A Stereotypes-Based Hybrid Recommender System for Media Items Guy Shani and Amnon Meisles and Yan gleyzer Department of Computer Science Ben Gurion University, Israel (shanigu, am, ygleyzer]@cs. bgu ac il Lior rokach and david ben shime Department of Computer Science, Department of Information Systems Engineering Ben Gurion University, Israel dliorrk, dudibs@bguacil Modern systems avoid querying the database(of either user-item ratings of item or item descriptions) directly and Many Recommender Systems use either Collaborative Filter- ing(CF)or Content-Based (CB)techniques to receive recom- adopt a statistical model(e.g. Decision Tree, SVD ma- mendations for products. Both approaches have advantages trix, Dependency Network) to allow scaling up to millions and weaknesses. Combining the two approaches together can of users and items. Stereotypes(also known as commu- vercome most weaknesses. However, most hybrid systems nities)are a way to define an abstract user that has gen- combine the two methods in an ad-hoc manner eral properties similar to a set (community) of real users In this paper we present an hybrid approach for recommen- Stereotypes are used in recommender systems for varying dations, where a user profile is a weighted combination of purposes, ranging from initial user profile creation to gene user stereotypes, created automatically through a clustering ating recommendations(Montaner et al. 2003) process. Each stereotype is defined by an ontology of item All methods use some type of a user profile attributes. Our approach provides good recommendations for model) for recommendation. CF systems usually items that were rated in the past and is also able to handle new a vector of rated items while CB systems maintain n a rated ems that were never observed by the system. set of item attributes. We suggest creating a set of stereo- Our algorithm is implemented in a commercial system fo ype content-based profiles and using an affinity vector of recommending media items. The system is envisioned to unction as personalized media(audio, video, print)service stereotypes as the user profile within mobile phones, online media portals, sling boxes, n this paper we present a recommendation system, tc. It is currently under development within Deutsche called MediaScout, that is currently being implemented for Telekom Laboratories- Innovations of Integrated Communi- Deutsche Telekom Laboratories and intended to be used cation projects mobile phones in Germany for recommending media items such as clips and movie trailers for viewing over the cellular phone. We therefore use media data as examples in this pa Introduction per but note that our system can be used for other purposes as Recommender systems systems that recommend item well. The system is designed to allow mobile, portal or sling to users -can be found in many modern web sites for box users to obtain most appropriate content(both based on various applications such as helping users find web pages ser preferences). As an example, MediaScout provides an entertain me option for a user(e.g, waiting for a bus), which e-commerce sites, recommending TV programs to users of when selected delivers content that fits user's personal pref- interactive TV and showing personalized advertisements. There are two dominating approaches(see e.g. Mon We suggest to classify new users to clusters through an aner(Montaner et al. 2003)to creating recommendation interactive questionnaire, generated automatically from the stems. The Collaborative Filtering(CF) approach con stereotypes after each update. Existing are automat siders the recommended items only by a unique identifier cally classified to new stereotypes through the update pro and recommends items that were purchased together, igi cess and do not need to undergo the questionnaire again ing any attribute of the item. Content-Based(CB) recom endations are generated based on an item profile -a set Recommender Systems of attributes of an item-discarding purchase information Each of these methods has its pros and cons but it seems that With the explosion of data available online, recommender a hybrid approach can overcome most of the disadvantages systems became very popular. While there are many types of the two methods of recommender systems ranging from manually predefined un-personalized recommendations to fully automatic gen Copyright 2007, Association for the Advancement of Artificial eral purpose recommendation engines, two dominating ap Intelligence(www.aaai.org).Allrightsreserved proaches have emerged-Collaborative Filtering and Con-
A Stereotypes-Based Hybrid Recommender System for Media Items Guy Shani and Amnon Meisles and Yan Gleyzer Department of Computer Science Ben Gurion University, Israel {shanigu,am,ygleyzer}@cs.bgu.ac.il Lior Rokach and David Ben-Shimon Department of Computer Science, Department of Information Systems Engineering Ben Gurion University, Israel {liorrk,dudibs}@bgu.ac.il Abstract Many Recommender Systems use either Collaborative Filtering (CF) or Content-Based (CB) techniques to receive recommendations for products. Both approaches have advantages and weaknesses. Combining the two approaches together can overcome most weaknesses. However, most hybrid systems combine the two methods in an ad-hoc manner. In this paper we present an hybrid approach for recommendations, where a user profile is a weighted combination of user stereotypes, created automatically through a clustering process. Each stereotype is defined by an ontology of item attributes. Our approach provides good recommendations for items that were rated in the past and is also able to handle new items that were never observed by the system. Our algorithm is implemented in a commercial system for recommending media items. The system is envisioned to function as personalized media (audio, video, print) service within mobile phones, online media portals, sling boxes, etc. It is currently under development within Deutsche Telekom Laboratories - Innovations of Integrated Communication projects. Introduction Recommender Systems — systems that recommend items to users — can be found in many modern web sites for various applications such as helping users find web pages that interest them, recommending products to customers in e-commerce sites, recommending TV programs to users of interactive TV and showing personalized advertisements. There are two dominating approaches (see e.g. Montaner (Montaner et al. 2003)) to creating recommendation systems. The Collaborative Filtering (CF) approach considers the recommended items only by a unique identifier and recommends items that were purchased together, ignoring any attribute of the item. Content-Based (CB) recommendations are generated based on an item profile — a set of attributes of an item — discarding purchase information. Each of these methods has its pros and cons but it seems that a hybrid approach can overcome most of the disadvantages of the two methods. Copyright c 2007, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Modern systems avoid querying the database (of either user-item ratings of item or item descriptions) directly and adopt a statistical model (e.g. Decision Tree, SVD matrix, Dependency Network) to allow scaling up to millions of users and items. Stereotypes (also known as communities) are a way to define an abstract user that has general properties similar to a set (community) of real users. Stereotypes are used in recommender systems for varying purposes, ranging from initial user profile creation to generating recommendations (Montaner et al. 2003). All methods use some type of a user profile (or user model) for recommendation. CF systems usually maintain a vector of rated items while CB systems maintain a rated set of item attributes. We suggest creating a set of stereotype content-based profiles and using an affinity vector of stereotypes as the user profile. In this paper we present a recommendation system, called MediaScout, that is currently being implemented for Deutsche Telekom Laboratories and intended to be used in mobile phones in Germany for recommending media items such as clips and movie trailers for viewing over the cellular phone. We therefore use media data as examples in this paper but note that our system can be used for other purposes as well. The system is designed to allow mobile, portal or sling box users to obtain most appropriate content (both based on user preferences). As an example, MediaScout provides an entertain me option for a user (e.g., waiting for a bus), which when selected delivers content that fits user’s personal preferences through her mobile phone. We suggest to classify new users to clusters through an interactive questionnaire, generated automatically from the stereotypes after each update. Existing users are automatically classified to new stereotypes through the update process and do not need to undergo the questionnaire again. Recommender Systems With the explosion of data available online, recommender systems became very popular. While there are many types of recommender systems ranging from manually predefined un-personalized recommendations to fully automatic general purpose recommendation engines, two dominating approaches have emerged - Collaborative Filtering and Con- 76
tent Based recommendations may not be able to direct the user towards items she is un aware of but may like Collaborative Filtering Nevertheless, CB systems can easily provide valid recom- mendations to new users, assuming that their profile is spec ing for recommendations often ask for the advise of friends. ified, even if they never used the system before. CB engines Over the internet the population that can supply advises is can provide recommendations for new items that were never very large. The problem hence shifts into identifying what rated before based on the item description and are therefore of this population is relevant for the current very useful in environments where new items are constantl CF methods identify similarity between users based on added items they have rated and recommend new items similar users have liked. CF algorithms vary by the method they use Hybrid approaches to identify similar users. Originally Nearest-Neighbor ap The disadvantages CF and CB can be reduced by combin proaches based on the Pearson Correlation, computing sim- ing the two approaches into a hybrid method (Burke 2002) larity between users directly over the database of user-item Many hybrid approaches use two recommendation algo- ratings were implemented. Modern systems tend to learn rithms and combine their results in some manner, such as some statistical model from the database and then use it for ng the results by their relet recommending previously rated items to a new audience of the two algorithms, switching from CB into CF once the Model-based approaches usually sacrifice some accuracy in cold-start phase is over, or using the output of one algorithm favor of a rapid recommendation generation process( Breese as input to the second algorithm. et al. 1998), better scaling up to modern applications For example, the PVT system(Smyth and Cotter 1999) The main advantage of CF is that it is independent of the that operates in a similar setting of online recommendations specification of the item and can therefore provide recom to TV shows, maintains content-based user profiles and op mendations for complex items which are very different yet erates two distinct recommendation engines. The CB en- are often used together. The major drawback of this ap- gine recommends based on program similarity to the pro- proach is the inability to create good recommendations for grams the user has liked so far, and the CF engine finds the new users that have not yet rated many items, and for new k nearest neighbors based on user profile similarity and rec items that were not rated by many users(known as the cold- ommends programs that the neighbors liked. The final list start problem) of recommendations is then created by combining the two lists Content-Based recommendation It seems that a more appropriate combination would be to The ideas of content-based( CB)recommendations originate create an algorithm that is by nature a hybrid of CF and CB in the field of information filtering. where documents are not an ad-hoc combination of two independent algorithms searched given some analysis of their text Items are hence Stereotypes defined by a set of features or attributes. Such systems de- fine a user using preferences over this set of features, and Modeling users by stereotypes(or communities) is a well obtain recommendations by matching user profiles and item studied concept(Rich 1998). Stereotypes are a generaliza profiles looking for best matches. Some researchers(Mon tion of users-an abstract user entity that provides a general taner et al. 2003) separate methods that learn preferred at description for a set of similar users(a community) tributes from rated items(called content-based) from meth In CF systems stereotypes are described by a set of rat ods that ask the user to specify her preferences over item ings over items, and user similarity can be identified by attributes(called demographic filtering), but we refer to all their affinity to various stereotypes. In CB systems stereo- methods that recommend based on item attribute preferences types are a set of preferences over item attributes, and users content-based recommendation can belong to a single stereotype(Rich 1998)or to multi Content-based approaches rarely learn statistical models ple stereotypes(Orwant 1995).Recommendations are e co nd usually match user profiles and item profiles directly puted based on the stereotype and then normalized given the User and item profiles are very sensitive to profile definitions user affinity to a stereotype which attributes are relevant and which attributes should be ignored. It is also difficult to create an initial profile of Feedback the user, specifying the interests and preferences of the user In order to adapt and refine recommendations to changes in Users are reluctant to provide thorough descriptions of the user tastes, most recommender systems rely on some mech ings they like and do not like. It is also possible that users anism of feedback from users. Feedback is usually in the re unaware of their preferences. For example, a user cannot form of a rating over an item that can be either numeric(on acquisition of user preferences is usually considered a bot As users are usually reluctant to rate items explicitly, tleneck for the practical use of these systems. Content-based some research focused on obtaining implicit ratings-es recommendations may also result in very expected items and timating the user ratings through her observable operations Some researchers (e.g. (Burke 2002) further divide these scrolled down the article, or clicked on a link inside the arti- classes. but we restrict ourselves to the definitions below cle. then we can assume that the article was useful for her. If
tent Based recommendations1 . Collaborative Filtering Collaborative filtering stems from the idea that people looking for recommendations often ask for the advise of friends. Over the internet the population that can supply advises is very large. The problem hence shifts into identifying what part of this population is relevant for the current user. CF methods identify similarity between users based on items they have rated and recommend new items similar users have liked. CF algorithms vary by the method they use to identify similar users. Originally Nearest-Neighbor approaches based on the Pearson Correlation, computing similarity between users directly over the database of user-item ratings were implemented. Modern systems tend to learn some statistical model from the database and then use it for recommending previously rated items to a new audience. Model-based approaches usually sacrifice some accuracy in favor of a rapid recommendation generation process (Breese et al. 1998), better scaling up to modern applications. The main advantage of CF is that it is independent of the specification of the item and can therefore provide recommendations for complex items which are very different yet are often used together. The major drawback of this approach is the inability to create good recommendations for new users that have not yet rated many items, and for new items that were not rated by many users (known as the coldstart problem). Content-Based recommendation The ideas of content-based (CB) recommendations originate in the field of information filtering, where documents are searched given some analysis of their text. Items are hence defined by a set of features or attributes. Such systems de- fine a user using preferences over this set of features, and obtain recommendations by matching user profiles and item profiles looking for best matches. Some researchers (Montaner et al. 2003) separate methods that learn preferred attributes from rated items (called content-based) from methods that ask the user to specify her preferences over item attributes (called demographic filtering), but we refer to all methods that recommend based on item attribute preferences as content-based recommendation. Content-based approaches rarely learn statistical models and usually match user profiles and item profiles directly. User and item profiles are very sensitive to profile definitions — which attributes are relevant and which attributes should be ignored. It is also difficult to create an initial profile of the user, specifying the interests and preferences of the user; Users are reluctant to provide thorough descriptions of the things they like and do not like. It is also possible that users are unaware of their preferences. For example, a user cannot know whether she likes an actor she never seen. In fact the acquisition of user preferences is usually considered a bottleneck for the practical use of these systems. Content-based recommendations may also result in very expected items and 1 Some researchers (e.g. (Burke 2002)) further divide these classes, but we restrict ourselves to the definitions below. may not be able to direct the user towards items she is unaware of but may like. Nevertheless, CB systems can easily provide valid recommendations to new users, assuming that their profile is specified, even if they never used the system before. CB engines can provide recommendations for new items that were never rated before based on the item description and are therefore very useful in environments where new items are constantly added. Hybrid approaches The disadvantages CF and CB can be reduced by combining the two approaches into a hybrid method (Burke 2002). Many hybrid approaches use two recommendation algorithms and combine their results in some manner, such as combining the results by their relevance, mixing the output of the two algorithms, switching from CB into CF once the cold-start phase is over, or using the output of one algorithm as input to the second algorithm. For example, the PVT system (Smyth and Cotter 1999) that operates in a similar setting of online recommendations to TV shows, maintains content-based user profiles and operates two distinct recommendation engines. The CB engine recommends based on program similarity to the programs the user has liked so far, and the CF engine finds the k nearest neighbors based on user profile similarity and recommends programs that the neighbors liked. The final list of recommendations is then created by combining the two lists. It seems that a more appropriate combination would be to create an algorithm that is by nature a hybrid of CF and CB, not an ad-hoc combination of two independent algorithms. Stereotypes Modeling users by stereotypes (or communities) is a well studied concept (Rich 1998). Stereotypes are a generalization of users — an abstract user entity that provides a general description for a set of similar users (a community). In CF systems stereotypes are described by a set of ratings over items, and user similarity can be identified by their affinity to various stereotypes. In CB systems stereotypes are a set of preferences over item attributes, and users can belong to a single stereotype (Rich 1998) or to multiple stereotypes (Orwant 1995). Recommendations are computed based on the stereotype and then normalized given the user affinity to a stereotype. Feedback In order to adapt and refine recommendations to changes in user tastes, most recommender systems rely on some mechanism of feedback from users. Feedback is usually in the form of a rating over an item that can be either numeric (on a scale of, e.g., 1 to 5) or binary (like/dislike). As users are usually reluctant to rate items explicitly, some research focused on obtaining implicit ratings — estimating the user ratings through her observable operations. For example, in the domain of web browsing, if the user scrolled down the article, or clicked on a link inside the article, then we can assume that the article was useful for her. If 77
the user, however, only read the title and then went back to Methods that ask the user to specify her preferences over the former page, we can assume that the web page was not item attributes are also known as preference elicitation or useful preference based search. Viappiani et al.(Paolo Viappi ani and Evaluating 2006) describes a preference elicitation MediaScout method and an example-based critiquing, that avoids the problems of preference fluency, domain knowledge and user This paper presents the MediaScout system, designed to de effort liver media content recommendations over mobile phones. The system uses a stereotype approach combining elements from both content-based and collaborative filtering ap- proaches. We explain below how the system is constructed Select the movie you prefer how new users are introduced to the system, how recom- mendations are generated and how to update the model We take the content-based approach here, defining an onto ogy over media items, defined by an expert in the field. It is reasonable to assume that an expert will be able to identify movie to see. A media item profile is an instantiation of this ontology, and a stereotype profile assigns relevance values Pulp Fiction Armageddon Shrek 2 for various attribute values of the ontology. For example, a movie profile may have Bruce Willis and Samuel L Jackson Submit s actors, and a stereotype profile may assign to Bruce willis as an actor the value 0.97 and to mel gibson as an actor the value 0.85 while assigning to Mel Gibson as a director the Figure 1: MediaScout questionnaire example value 0.67 Receiving recommendations for stereotypes can be done We propose to combine these two appi by matching item profiles with the stereotype profile, result ing in relevance values over media items ing the user a set of simple questions, such as whether she A user in our domain is modeled by an affinity list of likes an actor, picking a favored movie out of a set of 2 or 3 movies. An example of such a question can be seen stereotypes.A user may belong for example to one stereo- Figure 1. We choose an anytime approach-the user may type with relevance 0. 8 and to another stereotype with rele answer as many questions as she likes. Whenever the user vance 0.7 stops answering questions we have some classification of the user into stereotypes. The advantages of the anytime ap Initialization proach have been noted in the past(Pearl Pu and Torrens. Our initial stereotypes are manually defined by an expert An expert in the field of movies is able to identify several Our approach is based on a decision tree, with questions at types of movie watchers, such as people who like action the decision nodes. Each node, both leaves and inner nodes, movies and people who prefer Sci-Fi. Identifying the rel is assigned an affinity vector of stereotypes. If the user does evant actors directors and other attributes of these stereo- not wish to answer more questions, the current affinity vec types will also be done by the expert. tor will be assigned to her. The more questions the user When a new user registers into the system we need to cre answers the more specific her affinity vector becomes ate an affinity vector of stereotypes for her. Research in the rea has mainly focused on using a set of examples(ega Recommendations number of movies the user likes)or through a form spec Once the system obtains a user profile definition in the form ifying the user interests. Such approaches are problematic of an affinity vector, we can generate recommendations for while rating observed movies is a painless process, us this user, based on the relevant stereotypes ng only a set of rated movies can cause the system to later First, we need to compute recommendations for the recommend only movies similar to the ones the user rated. stereotypes. As a stereotype describes a content-based pro- Asking users to fill lengthy forms is usually considered a te file, explicitly stating preferences over the possible values of dious chore and users tend to either avoid it or answer ques item attributes, we activate a matching engine that compute tions arbitrarily(e. g. always picking the first answer) the relevance of a media item to a stereotype. As the num Smyth and Cotter(Smyth and Cotter 1999), for example, ber of stereotypes is not expected to be too high, these lists request a new user to fill a form specifying the features she can be persistent in the database. Moreover, it is expected likes and programs she likes and dislikes. They note that that many items will have low relevance to stereotypes users were willing to fill the easier parts of the part but re the lists can be truncated after some threshold luctant to look for programs within the lists Once a request for a recommendation for user u with
the user, however, only read the title and then went back to the former page, we can assume that the web page was not useful. MediaScout This paper presents the MediaScout system, designed to deliver media content recommendations over mobile phones. The system uses a stereotype approach combining elements from both content-based and collaborative filtering approaches. We explain below how the system is constructed, how new users are introduced to the system, how recommendations are generated and how to update the model. Stereotype Model We take the content-based approach here, defining an ontology over media items, defined by an expert in the field. It is reasonable to assume that an expert will be able to identify the key features relevant for people when they choose which movie to see. A media item profile is an instantiation of this ontology, and a stereotype profile assigns relevance values for various attribute values of the ontology. For example, a movie profile may have Bruce Willis and Samuel L. Jackson as actors, and a stereotype profile may assign to Bruce Willis as an actor the value 0.97 and to Mel Gibson as an actor the value 0.85 while assigning to Mel Gibson as a director the value 0.67. Receiving recommendations for stereotypes can be done by matching item profiles with the stereotype profile, resulting in relevance values over media items. A user in our domain is modeled by an affinity list of stereotypes. A user may belong for example to one stereotype with relevance 0.8 and to another stereotype with relevance 0.7. Initialization Our initial stereotypes are manually defined by an expert. An expert in the field of movies is able to identify several types of movie watchers, such as people who like action movies and people who prefer Sci-Fi. Identifying the relevant actors, directors and other attributes of these stereotypes will also be done by the expert. When a new user registers into the system we need to create an affinity vector of stereotypes for her. Research in the area has mainly focused on using a set of examples (e.g. a number of movies the user likes) or through a form specifying the user interests. Such approaches are problematic — while rating observed movies is a painless process, using only a set of rated movies can cause the system to later recommend only movies similar to the ones the user rated. Asking users to fill lengthy forms is usually considered a tedious chore and users tend to either avoid it or answer questions arbitrarily (e.g. always picking the first answer). Smyth and Cotter (Smyth and Cotter 1999), for example, request a new user to fill a form specifying the features she likes and programs she likes and dislikes. They note that users were willing to fill the easier parts of the part but reluctant to look for programs within the lists. Methods that ask the user to specify her preferences over item attributes are also known as preference elicitation or preference based search. Viappiani et al. (Paolo Viappiani and Evaluating 2006) describes a preference elicitation method and an example-based critiquing, that avoids the problems of preference fluency, domain knowledge and user effort. Figure 1: MediaScout questionnaire example. We propose to combine these two approaches by asking the user a set of simple questions, such as whether she likes an actor, picking a favored movie out of a set of 2 or 3 movies. An example of such a question can be seen in Figure 1. We choose an anytime approach — the user may answer as many questions as she likes. Whenever the user stops answering questions we have some classification of the user into stereotypes. The advantages of the anytime approach have been noted in the past (Pearl Pu and Torrens. ). Our approach is based on a decision tree, with questions at the decision nodes. Each node, both leaves and inner nodes, is assigned an affinity vector of stereotypes. If the user does not wish to answer more questions, the current affinity vector will be assigned to her. The more questions the user answers the more specific her affinity vector becomes. Recommendations Once the system obtains a user profile definition in the form of an affinity vector, we can generate recommendations for this user, based on the relevant stereotypes. First, we need to compute recommendations for the stereotypes. As a stereotype describes a content-based pro- file, explicitly stating preferences over the possible values of item attributes, we activate a matching engine that computes the relevance of a media item to a stereotype. As the number of stereotypes is not expected to be too high, these lists can be persistent in the database. Moreover, it is expected that many items will have low relevance to stereotypes so the lists can be truncated after some threshold. Once a request for a recommendation for user u with 78
affinity vector v is received, we compute the relevance of nedia item i to user u as follows Page 1 of 1 Media info relevance(, u)=2 v(s)relevance(i,s)(I) yu hanus where relevance(i, s)is the persisted relevance of item i make ft furry so i hope you to stereotype s. Note that this process is much faster than matching engine, and therefore can scale up much betle the rf you lke it tell matching each user with all items in the database a parton unexpected content. It is useful to try and explore the user preferences by presenting new media content that the user is unaware of, as many users cannot accurately define what they like(e. g.(ten Hagen et al. 2003; Ziegler et al. 2005)). Presenting to the user a list of 5 recommendations we can both maintain a high level of acceptance and explore, by al ways fixing the first 3 recommendations to the most likely items as predicted by Equation 1, and exploring over the Figure 2: MediaScout recommendation list 2 items only. Exploration is done over two different axes using an E-greedy exploration we select items that have lower relevance given the relevance list. More exploration Affinity vector update A rapid online update, executed is done by sometimes randomly boosting the relevance of a after each feedback received from the user. random stereotype in the relevance equation. Figure la il- When a feedback from a user(either positive or negative strates the list of recommendations that the user gets when is received, we attempt to update the affinity vector accord she presses the" EntertainMe"button ngly. We look for the rated item in the relevance lists of the stereotypes. If an item rated positively is relevant to a stereo- Feedbacks type, we boost the relevance of the stereotype to the user. If Our system supports both positive and negative feedbacks an item was rated negatively by the user, we lower the rel As we believe that users find it easier to specify binary, evance of stereotypes that recommended it to the user. We like/dislike feedbacks rather than numeric values we use use a decaying A learning factor so that the user affinity vec binary feedback, but our system can be easily adapted for tor will eventually converge. a is reset once the stereotype numeric feedbacks as well model is rebuilt(see below) We use two different types of feedbacks -explicit and Recommendation lists recomputation Due to the addi- implicit ratings. For explicit ratings the user can select while tion of new items into the database, it is required to recom atching a media item whether she likes or dislikes it. When pute the persisted recommendation lists of the stereotypes a user is presented with a list of 5 items and selects the 3rd We compute new relevance values for new items, and merge item we assume that she did not like the first two items and them into the persisted lists of recommendations. As we pre- notify the system of a negative response for the first two fer to recommend to the user new items we also decrease the aware whether the user watched the item fully or decided to existing item relevance is unneeded that ra constant factor items. Our media content domain uses streaming technol- relevance of items currently in the lists by a constant factor ogy to show media content to users. The server is therefore known as the items). Note that recomputation of stop after a few seconds. If the user decided to stop watch The frequency of this recomputation depends on the ap ing after a few seconds we assume that she did not like the arance of new items in the database. In our case it is suf media item and the system is notified of a negative rating ficient to execute the recomputation only once a day. After a user requests the MediaScout for recommenda Stereotype model reconstruction As the initial stereo- ions, the system displays a list of recommendations. When type model is created by an expert and the initial user affin the user selects a recommendation she may view it(pressing ty vectors are manually created through a questionnaire, theplay button) and provide positive or negative feedback we introduce an automatic stereotype discovering phase de using the the combobox below(see Figure 1b) signed to find new stereotypes automatically and compute The gathered feedbacks(implicit and explicit) are used for new affinity vectors for users. This automatic construction both model update methods we discuss in the next section ignores the current user model of affinity vectors of stereo- types Model Update To create new stereotypes we use a clustering algorithm In most environments the relevant items list for a user need Cluster-analysis is a process that receives as input a set of to be updated every so often due to the insertion of new objects, user profiles in our case, and generates clusters items, changes in the information we have over the user (groups) of similar users. The users within a group have (through feedback for example) and general new trends of high similarity and the users between groups are less simi- er tastes. Our system implements three update phases de lar. The number of clusters can be manually predefined or signed to refine the stereotype model can be automatically induced by the clustering algorithm
affinity vector v is received, we compute the relevance of media item i to user u as follows: relevance(i, u) = X s∈stereotypes v(s)relevance(i, s) (1) where relevance(i, s) is the persisted relevance of item i to stereotype s. Note that this process is much faster than matching each user with all items in the database using the matching engine, and therefore can scale up much better. We would also like to surprise the user sometimes with unexpected content. It is useful to try and explore the user preferences by presenting new media content that the user is unaware of, as many users cannot accurately define what they like (e.g. (ten Hagen et al. 2003; Ziegler et al. 2005)). Presenting to the user a list of 5 recommendations we can both maintain a high level of acceptance and explore, by always fixing the first 3 recommendations to the most likely items as predicted by Equation 1, and exploring over the last 2 items only. Exploration is done over two different axes — using an ǫ-greedy exploration we select items that have lower relevance given the relevance list. More exploration is done by sometimes randomly boosting the relevance of a random stereotype in the relevance equation. Figure 1a illustrates the list of recommendations that the user gets when she presses the ”EntertainMe” button. Feedbacks Our system supports both positive and negative feedbacks. As we believe that users find it easier to specify binary, like/dislike feedbacks rather than numeric values, we use binary feedback, but our system can be easily adapted for numeric feedbacks as well. We use two different types of feedbacks — explicit and implicit ratings. For explicit ratings the user can select while watching a media item whether she likes or dislikes it. When a user is presented with a list of 5 items and selects the 3rd item we assume that she did not like the first two items, and notify the system of a negative response for the first two items. Our media content domain uses streaming technology to show media content to users. The server is therefore aware whether the user watched the item fully or decided to stop after a few seconds. If the user decided to stop watching after a few seconds we assume that she did not like the media item and the system is notified of a negative rating. After a user requests the MediaScout for recommendations, the system displays a list of recommendations. When the user selects a recommendation she may view it (pressing the ”play” button) and provide positive or negative feedback using the the combobox below (see Figure 1b). The gathered feedbacks (implicit and explicit) are used for both model update methods we discuss in the next section. Model Update In most environments the relevant items list for a user need to be updated every so often due to the insertion of new items, changes in the information we have over the user (through feedback for example) and general new trends of user tastes. Our system implements three update phases designed to refine the stereotype model. (a) (b) Figure 2: MediaScout recommendation list. Affinity vector update A rapid online update, executed after each feedback received from the user. When a feedback from a user (either positive or negative) is received, we attempt to update the affinity vector accordingly. We look for the rated item in the relevance lists of the stereotypes. If an item rated positively is relevant to a stereotype, we boost the relevance of the stereotype to the user. If an item was rated negatively by the user, we lower the relevance of stereotypes that recommended it to the user. We use a decaying λ learning factor so that the user affinity vector will eventually converge. λ is reset once the stereotype model is rebuilt (see below). Recommendation lists recomputation Due to the addition of new items into the database, it is required to recompute the persisted recommendation lists of the stereotypes. We compute new relevance values for new items, and merge them into the persisted lists of recommendations. As we prefer to recommend to the user new items, we also decrease the relevance of items currently in the lists by a constant factor (known as ’aging’ the items). Note that recomputation of existing item relevance is unneeded. The frequency of this recomputation depends on the appearance of new items in the database. In our case it is suf- ficient to execute the recomputation only once a day. Stereotype model reconstruction As the initial stereotype model is created by an expert and the initial user affinity vectors are manually created through a questionnaire, we introduce an automatic stereotype discovering phase designed to find new stereotypes automatically and compute new affinity vectors for users. This automatic construction ignores the current user model of affinity vectors of stereotypes. To create new stereotypes we use a clustering algorithm. Cluster-analysis is a process that receives as input a set of objects, user profiles in our case, and generates clusters (groups) of similar users. The users within a group have high similarity and the users between groups are less similar. The number of clusters can be manually predefined or can be automatically induced by the clustering algorithm. 79
In hard clustering, the objects are divided into crisp clus- use them to define new users. New users will not have affin- ters, where each object belongs to exactly one cluster. In ly to the automatically generated stereotypes until they will soft( fuzzy)clustering, the objects belong to more than one undergo a model reconstruction phase. Alternatively one cluster, and associated with each of the objects are member- can automatically learn a new decision tree for the question- ship grades which indicate the degree to which the object naire from the preferences database by using tree induction belongs to the different clusters. For each cluster, a central algo stereotype profile(centroid) is generated. The centroid is a weighted average of the users' profiles that belongs to the Discussion cluster based on their membership grades Our system combines features from various approaches to In this application e the FCM(Fuzzy C-Means)al- recommendations. Our initial user affinity vector construc gorithm(e.g.(Kolen and Hutcheson 2002)), mainly due to tion is content-based oriented in that we try to identify at its efficient implementation that scales up well to a large tributes of the content of the media item(.g. actors, direc number of users. Moreover the FCM uses the probabilistic tor, genre) that the user prefers constraint that the memberships of an object across clusters Our heavy update incorporates aspects of content-based sum to 1. This constraint suits our approach for weighting (CB)and collaborative filtering(CF). Merging media item stereotypes. attributes to create a user profile of preferred attributes of Instead of using the usual distance metric of fCm. we media elements is classic CB, while clustering users to- suggest to evaluate the similarity of two users with the Pear- gether is more CF oriented. When we create the stereotyp ons Correlation metric(see, e.g.(Breese et al. 1998)), through clustering we average over a number of users. the that compares the similarity of ratings for items(positive resulting recommendations are not based only on items the and negative). Our experiments show it to produce superior user has seen and liked, but also on items that similar users esults to standard vector distance metrics such as l have seen and liked, which is a CF approach to recommen The heavy update phase includes four steps dation User Representation Construction: for each user we com- The task of classifying new users to stereotypes is a ma- ute a profile similar to the stereotype profiles-a list of jor concern for us. We ask users to go through a question- preferred values for each possible attribute in the onto- naire, providing answers to questions to provide information ogy. This profile is automatically computed by observing as to their preferences. Our approach is anytime, allowing of an attribute of an item. we add the value into the user If a user stops anywhere within the questionnaire, we still profile. For example, if the user has liked a movie with have a classification of the user to stereotypes. Bruce Willis, we add Bruce Willis to the list of actors pre Our questionnaire is also easy to use: Questions are short ferred by the user. This can be thought of as merging the and require the user to select one of a handful of items(e. g edia items together into a single profile. This profile i a movie), or attribute values(e.g. and actor). Answers are used only for the update process, not for computing rec also provided through images of the available options that ommendations for the user. the user has to click upon The disadvantages of the various approaches are solved Clustering User Profiles: we now use a clustering algo rithm to create clusters of similar users using a distance The difficulty of having the user properly specify her pref- metric over the similarity of user profiles computed in the erences is eased through our interactive any-time question- former step. We use a soft-clustering algorithm, resulting naire. Moreover, even if the profile is inaccurate, it will be a list of cluster centroids and for each user. its distances refined through ou updates to better refle from the centroids of clusters he user preferences. The common problem of CB systems Stereotype Construction: we use each cluster as a new that can recommend to the user only items similar to stereotype. To create the stereotype profile, we merge ones she has seen and liked does not exist here because u users that are close to the cluster centroid more than a also receive recommendations based on data regarding predefined threshold. A cluster(stereotype) profile is de erences of similar users fined by the attribute values the users close to its centroi have favored. This merging is weighted by the distance Experimental Evaluation of users from the centroid so that closer users have higher Unfortunately, we cannot yet report results from our accep ance tests. To examine the predictive power of the proposed Affinity Vector Reconstruction: for each user, we define algorithm, we run a prediction test comparing our approach her new affinity vector as the membership grades to the to a number of other algorithms to show its capabilities clusters. Note that these grades are already computed by FCM and there is no need for additional computations Dataset used The user shall no longer refer to the manually gener- In order to make our results reproducible, we provide here ated stereotypes, only to the new, automatically generated Lens-database, consisting of 5868 users who rated at least We still maintain the manually generated stereotypes and 2www.movielens.org
In hard clustering, the objects are divided into crisp clusters, where each object belongs to exactly one cluster. In soft (fuzzy) clustering, the objects belong to more than one cluster, and associated with each of the objects are membership grades which indicate the degree to which the object belongs to the different clusters. For each cluster, a central stereotype profile (centroid) is generated. The centroid is a weighted average of the users’ profiles that belongs to the cluster based on their membership grades. In this application we use the FCM (Fuzzy C-Means) algorithm (e.g. (Kolen and Hutcheson 2002)), mainly due to its efficient implementation that scales up well to a large number of users. Moreover the FCM uses the probabilistic constraint that the memberships of an object across clusters sum to 1. This constraint suits our approach for weighting stereotypes. Instead of using the usual distance metric of FCM, we suggest to evaluate the similarity of two users with the Pearson’s Correlation metric (see, e.g. (Breese et al. 1998)), that compares the similarity of ratings for items (positive and negative). Our experiments show it to produce superior results to standard vector distance metrics such as L2. The heavy update phase includes four steps: • User Representation Construction: for each user we compute a profile similar to the stereotype profiles — a list of preferred values for each possible attribute in the ontology. This profile is automatically computed by observing the list of media items rated positively. For each value of an attribute of an item, we add the value into the user profile. For example, if the user has liked a movie with Bruce Willis, we add Bruce Willis to the list of actors preferred by the user. This can be thought of as merging the media items together into a single profile. This profile is used only for the update process, not for computing recommendations for the user. • Clustering User Profiles: we now use a clustering algorithm to create clusters of similar users using a distance metric over the similarity of user profiles computed in the former step. We use a soft-clustering algorithm, resulting in a list of cluster centroids and for each user, its distances from the centroids of clusters. • Stereotype Construction: we use each cluster as a new stereotype. To create the stereotype profile, we merge users that are close to the cluster centroid more than a predefined threshold. A cluster (stereotype) profile is de- fined by the attribute values the users close to its centroid have favored. This merging is weighted by the distance of users from the centroid so that closer users have higher impact. • Affinity Vector Reconstruction: for each user, we define her new affinity vector as the membership grades to the clusters. Note that these grades are already computed by FCM and there is no need for additional computations. The user shall no longer refer to the manually generated stereotypes, only to the new, automatically generated ones. We still maintain the manually generated stereotypes and use them to define new users. New users will not have affinity to the automatically generated stereotypes until they will undergo a model reconstruction phase. Alternatively one can automatically learn a new decision tree for the questionnaire from the preferences database by using tree induction algorithms. Discussion Our system combines features from various approaches to recommendations. Our initial user affinity vector construction is content-based oriented in that we try to identify attributes of the content of the media item (e.g. actors, director, genre) that the user prefers. Our heavy update incorporates aspects of content-based (CB) and collaborative filtering (CF). Merging media item attributes to create a user profile of preferred attributes of media elements is classic CB, while clustering users together is more CF oriented. When we create the stereotypes through clustering, we average over a number of users. The resulting recommendations are not based only on items the user has seen and liked, but also on items that similar users have seen and liked, which is a CF approach to recommendations. The task of classifying new users to stereotypes is a major concern for us. We ask users to go through a questionnaire, providing answers to questions to provide information as to their preferences. Our approach is anytime, allowing the user to stop answering questions whenever she chooses. If a user stops anywhere within the questionnaire, we still have a classification of the user to stereotypes. Our questionnaire is also easy to use; Questions are short and require the user to select one of a handful of items (e.g. a movie), or attribute values (e.g. and actor). Answers are also provided through images of the available options that the user has to click upon. The disadvantages of the various approaches are solved. The cold start problem is handled using expert knowledge. The difficulty of having the user properly specify her preferences is eased through our interactive any-time questionnaire. Moreover, even if the profile is inaccurate, it will be refined through our light and heavy updates to better reflect the user preferences. The common problem of CB systems that can recommend to the user only items similar to the ones she has seen and liked does not exist here because users also receive recommendations based on data regarding preferences of similar users. Experimental Evaluation Unfortunately, we cannot yet report results from our acceptance tests. To examine the predictive power of the proposed algorithm, we run a prediction test comparing our approach to a number of other algorithms to show its capabilities. Dataset Used In order to make our results reproducible, we provide here the results over a public domain dataset. We used the MovieLens 2 database, consisting of 5868 users who rated at least 2www.movielens.org 80