Francesco Ricci, Lior Rokach and Bracha Shapira type of promotional message derived by analyzing the data collected by the rs (transactions of the users). We mentioned above some important motivations as to why e-service providers introduce RSs. But users also may want a RS, if it will effectively support their tasks or goals. Consequently a RS must balance the needs of these two players and offer a service that is valuable to both Herlocker et al. [25], in a paper that has become a classical reference in this eld, define eleven popular tasks that a RS can assist in implementing. Some may be considered as the main or core tasks that are normally associated with a Rs i.e., to offer suggestions for items that may be useful to a user. Others might be considered as more"opportunistic"ways to exploit a RS. As a matter of fact, this task differentiation is very similar to what happens with a search engine, Its primary function is to locate documents that are relevant to the users information need but it can also be used to check the importance of a Web page (looking at the position of the page in the result list of a query)or to discover the various usages of a word in a collection of documents find Some good Items. Recommend to a user some items as a ranked list along with predictions of how much the user would like them(e. g, on a one- to five- star scale). This is the main recommendation task that many commercial systems ddress(see, for instance, Chapter 9). Some systems do not show the predicted Find all good items: Recommend all the items that can satisfy some user needs. In such cases it is insufficient to just find some good items. This is especially true when the number of items is relatively small or when the Rs is mission-critical, such as in medical or financial applications. In these situations, in addition to the benefit derived from carefully examining all the possibilities, the user may also benefit from the RS ranking of these items or from additional explanations that the rs generates Annotation in context: Given an existing context, e.g, a list of items, emphasize some of them depending on the users long-term preferences. For example, a TV recommender system might annotate which TV shows displayed in the elec tronic program guide(EPG)are worth watching( Chapter 18 provides interesting examples of this task) Recommend a sequence: Instead of focusing on the generation of a single rec- ommendation, the idea is to recommend a sequence of items that is pleasing as a whole. Typical examples include recommending a TV series; a book on RSs after having recommended a book on data mining; or a compilation of musical racks[99],[39] Recommend a bundle: Suggest a group of items that fits well together. For in- stance a travel plan may be composed of various attractions, destinations, and accommodation services that are located in a delimited area. From the point of view of the user these various alternatives can be considered and selected as a single travel destination [87]
6 Francesco Ricci, Lior Rokach and Bracha Shapira type of promotional message derived by analyzing the data collected by the RS (transactions of the users). We mentioned above some important motivations as to why e-service providers introduce RSs. But users also may want a RS, if it will effectively support their tasks or goals. Consequently a RS must balance the needs of these two players and offer a service that is valuable to both. Herlocker et al. [25], in a paper that has become a classical reference in this field, define eleven popular tasks that a RS can assist in implementing. Some may be considered as the main or core tasks that are normally associated with a RS, i.e., to offer suggestions for items that may be useful to a user. Others might be considered as more “opportunistic” ways to exploit a RS. As a matter of fact, this task differentiation is very similar to what happens with a search engine, Its primary function is to locate documents that are relevant to the user’s information need, but it can also be used to check the importance of a Web page (looking at the position of the page in the result list of a query) or to discover the various usages of a word in a collection of documents. • Find Some Good Items: Recommend to a user some items as a ranked list along with predictions of how much the user would like them (e.g., on a one- to fivestar scale). This is the main recommendation task that many commercial systems address (see, for instance, Chapter 9). Some systems do not show the predicted rating. • Find all good items: Recommend all the items that can satisfy some user needs. In such cases it is insufficient to just find some good items. This is especially true when the number of items is relatively small or when the RS is mission-critical, such as in medical or financial applications. In these situations, in addition to the benefit derived from carefully examining all the possibilities, the user may also benefit from the RS ranking of these items or from additional explanations that the RS generates. • Annotation in context: Given an existing context, e.g., a list of items, emphasize some of them depending on the user’s long-term preferences. For example, a TV recommender system might annotate which TV shows displayed in the electronic program guide (EPG) are worth watching (Chapter 18 provides interesting examples of this task). • Recommend a sequence: Instead of focusing on the generation of a single recommendation, the idea is to recommend a sequence of items that is pleasing as a whole. Typical examples include recommending a TV series; a book on RSs after having recommended a book on data mining; or a compilation of musical tracks [99], [39]. • Recommend a bundle: Suggest a group of items that fits well together. For instance a travel plan may be composed of various attractions, destinations, and accommodation services that are located in a delimited area. From the point of view of the user these various alternatives can be considered and selected as a single travel destination [87]
I Introduction to Recommender Systems Handbook Just browsing: In this task, the user browses the catalog without any imminent intention of purchasing an item. The task of the recommender is to help the user to browse the items that are more likely to fall within the scope of the user's inter- ests for that specific browsing session. This is a task that has been also supported by adaptive hypermedia techniques [23] Find credible recommender: Some users do not trust recommender systems thus Improve the profile: This relater equired for obtaining recommendations ti they play with them to see how good they are in making recommendations. Hence, some system may also offer specific functions to let the users behavior in addition to those to the capability of the user to provide(input) information to the recommender system about what he likes and dislikes. This is a fundamental task that is strictly necessary to provide personalized recommen- dations. If the system has no specific knowledge about the active user then it can only provide him with the same recommendations that would be delivered to an verage user. Express self: Some users may not care about the recommendations at all. Rather, what it is important to them is that they be allowed to contribute with their rat- ings and express their opinions and beliefs. The user satisfaction for that activity can still act as a leverage for holding the user tightly to the application(as we mentioned above in discussing the service provider's motivations) Help others: Some users are happy to contribute with information, e.g., their evaluation of items(ratings), because they believe that the community benefits from their contribution. This could be a major motivation for entering informa- tion into a recommender system that is not used routinely. For instance, with a car RS, a user, who has already bought her new car is aware that the rating en tered in the system is more likely to be useful for other users rather than for the next time she will buy a car. Infuence others: In Web-based RSs, there are users whose main goal is to ex plicitly influence other users into purchasing particular products. As a matter of fact, there are also some malicious users that may use the system just to promote or penalize certain items(see Chapter 25) As these various points indicate, the role of a RS within an information system can be quite diverse. This diversity calls for the exploitation of a range of different a Rs manages and the core technigue used to identify the right recommendation 9 knowledge sources and techniques and in the next two sections we discuss the da 1.3 Data and Knowledge sources RSS are information processing systems that actively gather various of data in order to build their recommendations. Data is primarily about the gest and the users who will receive these recommendations. But, since the data nd knowledge sources available for recommender systems can be very diverse, ultimately, whether they can be exploited or not depends on the recommendation
1 Introduction to Recommender Systems Handbook 7 • Just browsing: In this task, the user browses the catalog without any imminent intention of purchasing an item. The task of the recommender is to help the user to browse the items that are more likely to fall within the scope of the user’s interests for that specific browsing session. This is a task that has been also supported by adaptive hypermedia techniques [23]. • Find credible recommender: Some users do not trust recommender systems thus they play with them to see how good they are in making recommendations. Hence, some system may also offer specific functions to let the users test its behavior in addition to those just required for obtaining recommendations. • Improve the profile: This relates to the capability of the user to provide (input) information to the recommender system about what he likes and dislikes. This is a fundamental task that is strictly necessary to provide personalized recommendations. If the system has no specific knowledge about the active user then it can only provide him with the same recommendations that would be delivered to an “average” user. • Express self: Some users may not care about the recommendations at all. Rather, what it is important to them is that they be allowed to contribute with their ratings and express their opinions and beliefs. The user satisfaction for that activity can still act as a leverage for holding the user tightly to the application (as we mentioned above in discussing the service provider’s motivations). • Help others: Some users are happy to contribute with information, e.g., their evaluation of items (ratings), because they believe that the community benefits from their contribution. This could be a major motivation for entering information into a recommender system that is not used routinely. For instance, with a car RS, a user, who has already bought her new car is aware that the rating entered in the system is more likely to be useful for other users rather than for the next time she will buy a car. • Influence others: In Web-based RSs, there are users whose main goal is to explicitly influence other users into purchasing particular products. As a matter of fact, there are also some malicious users that may use the system just to promote or penalize certain items (see Chapter 25). As these various points indicate, the role of a RS within an information system can be quite diverse. This diversity calls for the exploitation of a range of different knowledge sources and techniques and in the next two sections we discuss the data a RS manages and the core technique used to identify the right recommendations. 1.3 Data and Knowledge Sources RSs are information processing systems that actively gather various kinds of data in order to build their recommendations. Data is primarily about the items to suggest and the users who will receive these recommendations. But, since the data and knowledge sources available for recommender systems can be very diverse, ultimately, whether they can be exploited or not depends on the recommendation
Francesco Ricci, Lior Rokach and Bracha Shapira technique(see also section 1. 4). This will become clearer in the various chapters ncluded in this handbook(see in particular Chapter I1) In general, there are recommendation techniques that are knowledge poor, i.e they use very simple and basic data, such as user ratings/evaluations for items Chapters 5, 4). Other techniques are much more knowledge dependent, e.g,us- ontological descriptions of the users or the items( Chapter 3), or constraints Chapter 6), or social relations and activities of the users( Chapter 19). In any case, as a general classification, data used by rSs refers to three kinds of objects: items, users, and transactions. i. e, relations between users and items Items. Items are the objects that are recommended. Items may be characterize by their complexity and their value or utility. The value of an item may be positive if the item is useful for the user, or negative if the item is not appropriate and the user made a wrong decision when selecting it. We note that when a user is acquiring an em she will always incur in a cost, which includes the cognitive cost of searchin for the item and the real monetary cost eventually paid for the item. For instance, the designer of a news RS must take into account the complexity of a news item, i.e., its structure, the textual representation, and the time-dependent im- portance of any news item. But, at the same time, the rs designer must understand that even if the user is not paying for reading news, there is al ways a cognitive cost ssociated to searching and reading news items. If a selected item is relevant for the er this cost is dominated by the benefit of having acquired a useful information, whereas if the item is not relevant the net value of that item for the user. and its recommendation, is negative. In other domains, e.g., cars, or financial investments, the true monetary cost of the items becomes an important element to consider when selecting the most appropriate recommendation approach. Items with low complexity and value are: news, Web pages, books, CDs, movies Items with larger complexity and value are: digital cameras, mobile phones, PCs etc. The most complex items that have been considered are insurance policies, fi- nancial investments, travels, jobs [72]. RSS, according to their core technology, can use a range of properties and fea- tures of the items. For example in a movie recommender system, the genre( such as comedy, thriller, etc. ) as well as the director, and actors can be used to describe a movie and to learn how the utility of an item depends on its features. Items can e represented using various information and representation approaches, e. g, in a minimalist way as a single id code, or in a richer form, as a set of attributes, but even as a concept in an ontological representation of the domain( Chapter 3) Users. Users of a RS, as mentioned above, may have very diverse goals and char- acteristics. In order to personalize the recommendations and the human-computer interaction, RSs exploit a range of information about the users. This information can be structured in various ways and again the selection of what information to model depends on the recommendation technique For instance, in collaborative filtering, users are modeled as a simple list contain- ing the ratings provided by the user for some items In a demographic RS, socio- demographic attributes such as age, gender, profession, and education, are used. User data is said to constitute the user model [21, 32]. The user model profiles the
8 Francesco Ricci, Lior Rokach and Bracha Shapira technique (see also section 1.4). This will become clearer in the various chapters included in this handbook (see in particular Chapter 11). In general, there are recommendation techniques that are knowledge poor, i.e., they use very simple and basic data, such as user ratings/evaluations for items (Chapters 5, 4). Other techniques are much more knowledge dependent, e.g., using ontological descriptions of the users or the items (Chapter 3), or constraints (Chapter 6), or social relations and activities of the users (Chapter 19). In any case, as a general classification, data used by RSs refers to three kinds of objects: items, users, and transactions, i.e., relations between users and items. Items. Items are the objects that are recommended. Items may be characterized by their complexity and their value or utility. The value of an item may be positive if the item is useful for the user, or negative if the item is not appropriate and the user made a wrong decision when selecting it. We note that when a user is acquiring an item she will always incur in a cost, which includes the cognitive cost of searching for the item and the real monetary cost eventually paid for the item. For instance, the designer of a news RS must take into account the complexity of a news item, i.e., its structure, the textual representation, and the time-dependent importance of any news item. But, at the same time, the RS designer must understand that even if the user is not paying for reading news, there is always a cognitive cost associated to searching and reading news items. If a selected item is relevant for the user this cost is dominated by the benefit of having acquired a useful information, whereas if the item is not relevant the net value of that item for the user, and its recommendation, is negative. In other domains, e.g., cars, or financial investments, the true monetary cost of the items becomes an important element to consider when selecting the most appropriate recommendation approach. Items with low complexity and value are: news, Web pages, books, CDs, movies. Items with larger complexity and value are: digital cameras, mobile phones, PCs, etc. The most complex items that have been considered are insurance policies, fi- nancial investments, travels, jobs [72]. RSs, according to their core technology, can use a range of properties and features of the items. For example in a movie recommender system, the genre (such as comedy, thriller, etc.), as well as the director, and actors can be used to describe a movie and to learn how the utility of an item depends on its features. Items can be represented using various information and representation approaches, e.g., in a minimalist way as a single id code, or in a richer form, as a set of attributes, but even as a concept in an ontological representation of the domain (Chapter 3). Users. Users of a RS, as mentioned above, may have very diverse goals and characteristics. In order to personalize the recommendations and the human-computer interaction, RSs exploit a range of information about the users. This information can be structured in various ways and again the selection of what information to model depends on the recommendation technique. For instance, in collaborative filtering, users are modeled as a simple list containing the ratings provided by the user for some items. In a demographic RS, sociodemographic attributes such as age, gender, profession, and education, are used. User data is said to constitute the user model [21, 32]. The user model profiles the
I Introduction to Recommender Systems Handbook user,i.e,encodes her preferences and needs. Various user have been used and in a certain sense a rs can be viewed as recommendations by building and exploiting user models [19, alization is possible without a convenient user model, unless the recommendation is non-personalized, as in the top-10 selection, the user model will always play a cen- tral role. For instance, considering again a collaborative filtering approach, the user is either profiled directly by its ratings to items or, using these ratings, the system derives a vector of factor values, where users differ in how each factor weights in their model( Chapters 5 and 4) Users can also be described by their behavior pattern data, for example, site browsing patterns (in a Web-based recommender system)[107], or travel search patterns(in a travel recommender system)[60]. Moreover, user data may include re- ations between users such as the trust level of these relations between users( Chap- ter 20). A RS might utilize this information to recommend items to users that were preferred by similar or trusted users. Transactions. We generically refer to a transaction as a recorded interaction be- tween a user and the rs. Transactions are log-like data that store important infor mation generated during the human-computer interaction and which are useful for the recommendation generation algorithm that the system is using. For instance, a transaction log may contain a reference to the item selected by the user and a description of the context(e.g, the user goal/query) for that particular recommen- lation. If available, that transaction may also include an explicit feedback the user has provided, such as the rating for the selected item. In fact, ratings are the most popular form of transaction data that a rs collects. These ratings may be collected explicitly or implicitly In the explicit collection of ratings, the user is asked to provide her opinion about an item on a rating scale According to [93]. ratings can take on a variety of forms Numerical ratings such as the 1-5 stars provided in the book recommender asso- ciated with amazon. com Ordinal ratings, such as"strongly agree, agree, neutral, disagree, strongly dis- agree"where the user is asked to select the term that best indicates her opinion regarding an item(usually via questionnaire) Binary ratings that model choices in which the user is simply asked to decide if a certain item is good or bad Unary ratings can indicate that a user has observed or purchased an item,or otherwise rated the item positively. In such cases, the absence of a rating indicates hat we have no information relating the user to the item(perhaps she purchased the item somewhere else) Another form of user evaluation consists of tags associated by the user with the items the system presents For instance in Movielens Rs(hTtp: //movielens. umn. edu) tags represent how MovieLens users feel about a movie, e. g: "too long", or"act ing". Chapter 19 focuses on these types of transactions In transactions collecting implicit ratings, the system aims to infer the users opin- ion based on the user's actions. For example, if a user enters the keyword"Yoga"at
1 Introduction to Recommender Systems Handbook 9 user, i.e., encodes her preferences and needs. Various user modeling approaches have been used and, in a certain sense, a RS can be viewed as a tool that generates recommendations by building and exploiting user models [19, 20]. Since no personalization is possible without a convenient user model, unless the recommendation is non-personalized, as in the top-10 selection, the user model will always play a central role. For instance, considering again a collaborative filtering approach, the user is either profiled directly by its ratings to items or, using these ratings, the system derives a vector of factor values, where users differ in how each factor weights in their model (Chapters 5 and 4). Users can also be described by their behavior pattern data, for example, site browsing patterns (in a Web-based recommender system) [107], or travel search patterns (in a travel recommender system) [60]. Moreover, user data may include relations between users such as the trust level of these relations between users (Chapter 20). A RS might utilize this information to recommend items to users that were preferred by similar or trusted users. Transactions. We generically refer to a transaction as a recorded interaction between a user and the RS. Transactions are log-like data that store important information generated during the human-computer interaction and which are useful for the recommendation generation algorithm that the system is using. For instance, a transaction log may contain a reference to the item selected by the user and a description of the context (e.g., the user goal/query) for that particular recommendation. If available, that transaction may also include an explicit feedback the user has provided, such as the rating for the selected item. In fact, ratings are the most popular form of transaction data that a RS collects. These ratings may be collected explicitly or implicitly. In the explicit collection of ratings, the user is asked to provide her opinion about an item on a rating scale. According to [93], ratings can take on a variety of forms: • Numerical ratings such as the 1-5 stars provided in the book recommender associated with Amazon.com. • Ordinal ratings, such as “strongly agree, agree, neutral, disagree, strongly disagree” where the user is asked to select the term that best indicates her opinion regarding an item (usually via questionnaire). • Binary ratings that model choices in which the user is simply asked to decide if a certain item is good or bad. • Unary ratings can indicate that a user has observed or purchased an item, or otherwise rated the item positively. In such cases, the absence of a rating indicates that we have no information relating the user to the item (perhaps she purchased the item somewhere else). Another form of user evaluation consists of tags associated by the user with the items the system presents. For instance, in Movielens RS (http://movielens.umn.edu) tags represent how MovieLens users feel about a movie, e.g.: “too long”, or “acting”. Chapter 19 focuses on these types of transactions. In transactions collecting implicit ratings, the system aims to infer the users opinion based on the user’s actions. For example, if a user enters the keyword “Yoga” at
Francesco Ricci, Lior Rokach and Bracha Shapira Amazon. com she will be provided with a long list of books. In return, the user may click on a certain book on the list in order to receive additional information At this point, the system may infer that the user is somewhat interested in that book. In conversational systems, i.e., systems that support an interactive process, the transaction model is more refined. In these systems user requests alternate with sys- tem actions(see Chapter 13). That is, the user may request a recommendation and the system may produce a suggestion list. But it can also request additional user preferences to provide the user with better results. Here, in the transaction model the system collects the various requests-responses, and may eventually learn to mod- ify its interaction strategy by observing the outcome of the recommendation process [60] 1. 4 Recommendation Techniques In order to implement its core function, identifying the useful items for the user, a RS must predict that an item is worth recommending. In order to do this, the system must be able to predict the utility of some of them, or at least compare the utility of some items, and then decide what items to recommend based on this comparison The prediction step may not be explicit in the recommendation algorithm but we can still apply this unifying model to describe the general role of a RS. Here our goal is to provide the reader with a unifying perspective rather than an account of all the different recommendation approaches that will be illustrated in this handbook. To illustrate the prediction step of a RS, consider, for instance, a simple, non- personalized, recommendation algorithm that recommends just the most popular songs. The rationale for using this approach is that in absence of more precise in- formation about the user's preferences, a popular song, i.e., something that is lik (high utility) by many users, will also be probably liked by a generic user, at least more than another randomly selected song. Hence the utility of these popular songs is predicted to be reasonably high for this generic user. This view of the core recommendation computation as the prediction of the util of an item for a user has been suggested in [3]. They model this degree of utility of the user u for the item i as a(real valued)function R(u, i), as is normally done in collaborative filtering by considering the ratings of users for items. Then the fun- damental task of a collaborative filtering Rs is to predict the value of r over pairs of users and items, i. e, to compute R(u, i), where we denote with R the estimation, computed by the Rs, of the true function R. Consequently, having computed this prediction for the active user u on a set of items, i.e., R(u, in)., R(u, iN) the sys- tem will recommend the items ijn (K <M with the largest predicted utility K is typically a small number, i.e., much smaller than the cardinality of the item data set or the items on which a user utility prediction can be computed, i. e, RSs"filter the items that are recommended to users As mentioned above, some recommender systems do not fully estimate the utility before making a recommendation but they may apply some heuristics to hypothe
10 Francesco Ricci, Lior Rokach and Bracha Shapira Amazon.com she will be provided with a long list of books. In return, the user may click on a certain book on the list in order to receive additional information. At this point, the system may infer that the user is somewhat interested in that book. In conversational systems, i.e., systems that support an interactive process, the transaction model is more refined. In these systems user requests alternate with system actions (see Chapter 13). That is, the user may request a recommendation and the system may produce a suggestion list. But it can also request additional user preferences to provide the user with better results. Here, in the transaction model, the system collects the various requests-responses, and may eventually learn to modify its interaction strategy by observing the outcome of the recommendation process [60]. 1.4 Recommendation Techniques In order to implement its core function, identifying the useful items for the user, a RS must predict that an item is worth recommending. In order to do this, the system must be able to predict the utility of some of them, or at least compare the utility of some items, and then decide what items to recommend based on this comparison. The prediction step may not be explicit in the recommendation algorithm but we can still apply this unifying model to describe the general role of a RS. Here our goal is to provide the reader with a unifying perspective rather than an account of all the different recommendation approaches that will be illustrated in this handbook. To illustrate the prediction step of a RS, consider, for instance, a simple, nonpersonalized, recommendation algorithm that recommends just the most popular songs. The rationale for using this approach is that in absence of more precise information about the user’s preferences, a popular song, i.e., something that is liked (high utility) by many users, will also be probably liked by a generic user, at least more than another randomly selected song. Hence the utility of these popular songs is predicted to be reasonably high for this generic user. This view of the core recommendation computation as the prediction of the utility of an item for a user has been suggested in [3]. They model this degree of utility of the user u for the item i as a (real valued) function R(u,i), as is normally done in collaborative filtering by considering the ratings of users for items. Then the fundamental task of a collaborative filtering RS is to predict the value of R over pairs of users and items, i.e., to compute Rˆ(u,i), where we denote with Rˆ the estimation, computed by the RS, of the true function R. Consequently, having computed this prediction for the active user u on a set of items, i.e., Rˆ(u,i1),...,Rˆ(u,iN) the system will recommend the items ij1 ,...,ijK (K ≤ N) with the largest predicted utility. K is typically a small number, i.e., much smaller than the cardinality of the item data set or the items on which a user utility prediction can be computed, i.e., RSs “filter” the items that are recommended to users. As mentioned above, some recommender systems do not fully estimate the utility before making a recommendation but they may apply some heuristics to hypothe-