Preference Learning in Recommender Systems Marco de gemmis, Leo laquinta, Pasquale Lops, Cataldo Musto Fedelucio Narducci and Giovanni Semeraro Department of Computer Science niversity of Bari"Aldo Moro", Italy idegemmis, iaquinta, lops, musto, narducci, semeraroj@di uniba.it Abstract. As proved by the continuous growth of the number of web sites which embody recommender systems as a way of personalizing the experience of users with their content, recommender systems represent one of the most popular applications of principles and techniques com- ing from Information Filtering(IF). As IF techniques usually perform a progressive removal of non-relevant content according to the information stored in a user profile, recommendation algorithms process information about user interests- acquired in an explicit(e. g, letting users express their opinion about items) or implicit (e.g, studying some behavioral features) way-and exploit these data to generate a list of recommended items. Although each type of filtering method has its own weaknesses and strengths, preference handling is one of the core issues in the design of every recommender system: since these systems aim to guide users in a personalized way to interesting or useful objects in a large space of possi ble options, it is important for them to accurately catch and model use preferences. The paper provides a general overview of the approaches to learning preference models in the context of recommender systen 1 Introduction How many times did you search something on the Web and you were not able to find successfully what were you looking for? The existence of a large quantity of information, in combination with the dynamic and heterogeneous nature of the Web, makes retrieval a hard task for the average user, who is usually over- amount refer to this as Information Overload problem), the role of user modeling and personalized information access is becoming crucial: although it is too soon to deeply understand the long-term effects of this surplus of information in our habits and in daily life, it is clear that users need a personalized support in sift ing through large amounts of available information according to their interests and preferences Information Filtering systems, like Recommender Systems, relying on this dea, adapt their behavior to individual users by learning their tastes during the interaction, in order to construct a profile that can be later exploited to select relevant items. Nowadays these systems represent the main solution to the information overload problem, because they are able to gather and exploit
Preference Learning in Recommender Systems Marco de Gemmis, Leo Iaquinta, Pasquale Lops, Cataldo Musto, Fedelucio Narducci, and Giovanni Semeraro Department of Computer Science University of Bari “Aldo Moro”, Italy {degemmis,iaquinta,lops,musto,narducci,semeraro}@di.uniba.it Abstract. As proved by the continuous growth of the number of web sites which embody recommender systems as a way of personalizing the experience of users with their content, recommender systems represent one of the most popular applications of principles and techniques coming from Information Filtering (IF). As IF techniques usually perform a progressive removal of non-relevant content according to the information stored in a user profile, recommendation algorithms process information about user interests - acquired in an explicit (e.g., letting users express their opinion about items) or implicit (e.g., studying some behavioral features) way - and exploit these data to generate a list of recommended items. Although each type of filtering method has its own weaknesses and strengths, preference handling is one of the core issues in the design of every recommender system: since these systems aim to guide users in a personalized way to interesting or useful objects in a large space of possible options, it is important for them to accurately catch and model user preferences. The paper provides a general overview of the approaches to learning preference models in the context of recommender systems. 1 Introduction How many times did you search something on the Web and you were not able to find successfully what were you looking for? The existence of a large quantity of information, in combination with the dynamic and heterogeneous nature of the Web, makes retrieval a hard task for the average user, who is usually overwhelmed by the abundant amount of information. In this context (we usually refer to this as Information Overload problem), the role of user modeling and personalized information access is becoming crucial: although it is too soon to deeply understand the long-term effects of this surplus of information in our habits and in daily life, it is clear that users need a personalized support in sifting through large amounts of available information according to their interests and preferences. Information Filtering systems, like Recommender Systems, relying on this idea, adapt their behavior to individual users by learning their tastes during the interaction, in order to construct a profile that can be later exploited to select relevant items. Nowadays these systems represent the main solution to the information overload problem, because they are able to gather and exploit
heterogeneous information about users, emerging as one of the most useful tools to achieve a more intelligent information access. In the workflow of a typical rec- ommendation process, learning user preferences is a primary step: catching and modeling user interests in an effective way can be a key issue for personalization goals. Gathering user characteristics, acquired through an explicit(e.g, directly asking to the user) or implicit process(e. g, observing the user behavior),can produce a user model to be exploited to enable adaptivity mechanisms during the interaction with an information system The problem of recommending items has been studied extensively, and two main paradigms have emerged. Content-based recommendation systems try to recommend items similar to those a given user has liked in the past, whereas systems designed according to the collaborative recommendation paradigm iden- tify users whose preferences are similar to those of the given user and recom- mend items they have liked. Further, in literature we found also other note- worthy paradigms: demographic recommenders, whose aim is to categorize the user starting from personal attributes making recommendation based on demo- graphic classes; knowledge-based systems, which exploit know ledge about how a particular item meets a particular user need; hybrid systems, at last, com- bine different recommendation techniques trying to exploit their advantages and reducing at the same time their draw backs. Each of above paradigms has par ticular methods to elicit user interests and preferences: most of them are related to Machine Learning area(probabilistic models, bayesian or neural networks decision trees, association rules), but there are also some other techniques(so- called heuristics)which learn user profiles by exploiting preferences expressed by similar users(usually referred to as "neighbours")or processing textual contents describing the items liked The paper provides a general overview of the approaches to learning prefer- ce models in the context of recommender systems and it is organized as follows Section 2 introduce general concepts and terminology about recommender sys- tems. Preference learning issues in the area of recommender systems is presented in Section 3, where we also introduce the feedback gathering problem and some machine learning techniques used to acquire and infer user preferences. Conclu- sions are drawn in the last sectio 2 Basics of Recommender Systems Nowadays it is very important for people to be supported in their decisions, due o the exponential increase of available information. Everyday we get advices from other people: " Hey, check out this Web site",I saw this book, you will like it'","That restaurant is very good!". When making a choice in the absence of decisive first-hand knowledge, choosing as other like-minded people have cho- sen in the past may be a good strategy. Recommender systems have the same role as human recommendations: they present information that they perceive to be useful and worth trying out. These systems are used in several application de to support users in taking decisions, to help them in managing the
heterogeneous information about users, emerging as one of the most useful tools to achieve a more intelligent information access. In the workflow of a typical recommendation process, learning user preferences is a primary step: catching and modeling user interests in an effective way can be a key issue for personalization goals. Gathering user characteristics, acquired through an explicit (e.g., directly asking to the user) or implicit process (e.g., observing the user behavior), can produce a user model to be exploited to enable adaptivity mechanisms during the interaction with an information system. The problem of recommending items has been studied extensively, and two main paradigms have emerged. Content-based recommendation systems try to recommend items similar to those a given user has liked in the past, whereas systems designed according to the collaborative recommendation paradigm identify users whose preferences are similar to those of the given user and recommend items they have liked. Further, in literature we found also other noteworthy paradigms: demographic recommenders, whose aim is to categorize the user starting from personal attributes making recommendation based on demographic classes; knowledge-based systems, which exploit knowledge about how a particular item meets a particular user need ; hybrid systems, at last, combine different recommendation techniques trying to exploit their advantages and reducing at the same time their drawbacks. Each of above paradigms has particular methods to elicit user interests and preferences: most of them are related to Machine Learning area (probabilistic models, bayesian or neural networks, decision trees, association rules), but there are also some other techniques (socalled heuristics) which learn user profiles by exploiting preferences expressed by similar users (usually referred to as “neighbours”) or processing textual contents describing the items liked. The paper provides a general overview of the approaches to learning preference models in the context of recommender systems and it is organized as follows. Section 2 introduce general concepts and terminology about recommender systems. Preference learning issues in the area of recommender systems is presented in Section 3, where we also introduce the feedback gathering problem and some machine learning techniques used to acquire and infer user preferences. Conclusions are drawn in the last section. 2 Basics of Recommender Systems Nowadays it is very important for people to be supported in their decisions, due to the exponential increase of available information. Everyday we get advices from other people: “Hey, check out this Web site”, “I saw this book, you will like it”, “That restaurant is very good!”. When making a choice in the absence of decisive first-hand knowledge, choosing as other like-minded people have chosen in the past may be a good strategy. Recommender systems have the same role as human recommendations: they present information that they perceive to be useful and worth trying out. These systems are used in several application domains to support users in taking decisions, to help them in managing the ex-
ponential increase of information and, in general, to provide a more intelligent form of information access The creation and management of personalized recommendations require mainly three distinct and important components: a user profile, an algorithm to update the profile given usage/input information, and an adaptive tool that exploits the profile in order to provide personalization. First, the system needs to be able to store relevant information about users that will be used to infer their preferences and needs. Such information are stored in an individual user profi Second, if the system has to adapt with the user over time, some mechanism needed to keep the profile up-to-date. This could happen through explicit data input or implicit recording of user behavior as she interacts with the system, or combination of them. Third, the system needs some way to exploit the current profile data in making recommendations to the user. The types of information stored in the profile will depend on the goals of the system and the algorithms it employs in order to provide recommendations. Different approaches to recom- endation will require different pieces of information about the user, thus the profile structure will differ from system to system In this section we will provide an overview of the main recommendation aches and their benefits and weaknesses 2.1 Collaborative Recommender Systems n Collaborative Filtering(CF) systems recommendations are based on evalu ations of users who share similar interests among them. The idea behind thes systems is that a set of users which liked the same items in the past probably share the same preferences. Thus, picking a user from this set, we can suggest her n the unseen items which other users with similar tastes showed to like in the past. Opinions on items can be expressed as explicit user ratings on some scale ranging from bad to good, or as implicit ratings given by logging user actions As an example of the latter, viewing or skipping items could be interpreted positive and negative ratings respectively. CF systems analyze opinions of other users on items, thus they provide a liking degree not based on the nature of the item, but on human judgment The main advantage of collaborative methods is that items in different prod- uct categories can be recommended. Movies, images, art and text items are all epresented by opinions of users and thus they can be recommended by the same system. In CF, a user profile simply consists of the data the user has specified These data are compared to those of other users to find overlaps in interests among users. For example, the nearest neighbor approach, used in some collab- orative recommender system [20, represents the preferences by the items rated (or purchased) by the user. The profile is represented by the user-item matrix 22 where for each cell (u, i)we have the rate of the user u on the item i. The recommender algorithm performs three tasks: it finds similar users, creates the nearest neighbors set for each user, infers the like degree for an unseen item based on the nearest neighbors behavior
ponential increase of information and, in general, to provide a more intelligent form of information access. The creation and management of personalized recommendations require mainly three distinct and important components: a user profile, an algorithm to update the profile given usage/input information, and an adaptive tool that exploits the profile in order to provide personalization. First, the system needs to be able to store relevant information about users that will be used to infer their preferences and needs. Such information are stored in an individual user profile. Second, if the system has to adapt with the user over time, some mechanism is needed to keep the profile up-to-date. This could happen through explicit data input or implicit recording of user behavior as she interacts with the system, or a combination of them. Third, the system needs some way to exploit the current profile data in making recommendations to the user. The types of information stored in the profile will depend on the goals of the system and the algorithms it employs in order to provide recommendations. Different approaches to recommendation will require different pieces of information about the user, thus the profile structure will differ from system to system. In this section we will provide an overview of the main recommendation approaches and their benefits and weaknesses. 2.1 Collaborative Recommender Systems In Collaborative Filtering (CF) systems recommendations are based on evaluations of users who share similar interests among them. The idea behind these systems is that a set of users which liked the same items in the past probably share the same preferences. Thus, picking a user from this set, we can suggest her all the unseen items which other users with similar tastes showed to like in the past. Opinions on items can be expressed as explicit user ratings on some scale ranging from bad to good, or as implicit ratings given by logging user actions. As an example of the latter, viewing or skipping items could be interpreted as positive and negative ratings respectively. CF systems analyze opinions of other users on items, thus they provide a liking degree not based on the nature of the item, but on human judgment. The main advantage of collaborative methods is that items in different product categories can be recommended. Movies, images, art and text items are all represented by opinions of users and thus they can be recommended by the same system. In CF, a user profile simply consists of the data the user has specified. These data are compared to those of other users to find overlaps in interests among users. For example, the nearest neighbor approach, used in some collaborative recommender system [20], represents the preferences by the items rated (or purchased) by the user. The profile is represented by the user-item matrix [22] where for each cell (u,i) we have the rate of the user u on the item i. The recommender algorithm performs three tasks: it finds similar users, creates the nearest neighbors set for each user, infers the like degree for an unseen item based on the nearest neighbors behavior
Terveen and Hill 38 claim three essentials are needed to support CF: many people must participate(increasing the likelihood that any one person will find other users with similar preferences ), there must be an easy way to represent the user interests in the system, and the algorithms must be able to match people with similar interests. These three elements are not that easy to develop, and produce the main shortcoming of CF systems. Following the main limitations of collaborative systems 4, 18 NEW USER PROBLEM-In order to make accurate recommendations, the system must first learn the preferences of the user from her ratings NEW ITEM PROBLEM(EARLY RATER)-Until new items are rated by a substantial number of users, the recommender system would not be able recommend them SPARSITY PROBLEM- The number of ratings obtained is usually very small compared to the number of ratings to be predicted and the success of th collaborative recommender system depends on the availability of a critical mass of users. One way to overcome the problem of rating sparsity is to use user profile information when calculating user similarity. That is, two users could be considered similar not only if they similarly rated the same items, but also if they belong to the same demographic segment. For example Pazzani uses gender, age, area code, education, and employment information of users in the restaurant recommendation application 25 GREY SHEEP PROBLEM(UNUSUAL USER)-In a small or even medium com munity of users, there are individuals who would not benefit from pure CF systems because their opinions do not consistently agree or disagree with any group of people. These individuals will rarely, if ever, receive accurate predictions, even after the initial start up phase for the user and the sys- tem [11]. The majority of users falls into the class of the so-called"white sheep", those who have high correlation with many other users and who will therefore, in theory, be easy to find recommendations for. The opposite type of people are the"black sheep", those for whom there are no or few people who they correlate with. This makes it very difficult to make recommenda- tions for them. On the positive side, for statistical reasons, as the number of users of a system increases the chance of finding other people with similar tastes increases and so better recommendations can be provided SCALABILITY PROBLEM-CF systems require data from a large number of users before being effective as well as requiring a large amount of data from each user. Therefore, the required computational resources become a critical issue to find users with similar tastes LACK OF TRANSPARENCY PROBLEM- Collaborative systems today are black bores, computerized oracles which give advice but cannot be questioned. A user is given no indicators to consult in order to decide when to trust a recommendation and when to doubt one. These problems have prevented acceptance of collaborative systems in all but low-risk content domains since they are untrustworthy for high-risk content domains
Terveen and Hill [38] claim three essentials are needed to support CF: many people must participate (increasing the likelihood that any one person will find other users with similar preferences), there must be an easy way to represent the user interests in the system, and the algorithms must be able to match people with similar interests. These three elements are not that easy to develop, and produce the main shortcoming of CF systems. Following the main limitations of collaborative systems [4, 18]. – New user problem - In order to make accurate recommendations, the system must first learn the preferences of the user from her ratings. – New item problem (early rater) - Until new items are rated by a substantial number of users, the recommender system would not be able to recommend them. – Sparsity problem - The number of ratings obtained is usually very small compared to the number of ratings to be predicted and the success of the collaborative recommender system depends on the availability of a critical mass of users. One way to overcome the problem of rating sparsity is to use user profile information when calculating user similarity. That is, two users could be considered similar not only if they similarly rated the same items, but also if they belong to the same demographic segment. For example, Pazzani uses gender, age, area code, education, and employment information of users in the restaurant recommendation application [25]. – Grey sheep problem (unusual user) - In a small or even medium community of users, there are individuals who would not benefit from pure CF systems because their opinions do not consistently agree or disagree with any group of people. These individuals will rarely, if ever, receive accurate predictions, even after the initial start up phase for the user and the system [11]. The majority of users falls into the class of the so-called “white sheep”, those who have high correlation with many other users and who will therefore, in theory, be easy to find recommendations for. The opposite type of people are the “black sheep”, those for whom there are no or few people who they correlate with. This makes it very difficult to make recommendations for them. On the positive side, for statistical reasons, as the number of users of a system increases the chance of finding other people with similar tastes increases and so better recommendations can be provided. – Scalability problem - CF systems require data from a large number of users before being effective as well as requiring a large amount of data from each user. Therefore, the required computational resources become a critical issue to find users with similar tastes. – Lack of transparency problem - Collaborative systems today are black boxes, computerized oracles which give advice but cannot be questioned. A user is given no indicators to consult in order to decide when to trust a recommendation and when to doubt one. These problems have prevented acceptance of collaborative systems in all but low-risk content domains since they are untrustworthy for high-risk content domains
2.2 Content-based Recommender System Unlike CF systems, where user opinions were a key element to learn user pref erences and finding items to suggest, in content-based(CB) recommenders the ratings expressed by a single user have no role in recommendations provided to other users. The core of this approach is the processing of the contents describ- ing the items to be recommended. The items can be very different depending on the number and type of attributes used to describe them. Each item can be described by the same small number of attributes with known set of values, but this is not appropriate for items, such as Web pages, news or documents, de- scribed by means of unstructured text. In this case there are no attributes with well-defined values and the use of document modeling techniques with roots in Information Retrieval 30, 3 and Information Filtering 5 research is desirable A method to represent unstructured data is the Vector Space Model (vSM) The VSM 34 is a spatial representation of text documents. In this model each document is represented by a vector in a n-dimensional space, where each dimension corresponds to a term from the overall vocabulary of a given document collection. Formally, every document is represented as a vector of term weights where each weight indicates the degree of association between the document and the term. The CB approach can be applied only in the domains where we can provide some textual metadata describing the items A CB recommender learns a profile of the user interests based on some fea- tures of the objects the user rated. Afterwards the system exploits the user profile to suggest relevant items by matching the profile representation against that of items to be recommended. The result of this matching is a binary or continuous relevance judgment, the latter case resulting in a ranked list of potentially inter- esting items. If data are represented by the vsm, the matching might be realized by computing the cosine similarity between the prototype vector and the item vectors. Many systems ask users for feedback on the recommended items so that the matching can be performed according the relevance feedback. The CB paradigm has several advantages when compared to the Cf one USER INDEPENDENCE-CB recommenders exploit solely ratings provided by the active user to build her own profile TRANSPARENCY-Explanations of recommendations can be provided by list ng content features or descriptions that caused an item to be recommended NEW ITEM-CB recommenders are capable of recommending items not yet rated by any user On the other hand, CB systems have several shortcomings LIMITED CONTENT ANALYSIS- CB techniques are limited by the features that are associated either automatically or manually with the items. No CB system can provide good suggestions if the content does not contain enough information to distinguish items the user likes from items the user does not like Some representations capture only certain aspects of the content, but there are many others that would influence a us
2.2 Content-based Recommender Systems Unlike CF systems, where user opinions were a key element to learn user preferences and finding items to suggest, in content-based (CB) recommenders the ratings expressed by a single user have no role in recommendations provided to other users. The core of this approach is the processing of the contents describing the items to be recommended. The items can be very different depending on the number and type of attributes used to describe them. Each item can be described by the same small number of attributes with known set of values, but this is not appropriate for items, such as Web pages, news or documents, described by means of unstructured text. In this case there are no attributes with well-defined values and the use of document modeling techniques with roots in Information Retrieval [30, 3] and Information Filtering [5] research is desirable. A method to represent unstructured data is the Vector Space Model (VSM). The VSM [34] is a spatial representation of text documents. In this model, each document is represented by a vector in a n-dimensional space, where each dimension corresponds to a term from the overall vocabulary of a given document collection. Formally, every document is represented as a vector of term weights, where each weight indicates the degree of association between the document and the term. The CB approach can be applied only in the domains where we can provide some textual metadata describing the items. A CB recommender learns a profile of the user interests based on some features of the objects the user rated. Afterwards the system exploits the user profile to suggest relevant items by matching the profile representation against that of items to be recommended. The result of this matching is a binary or continuous relevance judgment, the latter case resulting in a ranked list of potentially interesting items. If data are represented by the VSM, the matching might be realized by computing the cosine similarity between the prototype vector and the item vectors. Many systems ask users for feedback on the recommended items so that the matching can be performed according the relevance feedback. The CB paradigm has several advantages when compared to the CF one: – User independence - CB recommenders exploit solely ratings provided by the active user to build her own profile. – Transparency - Explanations of recommendations can be provided by listing content features or descriptions that caused an item to be recommended. – New item - CB recommenders are capable of recommending items not yet rated by any user. On the other hand, CB systems have several shortcomings: – Limited content analysis - CB techniques are limited by the features that are associated either automatically or manually with the items. No CB system can provide good suggestions if the content does not contain enough information to distinguish items the user likes from items the user does not like. Some representations capture only certain aspects of the content, but there are many others that would influence a user’s experience. For instance