Hybrid Recommender Systems: Survey and Experiments Robin burke User Modeling and User-Adapted Interaction: Nov 2002; 12, 4; ABI/INFORM Global pg331 User Modeling and User-Adapted C 2002 Khuwer Academic Publi Hybrid Recommender Systems: Survey and Experiments ROBIN BURKE Department of Information Systems and Decision Sciences. California State University, Fuller CA 92834, USA Received 23 January 2000; accepted in revised form 24 September 2001) Abstract. Recommender systems represent user preferences for the purpose of suggesting items to purchase or examine. They have become fundamental applications in electronic commerce and information access, providing suggestions that effectively prune large information spaces so that users are directed toward those items that best meet their needs and preferences. a variety of techniques have been proposed for performing recommendation, inchuding content-base collaborative, knowledge-based and other techniques. To improve performance, these methods have sometimes been combined in hybrid recommenders. This paper surveys the landscape actual and possible hybrid recommenders, and introduces a novel hybrid, Entree C, a system that combines knowledge-based recommendation and collaborative filtering to recommend restaurants. Further, we show that semantic ratings obtained from the knowledge-based part Key words: case-based reasoning, collaborative filtering, electronic commerc 1. Introduction Recommender systems were originally defined as ones in which 'people provide rec- ommendations as inputs, which the system then aggregates and directs to appro- priate recipients'(Resnick Varian, 1997). The term now has a broade connotation, describing any system that produces individualized recommendations as output or has the effect of guiding the user in a personalized way to interesting or useful objects in a large space of possible options. Such systems have an obvious appeal in an environment where the amount of on-line information vastly outstrips any individual's capability to survey it. Recommender systems are now an integral part of some e-commerce sites such as Amazon. com and CDNow(Schafer, Konstan riedl, 1999) It is the criteria of 'individualized'and interesting and useful that separate the recommender system from information retrieval systems or search engines. The semantics of a search engine are 'matching the system is supposed to return all those items that match the query ranked by degree of match. Techniques such I The managing editor for this paper was Ingrid Zukerman Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Hybrid Recommender Systems: Survey and Experiments Robin Burke User Modeling and User - Adapted Interaction; Nov 2002; 12, 4; ABI/INFORM Global pg. 331
ROBIN BURKE as relevance feedback enable a search engine to refine its representation of the users query, and represent a simple form of recommendation. The next-generation search engine Google blurs this distinction, incorporatingauthoritativeness'criteria into its ranking(defined recursively as the sum of the authoritativeness of pages linking to a given page) in order to return more useful results (Brin Page, 1998). One common thread in recommender systems research is the need to combine recommendation techniques to achieve peak performance. All of the known rec ommendation techniques have strengths and weaknesses, and many researchers have chosen to combine techniques in different ways. This article surveys the different ecommendation techniques being researched- analyzing them in terms of the data that supports the recommendations and the algorithms that operate on that data and examines the range of hybridization techniques that have been proposed. This analysis points to a number of possible hy brids that have yet to be explored. Finally, we discuss how adding a hybrid with colla borative filtering improved the perform ance of our knowledge-based recommender system Entree. In addition, we show that semantic ratings made available by the knowledge-based portion of the system provide an additional boost to the hy brids performance 1.1. RECOMMENDATION TECHNIQUES Recommendation techniques have a number of possible classifications(Resnick Varian, 1997; Schafer, Konstan riedl, 1999; Terveen Hill, 2001). Of interest in this discussion is not the type of interface or the properties of the users interaction with the recommender, but rather the sources of data on which recommendation is based and the use to which that data is put. Specifically, recommender systems have (i background data, the information that the system has before the recommendation process begins, (ii) input data, the information that user must communicate to the system in order to generate a recommendation, and (iii) an algorithm that combines background and input data to arrive at its suggestions On this basis, we can dis- tinguish five different recommendation techniques as shown in Table I. Assume that I is the set of items over which recommendations might be made, U is the set of users whose preferences are known, u is the user for whom recommendations need to be generated, and i is some item for which we would like to predict u's preference Collaborative recommendation is probably the most familiar, most widely implemented and most mature of the technologies. Collaborative recommender sys- tems aggregate ratings or recommendations of objects, recognize commonalities between users on the basis of their ratings, and generate new recommendations based on inter-user comparisons. A typical user profile in a collaborative system consists of vector of items and their ratings, continuously augmented as the user interacts with the system over time. Some systems used time-based discounting of ratings to ccount for drift in user interests(Billsus Pazzani, 2000; Schwab et al., 2001) In some cases, ratings may be binary(like/dislike)or real-valued indicating degree UrlhtTp://www.google.com Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
HYBRID RECOMMENDER SYSTEMS SURVEY AND EXPERIMENTS Table I. Recommendation technique Ratings from u of Identifv users in U similar to u, and extrapolate from their ratings of i. Content- Features of items in I u's ratings of items in I Generate a classifier based hat fits us rating behavior and use it on i Demographic Demographic Demographic Identify users that information about nformation about u graphically U and their ratings similar to u. and extrapolate from their ratings of i Utility-based Features of items in L. A utility function over Apply the function to items in I that describes the items and determine s rank Features of items in L. A description of Infer a match between of preference. Some of the most important systems using this technique are (Resnick et al., 1994), Ringo/ Firefly (Shardanand Maes, 1995), Tapestry( Goldberg et al., 1992)and Recommender(hill et al, 1995) These systems can be either memory-based, comparing users against each other other mea model-based. in which a model is derived from the historical rating data and used to make predictions(breese et al., 1998). Model-based recommenders have used a variety of learning techniques including neural networks (Jennings Higuchi, 1993), latent semantic indexing ( Foltz, 1990), and Bayesian networks( Condliff et al., 1999) The greatest strength of collaborative techniques is that they are completely inde pendent of any machine-readable representation of the objects being recommended, and work well for complex objects such as music and movies where variations in taste are responsible for much of the variation in preferences. Schafer, Konstan and Riedl (1999)call this ' people-to-people correlation Demographic recommender systems aim to categorize the user based on pe attributes and make recommendations based on demographic classes. An early example of this kind of system was Grundy(rich, 1979) that recommended books based on personal information gathered through an interactive dialogue. The users responses were matched against a library of manually assembled user stereotypes Some more recent recommender systems have also taken this approach. Krulwich (1997), for example, uses demographic groups from marketing research to suggest a range of products and services. a short survey is used to gather the data for user categorization. In other systems, machine learning is used to arrive at a classifier based on demographic data(Pazzani, 1999). The representation of demographic Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
ROBIN BURKE nformation in a user model can vary greatly. Rich's system used hand-crafted attributes with numeric confidence values. Pazzani's model uses Winnow to extract features from users' home pages that are predictive of liking certain restaurants Demographic techniques form people-to-people' correlations like collaborative ones, but use different data. The benefit of a demographic approach is that it may not require a history of user ratings of the type needed by collaborative and content-based techniques Content-based recommendation is an outgrowth and continuation of information filtering research(Belkin Croft, 1992). In a content-based system, the objects of interest are defined by their associated features. For example, text recommendation systems like the newsgroup filtering system News Weeder (Lang, 1995)uses the words of their texts as features. A content-based recommender learns a profile the user's interests based on the features present in objects the user has rated Schafer, Konstan and Riedl call this item-to-item correlation. The type of user profile derived by a content-based recommender depends on the learning method employed. Decision trees, neural nets, and vector-based representations have all been used. As in the collaborative case, content-based user profiles are long-term models and updated as more evidence about user preferences is observed Utility-based and knowledge-based recommenders do not attempt to build long-term generalizations about their users, but rather base their advice on an evalu ation of the match between a user's need and the set of options available Utility-based recommenders make suggestions based on a computation of the utility of each object for the user. Of course, the central problem is how to create a utility function for each user. Tete-a-Tete and the e-commerce site Persona logic each have different techniques for arriving at a user-specific utility function and applying it to the objects under consideration( Guttman, 1998). The user profile therefore is the utility function that the system has derived for the user, and the system employs onstraint satisfaction techniques to locate the best match. The benefit of utility-based recommendation is that it can factor non-product attributes, such as vendor reliability and product availability, into the utility computation, making it possible for example to trade off price against delivery schedule for a user who has an immediate need Knowledge-based recommendation attempts to suggest objects based on inferences about a user's needs and preferences. In some sense, all recommendation techniques could be described as doing some kind of inference. Knowledge-based approaches are distinguished in that they have functional knowledge: they have knowledge abor how a particular item meets a particular user need, and can therefore reason about the relationship between a need and a possible recommendation. The user profil can be any knowledge structure that supports this inference. In the simplest case, in Google, it may simply be the query that the user has formulated. In others, Schoorexampleseethecollegeguidesavailableathttp://www.peronalogic.aolcom/go/grad- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
HYBRID RECOMMENDER SYSTEMS SURVEY AND EXPERIMENTS 335 it may be a more detailed representation of the user's needs(Towle Quinn, 2000) The Entree system(described below )and several other recent systems(for example, Schmitt Bergmann, 1999) employ techniques from case-based reasoning for knowledge-based recommendation. Schafer, Konstan and Riedl call knowledge- based recommendation the 'Editor's choice method The knowledge used by a knowledge-based recommender can also take many forms. Google uses information about the links between web pages to infer popularity and authoritative value ( Brin Page, 1998). Entree uses knowledge of cuisines to infer similarity between restaurants. Utility-based approaches calcu late a utility value for objects to be recommended, and in principle, such calculations could be based on functional knowledge. However, existing systems do not use such inference, requiring users to do their own mapping between their needs and the fea ures of products, either in the form of preference functions for each feature in the case of tete-a- tete or answers to a detailed questionnaire in the case of Persona logic 2. Comparing recommendation techniques All recommendation techniques have strengths and weaknesses discussed below and summarized in Table II. Perhaps the best known is the ramp-up'problem(Konstan et al., 1998). This term actually refers to two distinct but related problems New User: Because recommendations follow from a comparison between the target ser and other users based solely on the accumulation of ratings, a user with few ratings becomes difficult to categorize. Table IL. TradofIs between recommendation techniques A. Can identify cross-genre I New user ramp-up probler filtering J. New item ramp-up problem B. Domain knowledge not K " Gray sheep problem L. Quality d C. Adaptive: qua over time M. Stability vs, plasticity problem D. Implicit feedback sufficient Content-based(CN) B, C, D I, L, M Demographic(DM) A, B, C L, K, L M N. Must gather demograp information O. User must input ut P. Suggestion ability static G. Can include non-product features Knowledge-based E F G ap Q Knowledge engineering required user needs to products Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission