Interaction Design for Recommender Systems Kirsten Swearingen, Rashmi Sinha School of Information Management Systems, University of Califomia, Berkeley, CA ABSTRACT offer recommender systems as one way for consumers to find products they might like to purchase for users, aiding them in decision making about matters Typically the effectiveness of recommender systems has related to personal taste. Research has focused mostly on the en indexed by statistical accuracy metrics such as mean algorithms that drive the system, with little understanding of Absolute Error(MAE)[3]. However, satisfaction with a design issues from the user's perspective. The goal of our recommender system is only partly determined by the research is to study users' interactions with recommender accuracy of the algorithm behind it [2]. What factors lead to stems in order to develop general design guidelines. We satisfaction with a recommender system? What encourages have studied users' interactions with 1 l online recommender users to reveal their tastes to online systems, and act upon systems. Our studies have highlighted the role of the recommendations provided by such systems? While ansparency(understanding of system logic), familiar there is a lot of research on the accuracy of recommender commendations, and information about recommended system algorithms, there is little focus on interaction design items in the user's interaction with the system. Our results for recommender systems. also indicate that there are multiple models for successful To design an effective interaction, one must consider two recommender systems. questions: (I)what user needs are satisfied by interacting ith the system; and (2)what specific system features lead to satisfaction of those needs. our research studies have Evaluation, Information Retrieval, Usability Studies, User attempted to answer both of these questions. Below is a Studies, World wide Web brief overview of our study methodology and main findings Subsequently we discuss the results in greater detail and INTRODUCTION offer design guidelines based on those results. In everyday life, people must often rely on incomplete information when deciding which books to read movies to OVERMEW OF OUR RESEARCH PROGRAM watch or music to purchase. When presented with a number More than 20 different book, movie and music recommender of unfamiliar altematives, people tend to seek out systems are currently available online. Though the basic recommendations from friends or expert reviews interaction paradigm is similar(user provides some input ewspapers and magazines to aid them in decision-making and the system processes that information to generate a list In recent years, online recommender systems have begun of recommendations), recommender systems differ in the providing a technological proxy for this social specifics of the interaction(e.g, amount and type of input recommendation process. Most recommender systems work user is required to give, familiarity of recommendations, by asking users to rate some sample items. Collaborative transparency of system logic, number of recommendations) filtering algorithms, which often fom the backbone of such Our approach has been to sample a variety of interaction systems, use this input to match the current user with others models in order to identify best practices and generate who share similar tastes. Recommender systems have inners of Recommender Systems. For 6 of gained increasing popularity on the web, both in research the ll systems tested, we also compared user's liking for ystems(e.g GroupLens [1] and MovieLens [2) and online systems recommendations with liking for recommendations commercesites(e.g.AmazoncomandCdnow.com),th provided by their friends. Permission to make digal or hard copies of al a part d this work for arena made or distributed or proft or commercal advantage and that otherwise, ar republish, to post on servers or o redistribute to ists, DIS2002, London@ Copyright 2002 ACM1-58113-29-000000B $5.00 berkeley. edu, sinhala rkeley.edu
Interaction Design for Recommender Systems Kirsten Swearingen, Rashmi Sinha School of Information Management & Systems, University of California, Berkeley, CA1 1 kirstens@sims.berkeley.edu, sinha@sims.berkeley.edu ABSTRACT Recommender systems act as personalized decision guides for users, aiding them in decision making about matters related to personal taste. Research has focused mostly on the algorithms that drive the system, with little understanding of design issues from the user’s perspective. The goal of our research is to study users’ interactions with recommender systems in order to develop general design guidelines. We have studied users’ interactions with 11 online recommender systems. Our studies have highlighted the role of transparency (understanding of system logic), familiar recommendations, and information about recommended items in the user’s interaction with the system. Our results also indicate that there are multiple models for successful recommender systems. Keywords Evaluation, Information Retrieval, Usability Studies, User Studies, World Wide Web INTRODUCTION In everyday life, people must often rely on incomplete information when deciding which books to read, movies to watch or music to purchase. When presented with a number of unfamiliar alternatives, people tend to seek out recommendations from friends or expert reviews in newspapers and magazines to aid them in decision-making. In recent years, online recommender systems have begun providing a technological proxy for this social recommendation process. Most recommender systems work by asking users to rate some sample items. Collaborative filtering algorithms, which often form the backbone of such systems, use this input to match the current user with others who share similar tastes. Recommender systems have gained increasing popularity on the web, both in research systems (e.g. GroupLens [1] and MovieLens [2]) and online commerce sites (e.g. Amazon.com and CDNow.com), that offer recommender systems as one way for consumers to find products they might like to purchase. Typically the effectiveness of recommender systems has been indexed by statistical accuracy metrics such as Mean Absolute Error (MAE) [3]. However, satisfaction with a recommender system is only partly determined by the accuracy of the algorithm behind it [2]. What factors lead to satisfaction with a recommender system? What encourages users to reveal their tastes to online systems, and act upon the recommendations provided by such systems? While there is a lot of research on the accuracy of recommender system algorithms, there is little focus on interaction design for recommender systems. To design an effective interaction, one must consider two questions: (1) what user needs are satisfied by interacting with the system; and (2) what specific system features lead to satisfaction of those needs. Our research studies have attempted to answer both of these questions. Below is a brief overview of our study methodology and main findings. Subsequently we discuss the results in greater detail and offer design guidelines based on those results. OVERVIEW OF OUR RESEARCH PROGRAM More than 20 different book, movie and music recommender systems are currently available online. Though the basic interaction paradigm is similar (user provides some input and the system processes that information to generate a list of recommendations), recommender systems differ in the specifics of the interaction (e.g., amount and type of input user is required to give, familiarity of recommendations, transparency of system logic, number of recommendations). Our approach has been to sample a variety of interaction models in order to identify best practices and generate guidelines for designers of Recommender Systems. For 6 of the 11 systems tested, we also compared user’s liking for systems’ recommendations with liking for recommendations provided by their friends. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires specific permission and/or a fee. DIS2002, London © Copyright 2002 ACM 1-58113-2-9-0/00/0008 $5.00
Figure 1: User can choose between Online recommender and social recommendations(from friends) Social commendations Input from user Output (Recommendations) Online recommender Our study methodology incorporates a mix of quantitative made by friends, they expressed a high level of overall users to interact with several recommender systems, they found the systems useful and intended to use them presented in random order. Users provided input to the gain [5]. This seemed to be due in part to the ability systems and received a set of recommendations. We the recommender systems to suggest items that users had not asked users to rate 10 recommendations from each system, previously heard of. In the words of one user, "I'm evaluating aspects such as: liking, action towards item pressed with the types of movies that came back--there (would they buy it/download it/do nothing); transparency were movies I hadn' t seen-more interesting, more obscure (if they understood why systemrecommended that item), The system pulls from a large database--no one person car and familiarity(any previous experience of the item). Users know about all the movies i might like were also asked to rate the system as a whole on a number of The results of this study offer insight into the popularity of dimensions: usefulness, trustworthiness, and ease of use. For recommender systems. While users are happy with the age Study l, we also asked users to evaluate recommendations old ways of getting recommendations, they like the breadth provided by three of their friends using similar criteria. We that online systems offer Recommender systems allow users with each system. At the end of each session, we asked recorded user behaviour and comments while they intera a unique opportunity to explore their tastes, and lean about new items users to name the system they preferred and explain their reasoning Study I involved 20 participants and Study 2 Study 1, Part 2: Interface Analysis of Book and Movie involved 12. All participants were regular Internet users, der Systems and ranged in age from 19 to 44 years. Below, we describe The next question we asked was: What constitutes a our research studies in greater detail satisfying interaction with recommender systems? To address this question, we conducted an exploratory study tudy 1, Part 1: What user needs do recommender examining the interface of three book and three movie systems satisty that a friend cannot? recommender systems Since the goal of most recommender systems is to replace Amazon. com(books and movies) (or at least augment) the social recommendation process Rating Zones Quick Picks(books) (also called word-of-mouth), we began by directly comparing the two ways of receiving recommendations Moviecritic. com (movies) (friends and online recommender systems-see Figure D) Reel. com(movies) [4]. Do users like receiving recommendations from an online A recommender system may take input from users implicitly system? How do the recommendations provided by online or explicitly, or a combination of the two [6] our stud systems differ from those provided by a user's friends? The focused on systems that relied upon explicit input. Within results of our study indicated that users preferred this subset of recommenders, we chose systems that offered recommendations made by their friends to those made by a wide variety of interaction paradigms to the user: online systems. Though users preferred recommendations differences in interfaces such as layout, navigation, color
Our study methodology incorporates a mix of quantitative and qualitative techniques. For both of our studies we asked users to interact with several recommender systems, presented in random order. Users provided input to the systems and received a set of recommendations. We then asked users to rate 10 recommendations from each system, evaluating aspects such as: liking, action towards item (would they buy it / download it / do nothing); transparency (if they understood why system recommended that item); and familiarity (any previous experience of the item). Users were also asked to rate the system as a whole on a number of dimensions: usefulness, trustworthiness, and ease of use. For Study 1, we also asked users to evaluate recommendations provided by three of their friends using similar criteria. We recorded user behaviour and comments while they interacted with each system. At the end of each session, we asked users to name the system they preferred and explain their reasoning. Study 1 involved 20 participants and Study 2 involved 12. All participants were regular Internet users, and ranged in age from 19 to 44 years. Below, we describe our research studies in greater detail. Study 1, Part 1: What user needs do recommender systems satisfy that a friend cannot? Since the goal of most recommender systems is to replace (or at least augment) the social recommendation process (also called word-of-mouth), we began by directly comparing the two ways of receiving recommendations (friends and online recommender systems—see Figure 1) [4]. Do users like receiving recommendations from an online system? How do the recommendations provided by online systems differ from those provided by a user’s friends? The results of our study indicated that users preferred recommendations made by their friends to those made by online systems. Though users preferred recommendations made by friends, they expressed a high level of overall satisfaction with the online recommenders and indicated that they found the systems useful and intended to use them again [5]. This seemed to be due in part to the ability of recommender systems to suggest items that users had not previously heard of. In the words of one user, “I’m impressed with the types of movies that came back-- there were movies I hadn't seen—more interesting, more obscure. The system pulls from a large database—no one person can know about all the movies I might like.” The results of this study offer insight into the popularity of recommender systems. While users are happy with the age - old ways of getting recommendations, they like the breadth that online systems offer. Recommender systems allow users a unique opportunity to explore their tastes, and learn about new items. Study 1, Part 2: Interface Analysis of Book and Movie Recommender Systems The next question we asked was: What constitutes a satisfying interaction with recommender systems? To address this question, we conducted an exploratory study examining the interface of three book and three movie recommender systems: · Amazon.com (books and movies) · RatingZone’s QuickPicks (books) · Sleeper (books) · Moviecritic.com (movies) · Reel.com (movies) A recommender system may take input from users implicitly or explicitly, or a combination of the two [6]; our study focused on systems that relied upon explicit input. Within this subset of recommenders, we chose systems that offered a wide variety of interaction paradigms to the user: differences in interfaces such as layout, navigation, color, Figure 1: User can choose between Online recommender systems and social recommendations (from friends) Online Recommender System Output (Recommendations) Input from user Social Recommendations
Amazon 600K QUCK PICKS Rating Zor 心 ommendation Halp us improve your recommendations h two easy stops. Enter your favorite interests, products, authors, or at Athar Faverite Artist Japo rrn (Jhn5b,phn构v》(Th图时呢影。协 Favorite Movie Fevorite Interes Figure 2: Interaction Paradigms for Amazon(Books )& Rating Zone graphics, and user instructions, types of input required, and Song Explorer information displayed with recommendations(see Figure 2 Media Unbound(5-minute version for illustration, and Appendixfor full system comparison From this study, we found that trust was affected by several aspects of the users' interactions with the systems, in Our findings in this study suggested that, from a user's ddition to the accuracy of the recommendations themselves perspective, an effective recommender system inspires trus ransparency of system logic, familiarity of the items in the system; has system logic that is at least somewhat recommended, and the process for receiving transparent; points users towards new, not-yet-experienced recommendations items; provides details about recommended items, including res and community ratings, and finally, provides Interaction Design for Recommender Systems to refine recommendations by including or excluding divided into thre particular genres. Users expressed willingness to provide parts. User interaction with such systems typically involves more input to the system in retum for more effective some input to the system; the system processes this input: and the user receives the output or recommendations. First ve take recommender systems apart and analyse the input Study 2: Interface Analysis of Music and output phases. What characteristics of these two phases Recommender Systems distinguish recommender systems? Which of these design The goal of our second study was to verify the findings from options do users prefer and why Study I, and extend them to another recommendation User interaction with recommender systems can also be domainthat of music. In Study I we had focused on conceptualised on a more gestalt or holistic level. What specific aspects of the interface(number of input items, overall system features lead to satisfaction with number of results etc. ) In Study 2 we considered the recommendations? How do users decide whether to trust systems more holistically, seeking in particular to answer the recommendations? What kinds of recommendations do they question"what leads a user to trust the systems find the most useful? For each of these questions, we recommendations? describe pertinent study results(both quantitative and In this study, qualitative), and suggest design options systems, for two reasons. First, with the increasing availability and usage of online music, we anticipate that 1)TAKING THINGS APART: INPUT TO THE SYSTEM music recommender systems will increase in popularity Recommender systems differ widely in terms of the type and Second, and more importantly, music recommenders allow amount of input users must provide in order to generate users to sample the item recommended--most systems recommendations. Some recommender systems use an open- provide access to a 30 second audio sample. This gave us ended technique, asking users to indicate their favorite the unique opportunity to evaluate the efficacy of author, musician, or actor. Other systems ask users to rate a commendations in the lab setting. Users could sample the eries of given items(books, songs, or movies)on a Likert audio clip during the test session. Thus, their evaluations of Scale, while still others use a hybrid technique first asking the recommended items are based upon direct experience general questions about taste(e.g, what phrase best rather than an abstract estimate of liking indicates how you feel about FM radio? followed by ratings We examined five music recommender systems of individual items, followed by item comparisons(e.g. de Amazons Recommendations explorer you like this song m ore or less than this other song?) CDNow Mood Logic Filters browser
graphics, and user instructions, types of input required, and information displayed with recommendations (see Figure 2 for illustration, and Appendixfor full system comparison chart). Our findings in this study suggested that, from a user’s perspective, an effective recommender system inspires trust in the system; has system logic that is at least somewhat transparent; points users towards new, not-yet-experienced items; provides details about recommended items, including pictures and community ratings; and finally, provides ways to refine recommendations by including or excluding particular genres. Users expressed willingness to provide more input to the system in return for more effective recommendations. Study 2: Interface Analysis of Music Recommender Systems The goal of our second study was to verify the findings from Study 1, and extend them to another recommendation domain—that of music. In Study 1 we had focused on specific aspects of the interface (number of input items, number of results etc.). In Study 2 we considered the systems more holistically, seeking in particular to answer the question “what leads a user to trust the system’s recommendations?” In this study, we chose to examine music recommender systems, for two reasons. First, with the increasing availability and usage of online music, we anticipate that music recommender systems will increase in popularity. Second, and more importantly, music recommenders allow users to sample the item recommended—most systems provide access to a 30 second audio sample. This gave us the unique opportunity to evaluate the efficacy of recommendations in the lab setting. Users could sample the audio clip during the test session. Thus, their evaluations of the recommended items are based upon direct experience rather than an abstract estimate of liking. We examined five music recommender systems: · Amazon’s Recommendations Explorer · CDNow · Mood Logic Filters Browser · Song Explorer · Media Unbound (5-minute version) From this study, we found that trust was affected by several aspects of the users’ interactions with the systems, in addition to the accuracy of the recommendations themselves: transparency of system logic, familiarity of the items recommended, and the process for receiving recommendations. Interaction Design for Recommender Systems Our analysis of recommender systems is divided into three parts. User interaction with such systems typically involves some input to the system; the system processes this input; and the user receives the output or recommendations. First we take recommender systems apart and analyse the input and output phases. What characteristics of these two phases distinguish recommender systems? Which of these design options do users prefer and why? User interaction with recommender systems can also be conceptualised on a more gestalt or holistic level. What overall system features lead to satisfaction with recommendations? How do users decide whether to trust recommendations? What kinds of recommendations do they find the most useful? For each of these questions, we describe pertinent study results (both quantitative and qualitative); and suggest design options. 1) TAKING THINGS APART: INPUT TO THE SYSTEM Recommender systems differ widely in terms of the type and amount of input users must provide in order to generate recommendations. Some recommender systems use an openended technique, asking users to indicate their favorite author, musician, or actor. Other systems ask users to rate a series of given items (books, songs, or movies) on a Likert Scale, while still others use a hybrid technique first asking general questions about taste (e.g., what phrase best indicates how you feel about FM radio?) followed by ratings of individual items, followed by item comparisons (e.g. do you like this song more or less than this other song?). Figure 2: Interaction Paradigms for Amazon (Books) & RatingZone Amazon RatingZone
eeper rating scale Amazon rating sca Your Rating Rated Don't like it s I love it! CCCC C Figure 3: Input Rating Scales for Sleeper Amazon(Music) How many Items to Rate? found themselves stumped. With only one opportunity to A few systems ask the user to enter only I piece of provide input to the system, they felt pressure to choose with nformation to receive recommendations while others extreme caution.(b)Ratings on Likert Scale: Users were require a minimum commitment of at least 30 ratings. Our asked to rate items on 5-10 point scale ranging from Like to quantitative and qualitative results indicate that users do not Dislike. This could become repetitive and boring. At mind giving a little more input to the system in order to Song Explorer, Movie Critic, and Rating Zone users expressed receive more accurate suggestions. Across both of c irritation at having to page through lists of items in order to studies 39% of the users felt that the input required by provide the requisite number of ratings. Another systems was not enough, in contrast to only 9.4% of our manifestation of a Likert scale was a continuous rating bar users who thought that the input required was too much. ranging from Like to Dislike. Users liked the rating bar since Table I shows users' opinions regarding the amount of input they could click anywhere to indicate degree of liking for an for music recommender systems(Study 2). Even for a item. The Sleeper system used such a scale(see Figure 3) system like MediaUnbound that required (c)Binary Liking For this type of question, users were questions, only 8% of users regarded this as too much. Users simply asked to check a box if they liked an item. This was indicated that their opinion of required input was influenced simple to do, but could become repetitive and boring as well by the kind of recommendations they received. For systems (d)Hybrid Rating Process: Such systems incorporated commendations were perceived as too simplistic features from all the above types of questions as appropriate (Amazon), or inaccurate(SongExplorer), most (50%)users MediaUnbound used such a process and also provided thought that input was not enoug continuous feedback to the user, keeping him /her engaged. No of Input How users felt about number of Another aspect of the input process was the set of items that Rating was rated. Often users had little or no experience of the item, 4-ad dust Right Foo Much leading to frustration with the rating process. one user commented at rating zone "I'm worried because I haven 3390.094 read many of these-I don 't know what I'm going to get 45% 0.0% back, while at Song Explorer, another user observed"The sonoexplorer 58% 39 Items to be rated] are all so obvious. I feel like I'm more 34 179 75% 8 39 sophisticated than the system is going to give me credit for Design Suggestion. It is important to design an easy and engaging process that keeps users from getting bored or Design Suggestion: Designers of recommender systems are frustrated. A mix of different types of questions, and often faced with a choice between enhancing ease of use(by continuous feedback during the input phase can help achieve asking users to rate fewer items)or enhancing the accuracy this goal of the algorithms(by asking users to provide more ratings) Our suggestion is that it is fine to ask the users for a few Filtering by Genre more ratings if that leads to substantial increases in Several recommender systems ask users whether they want accuracy. Users dislike bad recommendations more than they dislike providing a few additional ratings. Movie Critic allows users to set a variety of genre filters Without being asked, almost all of the users volunteered What kind of rating process? favorable comments on these filtersthey liked being able In the systems we studied, there were four types of rating to quickly set the "include and"exclude options on a list input formats: (a)Open-ended Name an artist/ writer you of about 20 genres. However, we discovered two possible ike. When asked to name one "favorite" artist, some users oblems with genre filtering. Several users commented that
How many Items to Rate? A few systems ask the user to enter only 1 piece of information to receive recommendations, while others require a minimum commitment of at least 30 ratings. Our quantitative and qualitative results indicate that users do not mind giving a little more input to the system in order to receive more accurate suggestions. Across both of our studies 39% of the users felt that the input required by systems was not enough, in contrast to only 9.4 % of our users who thought that the input required was too much. Table 1 shows users’ opinions regarding the amount of input for music recommender systems (Study 2). Even for a system like MediaUnbound that required answers to 34 questions, only 8% of users regarded this as too much. Users indicated that their opinion of required input was influenced by the kind of recommendations they received. For systems whose recommendations were perceived as too simplistic (Amazon), or inaccurate (SongExplorer), most (>50%) users thought that input was not enough. Design Suggestion: Designers of recommender systems are often faced with a choice between enhancing ease of use (by asking users to rate fewer items) or enhancing the accuracy of the algorithms (by asking users to provide more ratings). Our suggestion is that it is fine to ask the users for a few more ratings if that leads to substantial increases in accuracy. Users dislike bad recommendations more than they dislike providing a few additional ratings. What kind of rating process? In the systems we studied, there were four types of rating input formats: (a) Open-ended: Name an artist / writer you like. When asked to name one “favorite” artist, some users found themselves stumped. With only one opportunity to provide input to the system, they felt pressure to choose with extreme caution. (b) Ratings on Likert Scale: Users were asked to rate items on 5-10 point scale ranging from Like to Dislike. This could become repetitive and boring. At SongExplorer, MovieCritic, and RatingZone users expressed irritation at having to page through lists of items in order to provide the requisite number of ratings. Another manifestation of a Likert scale was a continuous rating bar ranging from Like to Dislike. Users liked the rating bar since they could click anywhere to indicate degree of liking for an item. The Sleeper system used such a scale (see Figure 3). (c) Binary Liking: For this type of question, users were simply asked to check a box if they liked an item. This was simple to do, but could become repetitive and boring as well. (d) Hybrid Rating Process: Such systems incorporated features from all the above types of questions as appropriate. MediaUnbound used such a process and also provided continuous feedback to the user, keeping him / her engaged. Another aspect of the input process was the set of items that was rated. Often users had little or no experience of the item, leading to frustration with the rating process. One user commented at RatingZone “I’m worried because I haven’t read many of these—I don’t know what I’m going to get back,” while at SongExplorer, another user observed “The [items to be rated] are all so obvious. I feel like I’m more sophisticated than the system is going to give me credit for.” Design Suggestion: It is important to design an easy and engaging process that keeps users from getting bored or frustrated. A mix of different types of questions, and continuous feedback during the input phase can help achieve this goal. Filtering by Genre Several recommender systems ask users whether they want recommendations from a particular genre. For example, MovieCritic allows users to set a variety of genre filters. Without being asked, almost all of the users volunteered favorable comments on these filters—they liked being able to quickly set the “include” and “exclude” options on a list of about 20 genres. However, we discovered two possible problems with genre filtering. Several users commented that No. of Input Ratings System Not Enough Just Right Too Much Amazon 4-20 67% 33% 0.0% CDNow 3 67% 33% 0.0% MoodLogic ~4 45% 55% 0.0% SongExplorer 20 58% 25% 8.3% MediaUnbound 34 17% 75% 8.3% Table 1: Input Ratings (From Study 2) How users felt about number of ratings Sleeper Rating Scale Amazon Rating Scale Figure 3: Input Rating Scales for Sleeper & Amazon (Music)
hat styles do you like? ct just one Byrs. Theo Eye Bline. Ken. ROS ROCK Alternative Art Experimental Hard Rock。 Hardcore Pop/Top40 Figure 4: Genre Filtering in MediaUnbound they did not like being forced to name the single genre the if they can easily generate new sets of recommendations preferred, feeling that their tastes bridged several genres. ithout a lot of effort. Other users were unsure what exactly what kinds of music Design Suggestion: Users should not perceive the the genre represented since the systems categorization into recommendation set as dead end. This is important genres did not map to their mental models. MediaUnbound and Song Explorer, two of the music recommender systems, If they like the recommendations, then they might be faced such genre filtering problems(see Figure 4). erested in looking at more; if they dislike the Genre is a tricky thing in recommendations On the one hand recommendations, they might be interested in refining their recommender systems offer a way for users to move beyond atings in order to generate new recommendation set genre-based book/movie/music exploratin. On the otl hand, genres do work well as shorthand for a lot of likes and Information about Recommended Items dislikes of the user, and therefore help focus the The presence of longer descriptions of individual items recommendations. Over the course of the past year, we have correlates positively with both the perceived usefulness and bserved that nearly all the major recommender systems ease of use of the recommender system( Study 1). This have added a question about genre preferences indicates that users like to have more information about the Design Suggestion: Our design suggestion is to offer filter recommended item( book/movie description, author /actor like controls over genres, but to make them as simple and musician, plot summary, genre information, reviews by self-explanatory as possible. Users should be given the other users) choice of choosing more than one genre. Also a few lines of This finding was reinforced by the difference between the explanation of each genre should be provided. This will two versions of rating zone the first version of allow users to understand what kind of music /books/ Rating Zone's Quick Picks showed only the book title and movies the genre label represents. author name in the list of recommendations user evaluations were almost wholly negative as a result The second version 2)TAKING THINGS APAI UTPUT FROM THE of Rating Zone changed this situation very simply: by SYSTEM providing a link to item-specific information at Ease of Getting More Recommendations Amazon.com.Figure5showsthedifferenceinperceived stems vary in the number of sefulness between both versions of the same systems. recommendations they generate. Amazon suggests 15 items (Note: Error bars in figures 5-9 represent standard errors. in the initial set, while other sites show 10 items per screen, a different problem occurred at Movie Critic, where detailed for as many screens as the user wishes to view. Users appear information was offered but users had trouble finding it. to be sensitive to the number of recommendations. However, This was because the item information was located sey the sheer number is less impor rtant tha mouse clicks away and the site had poor navigation de generating additional sets of recommendations. Some We have noticed that users find several types of information key in making up their minds. We use music systems as an simply by rating additional items. Other systems, however, example to describe the type of information users found recommendations. Users perceive the system as easier to use Basic Item Information: This includes song, album, artists name, genre information, when album was released. Users
they did not like being forced to name the single genre they preferred, feeling that their tastes bridged several genres. Other users were unsure what exactly what kinds of music the genre represented since the system’s categorization into genres did not map to their mental models. MediaUnbound and SongExplorer, two of the music recommender systems, faced such genre filtering problems (see Figure 4). Genre is a tricky thing in recommendations. On the one hand recommender systems offer a way for users to move beyond genre-based book / movie / music exploration. On the other hand, genres do work well as shorthand for a lot of likes and dislikes of the user, and therefore help focus the recommendations. Over the course of the past year, we have observed that nearly all the major recommender systems have added a question about genre preferences. Design Suggestion: Our design suggestion is to offer filterlike controls over genres, but to make them as simple and self-explanatory as possible. Users should be given the choice of choosing more than one genre. Also a few lines of explanation of each genre should be provided. This will allow users to understand what kind of music / books / movies the genre label represents. 2) TAKING THINGS APART: OUTPUT FROM THE SYSTEM Ease of Getting More Recommendations Recommender systems vary in the number of recommendations they generate. Amazon suggests 15 items in the initial set, while other sites show 10 items per screen, for as many screens as the user wishes to view. Users appear to be sensitive to the number of recommendations. However, the sheer number is less important than the ease of generating additional sets of recommendations. Some systems permit users to modify their recommendations simply by rating additional items. Other systems, however, require the user to repeat the entire rating process to see new recommendations. Users perceive the system as easier to use if they can easily generate new sets of recommendations without a lot of effort. Design Suggestion: Users should not perceive the recommendation set as a dead end. This is important regardless of whether they like the recommendations or not. If they like the recommendations, then they might be interested in looking at more; if they dislike the recommendations, they might be interested in refining their ratings in order to generate new recommendation sets. Information about Recommended Items The presence of longer descriptions of individual items correlates positively with both the perceived usefulness and ease of use of the recommender system (Study 1). This indicates that users like to have more information about the recommended item (book / movie description, author / actor / musician, plot summary, genre information, reviews by other users). This finding was reinforced by the difference between the two versions of Rating Zone. The first version of RatingZone's Quick Picks showed only the book title and author name in the list of recommendations; user evaluations were almost wholly negative as a result. The second version of RatingZone changed this situation very simply: by providing a link to item-specific information at Amazon.com. Figure 5 shows the difference in perceived usefulness between both versions of the same systems. (Note: Error bars in figures 5 - 9 represent standard errors.) A different problem occurred at MovieCritic, where detailed information was offered but users had trouble finding it. This was because the item information was located several mouse clicks away and the site had poor navigation design. We have noticed that users find several types of information key in making up their minds. We use music systems as an example to describe the type of information users found useful. Basic Item Information: This includes song, album, artists name, genre information, when album was released. Users Figure 4: Genre Filtering in MediaUnbound