当前位置：和泉文库 > 计算机 > 浏览文档

《电子商务 E-business》阅读文献：Meeting user information needs in recommender systems

文件格式：PDF，文件大小：10.89MB，售价：50.1元

文档详细内容（约254页）

If we move into a non-entertainment domain, e.g. the domain of peer-reviewed research papers, we claim the decision of whether to consume the item becomes more complex. While taste may be one component, other non-taste criteria also must be satisfied. For example, if I am looking to add references to a paper I am writing, high quality paper recommendations from a different research area will not help me, independent of how much I might enjoy reading them. My external, non-taste criteria places constraints on what I choose to consume. Until this point, recommenders have dealt with one criterion at a time, historically, a taste-based criterion. Users may have any number of criteria when visiting a recommender system, independent of domain. For example, a user of a movie recommender may be limited to items released on VHS tape To understand context, recommenders need to generate recommendations based on all of user's criteria. We claim these external criteria are representations of the users information seeking task. Thus, as the number and/or importance of these criteria increase, the more important our thesis becomes. If we want to tailor recommendations o a user's information seeking task, we first need to understand what these criteria are for a given user in a given domain The Curse of Accuracy Researchers and practioners have long used accuracy to judge the goodness of recommender algorithms. Many metrics have been proposed and used to measure accuracy, including ROC curves[55], modifications to precision and recall [131 Breese's Half-Life metric [13], and, most commonly used, Mean Absolute Error (MAe) 13, 55, 135]. For example, [51] provides an analysis of 432 variants of User-USer Collaborative Filtering algorithm run against ll accuracy metrics. While accuracy is an mportant component of a recommender algorithm, focusing solely on it leads to two different problems The first problem comes from the way accuracy-centric analysis views the ecommendation process. At a high level, this s process is one where a user, either with or without establishing a user model, makes a request of the recommender in the form of a basket of items and ratings. The recommender algorithm performs a computation based Reproduced with permission of the copyright owner. Further reproduction prohibited without permission

on this basket and returns a recommendation list to this user. In this setting, each request sent to the recommender is an independent event done in isolation of all other recommendation events. While this may be true for the algorithm itself, it is not true for e user Users see each recommendation in the context of other recommendations -in a list. Independently good recommendations may create a bad recommendation list. For example, if we assume that Tolkien,'s Lord of the Rings would be a good book recommendation, would a list containing ten different editions of that book(e. g hardcover,paperback, split into three volumes, combined into one volume, etc. )be a d recommendation list? As we argue in Chapter 4, we believe recommendation list need to be evaluated as a single recommendation entity. Moreover, the recommendation process is iterative. One recommendation list is evaluated in the context of previous lists he user has seen. The user will have different opinions of a consistent algorithm compared to an inconsistent one. a metric that judges independent events,even recommendation lists, will miss this temporal component of the recommendation process The fact that users return to recommenders is an important aspect of recommendation process The second problem is an artifact of the standard approach used to measure accuracy. The leave-n-out methodology [131, works by splitting collected ratings data into test and train datasets, removing n ratings from each test line, and recording how well the recommender can predict back the removed ratings. In essence, you hide a portion of the data and check how well the recommender reconstructs it. Leave-n-out is commonly used in machine learning to test the accuracy of classification algorithms [971 If we approach leave-n-out from the user's perspective, it is equivalent to recommending items the user has already rated. This is not as useful as it could be. For example, when looking for recommendations on new places to visit on vacation, a travel guide book only containing information on places you have visited before is not helpful Moreover, if the guide recommended some places you've never been--say Beijing and Prague--then the leave-n-out methodology penalizes it for failing to recommend instead 6 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission

the places you have been, even if you'd like Beijing better than Boston or Prague better than Paris. Even if the book correctly ordered the places such that your favorites were first, this book is useless for planning a vacation. Accuracy metrics which reward algorithms for generating recommendation of withheld items are not measuring how well an algorithm performs on the task users care about: generating recommendations for items they have not seen The problem is rooted in the difference between a classifier and a recommender. Classifiers segment spaces into their most probable classes. In recommendation terms,a classifier would find, and recommend, the items the user is most like to rate next. while these items would have high"ratability"for that user, there is no guarantee that these items will help users with their information seeking tasks. For instance, an online music store used a User-based collaborative filtering algorithm to generate recommendations The most common recommendation was for the Beatle's"White Album". From an accuracy perspective, these recommendations were dead-on: most users like that album ery much. From a usefulness perspective, though, the recommendations were a complete failure: every user either already owned the "White Album", or had specificall y chosen not to own it. Even though it was highly ratable, White Album recommendations were almost never acted on by users, because they added almost no value In order to tailor recommendation lists not only to a user, but to a user information seeking task, we will need to judge recommender algorithms using a variety of metrics, each of which measure a different property of that algorithm and each of which correspond to different aspects of a user's information seeking task. When we have that information, we can select and tune the appropriate algorithm(s)for each user and task. In short, we need to re-think how to generate a'good'recommendation list Building Bridges In Figure 1-1, we show the current state of recommender systems, the state of the world before this dissertation. Between the user and the recommender is a space we call the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission

点击进入文档下载页（PDF格式）

共254页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录