We demonstrate the promise of the proposed algorithm through an offine simulation and a comparative data analysis with other approaches Chapter 6: The Idea of Items Influence In Recommender Systems We explain the idea of items' influence in mender systems We propose a set of item influence measures, mostly based on information theory, as we focus on the goal of learning user profiles in recommender systems. Also, we show how a Loo-based measure can be used for the same goal We demonstrate the values of the approaches by performing offline analyses and give possible explanations of their performance differences We list a set of intuitive criteria for a good item influence measure to have in order to effectively learn user preferences in recommender systems >Chapter 7: Learning Preferences of New Users Chapter 7 is based in part on(Rashid et al. 2002) and in part on new work We provide an offline experimental framework to study the efficacy of the new user preference learning metho We consider a set of information-theoretic item influence measures from chapter 6 to investigate their effectiveness in learning new user preferences By studying prior user profile learning techniques, we categorize our methods and find importance of investigating these methods. An online experiment performed with more than 400 users help compare the item influence approaches in a real world setting Chapter 8: Motivating Contributions of Users Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
We propose and implement a user interface design to convey item infuence information through the concept of value-that is, more influential items are portrayed as more valuable(Rashid et al. 2006b) We demonstrate that if users are guided to evaluate items that are valuable, they are willing to evaluate more This finding has implications in designing to draw more user participation We show, by an online experiment designed using various social science theories, that users value by differing amounts who the beneficiaries of their contributions are 4 Thesis Roadmap In this section we briefly outline the organization of this thesis. Note that we have broadly divided this thesis into five parts. Part I contains the introduction and the preliminary materials, such as discussions about the recommendation algorithms used, evaluation met rics, experimental platform, datasets, and so on. Part II contains three chapters about user influence in recommender systems. In the first chapter on user influence(chapter 3),we present a set of criteria suitable for the user influence measures in recommender systems, discuss prior work on influence, and introduce a new method of influence based on the leave- one-out Loo technique. In the second chapter on user influence(chapter 4), we define and analyze a generic influence measure based on the Loo technique. In the third chapter on user influence(chapter 5), we analyze another Loo-based influence measure to find reliable early evaluators. Part Ill of this thesis contains one chapter about item influence in rec- ommender systems. The goal of item influence is set to improve recommendation accuracy Part IV contains two chapters about applications of item influcnce that involve carrying out online experiments using influence. Finally, part V has the final chapter of this thesis that describes how this work can be extended into a number of different research directions Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Chapter 2 Experimental Platform Contents 2.1 Introduction 2.2 Collaborative Filtering Algorithms Considered 2.3 Experimental Platform Datasets 21 2.4 Evaluation metrics 2.5 Summary 2.1 Introduction Our goal is to study influence as it unfolds in collaborative filtering-based recommender systems.As explained in the last chapter, we investigate influence on two Cf algorithms These two algorithms are presented here. In order to compute influence and perform offline analysis, we garner our datasets from MOVIELENS, a movie recommender that employs a cF algorithm. Further, we carry out our online experiments involving infuence on MOVIELENS We introduce MoVIELENS and describe the datasets in this chapter. Finally, since we primarily analyze the effects of influence by means of recommendation quality throughout 16 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
this thesis, we present a few evaluation metrics to assess recommendation accuracy 2.2 Collaborative Filtering algorithms Considered Breese et al( Breese, Heckerman, and Kadie 1998)divides CF algorithms into two classes model-based and memory-based. A memory-based algorithm such as USER-BASED kNN (Resnick et al. 1994)utilizes the entire database of user preferences when computing rec- ommendations. These algorithms tend to be simple to implement and require little to no training cost. They can also easily take new preference data into account. However, their online performance tends to be slow as the size of the user and item sets grow, which makes these algorithms unsuitable in large systems. One workaround is to consider only a subset of the preference data in the calculation, but doing this can reduce both recommendation quality and the number of items that can be recommended due to data being omitted from the calculation. Another workaround is to perform as much of the computation as possible in an offine setting. However, this may make it difficult to add new users to the system on a real-time basis, which is a basic necessity of most online systems. Furthermore, the storage requirements for the pre-computed data could be high An interesting memory-based CF algorithm is proposed by(Sarwar et al. 2001), which we present later in this section. It achieves scalability by leaving most of the expensive computations offine. Its offine computation involves building item-to-item relationships Since the item space is relatively more stable than the user space in today's large-scale e- commerce sites(Sarwar et al. 2001), new users do not pose a problem for its periodic offine computations. We present more about this algorithm when we describe it A model-based algorithm such as one based on Bayesian networks(Breese, Heckerman and Kadie 1998) or singular value decomposition(SVD)( Sarwar et al. 2000)computes a model of the preference data and uses it to produce recommendations. Often, the model building process is time-consuming and is only done periodically. The models are compact 17 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
and can generate recommendations very quickly. The disadvantage to model-based algo- rithms is that adding new users, items, or preferences can require recomputing the entire We next present the collaborative filtering algorithms we use in this thesis. Table 2.1 summarizes a comparative analysis of the two CF algorithms USER-BASED kNN This algorithm belongs to the memory-based class of CF algorithms. Predictions under this l gorithm are computed as a two step process. First, the similarities between the target user and all other users who have rated the target item are computed- most commonly using the Pearson correlation coefficient(Herlocker et al. 1999; Resnick et al. 1994). That is (21) V∑r(Rna-R12∑(Baa-an)2 phere I is the set of items rated by both of the users Then the prediction for the target item at is computed using at most k closest users found from step one, and by applying a weighted average of deviations from the selected users means Aum-瓦+E(m一 note that we follow a number of improvements suggested in(Herlocker et al. 1999) including significance weighting where an attempt is made to lower the similarity between wo users if they have not co-rated enough items. Approaches for significance weighting iding the similarity score with a constant(He b)by multiplying the similarity score with a Jaccard coefficient(Cosley et al. 2007). We use the first significance weighting approach in our implementations. Although approach Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission