Contents Dedication Abstract List of tables List of Figures vlll I Prologue 1.1 The Problem Domain: Recommender Systems Collaborative Filtering 1.2 Recommender Systems and Influence 1. 3 Contributions 1. 4 Thesis Roadmap 2 Experimental Platform 2.1 Introduction 16 2.2 Collaborative Filtering Algorithms Considered 2.3 Experimental Platform Datasets 2. 4 Evaluation metrics II Influence of Users 3 The Idea of Users?Influence in Recommender Systems 3.1 Introduction 3.2 Principles of Influence Measures 3.3 Influence Based on Earlier Work Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
3.3.1 Authority 3.3.2 Centrality-Based Measures 3.3.3 ELPNETWORK VALUE 3.3.4Di 3.4 Proposed Approach: Loo-Based Infuence 3.4. 1 Computing LOO-based measures on USER-BASED kNN 3.5 Summary 4 ENIPD: An Algorithm-Independent Measure of Influence 4.2 ENIPD Idea 4.3 Computing ENIPD from Data 4.3. 1 Computing ENIPD on any CF Algorithm 4.4 Qualitative Factors Potentially Affecting ENIPD 4.5 Building Predictive Models of ENIPD 4.5.2 Results 4.5.2.1 Predictive Performance 4.5.2.2 Relationship Between ENIPD and the factors 4.6 Discussion 4.6.1 Dependence of ENIPD on the CF Algorithm 4.6.2 Applications of ENIPD 4.6.2.1 Reducing Model siz 4.6.2.2 Improving Coverage 4.6.2.3 Enhancing User Participa 64 4.7S 5 ENSI: An Influence Measure to Find Early Evaluators 5.1 Introduction 5.2 ENSI: Influence by Reliable 5.3 Selecting Influencers for Early Evaluation 5.4 Empirical Study 72356 5.4.1 Preparing D 5.4.2 Other approaches compared 5.4.3 Results 1v Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
5.4.4 Discussion 5.5 Summar III Influence of Items 6 The Idea of Item Influence in Recommender Systems 6.2 Approaches for Computing Item Infuence 6.2. 1 POPULARItY 622EN 888889 6.2.3 Information Theoretic Measures 6.2.3.1 Entropy 6.2.3.2 ENTROPYO: Entropy Considering Missing values 6.2.3.3 HELF: Harmonic mean of Entropy and Logarithm of Frequency 95 6.2.3.4 IGCN: Information Gain through Clustered Neighbors 6. 3 Empirical Study 6.3.1 Procedure 102 6.3.1.1 Preparing Data 6.3. 1.2 Computing influence measures from data 6.3.1.3 Evaluation 6.3.2 Results 6.4 Discussion 6.5 Summary 113 Iv Online Experiments with Influence 114 7 Learning Preferences of New Users 115 7.1 Introduction 7.2 Offline Experiments 7.2.1 Data 7.2.2 Procedure 2.3 Results 7. 2.4 Discussion 128 7.3 Online Experiments 128 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
7.3.1 Design 7. 3.2 Results and discussio 7. 4 Summary 133 8 Motivating Contributions of Users 8.1 Introduction 8.2 Research Questions 137 Experimental Design 8.3.1 Experimental Groups 8.3.2 The Experiment 8.3.3 Hypotheses 140 8.3.4 Post-Survey 8.4 Methods 8. 4.2 Movie list 8.4.3 Item value 8.5 Results and discussion 148 8.6 Summary 150 v Epilogue 152 153 9. 1 Future research directions 9.1.1 Attacks on Recommender Systems 9.1.2 Early Evaluations by Influencers 9.1.3 Recommendation Accuracy 9. 1.4 Interface Issues 9.1.5 Diffusion of Influence 9. 1.6 Other Collaborative Filtering Algorithms 160 Bibliography 161 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
List of tables 2. 1 Comparison between USER-BASED kNN and ITEM-BASED knn CF algorithms. 19 2.2 Properties of the datasets 3.1 Influence measures compared against influence principles 4.1 Relationship between ENIPD and #of ratings of the users 4.2 Squared correlation coefficient between the actual values and predicted values of EniD 4.3 Weights of the features near SvM modeling 6.1 Showing a limitation of entropy 6.2 ENTROPYO computation of the item a, which has been voted 200 times and the votes are uniformly distributed across the rating rating-scale of (1-5) 6. 3 ENTROPYO computation of the item b, which has been voted 3, 000 times and the votes are unanimously 5 6.4 Effect of applying a log transformation to items'rating frequency 6.5 Average percentage of overlapped items between the pairs of item-infuence measures 6.6 Properties of the user-specific top 20 items by various item-influence measures 110 7.1 Table of notations 7. 2 Showing the strengths and weaknesses of the item influence approaches in 127 7. 3 Group-wise participations of the subjects 130 7.4 Effectiveness of the learned user profiles according to th racy of the initial recommendations on two CF algorithms 8. 1 Contrasts testing the four hypotheses Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission