Push-Poll Recommender System Supporting Word of Mouth Andrew Webster and Julita Vassileva Department of Computer Science University of Saskatchewan, Saskatoon SK, S7N 5C9, Canada asw292(@mail usask. ca, jiv@cs. usask.ca Abstract. Recommender systems produce social networks as a side ef- ect of predicting what users will like. However, the potential for these social networks to aid in recommending items is largely ignored. We pro- pose a recommender system that works directly with these networks to distribute and recommend items: the informal exchange of information (word of mouth communication) is supported rather than replaced. The paper describes the push-poll approach and evaluates its performance at predicting user ratings for movies against a collaborative filtering al- gorithm. Overall, the push-poll approach performs significantly better while being computationally efficient and suitable for dynamic domains (e. g recommending items from RSS feeds) 1 Introduction The main advantages of recommender systems are that recommendations are supplied on demand and are made from a massive item collection. For example, Amazon. com can be queried in the middle of the night for book/music rec- ommendations from a list of millions [1]. However, recommending an item and giving a clear explanation of why it was recommended is not forthcoming in these distinctively "black box"systems 2. We see the lack of believable explanations a symptom of a wider limitation: the control and execution of a distributed process by a central authority. Recent recommender systems tend to be placed between people as authoritative intermediaries. It appears to the user that she is engaged in a dialog with the system-not her peers-about what to view next although in fact the system may be associating her with other like-minded users in order to predict items of interest. These user-to-user associations, or conn tions, are left embedded within the system, and their full potential for improving commendations is largely ignored Early recommender systems were said to simulate the informal, verbal hange of information known as word of mouth communication 3. Revisiting this conceptualization, we propose a recommender system that directly exploit user-to-user connections as the primary method to distribute and recommend nformation items: word of mouth processes are supported rather than replaced Our approach models the implicit social networks that normally develop around C. Conati
Push-Poll Recommender System: Supporting Word of Mouth Andrew Webster and Julita Vassileva Department of Computer Science University of Saskatchewan, Saskatoon SK, S7N 5C9, Canada asw292@mail.usask.ca, jiv@cs.usask.ca Abstract. Recommender systems produce social networks as a side effect of predicting what users will like. However, the potential for these social networks to aid in recommending items is largely ignored. We propose a recommender system that works directly with these networks to distribute and recommend items: the informal exchange of information (word of mouth communication) is supported rather than replaced. The paper describes the push-poll approach and evaluates its performance at predicting user ratings for movies against a collaborative filtering algorithm. Overall, the push-poll approach performs significantly better while being computationally efficient and suitable for dynamic domains (e.g. recommending items from RSS feeds). 1 Introduction The main advantages of recommender systems are that recommendations are supplied on demand and are made from a massive item collection. For example, Amazon.com can be queried in the middle of the night for book/music recommendations from a list of millions [1]. However, recommending an item and giving a clear explanation of why it was recommended is not forthcoming in these distinctively “black box” systems [2]. We see the lack of believable explanations as a symptom of a wider limitation: the control and execution of a distributed process by a central authority. Recent recommender systems tend to be placed between people as authoritative intermediaries. It appears to the user that she is engaged in a dialog with the system–not her peers–about what to view next, although in fact the system may be associating her with other like-minded users in order to predict items of interest. These user-to-user associations, or connections, are left embedded within the system, and their full potential for improving recommendations is largely ignored. Early recommender systems were said to simulate the informal, verbal exchange of information known as word of mouth communication [3]. Revisiting this conceptualization, we propose a recommender system that directly exploits user-to-user connections as the primary method to distribute and recommend information items: word of mouth processes are supported rather than replaced. Our approach models the implicit social networks that normally develop around C. Conati, K. McCoy, and G. Paliouras (Eds.): UM 2007, LNAI 4511, pp. 278–287, 2007. c Springer-Verlag Berlin Heidelberg 2007
Push-Poll Recommender System 279 shared topics of interests, or channels. Recommendation is achieved by two terrelated processes: push and poll. Push seeds an item into the social network associated with the channel the item is most appropriate for. The item then spreads according to diffusion of innovation models 4. Poll queries adjacent sers whether the item should be activated (i.e. recommended) for the current user,given a certain activation threshold. Feedback from users reshapes the net- work, affecting the spread and activation of subsequent items. The next section reviews related work on social networks and how information Hows in them. The proposed approach, push-poll, is described in Section 4. In Section 5, its performance is evaluated against a common collaborative filtering algorithm(which is reviewed in Section 3). We conclude in Section 6 that there are immediate advantages from directly working with social networks 2 Related work Users' activity within a recommender system has been acknowledged as"induc ing an implicit social network and influences the connectivities in this network" 5. Social networks [6 are conceptualized as graphs that represent people and the relationships between them as nodes and edges, respectively. edges, or connec- tions, traditionally denote the existence of social relationships(e.g. friendship but can also indicate more general relationships, e. g. users who have shared i terests. Thus, recommender systems produce social networks, i.e. user-to-user connections, as a consequence of predicting user-to-item connections It has been emphasized that the recommendation process naturally involves bringing people together and how these connections are determined is a signifi cant, but neglected, aspect of recommender systems research. User modeling, either direct(e.g. using explicit input like item ratings)or indirect(e. g data min- ing e-mail logs), and the computed similarity between user models was seen as the primary means to obtain social networks that are exploitable by the recom- mendation process either through structural analysis or by embedding additional information into connections between users. For an example of the former ap- proach, recommender systems in general were evaluated in light of the network structure created between users under certain conditions 5]. One condition that was analyzed was the minimum number of shared items users must rate in order to all be connected together(identifying this number would help strike a bal- ance between ensuring good recommendations and not alienating with too luch work). For an example of the latter approach, explicit indication of trust between users was collected, embedded into the computed social network, and used to generate improved movie recommendations 8 The study of information propagation through social networks is another re- lated area of research. The spread and adoption of social innovations within real-world communities 9 is of particular relevance as ensuing models can be applied to online environments. For example, a model for the spread of discus- ion topics in weblogs, or blogs, is presented in 10 and the identification of
Push-Poll Recommender System 279 shared topics of interests, or channels. Recommendation is achieved by two interrelated processes: push and poll. Push seeds an item into the social network associated with the channel the item is most appropriate for. The item then spreads according to diffusion of innovation models [4]. Poll queries adjacent users whether the item should be activated (i.e. recommended) for the current user, given a certain activation threshold. Feedback from users reshapes the network, affecting the spread and activation of subsequent items. The next section reviews related work on social networks and how information flows in them. The proposed approach, push-poll, is described in Section 4. In Section 5, its performance is evaluated against a common collaborative filtering algorithm (which is reviewed in Section 3). We conclude in Section 6 that there are immediate advantages from directly working with social networks. 2 Related Work Users’ activity within a recommender system has been acknowledged as “inducing an implicit social network and [influences] the connectivities in this network” [5]. Social networks [6] are conceptualized as graphs that represent people and the relationships between them as nodes and edges, respectively. Edges, or connections, traditionally denote the existence of social relationships (e.g. friendship) but can also indicate more general relationships, e.g. users who have shared interests. Thus, recommender systems produce social networks, i.e. user-to-user connections, as a consequence of predicting user-to-item connections. It has been emphasized that the recommendation process naturally involves bringing people together and how these connections are determined is a signifi- cant, but neglected, aspect of recommender systems research [7]. User modeling, either direct (e.g. using explicit input like item ratings) or indirect (e.g. data mining e-mail logs), and the computed similarity between user models was seen as the primary means to obtain social networks that are exploitable by the recommendation process either through structural analysis or by embedding additional information into connections between users. For an example of the former approach, recommender systems in general were evaluated in light of the network structure created between users under certain conditions [5]. One condition that was analyzed was the minimum number of shared items users must rate in order to all be connected together (identifying this number would help strike a balance between ensuring good recommendations and not alienating users with too much work). For an example of the latter approach, explicit indication of trust between users was collected, embedded into the computed social network, and used to generate improved movie recommendations [8]. The study of information propagation through social networks is another related area of research. The spread and adoption of social innovations within real-world communities [9] is of particular relevance as ensuing models can be applied to online environments. For example, a model for the spread of discussion topics in weblogs, or blogs, is presented in [10] and the identification of a
80 A. Webster and. vassileva minimal set of people whose adoption of a new product would maximize the spread of that product through the given social network is described in 11 3 Collaborative Filtering We begin with a brief overview of the collaborative filtering(CF)algorithm used to evaluate the performance of push-poll in Section 5. CF operates on the user-item matriT, R, where entry re s indicates the rating score user c E Cm) has given item s E[s1, $2, .., Sn). Each row represents all rat- ings a particular user has made, and each column represents all ratings a par- ticular item has collected. Often, rating scores follow a numerical scale(e.g. 1 to 5 stars)and are explicit, but they also may be inferred from item purchases and other implicit user actions [12. The ultimate goal is to predict the score of mpty cells for the active user, the user currently requesting recommendations CF algorithms are divided into two categories: memory-based and model-based. We focus on a memory-based algorithm because of its wide use and satisfactory prediction accuracy. For a review of CF, we refer to [13 3.1 Memory-Based Algorithm Memory-based CF algorithms rely on exploiting gaps within the user-item ma- trix. The intuition is that users who have similar preferences will generally rate items in a similar manner. Therefore. if the active user c has not rated item s but the recommender system can find similar or correlated users (i.e. neighbours) who have, then a rating score can be predicted using(1) re;=7e+k∑sim(c,()×(res-ie) sim(c, c) ∑ses(rs-i)2∑s∈s(r6s-Fe)2 C is the set of neighbours for the active user and implies there are some number of rated items in common between the active user and each neighbour. Users tend to use ratings scales differently. For example, on a l to 5 rating scale, the active user may seldom rate l or 5 while a neighbour only rates 1 and 5. Therefore, the average rating of the active user and current neighbour (Fc and Fe, respectively) e used to smooth out this inconsistency The Pearson coefficient(2) correlates the degree of similarity sim(c, e) be ween two users where Sce is the set of items both users have rated in common The degree of similarity ranges from-1 (perfect negative correlation)to +1(per- fect positive correlation). The similarity value is then used by equation(1)as the impact weight each neighbour has in determining the final predicted value (typically the N most similar neighbours are used ). Thus, a neighbour with a
280 A. Webster and J. Vassileva minimal set of people whose adoption of a new product would maximize the spread of that product through the given social network is described in [11]. 3 Collaborative Filtering We begin with a brief overview of the collaborative filtering (CF) algorithm used to evaluate the performance of push-poll in Section 5. CF operates on the user-item matrix, R, where entry rc,s indicates the rating score user c ∈ {c1, c2,...,cm} has given item s ∈ {s1, s2,...,sn}. Each row represents all ratings a particular user has made, and each column represents all ratings a particular item has collected. Often, rating scores follow a numerical scale (e.g. 1 to 5 stars) and are explicit, but they also may be inferred from item purchases and other implicit user actions [12]. The ultimate goal is to predict the score of empty cells for the active user, the user currently requesting recommendations. CF algorithms are divided into two categories: memory-based and model-based. We focus on a memory-based algorithm because of its wide use and satisfactory prediction accuracy. For a review of CF, we refer to [13]. 3.1 Memory-Based Algorithm Memory-based CF algorithms rely on exploiting gaps within the user-item matrix. The intuition is that users who have similar preferences will generally rate items in a similar manner. Therefore, if the active user c has not rated item s, but the recommender system can find similar or correlated users (i.e. neighbours) who have, then a rating score can be predicted using (1). rc,s = ¯rc + k c´∈Cˆ sim(c, ´c) × (r´c,s − ¯r´c) (1) sim(c, ´c) = s∈Sc´c (rc,s − ¯rc)(r´c,s − ¯r´c) s∈Sc´c (rc,s − ¯rc)2 s∈Sc´c (r´c,s − ¯r´c)2 (2) Cˆ is the set of neighbours for the active user and implies there are some number of rated items in common between the active user and each neighbour. Users tend to use ratings scales differently. For example, on a 1 to 5 rating scale, the active user may seldom rate 1 or 5 while a neighbour only rates 1 and 5. Therefore, the average rating of the active user and current neighbour (¯rc and ¯rc´, respectively) are used to smooth out this inconsistency. The Pearson coefficient (2) correlates the degree of similarity sim(c, ´c) between two users where Scc´ is the set of items both users have rated in common. The degree of similarity ranges from -1 (perfect negative correlation) to +1 (perfect positive correlation). The similarity value is then used by equation (1) as the impact weight each neighbour has in determining the final predicted value (typically the N most similar neighbours are used). Thus, a neighbour with a
Push-Poll Recommender System 281 similarity value I will have a large influence in moving the predicted score to- wards her (relative) rating. Finally, k is a normalizing factor and is the inverse summation of the absolute similarity value An advantage of CF is that nothing needs to be known about the items, for items that are difficult to analyze(e. g. video) this is clearly beneficial. However, cf performs poorly under sparse rating information, especially for new items and does not scale well [14 4 Push-Poll Approach Push-poll is envisioned to operate in a massive, highly dynamic environment and a potential application is recommending items from Rich Site Summary Ss) feeds 15. RSS is a popular method to publish content to the Web and is often used by blogs and news services to alert subscribers to new entries. RSS items follow well-known XML formats and usually include a headline, a shor description, and a URL to the full item of interest. The breadth of topics and overwhelming number of feeds presents an exciting challenge for a recommender ystem that must manage many new and diverse items per day. Our objective is to treat recommendation as an intuitive process that results from user interaction and follows how information propagates by word of mouth in the real world. We achieve this by modeling(inferring, representing explicitly and maintaining)social networks centred on specific interests and"shepherding relevant items through these networks 4.1 Interest-Based Channels A dedicated process looks for new items by routinely cycling through a list of RSS feeds(determined by developers and /or users). Once detected, new items re analyzed and separated into channels according to their content. Thus, push- poll is a hybrid recommender system [16 as it combines CF with content-based We suggest that RSS items require only a trivial amount of content-based analysis in order to be classified. So-called collaborative tagging systems [17] demonstrate successful item classification by having users provide manual clas- sification through a set of freely-chosen keywords, or tags, rather than relying on automated analysis or domain experts. The reoccurrence of certain tags points to a consensus regarding the items content. Term extraction of the RSs items headline and description would enable a rough guess as to what channel(s) the item initially "fits"into-a channel is simply represented as a unique set of tags The overlap between the item's tags and the channels tags defines the potential fit of the item. Later, tagging by users would overrule the systems tag set and trigger a re-examination, possibly causing the item to be introduced into other channels The organization of channels inherits the flexibility of collaborative tagging and allows users to freely define their interests and the scope of their interests
Push-Poll Recommender System 281 similarity value 1 will have a large influence in moving the predicted score towards her (relative) rating. Finally, k is a normalizing factor and is the inverse summation of the absolute similarity values. An advantage of CF is that nothing needs to be known about the items, for items that are difficult to analyze (e.g. video) this is clearly beneficial. However, CF performs poorly under sparse rating information, especially for new items and users, and does not scale well [14]. 4 Push-Poll Approach Push-poll is envisioned to operate in a massive, highly dynamic environment, and a potential application is recommending items from Rich Site Summary (RSS) feeds [15]. RSS is a popular method to publish content to the Web and is often used by blogs and news services to alert subscribers to new entries. RSS items follow well-known XML formats and usually include a headline, a short description, and a URL to the full item of interest. The breadth of topics and overwhelming number of feeds presents an exciting challenge for a recommender system that must manage many new and diverse items per day. Our objective is to treat recommendation as an intuitive process that results from user interaction and follows how information propagates by word of mouth in the real world. We achieve this by modeling (inferring, representing explicitly and maintaining) social networks centred on specific interests and “shepherding” relevant items through these networks. 4.1 Interest-Based Channels A dedicated process looks for new items by routinely cycling through a list of RSS feeds (determined by developers and/or users). Once detected, new items are analyzed and separated into channels according to their content. Thus, pushpoll is a hybrid recommender system [16] as it combines CF with content-based analysis. We suggest that RSS items require only a trivial amount of content-based analysis in order to be classified. So-called collaborative tagging systems [17] demonstrate successful item classification by having users provide manual classification through a set of freely-chosen keywords, or tags, rather than relying on automated analysis or domain experts. The reoccurrence of certain tags points to a consensus regarding the item’s content. Term extraction of the RSS item’s headline and description would enable a rough guess as to what channel(s) the item initially “fits” into–a channel is simply represented as a unique set of tags. The overlap between the item’s tags and the channel’s tags defines the potential fit of the item. Later, tagging by users would overrule the system’s tag set and trigger a re-examination, possibly causing the item to be introduced into other channels. The organization of channels inherits the flexibility of collaborative tagging and allows users to freely define their interests and the scope of their interests
82 A. Webster and. vassileva For example, consider a user interested in foreign policy. She may create a chan nel with a general interpretation using the tag set foreign, policy. Or, she may desire a narrower focus and conjoin (American, middleeast to the previous set 4.2 Push(Diffusion) After a new item has been matched to a channel, it is seeded into the corre- sponding network of users who subscribe to that channel (i. e, one user may create a public channel that is shared with others). For now, we assume this network already exists. Users are represented as nodes and a directed edge be- tween nodes describes that one user influences another with a specific strength. Influence strength is the edge weight(ranging from -1 to +1)between a pair of ser nodes and is related to the similarity between the users(Section 3.1). It determines how items will propagate through the social networks as explained by the Independent Cascade model [18 that captures the probability a person will chose to adopt an item depending on how many of her social contacts have Iready adopted it(note, the item could be a new hairstyle, gadget, etc. We use the Independent Cascade model to spread items across the network nodify the terminology to illustrate that users have no voluntary control over whether they adopt an item or not. Instead of"adopting an item, a user infected with it, and infection is a condition where the item becomes a candidate for activation(Section 4.3). For each new RSS item, some users are targeted to be initial seeds, i. e the nodes that are automatically infected. We suggest some criteria for determining a potential seed: the user provides quick feedback(e.g rates often)and acts as an authority (i. e, exerts strong, direct influence on many users). However, we leave seed determination as future work At the start of the push process, all seed nodes try to infect their"contacts or neighbour nodes(i.e. the nodes at the end of outgoing edges), with the item. A seed node u infects a neighbour node v with probability pu g-the absolute value of the influence strength from node u to u Infected nodes have a single attempt that will either succeed or fail at infecting a neighbour node. Success or failure is independent of all previous attempts to infect the node in question. Not this assumption is relaxed in the General Cascade model [11]. After the seed nodes cannot induce any new infections, all newly infected nodes try to infect their neighbours, and this breadth-first cycle repeats until no new infections are possible. Ultimately, depending on their direct /indirect connections to seed users. some users in the network will be infected while others will not 4.3 Poll(Activation) If a user is infected with an item, the item is left in the users respective chan- nel queue. Poll is the process that ultimately activates(i. e. recommends)these queued items, and it is based on the Threshold Model of collective behaviour 19. The model is similar to Independent Cascade and describes that node u has an intrinsic threshold level 0v, s E [0, 1 for adopting item s and a set of contacts
282 A. Webster and J. Vassileva For example, consider a user interested in foreign policy. She may create a channel with a general interpretation using the tag set {foreign, policy}. Or, she may desire a narrower focus and conjoin {American, middleeast} to the previous set. 4.2 Push (Diffusion) After a new item has been matched to a channel, it is seeded into the corresponding network of users who subscribe to that channel (i.e., one user may create a public channel that is shared with others). For now, we assume this network already exists. Users are represented as nodes and a directed edge between nodes describes that one user influences another with a specific strength. Influence strength is the edge weight (ranging from -1 to +1) between a pair of user nodes and is related to the similarity between the users (Section 3.1). It determines how items will propagate through the social networks as explained by the Independent Cascade model [18] that captures the probability a person will chose to adopt an item depending on how many of her social contacts have already adopted it (note, the item could be a new hairstyle, gadget, etc.). We use the Independent Cascade model to spread items across the network but modify the terminology to illustrate that users have no voluntary control over whether they adopt an item or not. Instead of “adopting” an item, a user is infected with it, and infection is a condition where the item becomes a candidate for activation (Section 4.3). For each new RSS item, some users are targeted to be initial seeds, i.e. the nodes that are automatically infected. We suggest some criteria for determining a potential seed: the user provides quick feedback (e.g. rates often) and acts as an authority (i.e., exerts strong, direct influence on many users). However, we leave seed determination as future work. At the start of the push process, all seed nodes try to infect their “contacts”, or neighbour nodes (i.e. the nodes at the end of outgoing edges), with the item. A seed node u infects a neighbour node v with probability pu,v–the absolute value of the influence strength from node u to v. Infected nodes have a single attempt that will either succeed or fail at infecting a neighbour node. Success or failure is independent of all previous attempts to infect the node in question. Note, this assumption is relaxed in the General Cascade model [11]. After the seed nodes cannot induce any new infections, all newly infected nodes try to infect their neighbours, and this breadth-first cycle repeats until no new infections are possible. Ultimately, depending on their direct/indirect connections to seed users, some users in the network will be infected while others will not. 4.3 Poll (Activation) If a user is infected with an item, the item is left in the user’s respective channel queue. Poll is the process that ultimately activates (i.e. recommends) these queued items, and it is based on the Threshold Model of Collective Behaviour [19]. The model is similar to Independent Cascade and describes that node v has an intrinsic threshold level θv,s ∈ [0, 1] for adopting item s and a set of contacts