Expert Systems with Applications 38(2011)5101-5109 Contents lists available at ScienceDirect Expert Systems with Applications ELSEVIER journalhomepagewww.elsevier.com/locate/eswa Utilizing various sparsity measures for enhancing accuracy of collaborative recommender systems based on local and global similarities Deepa Anand", Kamal K Bharadwaj School of Computer and Systems Sciences, Jawaharal Nehru University, New Delhi 110 067, India ARTICLE INFO A BSTRACT Collaborative filtering is a popular recommendation technique, which suggests items ing past user-item interactions involving affinities between pairs of users or items. In spite of th uccess they suffer from a range of problems, the most fundamental being that of data sparsity. w heir buge ting matrix is sparse, local similarity measures yield a poor neighborhood set thus affecting the recom- Sparsity measures mendation quality. In such cases global similarity measures can be used to enrich the neighborhood set y considering transitive relationships among users even in the absence of any common experiences. In his work we propose a recommender system framework utilizing both local and global similarities, tak- ng into account not only the overall sparsity in the rating data, but also sparsity at the user-item level. Several schemes are proposed, based on various sparsity measures pertaining to the active user, for the estimation of the parameter a, that allows the variation of the importance given to the global user sim- clarity with regards to local user similarity. Furthermore, we propose an automatic scheme f he various sparsity measures, through evolutionary approach, to obtain a unified measure of sparsity (UMS). In order to take maximum possible advantage of the various sparsity measures relating to an ctive user, a scheme based on the UMs is suggested for estimating a. Experimental results demonstrate that the proposed estimates of a, markedly, outperform the schemes for which a is kept constant across all predictions(fixed- schemes), on accuracy of predicted rating e 2010 Elsevier Ltd. All rights reserved. 1 Introduction Nichols, Oki, Terry, 1992)is the automation of"word of mouth A The explosive growth of the web has led to the problem of infor- (Shardanand Maes, 1995), where opinions gleaned from people, mation overload'-the overwhelming plethora of choices and op- who share similar tastes as the active user, is used in the decision tions available to a user, often varying in quality. The need for a making process. It is based on the assumption that users who solution to this abundance of information and the drive to bridge have agreed in the past tend to agree in the future. the greatest the gap between the vendor and customer in e-commerce has led strength of collaborative techniques is that they are completely to popularity of Web personalization Recommender systems are independent of any machine-readable representation of the the most notable application of Web Personalization. Recom- objects being recommended, and work well for complex objects mender systems are personalization tools which enable users to such as music and movies where variations in taste are responsible be presented information suiting his interests, which are novel, ser- for much of the variation in preferences( Burke, 2002). endipitous and relevant, without being explicitly asked for, thus Collaborative filtering algorithms can be classified as memory- supporting discovery" rather than search". Recommender based or model-based algorithms( Breese, Heckerman, Kadie, systems have become ubiquitous, with their presence everywhere 1998). Memory-based algorithms(Candillier, Meyer, Fessant from recommending books, CDs (Amazon. com, Linden, Smith, 2008: Russell Yoon, 2008)are heuristics based algorithms, which York2003).music[last.fm(www.last.fm),movies[mOvielensutilizetheentireratinghistorytoarriveatpredictionsThesein- (www.movielens.umn.Edu)torecommendinghighriskproductscludethecommonlyimplementedclassofuser-basedanditem such as mutual funds and vacations based CF methods. Model-based recommender systems(Al-Shamri Among the different types of recommender systems, et al 2007: Bell, Koren, Volinsky, 2007)build a user-model in an collaborative filtering is the most widely used and effective off-line learning phase and then apply this model on-line for rec- ommendation. The accuracy offered by memory-based RS, since Corresponding author. Tel. +91 11 9810253296: fax: +91 11 26717528. they examine the entire rating database for prediction, and their E-mailaddress:deepanand209@gmail.com(D.Anand). simplicity, lend to their popularity 0957-4174/s- see front matter o 2010 Elsevier Ltd. All rights reserved doi:10.1016/eswa2010.09141
Utilizing various sparsity measures for enhancing accuracy of collaborative recommender systems based on local and global similarities Deepa Anand ⇑ , Kamal K. Bharadwaj School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi 110 067, India article info Keywords: Collaborative filtering Recommender systems Similarity measures Sparsity measures abstract Collaborative filtering is a popular recommendation technique, which suggests items to users by exploiting past user-item interactions involving affinities between pairs of users or items. In spite of their huge success they suffer from a range of problems, the most fundamental being that of data sparsity. When the rating matrix is sparse, local similarity measures yield a poor neighborhood set thus affecting the recommendation quality. In such cases global similarity measures can be used to enrich the neighborhood set by considering transitive relationships among users even in the absence of any common experiences. In this work we propose a recommender system framework utilizing both local and global similarities, taking into account not only the overall sparsity in the rating data, but also sparsity at the user-item level. Several schemes are proposed, based on various sparsity measures pertaining to the active user, for the estimation of the parameter a, that allows the variation of the importance given to the global user similarity with regards to local user similarity. Furthermore, we propose an automatic scheme for weighting the various sparsity measures, through evolutionary approach, to obtain a unified measure of sparsity (UMS). In order to take maximum possible advantage of the various sparsity measures relating to an active user, a scheme based on the UMS is suggested for estimating a. Experimental results demonstrate that the proposed estimates of a, markedly, outperform the schemes for which a is kept constant across all predictions (fixed-a schemes), on accuracy of predicted ratings. 2010 Elsevier Ltd. All rights reserved. 1. Introduction The explosive growth of the web has led to the problem of ‘information overload’-the overwhelming plethora of choices and options available to a user, often varying in quality. The need for a solution to this abundance of information and the drive to bridge the gap between the vendor and customer in e-commerce, has led to popularity of Web personalization. Recommender systems are the most notable application of Web Personalization. Recommender systems are personalization tools which enable users to be presented information suiting his interests, which are novel, serendipitous and relevant, without being explicitly asked for, thus supporting ‘‘discovery” rather than ‘‘search”. Recommender systems have become ubiquitous, with their presence everywhere from recommending books, CDs (Amazon.com, Linden, Smith, & York, 2003), music [last.fm(www.last.fm), movies [MovieLens (www.MovieLens.umn.edu)] to recommending high risk products such as mutual funds and vacations. Among the different types of recommender systems, collaborative filtering is the most widely used and effective recommendation technique. Collaborative filtering (Goldberg, Nichols, Oki, & Terry, 1992) is the automation of ‘‘word of mouth” (Shardanand & Maes, 1995), where opinions gleaned from people, who share similar tastes as the active user, is used in the decision making process. It is based on the assumption that users who have agreed in the past tend to agree in the future. The greatest strength of collaborative techniques is that they are completely independent of any machine-readable representation of the objects being recommended, and work well for complex objects such as music and movies where variations in taste are responsible for much of the variation in preferences (Burke, 2002). Collaborative filtering algorithms can be classified as memorybased or model-based algorithms (Breese, Heckerman, & Kadie, 1998). Memory-based algorithms (Candillier, Meyer, & Fessant, 2008; Russell & Yoon, 2008) are heuristics based algorithms, which utilize the entire rating history to arrive at predictions. These include the commonly implemented class of user-based and itembased CF methods. Model-based recommender systems (Al-Shamri et al., 2007; Bell, Koren, & Volinsky, 2007) build a user-model in an off-line learning phase and then apply this model on-line for recommendation. The accuracy offered by memory-based RS, since they examine the entire rating database for prediction, and their simplicity, lend to their popularity. 0957-4174/$ - see front matter 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2010.09.141 ⇑ Corresponding author. Tel.: +91 11 9810253296; fax: +91 11 26717528. E-mail address: deepanand209@gmail.com (D. Anand). Expert Systems with Applications 38 (2011) 5101–5109 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
D. Anand, KK Bharadwaj/ Expert Systems with Applications 38(2011)5101-5109 parity of ratings data is one of the primary challenges to CF proposed in literature and then examine the various local and glo- systems. The large number of users and items in any e-commerce bal similarity measures. recommendation system restricts the number of items rated by any user to a small subset of the total number of items. Effective 2.1. Sparsity prediction of ratings from a small number of examples is important (Adomavicius Tuzhilin, 2005). Local similarity measures such a Several model-based techniques based on dimensionality the Perason similarity measure (Resnick, lacovou, Suchak, reduction techniques such as Svd(Billsus Pazzani, 1998: Bergstrom, Riedl, 1994)and Cosine similarity measure(Sarwar, Rosenstein Lochbaum, 2000)and Latent Semantic Indexing Karypis, Konstan, Riedl, 2001), base the similarity computation(Sarwar, Karypis, Konstan, Riedl, 2000) have been applied in step between two users, on the set of common items rated by both der to alleviate the sparsity problem (Suryavanshi, Shiri, Mudur, users. In the presence of sparseness in the data, many users do not 2005)propose a two level model-based technique, employing rela share common items, thus rendering CF useless. Even if users are tional fuzzy subtractive clustering and association rules mining at teemed similar to other users, the similarity might be based on a the first and second levels respectively, to produce recommenda- small set of common experiences. The user must rate a sufficient tions which are less prone to effects of sparsity. To aid computation umber of items to get a reasonable profile overlap in order for of comparability between users preferences, (Al-Shamri the system to establish his neighbors. The poor quality of recom- Bharadwaj, 2008) proposed building a concise and re presentative mendations at high sparsity levels is established(Grcar, Mladenic, hybrid user model, which enables quantification of user closeness Fortuna, Grobelnik, 2006). irrespective of any shared experiences, and thus aid predictions Over the years, a variety of solutions to the data sparsity prob- which are accurate in the presence of sparsity lem, have been proposed (Luo, Niu, Shen, Ullrich, 2008)propose Hybridization of collaborative filtering with other filtering a novel approach of combining predictions from locally similar mechanisms is another way to help solve the problem of data spar neighbors, ie users who have Co-rated items, and globally similar sity. Other data sources such as the item contents(Melville, eighbors. Two users are considered to be globally similar if they Mooney, Nagarajan, 2002: Popescul, Ungar, Pennock, are connected through locally similar neighbors. It was established Lawrence, 2001) have been combined with collaborative filtering experimentally that when the data is very sparse, globally similar to enhance existing user data, and thus provide personalized neighbors provide better prediction, whereas in case of low spar- suggestion sity, predictions from local neighbors are superior In(Luo et al Rules generated through Apriori algorithm(O'Sullivan, wilson, 2008)a parameter a was utilized to balance contributions from gl Smyth, 2002)have been used to address the sparsity problem bal neighborhood over local neighborhood. The value taken by the by enabling similarity computation between pairs of users having parameter a was static, i.e. the importance given to local/global pre- no common ratings. A recursive algorithm(Zhang Pu, 2007)is dictions remained the same across all pre ns. Also the sig proposed, which allows nearest-neighbor users to join the predic cance of local or global predictions had to be manually set to tion process even if they have not rated the given item, by recur reflect the actual contribution of local/ global neighborhood to accu- sively predicting the scores for neighbors who have not rated the acy. a promising extension to the above scheme would be to auto- item, and integrating it into the prediction for the active user mate the a estimation by considering not only the overall spal Trustworthiness of users is an important factor, worth consider but also the sparsity at the user/item level. Intuitively a user rating ation, when assessing the neighbors who contribute to the predi a very small subset of items or a user with unusual tastes who has tion. Trust as a metric can replace or complement similarity rated a reasonably large number of items, might have a poor quality metrics(Al-Shamri Bharadwaj, 2008: Massa Avesani, 2004 local neighborhood, even if the ratings matrix is dense. in order to enhance the neighborhood set, by adding contributions In this work we propose various o estimation schemes based on from users who may not have commonly rated items with the ac- the overall sparsity in the ratings data as well as sparsity based on tive user, but may be trustworthy. the active user and the item whose rating needs to be predicted Many criterion for similarity which allow computation of user The dependency of the a value on the user and item ensures that likeness for users who may not share any common the local and global neighbors are weighed differently for every have been proposed. A similarity metric, Generalized Cosine Max, user and for every prediction. The experimental results support which does not need ratings of common items by users, is pre- our ideas and demonstrate that the proposed methods are superior sented in Anand, Kearney, and Shapcott, 2007. This similarity mea- to the fixed a scheme(luo et al., 2008). sure which uses item similarity within the calculation of user The remainder of the paper is organized as follows: Section 2 similarity is shown to work well when the rating data is sparse. provides a literature overview into the variety of techniques which (Aggarwal, Wolf, Wu, Yu, 1999) proposed a graph-theoretic ap- have been employed to solve the data sparsity problem and an proach to collaborative filtering, that enables users having no direct overview of the various existing local and global similarity mea- resemblance with the active user, to participate in the prediction. sures. Several new a estimation based on the user and item are introduced in Section 3 while Section 4 suggests an automatic 2.2. Similarity measures weighting scheme of combining the various sparsity measures to arrive at a unified measure of sparsity (UMS) to integrate local One of the important steps in collaborative filtering is the and global predictions. Section 5 presents an experimental evalua- neighborhood formation step. Several similarity measures have tion of the sed schemes and com them with the been proposed in literature in order to identify users with similar schemes proposed in(Luo et al., 2008). Finally, Section 6 winds inclinations. However most of the popular measures, base the sim- up with the conclusions and points out some directions for future ilarity computation, on the local information available ie on rat- research ings common to both users. Global similarity measures complement local similarity measures in order to improve accu- 2. Literature review racy and coverage in sparse-data scenarios. This section presents an overview of the different local and global similarity measures In spite of more than a decade long research on proposed in literature and concludes with a framework for com- ems, sparsity remains the major hurdle. We fir bining both in order to exploit the advantages offered by both le various solutions to the sparsity problems th
Sparsity of ratings data is one of the primary challenges to CF systems. The large number of users and items in any e-commerce recommendation system restricts the number of items rated by any user to a small subset of the total number of items. Effective prediction of ratings from a small number of examples is important (Adomavicius & Tuzhilin, 2005). Local similarity measures such as the Perason similarity measure (Resnick, Iacovou, Suchak, Bergstrom, & Riedl, 1994) and Cosine similarity measure (Sarwar, Karypis, Konstan, & Riedl, 2001), base the similarity computation step between two users, on the set of common items rated by both users. In the presence of sparseness in the data, many users do not share common items, thus rendering CF useless. Even if users are deemed similar to other users, the similarity might be based on a small set of common experiences. The user must rate a sufficient number of items to get a reasonable profile overlap in order for the system to establish his neighbors. The poor quality of recommendations at high sparsity levels is established (Grˇcar, Mladenı˘c, Fortuna, & Grobelnik, 2006). Over the years, a variety of solutions to the data sparsity problem, have been proposed. (Luo, Niu, Shen, & Ullrich, 2008) propose a novel approach of combining predictions from locally similar neighbors, i.e. users who have co-rated items, and globally similar neighbors. Two users are considered to be globally similar if they are connected through locally similar neighbors. It was established experimentally that when the data is very sparse, globally similar neighbors provide better prediction, whereas in case of low sparsity, predictions from local neighbors are superior. In (Luo et al., 2008) a parameter a was utilized to balance contributions from global neighborhood over local neighborhood. The value taken by the parameter a was static, i.e. the importance given to local/global predictions remained the same across all predictions. Also the signifi- cance of local or global predictions had to be manually set to reflect the actual contribution of local/global neighborhood to accuracy. A promising extension to the above scheme would be to automate the a estimation by considering not only the overall sparsity but also the sparsity at the user/item level. Intuitively a user rating a very small subset of items or a user with unusual tastes who has rated a reasonably large number of items, might have a poor quality local neighborhood, even if the ratings matrix is dense. In this work we propose various a estimation schemes based on the overall sparsity in the ratings data as well as sparsity based on the active user and the item whose rating needs to be predicted. The dependency of the a value on the user and item ensures that the local and global neighbors are weighed differently for every user and for every prediction. The experimental results support our ideas and demonstrate that the proposed methods are superior to the fixed a scheme (Luo et al., 2008). The remainder of the paper is organized as follows: Section 2 provides a literature overview into the variety of techniques which have been employed to solve the data sparsity problem and an overview of the various existing local and global similarity measures. Several new a estimation based on the user and item are introduced in Section 3 while Section 4 suggests an automatic weighting scheme of combining the various sparsity measures to arrive at a unified measure of sparsity(UMS) to integrate local and global predictions. Section 5 presents an experimental evaluation of the proposed schemes and compares them with the schemes proposed in (Luo et al., 2008). Finally, Section 6 winds up with the conclusions and points out some directions for future research. 2. Literature review In spite of more than a decade long research on recommender systems, sparsity remains the major hurdle. We first take a look at the various solutions to the sparsity problems that have been proposed in literature and then examine the various local and global similarity measures. 2.1. Sparsity Several model-based techniques based on dimensionality reduction techniques such as SVD (Billsus & Pazzani, 1998; Rosenstein & Lochbaum, 2000) and Latent Semantic Indexing (Sarwar, Karypis, Konstan, & Riedl, 2000) have been applied in order to alleviate the sparsity problem. (Suryavanshi, Shiri, & Mudur, 2005) propose a two level model-based technique, employing relational fuzzy subtractive clustering and association rules mining at the first and second levels respectively, to produce recommendations which are less prone to effects of sparsity. To aid computation of comparability between users preferences, (Al-Shamri & Bharadwaj, 2008) proposed building a concise and representative hybrid user model, which enables quantification of user closeness irrespective of any shared experiences, and thus aid predictions which are accurate in the presence of sparsity. Hybridization of collaborative filtering with other filtering mechanisms is another way to help solve the problem of data sparsity. Other data sources such as the item contents (Melville, Mooney, & Nagarajan, 2002; Popescul, Ungar, Pennock, & Lawrence, 2001) have been combined with collaborative filtering to enhance existing user data, and thus provide personalized suggestions. Rules generated through Apriori algorithm (O’Sullivan, Wilson, & Smyth, 2002) have been used to address the sparsity problem by enabling similarity computation between pairs of users having no common ratings. A recursive algorithm (Zhang & Pu, 2007) is proposed, which allows nearest-neighbor users to join the prediction process even if they have not rated the given item, by recursively predicting the scores for neighbors who have not rated the item, and integrating it into the prediction for the active user. Trustworthiness of users is an important factor, worth consideration, when assessing the neighbors who contribute to the prediction. Trust as a metric can replace or complement similarity metrics (Al-Shamri & Bharadwaj, 2008; Massa & Avesani, 2004) in order to enhance the neighborhood set, by adding contributions from users who may not have commonly rated items with the active user, but may be trustworthy. Many criterion for similarity which allow computation of user likeness for users who may not share any common experiences, have been proposed. A similarity metric, Generalized Cosine Max, which does not need ratings of common items by users, is presented in Anand, Kearney, and Shapcott, 2007. This similarity measure which uses item similarity within the calculation of user similarity is shown to work well when the rating data is sparse. (Aggarwal, Wolf, Wu, & Yu, 1999) proposed a graph-theoretic approach to collaborative filtering, that enables users having no direct resemblance with the active user, to participate in the prediction. 2.2. Similarity measures One of the important steps in collaborative filtering is the neighborhood formation step. Several similarity measures have been proposed in literature in order to identify users with similar inclinations. However most of the popular measures, base the similarity computation, on the local information available i.e. on ratings common to both users. Global similarity measures can complement local similarity measures in order to improve accuracy and coverage in sparse-data scenarios. This section presents an overview of the different local and global similarity measures proposed in literature and concludes with a framework for combining both in order to exploit the advantages offered by both approaches. 5102 D. Anand, K.K. Bharadwaj / Expert Systems with Applications 38 (2011) 5101–5109
D. Anand, KK Bharadwaj/ Expert Systems with Applications 38(2011)5101-5109 2. 1. Local similarity measures Similarity measurement between users plays an elemental role sim,=Min(lup n lual- 2 sim (up, L la) (6) in both user-based and item-based algorithms. The most com- only used measurement techniques of the similarity between where lup n luql is the number of items co-rated by users p and q users are the Pearson correlation(Resnick et al, 1994)and cosine "y 'is the minimum number of common items that needs to be rated in common by both users. This method is termed surprisal-based computation is based on finding the similarity between the rating vector similarity with significance weighting(SVSS) Pearson correlation coefficient defines similarity between users 2. 2.2. Global similarity measures Global similarity measures adie sin ilarity computation be tween pairs of users who may not share any common encounters. sim(x y)= ∑esn(rxi-x)ry-Fy) (1) Several algorithms utilizing predictions from global neighbors have similarity between users, discovery hidden similarity(DHS), is where Sxy is the set of items which users x and y have co-rated and proposed in Lee, Yang, and Park (2004). The similarities are cap- used fe mean rating for user x. Another popular similarity metric tured not only through movie ratings but also through user simi- Cosine similarity is defined as: larities. A new technique of computing user similarities using a Markov-chain model of a random walk is introduced in Fouss sim(x,y)= ( 2) Pirotte, Renders, and Saerens(2007). Utilizing a bipartite graph with users and items as nodes where users are connected to items they have experienced the quantities, average commute time When the preference information is binary i.e. like or do not like an andaverage first passage time"are used as similarity measure em, then the Jaccard coefficient is used to measure user similarity. between two users. These quantities have the nice property of The jaccard coefficient is defined as increasing when the number of paths connecting nodes increases sm(x,y)=风RnR and when the lengths of the paths decreases. Other item-oriented RuRAl (3) approaches using random walk based approach(Gori Pucci, 2007: Yildirim Krishnamoorthy, 2008 ), infer correlations Rx is the set of elements liked by user x between items using item graphs, to recommend items to the thia, Hailes, and Capra(2007)use concordance based meth active user. A new collaborative filtering approach(Desrosiers measure the association between a pair of users. The close- Karypis, 2008), computes global similarities between pairs of items ess among a pair of users is estimated based on the number of and users, based on the solution to system of linear equations concordant,discordant and tied pairs of common ratings. relating user similarities to item similarities. The new approach Candillier et al.(2008)introduce several weighted similarity helps make accurate predictions in the presence of sparsity and measures for user-based and item-based collaborative filtering. also takes into account content-based similarities between users. The method proposed, uses Jaccard similarity as a weighting scheme and combines it with other similarity measures such as 2.2.3. Framework combining local and global similarities Pearson correlation coefficient to emphasize similarity of users Luo et al. (2008)utilized the maximin distance between users in who share appreciation on several items with the active users a user graph as the global similarity between two users. Two users Luo et al. (2008 )exploited the fact that two users whose opin- are connected by an edge if they have positive local similarity. ion about an item is contrasting to the average attitude about the where local similarity was the svss similarity as discussed in tem, but comparable to each other, can be deemed to be more Section 2. 2. 1. A novel technique of combining the predictions from imilar, than users who go along with the popular opinion about locally and globally similar neighbors(luo et al., 2008), allowed the item. They stressed on the discriminative power of a rating the combined framework to attain quality predictions in both by measuring its surprisal, which carries information about sparse as well as dense data scenarios. The final prediction was a how distinct a particular rating from the average rating for the linear combination of predictions from both sets of neighbors item. The rating for an item was assumed to follow the laplacian and was defined as follows: distribution and the surprisal of a rating was computed as: predR=(1-a)* pred+a* predRG I(rp i=-In((r=rp, ilui, b1i)=In(2b +rp, i-Aeil (4) where predRt is the predicted rating obtained through the local neighbors and predRg is the predicted rating obtained through the here rp. a is the rating of item'i by user 'p', b and u are the global neighbors. a is the weight given to prediction from the global imum likelihood estimates of location and scale parameters, respec- neighborhood set. tively. Given the surprisal of all ratings, the user p's surprisal vector, Luo et al. ( 2008)empirically established the dependence of a Sp is define on the sparsity of the rating data. It was shown that with the in- crease in the sparsity of the ratings matrix, an increase in a led to more accurate predictions. This scheme of combining local and =[sgn(Tp1-H1)*(. 1)., sgn(rp N-AN)*I( N), p global similarities referred to as lS gs in Luo et al. (2008)were (5) fixed-azschemes, since weightage of predictions from local and glo- bal neighbors was fixed and needed to be manually set depending here sgn(pJ-A) represents whether the preference of user p is the data sparsity. The superiority of the ls& Gs over several positive or negative with respect to the average attitude for the prediction algorithms such as effective missing data prediction item. The similarity calculation between two users is the vector (EMDP), similarity fusion Algorithm(SF), user based filtering using the e similarity between the users'surprisal vectors To discount Pearson correlation(UPCC)and item based filtering using Pearson he high similarity between two users based on a small set of correlation(IPCC) was established experimentally by setting a to Co-rated items, the following significance weighting scheme is 0.5. This scheme will henceforth be referred to as the fixed-oo proposed: scheme 1. An alternative basis for comparison would be a fixed-at
2.2.1. Local similarity measures Similarity measurement between users plays an elemental role in both user-based and item-based algorithms. The most commonly used measurement techniques of the similarity between users are the Pearson correlation (Resnick et al., 1994) and Cosine similarity (Breese et al., 1998) algorithms. Typically the similarity computation is based on finding the similarity between the rating vectors, containing ratings of items rated in common by both users. Pearson correlation coefficient defines similarity between users x and y as: simðx; yÞ ¼ P i2Sxy ðrx;i rxÞðry;i ryÞ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P i2Sxy ðrx;i rxÞ 2P i2Sxy ðry;i ryÞ 2 q ð1Þ where Sxy is the set of items which users x and y have co-rated and rx is the mean rating for user x. Another popular similarity metric used for CF, the Cosine similarity is defined as: simðx; yÞ ¼ P i2Sxy rx;iry;i ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P i2Sxy r2 x;i P i2Sxy r2 y;i q ð2Þ When the preference information is binary i.e. like or do not like an item, then the Jaccard coefficient is used to measure user similarity. The Jaccard coefficient is defined as: simðx; yÞ ¼ jRx \ Ryj jRx [ Ryj ð3Þ where Rx is the set of elements liked by user x. Lathia, Hailes, and Capra (2007) use concordance based methods to measure the association between a pair of users. The closeness among a pair of users is estimated based on the number of concordant, discordant and tied pairs of common ratings. Candillier et al. (2008) introduce several weighted similarity measures for user-based and item-based collaborative filtering. The method proposed, uses Jaccard similarity as a weighting scheme and combines it with other similarity measures such as Pearson correlation coefficient to emphasize similarity of users who share appreciation on several items with the active users. Luo et al. (2008) exploited the fact that two users whose opinion about an item is contrasting to the average attitude about the item, but comparable to each other, can be deemed to be more similar, than users who go along with the popular opinion about the item. They stressed on the discriminative power of a rating by measuring its ‘‘surprisal”, which carries information about how distinct a particular rating from the average rating for the item. The rating for an item was assumed to follow the Laplacian distribution and the surprisal of a rating was computed as: Iðrp; iÞ¼ lnðfðr ¼ rp; ijlKi; bKiÞÞ ¼ lnð2bKiÞ þ jrp; i lKij bKi ð4Þ where rp,i is the rating of item ‘i’ by user ‘p’, bK i and lK i are the maximum likelihood estimates of location and scale parameters, respectively. Given the surprisal of all ratings, the user p’s surprisal vector, Sp is defined as Sp ¼ ½sp;1; ... ; sp;N T ¼ ½sgnðrp;1 l^1Þ Iðrp;1Þ; ... ; sgnðrp;N l^NÞ Iðrp;NÞT ; p ¼ 1; ... ; M ð5Þ where sgn ðrp;I lK i Þ represents whether the preference of user p is positive or negative with respect to the average attitude for the item. The similarity calculation between two users is the vector space similarity between the users’ surprisal vectors. To discount the high similarity between two users based on a small set of co-rated items, the following significance weighting scheme is proposed: sim0 L ¼ MinðjIup \ Iuqj; cÞ c simLðup; uqÞ ð6Þ where jIup \ Iuqj is the number of items co-rated by users p and q. ‘c’ is the minimum number of common items that needs to be rated in common by both users. This method is termed surprisal-based vector similarity with significance weighting (SVSS). 2.2.2. Global similarity measures Global similarity measures enable similarity computation between pairs of users who may not share any common encounters. Several algorithms utilizing predictions from global neighbors have been proposed in literature. An approach to capturing transitive similarity between users, discovery hidden similarity (DHS), is proposed in Lee, Yang, and Park (2004). The similarities are captured not only through movie ratings but also through user similarities. A new technique of computing user similarities using a Markov-chain model of a random walk is introduced in Fouss, Pirotte, Renders, and Saerens (2007). Utilizing a bipartite graph with users and items as nodes, where users are connected to items they have experienced, the quantities, ‘‘average commute time” and ‘‘average first passage time” are used as similarity measure between two users. These quantities have the nice property of increasing when the number of paths connecting nodes increases and when the lengths of the paths decreases. Other item-oriented approaches using random walk based approach (Gori & Pucci, 2007; Yıldırım & Krishnamoorthy, 2008), infer correlations between items using item graphs, to recommend items to the active user. A new collaborative filtering approach (Desrosiers & Karypis, 2008), computes global similarities between pairs of items and users, based on the solution to system of linear equations relating user similarities to item similarities. The new approach helps make accurate predictions in the presence of sparsity and also takes into account content-based similarities between users. 2.2.3. Framework combining local and global similarities Luo et al. (2008) utilized the maximin distance between users in a user graph as the global similarity between two users. Two users are connected by an edge if they have positive local similarity, where local similarity was the SVSS similarity as discussed in Section 2.2.1. A novel technique of combining the predictions from locally and globally similar neighbors (Luo et al., 2008), allowed the combined framework to attain quality predictions in both sparse as well as dense data scenarios. The final prediction was a linear combination of predictions from both sets of neighbors and was defined as follows: predR ¼ ð1 aÞ predRl þ a predRG ð7Þ where predRL is the predicted rating obtained through the local neighbors and predRG is the predicted rating obtained through the global neighbors. a is the weight given to prediction from the global neighborhood set. Luo et al. (2008) empirically established the dependence of a, on the sparsity of the rating data. It was shown that with the increase in the sparsity of the ratings matrix, an increase in a led to more accurate predictions. This scheme of combining local and global similarities referred to as LS & GS in Luo et al. (2008) were fixed-aschemes, since weightage of predictions from local and global neighbors was fixed and needed to be manually set depending on the data sparsity. The superiority of the LS & GS over several prediction algorithms such as effective missing data prediction (EMDP), similarity fusion Algorithm (SF), user based filtering using Pearson correlation (UPCC) and item based filtering using Pearson correlation (IPCC), was established experimentally by setting a to 0.5. This scheme will henceforth be referred to as the fixed-a scheme 1. An alternative basis for comparison would be a fixed-a, D. Anand, K.K. Bharadwaj / Expert Systems with Applications 38 (2011) 5101–5109 5103
D. Anand, KK Bharadwaj/ Expert Systems with Applications 38(2011)5101-5109 with a set to a value which results in the least mae this scheme of The uss for user uand item i is defined as estimating o would hereafter be referred to as the fixed- scheme User specific sparsity measure 3. Novel schemes for estimating parameter a where nu is the number of items rated by user u A framework combining predictions from local and global 3. 1.3. User and item specif Varsity measures neighbors was discussed in Section 2. The parameter o serves Sometimes the superiority of global predictions over local ones to adjust the weight that we give to global neighbors with re- so depend on the type of items rated by the user For example the ards to the weight that we give to the local neighbors. When local neighbors for a user, with eclectic tastes who has rated sev- the ratings matrix is dense then generally the local neighbor eral items which have not been experienced by a large majority hood set is rich enough to enable prediction for the activ of users, may be scarce. When the local neighborhood is meager. user,in which case the predictions from local neighborhood global neighbors can contribute to improvement in accuracy of should be weighed more. However, when the ratings matrix predictions. This means that the sparsity measure should take into is sparse, the local neighborhood set generated may account not only the active user, but also the item for which the lead to low qua commendations. and therefore it need rating needs to be predicted. Three sparsity measures, which cap- to be enriched globally similar neighbors and for better ture sparsity at user-item level, are introduced belov predictions In this section we propose several sparsity measures whic Local global rat enable the global neighbors to be weighed according to the active One simple measure is to find the of number of local er and the item whose vote needs to be predicted. To the best of neighbors to the number of global bors who have rated our knowledge there has been no previous attempt at quantifying the item. It is to be noted that the r of global neighbors the various facets of sparseness in data and to use them in the pre- always exceeds or is equal to the number of local neighbors. diction process. The "sparsity problem"in collaborative filtering This is due to the fact that simd(x, y)> sim(x, y). The LGR for a refers to inability to find a sufficient quantity of good quality user u and item i is defined as: neighbors to aid in the prediction process due to insufficient over lap of ratings between the active user and his neighbors. This can LGR(u,i=1-Luil (10) lappen when the ratings matrix is sparse, or the number of users anticipating is not large. Even when the data is dense enough to where Lui is the set of local neighbors of user u who have rated item i, allow quality predictions for most users, some users may not have Gui is the set of global neighbors of user u who have rated item i. rated enough items or may have rated items not rated by most User-item specific sparsity measure(UIS1)The UISI measure people, with the result that such users get poor quality predictions bases the sparsity measurement on the ratio of number of local sers whose local neighborhood set is sparse can thus be aided by neighbors who have rated a particular item to the total number using predictions from the global neighborhood set. Thus intui- of people who have rated the item i.e. the UiSl for a user u and tively the value of o should depend on the user and the item whose item I rating is to be predicted. The proposed work introduces several UIS1(u, i)=1 (11) estimates for a, which captures the various aspects of sparseness in the data where Ni is the set of users who have rated item i 3. 1. Individual sparsity measures User-item specific sparsity measure2 (UIS2)The UIS2 measure In this section we introduce several sparsity measures namely. computation is founded on the ratio of number of local neigh- OS (overall sparsity measure), USS(user specific sparsity measure). bors who have rated a particular item to the total number of LGR (local global ratio), UISI (user item based sparsity measure 1) users in the local neighborhood set ie. the UIS2 for a user u and UiS2 (user item based sparsity measure 2). and item i is defined as: 3.1.1. Overall sparsity measure(oS) The overall sparsity measure captures the level of sparsity in the where Lu is the set of local neighbors for user u. entire rating matrix. This sparsity measure is universal i.e. the a computed is fixed for all users. The overall sparsity measure 3. 2. Unified measure of sparsity(UMS)-GA approach defined as The various measures of sparsity, described in Section 3, encap- Overall sparsity measure =1 nUsers items (8) sulate the diverse factors that come into play. when deciding the extent to which the local and global predictions should influence where nR is the total number of ratings which exist, nUsers is the he final prediction. The performance of each sparsity measure is total number of users in the system and nitems is the total number ataset dependent and hence the quality offered, varies across sev eral datasets. The sparsity measure, which best reflects the balance items between local and global predictions may differ from user to user and also may evolve over time. The idea of unifying the various 3. 1.2. User specific sparsity measure(USS) sparsity measures to derive a single weight which works best for The user specific sparsity measure is based on the intuition that the active user, is propitious. The sparsity measures can be coa- users who have rated very few items are less likely to get reliable lesced into a unified sparsity measure(UMS) by considering a us need to depend more on global neighbor- weighted average of all the sparsity measures, where a sparsity hood set. The USS is user-dependent, but remains the same across easure more representative of the users preference for local/glo- all items whose rating needs to be predicted bal predictions, has a higher value
with a set to a value which results in the least MAE. This scheme of estimating a would hereafter be referred to as the fixed-a scheme 2. 3. Novel schemes for estimating parameter a A framework combining predictions from local and global neighbors was discussed in Section 2. The parameter a serves to adjust the weight that we give to global neighbors with regards to the weight that we give to the local neighbors. When the ratings matrix is dense then generally the local neighborhood set is rich enough to enable prediction for the active user, in which case the predictions from local neighborhood should be weighed more. However, when the ratings matrix is sparse, the meager local neighborhood set generated may lead to low quality recommendations, and therefore it needs to be enriched by the globally similar neighbors and for better predictions. In this section we propose several sparsity measures which enable the global neighbors to be weighed according to the active user and the item whose vote needs to be predicted. To the best of our knowledge there has been no previous attempt at quantifying the various facets of sparseness in data and to use them in the prediction process. The ‘‘sparsity problem” in collaborative filtering refers to inability to find a sufficient quantity of good quality neighbors to aid in the prediction process due to insufficient overlap of ratings between the active user and his neighbors. This can happen when the ratings matrix is sparse, or the number of users participating is not large. Even when the data is dense enough to allow quality predictions for most users, some users may not have rated enough items or may have rated items not rated by most people, with the result that such users get poor quality predictions. Users whose local neighborhood set is sparse can thus be aided by using predictions from the global neighborhood set. Thus intuitively the value of a should depend on the user and the item whose rating is to be predicted. The proposed work introduces several estimates for a, which captures the various aspects of sparseness in the data. 3.1. Individual sparsity measures In this section we introduce several sparsity measures namely, OS (overall sparsity measure), USS (user specific sparsity measure), LGR (local global ratio), UIS1 (user item based sparsity measure 1) and UIS2 (user item based sparsity measure 2). 3.1.1. Overall sparsity measure(OS) The overall sparsity measure captures the level of sparsity in the entire rating matrix. This sparsity measure is universal i.e. the a computed is fixed for all users. The overall sparsity measure is defined as: Overall sparsity measure ¼ 1 nR nUsers nItems ð8Þ where nR is the total number of ratings which exist, nUsers is the total number of users in the system and nItems is the total number of items. 3.1.2. User specific sparsity measure(USS) The user specific sparsity measure is based on the intuition that users who have rated very few items are less likely to get reliable local neighbors and thus need to depend more on global neighborhood set. The USS is user-dependent, but remains the same across all items whose rating needs to be predicted. The USS for user uand item i is defined as: User specific sparsity measure ¼ 1 nu max u2U ðnuÞ ð9Þ where nu is the number of items rated by user u. 3.1.3. User and item specific sparsity measures Sometimes the superiority of global predictions over local ones also depend on the type of items rated by the user. For example the local neighbors for a user, with eclectic tastes who has rated several items which have not been experienced by a large majority of users, may be scarce. When the local neighborhood is meager, global neighbors can contribute to improvement in accuracy of predictions. This means that the sparsity measure should take into account not only the active user, but also the item for which the rating needs to be predicted. Three sparsity measures, which capture sparsity at user-item level, are introduced below. Local global ratio (LGR) One simple measure is to find the ratio of number of local neighbors to the number of global neighbors who have rated the item. It is to be noted that the number of global neighbors always exceeds or is equal to the number of local neighbors. This is due to the fact that simG(x,y) P simL(x,y). The LGR for a user u and item i is defined as: LGRðu; iÞ ¼ 1 jLu;ij jGu;ij ð10Þ where Lu,i is the set of local neighbors of user u who have rated item i, Gu,i is the set of global neighbors of user u who have rated item i. User-item specific sparsity measure1 (UIS1) The UIS1 measure bases the sparsity measurement on the ratio of number of local neighbors who have rated a particular item to the total number of people who have rated the item i.e. the UIS1 for a user u and item i is defined as: UIS1ðu; iÞ ¼ 1 jLu;ij jNij ð11Þ where Ni is the set of users who have rated item i. User-item specific sparsity measure2 (UIS2) The UIS2 measure computation is founded on the ratio of number of local neighbors who have rated a particular item to the total number of users in the local neighborhood set i.e. the UIS2 for a user u and item i is defined as: UIS2ðu; iÞ ¼ 1 jLu;ij jLuj ð12Þ where Lu is the set of local neighbors for user u. 3.2. Unified measure of sparsity(UMS)–GA approach The various measures of sparsity, described in Section 3, encapsulate the diverse factors that come into play, when deciding the extent to which the local and global predictions should influence the final prediction. The performance of each sparsity measure is dataset dependent and hence the quality offered, varies across several datasets. The sparsity measure, which best reflects the balance between local and global predictions may differ from user to user and also may evolve over time. The idea of unifying the various sparsity measures to derive a single weight which works best for the active user, is propitious. The sparsity measures can be coalesced into a unified sparsity measure(UMS) by considering a weighted average of all the sparsity measures, where a sparsity measure more representative of the user’s preference for local/global predictions, has a higher value. 5104 D. Anand, K.K. Bharadwaj / Expert Systems with Applications 38 (2011) 5101–5109
D. Anand, KK Bharadwaj/ Expert Systems with Applications 38(2011)5101-5109 UMS=W,OS+W2 USS+W3LGR+WAUIS1+wsUIS2 3 Uniform mutation and arithmetic crossover operators are used here W= 1 as the basic mutation and crossover operators respectively, in our Genetic algorithms can be effectively employed to learn experiments weights of each of the five proposed a-estimates for every user, 3. 3. Fitness function in an offline learning process. Such a set of weights can then b The fitness function is the factor which drives the ga process used to compute the value of a during prediction for the active towards convergence to the optimal solution. Chromosomes which user. The next subsection discusses the genetic operators and the are more optimal, are allowed to breed and mix their datasets by fitness function utilized to generate the set of weights. any of several techniques, producing a new generation that will be even better. Choosing a fitness function for finding the set of 3. 1. Automatic weighting of sparsity measures using real-valued weights for the sparsity measures, requires that the weight shoul genetic algorith be evaluated via the error that the weight produces while predict Genetic algorithms base their operation on the Darwanian prin- ing each rating in the user's training set. An individual whose error ciple of"survival of the fittest"and utilize artificial evolution to get is lesser is a fitter individual. The weights are learnt for the active enhanced solutions with each iteration. The GA process starts with user by trying to predict each rating in the user's training set using a population of candidate solutions known as chromosome the particular weight to combine the five sparsity genotype. Each chromosome in the population has an associated determining how well the unified sparsity measure balances local ness and these scores are used in a competition to determine and global predictions. The fitness function is the average error which chromosomes are used to form new ones(De Jong, 1988). among all such predictions for the active user As the population evolves from one generation of chromosomes with the genetics inspired operations of crossover mutation and fitness =Ti ITal-pn山 inversion(Mitchell, 1998), the solutions tend to approach the gle bal optimum regardless of the topography of the fitness landscape where T is the set of all ratings in the training set of the user, rai is The traditional chromosome representation technique of using the actual rating for item i by the active user, and pray is the pre- bit strings, suffer from slow convergence due to the large chromo dicted score computed using the combined framework. The tructure,especially in optimization problems involvin weights to be applied for each of the measures of spar al parameters Practical experience favors utilization of the sity(wl, w2,-.w5) to arrive at a unified sparsity measures, (eq most natural representation of solutions rather than enforce a (13) can be computed by employing real valued GA as described ary vector ation for (Back, Fogel, Michalewicz, 1997). Since the weights for each of the sparsity measures are real numbers, a floating point representation of 4. Proposed recommender system framework chromosomes is most suited for our purpose. Thus each chromo some consists of a vector of five real numbers representing weights The proposed estimates of o can be utilized to integrate predic corresponding to five different sparsity measures. The population tions from locally and globally similar users and arrive at the final then evolves over several generations to arrive at the weight vector predicted vote for the active user. The main steps of the proposed recommender system framework are given below: 3. 2.2. Genetic operators Step 1: Compute local user similarities. Genetic operators drive the search process in GA Crossover and Compute the SvsS similarity between all pairs of users, mutation are the two basic genetic operators. while crossover al using the formula(6). Let sim(x, y) refer to the local simi- tion operators for real-valued GA was introduced by Michale Step 2: corby between users x and y as computed in this step lows creation of two new individuals by allowing two parent chi mosomes to exchange meaningful information, mutation is used to mute the global user similarities. naintain the genetic diversity of the population by introducing a Compute the global similarities between all pairs of users, ompletely new member into the population. Crossover and mut by first constructing the user graph based on local similar ities, and then finding the maximin distance between (1992). Let a chromosome be represented by n real n users, as described in Section 2. Let simd(x, y)refer to the bers(genes) where the ith gene lying in the range [a, b. Given global similarity between users x and y as computed two parent chromosomes X=(x1, x2,..., xn) and Y=y1, y2, ...,yn). this step. some of the different types of crossover and muta r real Step 3: Obtain predicted ratings using local and global neighbors. valued GA, are discussed below(Houck, Joines, The predicted rating for an item i for active user u is based Michalewicz, 1992). on Resnicks prediction formula(resnick et al., 1994). A gene i is randomly selected and its value is modified to a uni orm random number between ai and b (17) U(ai, bi), if i=j otherwise (14) where r is the mean rating for user i, simy is the similarity between Arithmetic crossover The above formula can be used to arrive at predictions from local Arithmetic crossover generates two new complementary neighborhood by setting simy to simL(i ) and N()to the local neigh ffspring as a linear combination of the parent chromosomes. borhood for user i. A similar method can be adopted to dictions from global neighborhood. Let prkk and pri be the X=X+(1-)Y (15) predicted ratings using local and global similarities respectively Step 4: Merge the local and global ratings The predicted ratings from local and global neighborhoods are where t=U(0, 1)and X' and y are the generated offspring. combined using the following formula:
UMS ¼ w1OS þ w2USS þ w3LGR þ w4UIS1 þ w5UIS2 ð13Þ where P5 i¼1wi ¼ 1. Genetic algorithms can be effectively employed to learn weights of each of the five proposed a-estimates for every user, in an offline learning process. Such a set of weights can then be used to compute the value of a during prediction for the active user. The next subsection discusses the genetic operators and the fitness function utilized to generate the set of weights. 3.2.1. Automatic weighting of sparsity measures using real-valued genetic algorithm Genetic algorithms base their operation on the Darwanian principle of ‘‘survival of the fittest” and utilize artificial evolution to get enhanced solutions with each iteration. The GA process starts with a population of candidate solutions known as chromosome or genotype. Each chromosome in the population has an associated fitness and these scores are used in a competition to determine which chromosomes are used to form new ones (De Jong, 1988). As the population evolves from one generation of chromosomes to a new population by using a kind of natural selection together with the genetics inspired operations of crossover, mutation and inversion (Mitchell, 1998), the solutions tend to approach the global optimum regardless of the topography of the fitness landscape. The traditional chromosome representation technique of using bit strings, suffer from slow convergence due to the large chromosome structure, especially in optimization problems involving several parameters Practical experience favors utilization of the most natural representation of solutions rather than enforce a binary vector representation for many problems (Bäck, Fogel, & Michalewicz, 1997). Since the weights for each of the sparsity measures are real numbers, a floating point representation of chromosomes is most suited for our purpose. Thus each chromosome consists of a vector of five real numbers representing weights corresponding to five different sparsity measures. The population then evolves over several generations to arrive at the weight vector which is the fittest. 3.2.2. Genetic operators Genetic operators drive the search process in GA. Crossover and mutation are the two basic genetic operators. While crossover allows creation of two new individuals by allowing two parent chromosomes to exchange meaningful information, mutation is used to maintain the genetic diversity of the population by introducing a completely new member into the population. Crossover and mutation operators for real-valued GA was introduced by Michalewicz (1992). Let a chromosome be represented by n real numbers(genes) where the ith gene lying in the range [ai,bi]. Given two parent chromosomes X = hx1,x2,...,xni and Y = hy1,y2,...,yni, some of the different types of crossover and mutation for realvalued GA, are discussed below (Houck, Joines, & Kay, 1995; Michalewicz, 1992). Uniform mutation A gene i is randomly selected and its value is modified to a uniform random number between ai and bi: x0 j ¼ Uðai; biÞ; if i ¼ j xj; otherwise ð14Þ Arithmetic crossover Arithmetic crossover generates two new complementary offspring as a linear combination of the parent chromosomes. X0 ¼ sX þ ð1 sÞY Y0 ¼ ð1 sÞX þ sY ð15Þ where s = U(0,1) and X0 and Y0 are the generated offspring. Uniform mutation and arithmetic crossover operators are used as the basic mutation and crossover operators respectively, in our experiments. 3.2.3. Fitness function The fitness function is the factor which drives the GA process towards convergence to the optimal solution. Chromosomes which are more optimal, are allowed to breed and mix their datasets by any of several techniques, producing a new generation that will be even better. Choosing a fitness function for finding the set of weights for the sparsity measures, requires that the weight should be evaluated via the error that the weight produces while predicting each rating in the user’s training set. An individual whose error is lesser is a fitter individual. The weights are learnt for the active user by trying to predict each rating in the user’s training set using the particular weight to combine the five sparsity measures, and determining how well the unified sparsity measure balances local and global predictions. The fitness function is the average error among all such predictions for the active user. fitness ¼ 1 jTj X i2T jra;i pra;ij ð16Þ where T is the set of all ratings in the training set of the user, ra,i is the actual rating for item i by the active user, and pra,I is the predicted score computed using the combined framework.. The weights to be applied for each of the measures of sparsity(w1,w2,...w5) to arrive at a unified sparsity measures, (Eq. (13)), can be computed by employing real valued GA as described in this section. 4. Proposed recommender system framework The proposed estimates of a can be utilized to integrate predictions from locally and globally similar users and arrive at the final predicted vote for the active user. The main steps of the proposed recommender system framework are given below: Step 1: Compute local user similarities. Compute the SVSS similarity between all pairs of users, using the formula (6). Let simL(x,y) refer to the local similarity between users x and y as computed in this step. Step 2: Compute the global user similarities. Compute the global similarities between all pairs of users, by first constructing the user graph based on local similarities, and then finding the maximin distance between users, as described in Section 2. Let simG(x,y) refer to the global similarity between users x and y as computed in this step. Step 3: Obtain predicted ratings using local and global neighbors. The predicted rating for an item i for active user u is based on Resnick’s prediction formula (Resnick et al., 1994). pri;k ¼ ri þ P j2NðiÞsimi;j ðrj;k rjÞ P j2NðiÞjsimi;jj ð17Þ where ri is the mean rating for user i, simi,j is the similarity between users i and j and N(i) is the neighborhood of user i. The above formula can be used to arrive at predictions from local neighborhood by setting simij to simL(i,j) and N(i) to the local neighborhood for user i. A similar method can be adopted to arrive at predictions from global neighborhood. Let prL i;k and prG i;k be the predicted ratings using local and global similarities respectively. Step 4 : Merge the local and global ratings. The predicted ratings from local and global neighborhoods are combined using the following formula: D. Anand, K.K. Bharadwaj / Expert Systems with Applications 38 (2011) 5101–5109 5105