Journal of Artificial Intelligence 2 (2): 40-55, 2009 ISSN1994-5450 e 2009 Asian Network for Scientific Information Recommender System based on Collaborative Behavior of Ants P Bedi, 'R Sharma and HKaur Department of Computer Science, New Academic Block, Adjoining Arts Faculty Building, University of Delhi, DeIhi-110007, India 'Department of Computer Science, Hans Raj College University of Delhi, Delhi-110007, India Abstract: This study uses collaborative filtering approach and proposes an Ant Recommender System(ARS) based on collaborative behavior of ants for generating Top-N recommendations. Present proposed system ARS works in two phases. In the first phase, opinions from users collected in the form of user-item rating matrix are clustered offline using ant based clustering algorithm into predetermined number of clusters and stored in the database for future recommendations. In the second phase, the recommendations are generated online for the active user. The pheromone updating strategy of ants is combined with similarity measure for choosing the clusters with good quality ratings. This helps in improving the quality of recommendations for the active user. The performance of ARS is evaluated using Jester dataset available on the website of University of California, Berkeley and compared with traditional collaborative filtering based recommender system Key words: Swarm intelligence, ant colony optimization, recommender system, collaborative filter somone updatin INTRODUCTION Recommender Systems have gained wide-spread acceptance and attracted increased public interest during the last decade, leveling the ground for new sales opportunities in E-Commerce (Herlocker et al, 1999; Adomavicius and Tuzhilin, 2005). They serve as crucial tools to overcome the information overload, sifting through very large sets of information and selecting information that is relevant for the active user. The term"active user"refers to the person for whom recommendations are made Recommender systems work by collecting data from users, using explicit and/or implicit methods and then comparing the collected data to the active user data to find a list of recommendations for the active user. The prevalent three approaches to building recommender systems are collaborative filtering, content based approach and hybrid methods. Collaborative Filtering(CF)collects opinions from users in the form of ratings on various items. The recommendations produced are based only on the opinions of users similar to the active user(neighbors ). Content based approach suggests items that are similar to the ones the active user has shown a preference for in the past rather than on the preferences of other users. Content based matching requires a representation of the items in terms of features (attributes). To improve performance, mostly CF is combined with one or more recommendation techniques i. e,, content based, knowledge-based, demographic or utility basedis called hybrid recommender systems orresponding Author: Dr Punan Bedi, Department of Computer Science, New Academic Block, Adjoining Arts Faculty Building, University of Delhi, Delhi-110007, Indi Fax;011-27662553
rti.htel,2(2):40-55,2009 Collaborative Filtering(CF)takes its roots from something humans have been doing for centuries viz., sharing opinions with others. It brings together the opinons of large interconnected communities on the web(Schafer et al., 2007). Many collaborative systems have been developed in the academia and the industry. Using stereotypes, the grundy system was developed to build individual user models and use them to recommend relevant books to each user(Rich, 1979). Later on, the tapest ystem relied on each user to identify like-minded users manually( Goldberg et al., 1992). Group Lens (Konstan et al., 1997; Resnick et al., 1994) in the domain of Usenet newsgroup articles, Bellcore's video recommender(hill ef al., 1995)in the domain of movies and Ringo (Shardanand and Maes, 1995) in the domain of music and musical artists were the first systems to use collaborative filtering algorithms to automate predictions. Other examples of collaborative recommender systems include the book recommendation system from Amazon. com, the PHOAKS system that helps people find relevant information on the www (Terveen et al., 1997)and the Jester system that recommends jokes Goldberg et al, 2001). Automated collaborative filtering approach exposing the reasoning and data behind the recommendation is explained by Herlocker et al.(2000). Different item-based recommendation algorithms are analyzed by Sarwar et al. (2001). Walker et al. (2004) reviews collaborative filtering approach to recommendations and also implemented Altered Vista. Huang and Zeng(2005)represent input data as a bipartite graph and develops its topological measures to capture patterns that exist in the input data relevant to recommendations. Schafer et al. (2007) discusses the core concepts of collaborative filtering, its primary uses for users of the adaptive web, the theory and practice of CF algorithms, design decisions regarding rating systems and acquisition of ratings Researchers have also applied Swarm Intelligence techniques to recommender systems. Ujjin and Bentley(2003)have described Particle Swarm Optimization(PSO) recommender system in which PSO algorithm has been employed to learn personal preferences of users and provide tailored suggestions They demonstrated that PSO system outperformed a non-adaptive approach based on the Pearson algorithm and obtained higher prediction accuracy than the Genetic Algorithm system and Pearson algorithm in most cases. A system called CASIS Lorenzi ef al., 2005)has been developed combining case based reasoning approach with a metaphor from colonies of social insects, namely the honey bee dance. This combination has been used in the retrieval step of the recommendation cycle. Web-based system user interface hybrid recommendation method based on ant colony metaphor is presented in (Sobecki, 2008). The applied hybrid recommendation method employs demographic, content bas and collaborative filtering approaches. In this study, ant colony metaphor is used for selecting the most optimal path in the user interface graph that specifies the user interface parameters for the Clustering algorithrms have been used by researchers to quickly locate a user's neighbors. Clusters of users similar to the active user are quickly discovered and nearest neighbors can be selected from the most similar clusters. A variety of clustering techniques exist in the literature, but k-means is ommonly used clustering technique. The k-means algorithm implementation for cluster analysis as data mining approach has been discussed in several research papers(Kuo et al, 2005; Vrahatis et al 2002). Saatchi and Hung(2005) described hybridization of the ACO with k-means algorithm to solve image clustering problems. Shelokar et al.(2004) presents an ACO methodology for optimally clustering objects and compares the performance of the algorithm with other popular heuristic methods viz. genetic algorithm, simulated annealing etc. They claim that, ant based clustering algorithm optimally clusters n objects into K clusters and their computational simulations reveal very encouraging results in terms of the quality of solution found, the average number of functio evaluations and the processing time required. The data clustering technique developed by Shelokar et ad (2004)is applied by Kekec et al. (2006). Hand and Meyer(2007) have categorized ant based clustering methods into two main groups viz, (1)methods that directly mimics the clustering ehavior observed in real ant colonies and (2)methods that are less directly inspired by nature i.e., the clustering task is reformulated as an optimization task and general purpose ant based optimization heuristics are utilized to find good or near-optimal clustering
rti.htel,2(2):40-55,2009 The success of the collaborative filtering technology depends on the process of locating people with similar profiles (neighbors). We believe that an approach for generating recommendations based a promising research area. In the real ant systems, ants tend to lay a substance called pheromone hile walking from their nests to food source and vice versa. Ants are attracted by pheromones coming from fellow type ants and repulsed by pheromone of non-fellow type ants. Paths that are marked by stronger amount of pheromone are chosen with higher probability than those that have weaker amount of pheromone deposit. This collaborative behavior between identical type ants has an analogy with the collaborative world as people mostly collect opinions from their like-minded friends, neighbors etc One of the main limitations of CF based recommender systems is that it provides recommendations based solely on the opinion of the users whose preferences best matches the taste of active user. A case may arise when an item has not been rated well in the cluster whose similarity matches best with active user but is rated well in other clusters which have similarity closer to his taste. In such a scenario, combinning pheromone information with similarity measure for choosing clusters provides active user with good set of altemative recommendations. In addition to his taste, clusters that are marked by stronger amount of pheromone have the higher probability of being chosen than those that have weaker amount of pheromone deposit. Also, one of the fundamental challenges for recommender systems is to improve the quality of the recommendations. In such a scenario, pheromone updating step guides the search to a better recommendation. The positive feedback in the form of pheromone deposition results in achieving an emergent, unified behavior for the recommender system as a whole and produces a robust system capable of finding improved quality recommendations. In this study, we have tried to improve the quality of recommendation based on the pheromone density and pheromone updating strategy PROPOSED ANT RECOMMENDER SYSTEM Ant Recommender System(ARS)works in two phases. Phase I is the preprocessing phase, which is offline. In this phase, the background data in the form of user-item rating matrix is collected and clustered using an ant based clustering algorithm into predetermined number of clusters. Once the clusters are obtained, the cluster data along with their centroids are stored in the database for future recommendations. Phase II is online in which the recommendation process takes place for the active user. Here, the pheromone deposition/evaporation technique known from ant algorithms is combined with similarity measure for choosing the best clusters for making the recommendations. Rating quality of each item unrated by active user is computed in the chosen clusters. To generate recommendations, clusters are further selected from the chosen clusters based on rating quality of item Recommendations are then made by computing the weighted average of the ratings of items in the elected clusters. Pheromone is updated based on recommendations made to active user and past recommendations. This helps in improving the quality of recommendations for future users. Figure 1 shows the steps where ant colony metaphor has been applied to the traditional collaborative filtering based recommendation process The working of ARS is described below in detail with the help of Jester data set example Phase 1: Preprocessing Phase Step 1: Normalization of Background Data i.e., Rating Matrix User-item ratings taken from jester dataset rated in the scale of-10 to +10 is normalized in the scale 0 to l, where 0 indicates that the item is not rated by the corresponding user. To ease the discussion, running example shown in Table I is used, where U -U,o are users and J, -Jo are items Jokes)rated/ unrated by the user
rti.htel,2(2):40-55,2009 colony Rating neighbors recommendation for the active user Fig. 1: Ant based collaborative filtering proce Table 1: Running example of rating matrix from Jester dataset after normalization in the range 0 to 1 0.150.94 0.000.92 091038 0.93 0.94 0.840.67 0.13 0.2 0.35 0.14 0.44 indicates not rated Step 2: Clustering the Rating Matrix Using ant Based Clustering Algorithm The rating matrix containing the user-item data is clustered into K clusters using ant based clustering technique( Shelokar et al., 2004). Each agent(ant) discovers a possible partition ofitems a given dataset using some metric like Euclidean distance. Clustering information of items (jokes associated with an ant is accumulated in the global information hub (pheromone trail matrix) and is used by other agents to construct possible clustering solution string S and iteratively improve them. Each element of s contains the cluster number which corresponds to one of the items. For example, solution string S for N (items)=10 and K( clusters)=3 is as follor The first element in S indicates that first item (i.e., N=1)is assigned cluster number 2, second m is assigned cluster number l and so on. by agents using modified pheromone trail information available from previous iteration, (2)performing local search operation on the newly generated solutions and (3) updating pheromone trail matrix. The algorithm works for a fixed number of iterations and the solution having the lowest fitness function value(F)represents an optimal or near-optimal partitioning of items into subsets in a given dataset Clustering the rating matrix helps locate simmilar user profiles. Some intermediate results during clustering process for dataset shown in Table 1, Table 2 depicts pheromone trail matrix generated during run of ant based clustering algorithm. Table 3 depicts solution string S generated by R=10 ants along with fitness function F in sorted order of F during run of ant based algorithm. Table 4
rti.htel,2(2):40-55,2009 Table 2: Pheromone trail matrix generated during run of ant based clustering algorithm 0.009 0.0099 0.2735 0.2735 0.3997 0.1361 0.0099 0.273 0.3997 00099 0.0099 Table 3: Solution String S generated by R=10 ants along with Fitness function F in sorted order of F during run of ant based clustering algorithm String t 11211111 llli 95892 String U1 Table 5: Users in each cluster and the centroid U3, U4, U5, U7. U1o UU, Uo shows the best solution S, obtained after a fixed number of iterations which indicates that U3, U4, US, U, and Uo belong to cluster 1, U2 and Us belong to cluster 2, UI, U6 and U, belong to cluster 3 Step 3: Computing Centroid of Each Cluster Once the clusters are created, centroid of each cluster can be computed using Euclidean distance measure. Here, preferences of each user are compared with every other user in the cluster and the user with maximum similarity with other users becomes the centroid of that cluster. Table 5 shows the users in each cluster along with the centroid. step Storing the clusters and their centroid in database for future recommendations are made in Step 4: Initializing Pheromone for Each Cluster Initially, at time t= 1, the pheromone attached to each cluster depends on the density of the uster i.e., relative mmber of users in the cluster. If the relative number of users is more, pheromone attached is more and vice versa. The equation for initializing pheromone for cluster i is given below No of uscrs pher (t) (1) total no