Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Genetic Algorithms for Feature Weighting in multi-criteria Recommender Systems Chein-Shung hwang Dept of Information Managemen Chinese culture university aipei, Taiwan hwang@faculty. pccu. edu dor: 10.4156/jcit vol5. issue813 Abstract Recommender systems have been emerging as a powerful technique of e-commerce. The majority of existing recommender systems uses an overall rating value on items for evaluating user's preference opinions. Because users might express their opinions based on some specific features of the item, recommender systems solely based on a single criterion could produce recommendations that do not meet user needs. In this paper, we propose a mechanism for integrating multiple criteria into th Collaborative Filtering (CF) algorithm. Specifically, we present the implementation of Genetic Algorithms (GA) for optimal feature weighting. The proposed system consists of two main parts. First, the weight of each user toward each feature is computed by using GAs. The feature weights are then incorporated into the collaborative filtering process to provide recommendations. Empirical studies have shown that our weighting scheme can be incorporated to improve the performance of multi criteria cF Keywords: Genetic Algorithms, Collaborative Filtering, Multiple Criteria, Recommender Systems 1 Introduction Although the rapid development and popularity of the Internet have brought convenience for people to access a variety of information, products, and services, the internet has also caused a scenario where people have difficulty obtaining relevant information. This scenario is commonly referred to as the problem of"information overload". Recommender systems [1, 2] are emergent to solve the information overload challenges. Recommender systems are personalized information filtering technologies used to suggest items to users that they might like or find interesting. In recent years, recommender systems ave been successfully applied in a broad range of applications, including recommending movies books. news articles. music. etc. Recommender systems can be based on content-based filtering, collaborative filtering, and hybrid Itering [3, 4]. Content-based recommendation suggests items to users that are similar to those items that they were interested in previously. Collaborative filtering( CF)recommends items based on the information about similar items or users. Hybrid recommendation combines both algorithms into one hybrid approach to gain better performance and avoid drawbacks from each individual one p The majority of existing recommender systems uses an overall rating value on items for evaluating s'preferences. The overall rating depends on one single criterion that usually represents the overall preference of users to a particular item. As users might express their opinions based on some specific features of the item, recommender systems merely based on a single criterion could produce recommendations that do not meet user needs. For example, in a movie recommender system, two sers A and B both assign a single-criterion rating of 12(out of 13)for Avatar. The recommender systems will conclude they have the same tastes even if A likes its story and B likes its visuals. This is called a"without distinction of interest problem. Furthermore, even if both users like the same movi features(e.g. actors, visuals, etc. ) they might select different movies. This is because people usually select a movie based on different movie features This situation is referred to as an"unsuitable weight feature" problem [5] Many recommender systems [6-8 have been developed to tackle the above-mentioned problems by using multiple criteria. Roux et al. [6] described a course recommender system
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang Dept. of Information Management Chinese Culture University Taipei, Taiwan cshwang@faculty.pccu.edu.tw doi: 10.4156/jcit.vol5.issue8.13 Abstract Recommender systems have been emerging as a powerful technique of e-commerce. The majority of existing recommender systems uses an overall rating value on items for evaluating user’s preference opinions. Because users might express their opinions based on some specific features of the item, recommender systems solely based on a single criterion could produce recommendations that do not meet user needs. In this paper, we propose a mechanism for integrating multiple criteria into the Collaborative Filtering (CF) algorithm. Specifically, we present the implementation of Genetic Algorithms (GA) for optimal feature weighting. The proposed system consists of two main parts. First, the weight of each user toward each feature is computed by using GAs. The feature weights are then incorporated into the collaborative filtering process to provide recommendations. Empirical studies have shown that our weighting scheme can be incorporated to improve the performance of multicriteria CF. Keywords: Genetic Algorithms, Collaborative Filtering, Multiple Criteria, Recommender Systems 1. Introduction Although the rapid development and popularity of the Internet have brought convenience for people to access a variety of information, products, and services, the internet has also caused a scenario where people have difficulty obtaining relevant information. This scenario is commonly referred to as the problem of “information overload”. Recommender systems [1, 2] are emergent to solve the information overload challenges. Recommender systems are personalized information filtering technologies used to suggest items to users that they might like or find interesting. In recent years, recommender systems have been successfully applied in a broad range of applications, including recommending movies, books, news articles, music, etc. Recommender systems can be based on content-based filtering, collaborative filtering, and hybrid filtering [3, 4]. Content-based recommendation suggests items to users that are similar to those items that they were interested in previously. Collaborative filtering (CF) recommends items based on the information about similar items or users. Hybrid recommendation combines both algorithms into one hybrid approach to gain better performance and avoid drawbacks from each individual one. The majority of existing recommender systems uses an overall rating value on items for evaluating users’ preferences. The overall rating depends on one single criterion that usually represents the overall preference of users to a particular item. As users might express their opinions based on some specific features of the item, recommender systems merely based on a single criterion could produce recommendations that do not meet user needs. For example, in a movie recommender system, two users A and B both assign a single-criterion rating of 12 (out of 13) for Avatar. The recommender systems will conclude they have the same tastes even if A likes its story and B likes its visuals. This is called a “without distinction of interest” problem. Furthermore, even if both users like the same movie features (e.g. actors, visuals, etc.), they might select different movies. This is because people usually select a movie based on different movie features. This situation is referred to as an “unsuitable weight feature” problem [5]. Many recommender systems [6-8] have been developed to tackle the above-mentioned problems by using multiple criteria. Roux et al. [6] described a course recommender system 126
Journal of Convergence Information Technology Volume 5. Number 8 October 2010 based on the multiple criteria decision-making model and the CF method. In this system, for each decision maker, a user-specified set of feature weights was required in order to aggregate measures from each feature. Teng et al. [7 viewed the multi-criteria recommendation problem s a data query problem and utilized the skyline query to eliminate conflicting criteria. Since the recommendation problem was no longer viewed as an optimization problem, the feature weights were not taken into account. UTA-Rec [8] was a utility-based recommender system that worked by employing the preference disaggregation principle. This system implemented the UTA*[91 algorithm to model a user's preference in terms of a set of additive utility functions based on multiple criteria. The criteria weights could be represented by the trade-off among the criteria utility values. However, performance might have been affected when there were only a very small number of ratings available for the active user Genetic algorithms (GAs) are adaptive algorithms based on the Darwinian principle of natural selection and have been successfully applied to a wide range of optimization problems including design, scheduling, routing, control, etc. In this paper, we propose a multi-criteria recommender ystem, which uses GAs for feature weighting. The proposed system consists of three modules. In the MCP (Multi-Criteria Prediction Module), the user-based CF algorithm is used to estimate the ratings for each individual criterion In the AG(Aggregation Module, the ga is used to find an optimal set of feature weights for each active user. Finally, in the rC(Recommendation Module), a list of recommendations is derived and presented. This optimal weighting scheme enables the recommender system to make more accurate predictions of users' likes and dislikes, and hence will provide better recommendations to users 2. Genetic algorithms GAs [1oI[ll are stochastic search techniques that guide a population of solutions using the principles of evolution and natural genetics. Extensive research has been performed exploiting the robust properties of genetic algorithms and demonstrating their capabilities across a broad spectrum of optimization problems, including feature selection and weighting tasks. GAs are modeled loosely on the principles of the evolution via natural selection, employing a population of individuals that undergo selection in the presence of variation-inducing operators, such as mutation and crossover. A fitness function is used to evaluate individuals, and reproductive success varies with fitness A general algorithm is started with a set of solutions(represented by chromosomes) called population. An initial population is created from a random selection of solutions. At every evolutionary step(generation), the solutions in the current population are evaluated according to some predefined quality criterion, referred to as the fitness or fitness function. Solutions from one population are used to form a new population(next generation). Solutions(parents) are selected according to their fitness-the more suitable they are the more chances they have to reproduce. These solutions then"reproduce"to create one or more new solution (offspring), after which the offspring are randomly produced by crossover or mutation. The process of fitness-dependent selection and application of genetic operators to generate successive generations of solutions is repeated many times until a satisfactory solution found. The basic steps of GAs are outlined as follows 1. [Initialization Randomly generate an initial population of solutions and evaluate the fitness function mplete. On] Create a new population by repeating the following steps until the new population 2. 1 [ Selection Select two parent solutions from a population according to their fitness(the better itness, the greater the chance to be selected) 2.2 [ Crossover] With a crossover probability cross over the parents to form a new offspring. If no crossover is performed, the offspring is an exact copy of the parents 3 Mutation With a mutation probability, mutate new offspring at each locus(position in the chromosome) 2.4 [Acceptance] Place new offspring in a new populatic Evaluation] Compute the fitness values for the new population of N solutions 4. [Test] If the stopping criterion is met, stop, and return the best solution in current popula
Journal of Convergence Information Technology Volume 5, Number 8, October 2010 based on the multiple criteria decision-making model and the CF method. In this system, for each decision maker, a user-specified set of feature weights was required in order to aggregate measures from each feature. Teng et al. [7] viewed the multi-criteria recommendation problem as a data query problem and utilized the skyline query to eliminate conflicting criteria. Since the recommendation problem was no longer viewed as an optimization problem, the feature weights were not taken into account. UTA-Rec [8] was a utility-based recommender system that worked by employing the preference disaggregation principle. This system implemented the UTA* [9] algorithm to model a user’s preference in terms of a set of additive utility functions based on multiple criteria. The criteria weights could be represented by the trade-off among the criteria utility values. However, performance might have been affected when there were only a very small number of ratings available for the active user. Genetic algorithms (GAs) are adaptive algorithms based on the Darwinian principle of natural selection and have been successfully applied to a wide range of optimization problems including design, scheduling, routing, control, etc. In this paper, we propose a multi-criteria recommender system, which uses GAs for feature weighting. The proposed system consists of three modules. In the MCP (Multi-Criteria Prediction Module), the user-based CF algorithm is used to estimate the ratings for each individual criterion. In the AG (Aggregation Module), the GA is used to find an optimal set of feature weights for each active user. Finally, in the RC (Recommendation Module), a list of recommendations is derived and presented. This optimal weighting scheme enables the recommender system to make more accurate predictions of users' likes and dislikes, and hence will provide better recommendations to users. 2. Genetic algorithms GAs [10][11] are stochastic search techniques that guide a population of solutions using the principles of evolution and natural genetics. Extensive research has been performed exploiting the robust properties of genetic algorithms and demonstrating their capabilities across a broad spectrum of optimization problems, including feature selection and weighting tasks. GAs are modeled loosely on the principles of the evolution via natural selection, employing a population of individuals that undergo selection in the presence of variation-inducing operators, such as mutation and crossover. A fitness function is used to evaluate individuals, and reproductive success varies with fitness. A general algorithm is started with a set of solutions (represented by chromosomes) called population. An initial population is created from a random selection of solutions. At every evolutionary step (generation), the solutions in the current population are evaluated according to some predefined quality criterion, referred to as the fitness or fitness function. Solutions from one population are used to form a new population (next generation). Solutions (parents) are selected according to their fitness - the more suitable they are the more chances they have to reproduce. These solutions then "reproduce" to create one or more new solution (offspring), after which the offspring are randomly produced by crossover or mutation. The process of fitness-dependent selection and application of genetic operators to generate successive generations of solutions is repeated many times until a satisfactory solution is found. The basic steps of GAs are outlined as follows: 1. [Initialization] Randomly generate an initial population of solutions and evaluate the fitness function. 2. [New population] Create a new population by repeating the following steps until the new population is complete. 2.1 [Selection] Select two parent solutions from a population according to their fitness (the better fitness, the greater the chance to be selected). 2.2 [Crossover] With a crossover probability cross over the parents to form a new offspring. If no crossover is performed, the offspring is an exact copy of the parents. 2.3 [Mutation] With a mutation probability, mutate new offspring at each locus (position in the chromosome). 2.4 [Acceptance] Place new offspring in a new population. 3. [Evaluation] Compute the fitness values for the new population of N solutions. 4. [Test] If the stopping criterion is met, stop, and return the best solution in current population. 127
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung hwang 5. Loop] Go to step 2 3. Traditional collaborative filtering CF recommends items based on the historical ratings data of similar users. There are two major approaches for CF: user-based (also called memory-based)and model-based. User-based CF identifies users whose interests are similar to an active user and recommends items they like. Model-based CF infers a compact model and then uses it for recommendations. Here we describe the user-based CF In a typical CF scenario, there is a list of n users U=( uI, u2,..., u), a list of m items trix of ratings R=Ir], where r, represents the rating of user u, for item i, CF systems usually consist of two phases: neighborhood formation and prediction computation. In neighborhood formation, the similarity between the active user ua and other users u, is computed. There are two common methods to compute the similarity between users: Pearson correlation and cosine similarity. Pearson correlation measures the extent to which two variables linearly relate with each other. The Pearson correlation between users u and u. is defined as ∑(a-F)(-7) ∑(-元)2∑(-元) where /(a, r)represents the items that both user u, and u, have rated and ra and Fr represent the average rating of users u, and u,, respectively. The cosine similarity views the rating of each user as a multidimensional vector and computes the cosine value between the two vectors i∈(a,r) Once the similarities between active user u, and other users u, are computed, a subset of the nearest neighbors of u, are chosen based on the similarity measure, and the ratings of neighbors are ggregated to generate predictions for the active user. We consider two popular ways to average the ng. The first is the weighted sum approach Pa.j and the second is the adjusted weighted sum approach r∈N(a) I sima I where Pa. represents the prediction for the active user u, for item i, and M(a) represents the set of
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang 5. [Loop] Go to step 2. 3. Traditional collaborative filtering CF recommends items based on the historical ratings data of similar users. There are two major approaches for CF: user-based (also called memory-based) and model-based. User-based CF identifies users whose interests are similar to an active user and recommends items they like. Model-based CF infers a compact model and then uses it for recommendations. Here we describe the user-based CF approach [12]. In a typical CF scenario, there is a list of n users U={ uuu n ,,, 21 }, a list of m items I={ m ,,, iii 21 }, and a matrix of ratings R=[ ij r ] , where ij r represents the rating of user ui for item j i . CF systems usually consist of two phases: neighborhood formation and prediction computation. In neighborhood formation, the similarity between the active user a u and other users r u is computed. There are two common methods to compute the similarity between users: Pearson correlation and cosine similarity. Pearson correlation measures the extent to which two variables linearly relate with each other. The Pearson correlation between users ua and ur is defined as ∑∑ ∑ ∈ ∈ ∈ − − −− = ),( , ),( , ),( , , , )()( ( )( ) raIi rir raIi aia raIi riraia ra rrrr rrrr sim 2 2 (1) where raI ),( represents the items that both user ua and ur have rated and ar and rr represent the average rating of users ua and ur , respectively. The cosine similarity views the rating of each user as a multidimensional vector and computes the cosine value between the two vectors. ∑∑ ∑ ∈ ∈ ∈ × = ),( , ),( , ),( ,, , raIi ir raIi ia raIi iria ra rr rr sim 2 2 (2) Once the similarities between active user ua and other users ur are computed, a subset of the nearest neighbors of ua are chosen based on the similarity measure, and the ratings of neighbors are aggregated to generate predictions for the active user. We consider two popular ways to average the rating. The first is the weighted sum approach: ∑ ∑ ∈ ∈ × = )( , )( ,, , || aNr ra aNr jrra ja sim rsim p (3) and the second is the adjusted weighted sum approach: ∑ ∑ ∈ ∈ −× += )( , )( ,, , || )( aNr ra aNr rjrra aja sim rrsim rp (4) where p , ja represents the prediction for the active user ua for item j i and N(a) represents the set of 128
Journal of Convergence Information Technology Volume 5. Number 8 October 2010 nearest neighbors of u 4. Multi-criteria collaborative filtering The CF technique has been successfully applied to solve the recommendation problem with single riterion. When the problem is extended to be multiple conflicting criteria, new techniques are needed in order to effectively incorporate the multi-criteria rating information into the recommendation process. Particularly, in this section describe how to incorporate the multi-criteria rating information into the CF process following the aggregation function-based approach proposed in [13] Aggregation Module(AG), and the Recommendation Model(RC)as shown in Figure dule(MP), the The proposed system consists of three main modules: the Multi-Criteria Prediction Mo The overall system can be viewed as a blackbox which takes, as input, the ratings matrix, and produces, as output, a recommendation list. The ratings matrix R contains the rating set ru drg, i,,..,"y) standing for the ratings of the user u, for item iy, where rg represents the overall rating value and i represents the rating value for criterion k. The MP module uses the single-criterion CF algorithm to produce the recommendation list for each individual criterion. The AG module computes the user's preference for each criterion in terms of weight using GA algorithms. The rC module produces an overall recommendation list based on the results from MP and AG modules Multi-Criteria Agge空ator Prediction Module TOp-N Recommendation List Figure 1. System architecture 4. 1. Aggregation module In the multi-criteria rating environment, different people may place different emphases on these interrelated features. The goal of AG is to find the relationship between the overall rating and the underlying multi-criteria ratings for each user. More specifically, given the ratings data of a user, AG computes his/her preference model in terms of feature weights using GAs In this study, each chromosome x in the ga process is expressed as a set of feature weights where W, =(wI, w2, w,, w4)with 4 genes. Each gene is represented as a movie feature weight and encod with 8 bits. The Ga begins with a random population of chromosomes. For an active user l,, each chromosome is assigned alternately and tested by the fitness function. The fitness function measures the prediction accuracy of products using the weights as defined in the current chromosome (5) Fitness(x)=l-
Journal of Convergence Information Technology Volume 5, Number 8, October 2010 nearest neighbors of ua . 4. Multi-criteria collaborative filtering The CF technique has been successfully applied to solve the recommendation problem with single criterion. When the problem is extended to be multiple conflicting criteria, new techniques are needed in order to effectively incorporate the multi-criteria rating information into the recommendation process. Particularly, in this section, we describe how to incorporate the multi-criteria rating information into the CF process following the aggregation function-based approach proposed in [13]. The proposed system consists of three main modules: the Multi-Criteria Prediction Module (MP), the Aggregation Module (AG), and the Recommendation Model (RC) as shown in Figure 1. The overall system can be viewed as a blackbox which takes, as input, the ratings matrix, and produces, as output, a recommendation list. The ratings matrix R contains the rating set rij = { k ijijij ,,, rrr 10 } standing for the ratings of the user ui for item ij, where 0 ij r represents the overall rating value and k ij r represents the rating value for criterion k. The MP module uses the single-criterion CF algorithm to produce the recommendation list for each individual criterion. The AG module computes the user’s preference for each criterion in terms of weight using GA algorithms. The RC module produces an overall recommendation list based on the results from MP and AG modules. Figure 1. System architecture 4. 1. Aggregation Module In the multi-criteria rating environment, different people may place different emphases on these interrelated features. The goal of AG is to find the relationship between the overall rating and the underlying multi-criteria ratings for each user. More specifically, given the ratings data of a user, AG computes his/her preference model in terms of feature weights using GAs. In this study, each chromosome x in the GA process is expressed as a set of feature weights where ),,,( 4321 = wwwwW xxxxx with 4 genes. Each gene is represented as a movie feature weight and encoded with 8 bits. The GA begins with a random population of chromosomes. For an active user ua , each chromosome is assigned alternately and tested by the fitness function. The fitness function measures the prediction accuracy of products using the weights as defined in the current chromosome. l rp itness xF l j ∑ jaja = − −= 1 00 1 ,, )( (5) 129
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung hwang Each chromosome x is tested repeatedly, I times, by a leave-one-out cross validation. Pa. and ra. are the predicted and the actual overall ratings of user ua to item ij, respectively The overall prediction can be made by two different approaches, the filter-based and the wrapper based approach, depending on whether the method uses feedback from the subsequent MP module The filter-based method contains algorithms that use no input other than users'own ratings to calculate the overall predictions, whereas the wrapper-based approach uses feedback (i.e. the predicted ratings) from the MP module to facilitate the prediction For the filter-based approach, the predicted overall rating is calculated as a weighted sum o individual original ratings Wr where r. is the actual rating of user u, to item i, for criterion k For the wrapper-based approach, the predicted overall rating is calculated as a weighted sum of individual predicted ratings generated by the MP module n2×p w where Pa, is the predicted rating of user u, to item i for criterion k For each generation evolution, chromosomes for the next generation are selected using the roulette wheel selection scheme to implement proportionate random selection. All of the chromosomes are then paired up using the single-point crossover strategy with a probability Pe After the crossover, for each of the genes of the chromosomes, the gene is mutated with a probability Pm. The algorithm continues to evolve until a maximum number of generations is reached. The chromosome with the highest fitnes value is then selected as the optimal set of feature weights for active user u 4. 2. Multi-Criteria Prediction Module In the MP module, we decompose k-dimensional multi-criteria rating space into k single-rating recommendation problems and use a traditional user-based CF technique to estimate ratings for each criterion. Thus, we decompose the multi-criteria rating matrix R into k single-criterion rating matrices R. The CF algorithm is then implemented k times, one for each criterion. For each R",we first compute the similarity simar between active user u, and other users u, using the Pearson Correlation Method and then estimate the rating value pa for unrated item i using an adjusted 4.3. Recommendation model The goal of the RC module is to produce a recommendation list for an active user. RC modules first edict each unknown overall rating Pay directly by using the feature weight function estimated in the AG module and the multi-criteria rating value Pa estimated in the MP module
Genetic Algorithms for Feature Weighting in Multi-criteria Recommender Systems Chein-Shung Hwang Each chromosome x is tested repeatedly, l times, by a leave-one-out cross validation. 0 ja p , and 0 jar , are the predicted and the actual overall ratings of user ua to item ij, respectively. The overall prediction can be made by two different approaches, the filter-based and the wrapperbased approach, depending on whether the method uses feedback from the subsequent MP module. The filter-based method contains algorithms that use no input other than users’ own ratings to calculate the overall predictions, whereas the wrapper-based approach uses feedback (i.e. the predicted ratings) from the MP module to facilitate the prediction. For the filter-based approach, the predicted overall rating is calculated as a weighted sum of individual original ratings. ∑ ∑ = = × = 4 1 4 0 1 k k x k k ja k x ja w rw p , , (6) where k jar , is the actual rating of user a u to item ij for criterion k. For the wrapper-based approach, the predicted overall rating is calculated as a weighted sum of individual predicted ratings generated by the MP module. ∑ ∑ = = × = 4 1 4 0 1 k k x k k ja k x ja w pw p , , (7) where k ja p , is the predicted rating of user ua to item ij for criterion k. For each generation evolution, chromosomes for the next generation are selected using the roulette wheel selection scheme to implement proportionate random selection. All of the chromosomes are then paired up using the single-point crossover strategy with a probability pc. After the crossover, for each of the genes of the chromosomes, the gene is mutated with a probability pm. The algorithm continues to evolve until a maximum number of generations is reached. The chromosome with the highest fitness value is then selected as the optimal set of feature weights for active user ua . 4. 2. Multi-Criteria Prediction Module In the MP module, we decompose k-dimensional multi-criteria rating space into k single-rating recommendation problems and use a traditional user-based CF technique to estimate ratings for each criterion. Thus, we decompose the multi-criteria rating matrix R into k single-criterion rating matrices k R . The CF algorithm is then implemented k times, one for each criterion. For each k R , we first compute the similarity k ar sim between active user ua and other users ur using the Pearson Correlation Method and then estimate the rating value k aj p for unrated item j i using an adjusted weighted sum approach. 4. 3. Recommendation Model The goal of the RC module is to produce a recommendation list for an active user. RC modules first predict each unknown overall rating 0 aj p directly by using the feature weight function estimated in the AG module and the multi-criteria rating value k aj p estimated in the MP module. 130