Dynamic Models of Expert Groups to Recommend Web documents DaeEun Kim and Sea Woo Kim2 Division of Informatics University of Edinburgh, 5 Forrest Hill Edinburgh, EHI 2QL United Kingdom daeeunedai. ed acu 2 Manna Information System Bangbae-dong 915-9, Seocho-gu Seoul. 137-060. Korea seawoogunitel. co, kr Abstract. Recently most recommender systems have been developed to recommend items or documents based on user preferences for a partic- ular user, but they have difficulty in deriving user preferences for users who have not rated many documents. In this paper we use dynamic expert groups which are automatically formed to recommend domain- specific documents for unspecified users. The group members have d amic authority weights depending on their performance of the ranking evaluations. Human evaluations over web pages are very effective to find relevant information in a specific domain. In addition, we have tested several effectiveness measures on rank order to determine if the current top-ranked lists recommended by experts are reliable. We show simula- tion results to check the possibility of dynamic expert group models for 1 Introduction The development of recommender systems has emerged as an important issue e Internet application, and have drawn attention in the academic and com- mercial fields. An example of this application is to recommend new products or items of interest to online customers, using customer preferences Recommender systems can be broadly categorized into content-based and collaborative filtering systems 6, 13, 16, 17. Content-based filtering methods e textual descriptions of the documents or items to be recommended. A users file is associated with the content of the documents that the user has alread rated. The features of documents are extracted from information retrieval, pat tern recognition, or machine learning techniques. Then the content-based system recommends documents that match the user's profile or tendency 4, 17. In con ast, collaborative filtering systems are based on user ratings rather than the features in the documents 1, 17, 16. The systems predict the ratings of a user P. Constantopoulos and I.T. Solberg(Eds ) ECDL 2001, LNCS 2163, pp. 275-286, 2001
Dynamic Models of Expert Groups to Recommend Web Documents DaeEun Kim1 and Sea Woo Kim2 1 Division of Informatics, University of Edinburgh, 5 Forrest Hill Edinburgh, EH1 2QL United Kingdom daeeun@dai.ed.ac.uk 2 Manna Information System Bangbae-dong 915-9, Seocho-gu Seoul, 137-060, Korea seawoo@unitel.co.kr Abstract. Recently most recommender systems have been developed to recommend items or documents based on user preferences for a particular user, but they have difficulty in deriving user preferences for users who have not rated many documents. In this paper we use dynamic expert groups which are automatically formed to recommend domainspecific documents for unspecified users. The group members have dynamic authority weights depending on their performance of the ranking evaluations. Human evaluations over web pages are very effective to find relevant information in a specific domain. In addition, we have tested several effectiveness measures on rank order to determine if the current top-ranked lists recommended by experts are reliable. We show simulation results to check the possibility of dynamic expert group models for recommender systems. 1 Introduction The development of recommender systems has emerged as an important issue in the Internet application, and have drawn attention in the academic and commercial fields. An example of this application is to recommend new products or items of interest to online customers, using customer preferences. Recommender systems can be broadly categorized into content-based and collaborative filtering systems [6, 13, 16, 17]. Content-based filtering methods use textual descriptions of the documents or items to be recommended. Auser’s profile is associated with the content of the documents that the user has already rated. The features of documents are extracted from information retrieval, pattern recognition, or machine learning techniques. Then the content-based system recommends documents that match the user’s profile or tendency [4, 17]. In contrast, collaborative filtering systems are based on user ratings rather than the features in the documents [1, 17, 16]. The systems predict the ratings of a user P. Constantopoulos and I.T. Sølvberg (Eds.): ECDL 2001, LNCS 2163, pp. 275–286, 2001. c Springer-Verlag Berlin Heidelberg 2001
276 DaeEun Kim and sea woo kim over given documents or items, depending on ratings of other users with tastes similar to the user. Collaborating filtering systems, such as GroupLens [13, 9 can be a part of recommender systems for online shopping sites. They recom- mend items to users, with the history of products that similar users have ordered efore or have been interested in Most recommender systems have focused on the recommendations for a par- ticular user with the analysis of user preferences. Such systems require the user to judge many items in order to obtain the user's preferences. In general, many online customers or users are interested in other users' opinions or ratings about items that belong to a certain category, before they become used to searching for items of interest. For instance, customers in E-commerce like to see top-ranked lists of rating scores of many users for items that retailers provide, in order to purchase specific items. However, recommender systems still have the difficulty in providing relevant rating information before they receive a large number of user evaluations In this paper, we use a method to evaluate web documents by a representa- tive board of human agents [ 7; we call it an expert group. This is different from automatic recommender systems with software agents or feature extractions. We suggest dynamic expert groups among users should be automatically created to evaluate domain-specific documents for web page ranking and also the group members have dynamic authority weights depending on their performance of the ranking evaluations. This method is quite effective in recommending web documents or items that many users have not evaluated. a voting board of experts with expertise on a domain category is operated to evaluate the docu- ments. In this kind of problem, it is not a feasible idea to replace human agents by intelligent soft agents Our recommender system with dynamic expert groups may be extended to challenge search engine designs and image retrieval problems. Many search en gines find relevant information and its importance using automatic citation anal- ysis to the general subject of queries. The connectivity of hypertext documents has been a good measure for automatic web citation. This method works on the assumption that a site which is cited many times is popular and important Many automatic page ranking systems have used this citation metric to decide the relative importance of web documents. IBM HITS system maintains a hub and an authority score for every document [8. A method called Page Rank is suggested to compute a ranking for every web document based on the web con- nectivity graph 2 with the random walk traversal. It also considers the relative importance by checking ranks of documents, which means that a document is ranked as highly important when the document has backlinks from documents with high authority, such as the Yahoo home page However, the automatic citation analysis has a limitation that it does flect well the importance in viewpoints of human evaluation. There are many ses where simple citation counting does not refect our common sense concept of importance 2. The method of developing a new ranking technique based on human interactions has been explored in this paper to handle the problems
276 DaeEun Kim and Sea Woo Kim over given documents or items, depending on ratings of other users with tastes similar to the user. Collaborating filtering systems, such as GroupLens [13, 9], can be a part of recommender systems for online shopping sites. They recommend items to users, with the history of products that similar users have ordered before or have been interested in. Most recommender systems have focused on the recommendations for a particular user with the analysis of user preferences. Such systems require the user to judge many items in order to obtain the user’s preferences. In general, many online customers or users are interested in other users’ opinions or ratings about items that belong to a certain category, before they become used to searching for items of interest. For instance, customers in E-commerce like to see top-ranked lists of rating scores of many users for items that retailers provide, in order to purchase specific items. However, recommender systems still have the difficulty in providing relevant rating information before they receive a large number of user evaluations. In this paper, we use a method to evaluate web documents by a representative board of human agents [7]; we call it an expert group. This is different from automatic recommender systems with software agents or feature extractions. We suggest dynamic expert groups among users should be automatically created to evaluate domain-specific documents for web page ranking and also the group members have dynamic authority weights depending on their performance of the ranking evaluations. This method is quite effective in recommending web documents or items that many users have not evaluated. Avoting board of experts with expertise on a domain category is operated to evaluate the documents. In this kind of problem, it is not a feasible idea to replace human agents by intelligent soft agents. Our recommender system with dynamic expert groups may be extended to challenge search engine designs and image retrieval problems. Many search engines find relevant information and its importance using automatic citation analysis to the general subject of queries. The connectivity of hypertext documents has been a good measure for automatic web citation. This method works on the assumption that a site which is cited many times is popular and important. Many automatic page ranking systems have used this citation metric to decide the relative importance of web documents. IBM HITS system maintains a hub and an authority score for every document [8]. Amethod called Page Rank is suggested to compute a ranking for every web document based on the web connectivity graph [2] with the random walk traversal. It also considers the relative importance by checking ranks of documents, which means that a document is ranked as highly important when the document has backlinks from documents with high authority, such as the Yahoo home page. However, the automatic citation analysis has a limitation that it does not reflect well the importance in viewpoints of human evaluation. There are many cases where simple citation counting does not reflect our common sense concept of importance [2]. The method of developing a new ranking technique based on human interactions has been explored in this paper to handle the problems
Dynamic Models of Expert Groups to Recommend Web Documents 277 We run a pool of experts, human agents to evaluate web documents and their authorities are dynamically determined by their performance. Also we suggest several effectiveness measures based on rank order. We have simulation result of expert-selection process over users'random access to web documents 2 Method 2.1 Dynamic Authority Weights of ExI eb Web Crawler meta-search engine QUery Category Ranking pert group ranking engine Fig. l Search engine diagram We define a group of people with high authority and much expertise in a special field as an expert group. This expert group is automatically established to evaluate web documents on a specific category. We provide a framework for search engine with our recommender system in Fig. 1. A meta-search engine is to collect good web documents from conventional search engines(e. g. Yahoo Alta Vista, Excite, Info Seek) Addresses of the documents cited in search engine are stored in the document dB. each web document has the information of how ny search engines in the meta-search engine are referring to the document nd keeps the record of how many times online users have accessed the web document using the search engine. For every category there is a list of top ranked documents rated by an expe group which are sorted by score. Authoritative web pages are determined by human expert members. The experts directly examine the content of candidat web pages, which are highly referenced amo many users. The method of employing an expert group is based on the idea that
Dynamic Models of Expert Groups to Recommend Web Documents 277 We run a pool of experts, human agents to evaluate web documents and their authorities are dynamically determined by their performance. Also we suggest several effectiveness measures based on rank order. We have simulation results of expert-selection process over users’ random access to web documents. 2 Method 2.1 Dynamic Authority Weights of Experts DBQuery Category Ranking expert group Database file Indexer Web Crawler meta-search engine web Monitor search engine ranking engine users Fig. 1. Search engine diagram We define a group of people with high authority and much expertise in a special field as an expert group. This expert group is automatically established to evaluate web documents on a specific category. We provide a framework for search engine with our recommender system in Fig.1. Ameta-search engine is run to collect good web documents from conventional search engines(e.g. Yahoo, AltaVista, Excite, InfoSeek). Addresses of the documents cited in search engines are stored in the document DB. Each web document has the information of how many search engines in the meta-search engine are referring to the document, and keeps the record of how many times online users have accessed the web document using the search engine. For every category there is a list of top ranked documents rated by an expert group, which are sorted by score. Authoritative web pages are determined by human expert members. The experts directly examine the content of candidate web pages, which are highly referenced among web documents or accessed by many users. The method of employing an expert group is based on the idea that
278 DaeEun Kim and sea woo kim for a given decision task requiring expert knowledge, many experts may be better than one if their individual judgments are properly combined. In our system experts decide whether a web document should be classified into a recommended document for a given category. A simple way is the majority voting [ 11, 10 where each expert has a binary vote for a web document and then the document obtaining equal to or greater than half of the votes are classified into a top ranked list An alternative method we can consider is a weighted linear combination. A weighted linear sum of expert votings yields the collaborative net-effect ratings of documents. In this paper, we take the adaptive weighted linear combination method, where the individual contributions of members in the expert group are weighted by their judgment performance. The evaluations of all the experts are summed with weighted linear combinations. The expert rating results will dynamically change depending on each expert's performance. Our approach of expert group decision is similar to a classifier committee concept in automatic text categorization [10, 15. Their methods use classifiers based on various sta tistical or learning techniques instead of human interactions and decisions. This weighted measure is useful even when the number of experts is not fixed It will be an issue how to choose experts and decide authority weights. We define a rating score matrix X lxii when the i-th expert rates a web document d, with a score Xij. For each web document dj, the voting score of an expert committee is given as follows V(4)=∑rx=∑、Rx k=1 where Ne is the number of experts for a given category and ri is the relative authority for the i-th expert member in the expert pool, and wi is the authority weight for the i-th expert member. We suppose wi should be positive for all time. he weight wi is a dynamic factor, and it represents each expert's authority to evaluate documents. The higher authority weight indicates the expert is more influential to make a voting decision We define the error measure e as a squared sum of differences between desired roting scores and actual voting scores as follows E=22v(1)-V(d)=2红k+-a where n is the number of documents evaluated by users, V'(di) is the users voting score for an expert-voted document d;. We assume V'(di) is the average over all user scores, but in reality it is rarely possible to receive the feedback from all users. The authority weight for each expert is changed every session, which is a given period of time, and at the same time V'(di) can be approximated by the central limit theorem with a set of V'(di), which is the average of user ratings during the given session
278 DaeEun Kim and Sea Woo Kim for a given decision task requiring expert knowledge, many experts may be better than one if their individual judgments are properly combined. In our system, experts decide whether a web document should be classified into a recommended document for a given category. Asimple way is the majority voting [11, 10], where each expert has a binary vote for a web document and then the document obtaining equal to or greater than half of the votes are classified into a top ranked list. An alternative method we can consider is a weighted linear combination. A weighted linear sum of expert votings yields the collaborative net-effect ratings of documents. In this paper, we take the adaptive weighted linear combination method, where the individual contributions of members in the expert groups are weighted by their judgment performance. The evaluations of all the experts are summed with weighted linear combinations. The expert rating results will dynamically change depending on each expert’s performance. Our approach of expert group decision is similar to a classifier committee concept in automatic text categorization [10, 15]. Their methods use classifiers based on various statistical or learning techniques instead of human interactions and decisions. This weighted measure is useful even when the number of experts is not fixed. It will be an issue how to choose experts and decide authority weights. We define a rating score matrix X = [χij ] when the i-th expert rates a web document dj with a score χij . For each web document dj , the voting score of an expert committee is given as follows : V (dj ) = Ne i=1 riχij = Ne i=1 wi Ne k=1 wk χij where Ne is the number of experts for a given category and ri is the relative authority for the i-th expert member in the expert pool, and wi is the authority weight for the i-th expert member. We suppose wi should be positive for all time. The weight wi is a dynamic factor, and it represents each expert’s authority to evaluate documents. The higher authority weight indicates the expert is more influential to make a voting decision. We define the error measure E as a squared sum of differences between desired voting scores and actual voting scores as follows : E = 1 2 n j=1 [V (dj ) − V (dj )]2 = 1 2 n j=1 { Ne i=1 wi Ne k=1 wk χij − V (dj )}2 where n is the number of documents evaluated by users, V (dj ) is the users’ voting score for an expert-voted document dj . We assume V (dj ) is the average over all user scores, but in reality it is rarely possible to receive the feedback from all users. The authority weight for each expert is changed every session, which is a given period of time, and at the same time V (dj ) can be approximated by the central limit theorem with a set of V (dj ), which is the average of user ratings during the given session.
Dynamic Models of Expert groups to Recommend Web Documents 279 We use a gradient-descent method over the error measure E with respect to a weight wi and the gradient is given by dE ∑Ⅳ(d)-V(d4)3)=∑-v(a where S= ke, Wk is the sum of weights, and 4,=[V(di)-v'(di) is the difference between predicted voting score and users'rating score during a session for a document dj We apply the similar scheme shown in error back-progation of multilayer perceptrons 5 to our approach. If we update weights of experts by feedback of users about a web document dj, the weight is changed each session by the following dynamic equation (t+1)=m(1)-mx-vd3+a(4()-21(t-1) where n is a learning rate proportional to the number of user ratings per session, and a is the momentum constant The above equation says how to reward or penalize authority weights for their share of the responsibility for any error. According to the equation, the weight change involves with the correlation between a voting score difference among experts and the error difference. For example, when both an expert voted score and the desirable rank score are larger than the weighted average voting score, or both of them are smaller than the average score, the expert gets rewards, otherwise gets penalty. In this case some experts have rewards and others receive penalties depending on the weighted average voting score of the expert group 2.2 Evaluation effectiveness When dynamic authority weights are assigned to experts for a category, the expert group ratings can form a ranking list in order. We need to determine if the given ranking list is reliable. Reliable ranking means that good experts selected into a pool of expert group and they recommend relevant documents items to general users. We evaluate the prediction performance of expert groups in terms of effectiveness, that is, a measure of the agreement between expert groups and users in ranking a test set of web documents. We assume there are many users to evaluate top-ranked lists in contrast to a small number of expert e suggest several effectiveness measures which are related to the agreement in rank order between expert ratings and user ratings. They are rank order indow measure, rank function measure, Spearmans correlation measure and FB measure with rank order partition
Dynamic Models of Expert Groups to Recommend Web Documents 279 We use a gradient-descent method over the error measure E with respect to a weight wi and the gradient is given by ∂E ∂wi = ∂ ∂wi ( 1 2 n j=1 [V (dj ) − V (dj )]2) = n j=1 [χij − V (dj )]∆j S where S = Ne k=1 wk is the sum of weights, and ∆j = [V (dj ) − V (dj )] is the difference between predicted voting score and users’ rating score during a session for a document dj . We apply the similar scheme shown in error back-progation of multilayer perceptrons [5] to our approach. If we update weights of experts by feedback of users about a web document dj , the weight is changed each session by the following dynamic equation : wi(t + 1) = wi(t) − η[χij − V (dj )]∆j S + α(wi(t) − wi(t − 1)) where η is a learning rate proportional to the number of user ratings per session, and α is the momentum constant. The above equation says how to reward or penalize authority weights for their share of the responsibility for any error. According to the equation, the weight change involves with the correlation between a voting score difference among experts and the error difference. For example, when both an expertvoted score and the desirable rank score are larger than the weighted average voting score, or both of them are smaller than the average score, the expert gets rewards, otherwise gets penalty. In this case some experts have rewards and others receive penalties depending on the weighted average voting score of the expert group. 2.2 Evaluation Effectiveness When dynamic authority weights are assigned to experts for a category, the expert group ratings can form a ranking list in order. We need to determine if the given ranking list is reliable. Reliable ranking means that good experts are selected into a pool of expert group and they recommend relevant documents or items to general users. We evaluate the prediction performance of expert groups in terms of effectiveness, that is, a measure of the agreement between expert groups and users in ranking a test set of web documents. We assume there are many users to evaluate top-ranked lists in contrast to a small number of experts in a category group. We suggest several effectiveness measures which are related to the agreement in rank order between expert ratings and user ratings. They are rank order window measure, rank function measure, Spearman’s correlation measure and Fβ measure with rank order partition.