otherwise, a lower authority weight. If the authority weight becomes negativ An opinion- corresponding expert will be dropped from the representative board and a new based decision expert opinions. If there is more than one user who has provided the most frequent model feedback, one user will be randomly chosen from among them. In this way, the constitution of the expert group is dynamically changed We define a rating score matrix X=[X;l, when the i-th expert rates a web 589 document d with a score X=[Xi]. For each web document d, the voting score of an expert committee is given as follows: =2 where Ne is the number of experts for a given category and ri is the relative authority for the i-th expert member of the expert pool, and wi is the authority weight for the i-th expert member. We suppose w should always be positive. The weight w; is a dynamic factor, and it represents each experts authority to evaluate documents. a higher authority weight indicates that the expert has more influence in a voting decision. voting scores and actual voting scores, as follows. on of differences between desired E [V(dj)-v'(dD V(di) where n is the number of documents evaluated by users, v(d) is the users'voting score for an expert-voted document d;. We assume V(di is the average over all user scores, but in reality it is rarely possible to receive feedback from all users. The authority weight for each expert is changed every session, which is a given period of time, and at the same time v(di) can be approximated by the central limit theorem with a set of v(dg, which is the average user rating during the given session. We use a gradient-descent method over the error measure E with respect to a weight w and the gradient is given by ae a LV(di)-v(dil where S=Ek,Wr is the sum of weights, and A;=V(dj)-v(di) is the difference between the predicted voting score and the users' rating score during a session for a document d 24(+1)=2(0)-n-v(4)÷+a(a()-2(-1) We apply the similar scheme shown in error back-propagation of multiplayer perceptrons(haykin, 1999)to our approach. If we update the weights of experts with the feedback of users about a web document d, the weight is changed each session by the following dynamic equation:
otherwise, a lower authority weight. If the authority weight becomes negative, the corresponding expert will be dropped from the representative board and a new member will be chosen from among users who have highest participation in evaluating expert opinions. If there is more than one user who has provided the most frequent feedback, one user will be randomly chosen from among them. In this way, the constitution of the expert group is dynamically changed. We define a rating score matrix X ¼ ½Xij, when the i-th expert rates a web document dj with a score X ¼ ½Xij. For each web document dj, the voting score of an expert committee is given as follows: VðdjÞ ¼ X Ne i¼1 rixij ¼ X Ne i¼1 wi PNe k¼1wk xij where Ne is the number of experts for a given category and ri is the relative authority for the i-th expert member of the expert pool, and wi is the authority weight for the i-th expert member. We suppose wi should always be positive. The weight wi is a dynamic factor, and it represents each expert’s authority to evaluate documents. A higher authority weight indicates that the expert has more influence in a voting decision. We define the error measure E as a squared sum of differences between desired voting scores and actual voting scores, as follows: E ¼ 1 2 Xn j¼1 VðdjÞ 2 V0 ðdjÞ 2 ¼ 1 2 Xn j¼1 Xn i¼1 wi PNe k¼1wk xij 2 V0 ðdjÞ ( )2 where n is the number of documents evaluated by users, V0 (dj) is the users’ voting score for an expert-voted document dj. We assume V0 (dj) is the average over all user scores, but in reality it is rarely possible to receive feedback from all users. The authority weight for each expert is changed every session, which is a given period of time, and at the same time V0 (dj) can be approximated by the central limit theorem with a set of V˜ 0 (dj), which is the average user rating during the given session. We use a gradient-descent method over the error measure E with respect to a weight wi and the gradient is given by: ›E ›wi ¼ › ›wi 1 2 Xn j¼1 VðdjÞ 2 V~0 ðdjÞ 2 ! ¼ Xn j¼1 xij 2 VðdjÞ Dj S where S ¼ PNe k¼1wk is the sum of weights, and Dj ¼ VðdjÞ 2 V~0 ðdjÞ is the difference between the predicted voting score and the users’ rating score during a session for a document dj: wiðt þ 1Þ ¼ wiðtÞ 2 h xij 2 VðdjÞ Dj S þ að Þ wiðtÞ 2 wiðt 2 1Þ We apply the similar scheme shown in error back-propagation of multiplayer perceptrons (Haykin, 1999) to our approach. If we update the weights of experts with the feedback of users about a web document dj, the weight is changed each session by the following dynamic equation: An opinionbased decision model 589
DIR 33,3 (+1)=(1)-n1x-Vd)+a(2(D-a0(t-1) where m is a learning rate proportional to the number of user ratings per session and a h The above equation says how to reward or penalise the authority weights of experts the momentum constant 590 for their share of responsibility for any error. According to the equation, the weight change involves the correlation between a voting score difference among experts and he error difference. For example, when both an expert-voted score and the desirable-rank score are larger than the weighted average voting score or both of them are smaller than the average score, the expert is rewarded; if otherwise, the expert penalised. In this case some experts have rewards and others receive penalties depending on the weighted average voting score of the expert group Evaluation of effectiveness When dynamic authority weights are assigned to experts for a category the expert group ratings can form a ranking list in order. We need to determine if the given ranking list is reliable. Reliable ranking means that good experts have been selected for an expert group and they recommend relevant documents or items to general users. We evaluate the prediction performance of expert groups in terms of effectiveness-that is, a measure of the agreement between expert groups and users-in ranking a test set of web documents. We assume there are many users to evaluate the top-ranked lists in contrast to a small number of experts in a category group We suggest several effectiveness measures that are related to the agreement in rank order between expert ratings and user ratings. They are rank order window measure rank function measure, and Fe measure with rank order partition. We compared these with Spearmans correlation measure, which is a common measure in the information retrieval field Rank order window measure. Given a sample query or category, we can represent the effectiveness as the percentage of top-ranked lists that user ratings rank in the same or very close position as an expert group does. Given top-ranked web documents D=dl, do,..., dn,i, we can define effectiveness As with rank order window S(dy as A s S(dk=l min S(dk) μ(d)-d) 28(dk)+1 where dk is the k-th web document from the test set for a given category, and 8(dy is the width of the window centred in the rank uld) assigned by the ratings of experts for dk- Q() is the rank position of the average rating score of users for a document dk. s(dy) calculates the rate of the rank order difference in the window u(d2)-d4),p(d2)+d) For this measure, we directly compare the rank of documents that the expert group provides with the rank given by users. For each of the top-ranked documents the experts recommend, we calculate how much the rank position is changed by user feedback. However, we check the position change within a window size
wiðt þ 1Þ ¼ wiðtÞ 2 h xij 2 VðdjÞ Dj S þ að Þ wiðtÞ 2 wiðt 2 1Þ where h is a learning rate proportional to the number of user ratings per session and a is the momentum constant. The above equation says how to reward or penalise the authority weights of experts for their share of responsibility for any error. According to the equation, the weight change involves the correlation between a voting score difference among experts and the error difference. For example, when both an expert-voted score and the desirable-rank score are larger than the weighted average voting score, or both of them are smaller than the average score, the expert is rewarded; if otherwise, the expert is penalised. In this case some experts have rewards and others receive penalties depending on the weighted average voting score of the expert group. Evaluation of effectiveness When dynamic authority weights are assigned to experts for a category, the expert group ratings can form a ranking list in order. We need to determine if the given ranking list is reliable. Reliable ranking means that good experts have been selected for an expert group and they recommend relevant documents or items to general users. We evaluate the prediction performance of expert groups in terms of effectiveness – that is, a measure of the agreement between expert groups and users – in ranking a test set of web documents. We assume there are many users to evaluate the top-ranked lists in contrast to a small number of experts in a category group. We suggest several effectiveness measures that are related to the agreement in rank order between expert ratings and user ratings. They are rank order window measure, rank function measure, and Fb measure with rank order partition. We compared these with Spearman’s correlation measure, which is a common measure in the information retrieval field. Rank order window measure. Given a sample query or category, we can represent the effectiveness as the percentage of top-ranked lists that user ratings rank in the same or very close position as an expert group does. Given top-ranked web documents D ¼ {d1, d2, ... , dn}, we can define effectiveness ld with rank order window d(dk) as: Ld ¼ Pn k¼1 SðdkÞ n SðdkÞ ¼ 1 2 1 dðdkÞ min dðdkÞ; mðdk X ÞþdðdkÞ i¼mðdkÞ2dðdkÞ mðdkÞ 2 QðdiÞ 2dðdkÞ þ 1 ! where dk is the k-th web document from the test set for a given category, and d(dk) is the width of the window centred in the rank m(dk) assigned by the ratings of experts for dk. Q(dk) is the rank position of the average rating score of users for a document dk. S(dk) calculates the rate of the rank order difference in the window mðdkÞ 2 dðdkÞ; mðdkÞ þ dðdkÞ . For this measure, we directly compare the rank of documents that the expert group provides with the rank given by users. For each of the top-ranked documents the experts recommend, we calculate how much the rank position is changed by user feedback. However, we check the position change within a window size. OIR 33,3 590