counterpart in the relationship. A low ability value should result from the existence of conflicting data and this should make the observer unable to fill in the uncertainty gap. When there are not enough observations to distinguish rating trends data might ppear to be highly conflicting. We propose the following formula to model uncertainty from prediction error where k is the number of common experiences(ratings)of the two entities that form a relationship, Pr is the predicted rating of item x calculated using some prediction calculation formula and Ix is the real rate that the entity has given to item x. m represents the maximum value that a rating can take and it is used here as a measure of rating. As can be seen, uncertainty is inversely proportional to the number of ex periences. This agrees with the definition of uncertainty we presented in the pre- vious section The logical reasoning for deriving formula(4.1) for Uncertainty is the following. Incertainty is proportional to the prediction error for every user's single experience, therefore the numerator represents the absolute error between the predicted value (using a rating prediction formula) and the real (rated) value. The denominator m has been used for normalizing the error to the range 0-1. The summing symbol has been used to include all the experiences(k in number) of a particular user. Finally, the division by the total number of experiences(k)is done to get the average norma lized error. In the sum we take every pair of common ratings and try to predict what the rate p would be. Therefore it is assumed that on every prediction calculation all but the real rating of the value that is to be predicted exist Unlike Beta mapping [14 where u tends to 0 as the number of experiences grows, in our model the trend remains quite uncertain because u is also dependent on the average prediction error. In the extreme case where there is high controversy in the data, u will reach a value close to 1, leaving a small space for belief and dis- belief. Another interesting characteristic of our model is the asymmetry in the trust relationships produced, which adheres to the natural form of relationships since the levels of trust that two entities place on each other may not be necessarily the same As regards the other two properties b(belief) and d(disbelief), we set them up in such a way that they are dependent on the value of the Correlation Coefficient CC We made the following two assumptions The belief (disbelief) property reaches its maximum value(1-u) when CC=l(or The belief (disbelief) property reaches its minimum value(1-u) when CC=-1(or CC=l respectively) which are expressed by the two formulae b=(1-)a+CC) (4.3)
counterpart in the relationship. A low ability value should result from the existence of conflicting data and this should make the observer unable to fill in the uncertainty gap. When there are not enough observations to distinguish rating trends data might appear to be highly conflicting. We propose the following formula to model uncertainty from prediction error: ¦ k x xx m rp k u 1 1 (4.1) where k is the number of common experiences (ratings) of the two entities that form a relationship, px is the predicted rating of item x calculated using some prediction calculation formula and rx is the real rate that the entity has given to item x. m represents the maximum value that a rating can take and it is used here as a measure of rating. As can be seen, uncertainty is inversely proportional to the number of experiences. This agrees with the definition of uncertainty we presented in the previous section. The logical reasoning for deriving formula (4.1) for Uncertainty is the following: Uncertainty is proportional to the prediction error for every user’s single experience; therefore the numerator represents the absolute error between the predicted value (using a rating prediction formula) and the real (rated) value. The denominator m has been used for normalizing the error to the range 0-1. The summing symbol has been used to include all the experiences (k in number) of a particular user. Finally, the division by the total number of experiences (k) is done to get the average normalized error. In the sum we take every pair of common ratings and try to predict what the rate p would be. Therefore it is assumed that on every prediction calculation all but the real rating of the value that is to be predicted exist. Unlike Beta mapping [14] where u tends to 0 as the number of experiences grows, in our model the trend remains quite uncertain because u is also dependent on the average prediction error. In the extreme case where there is high controversy in the data, u will reach a value close to 1, leaving a small space for belief and disbelief. Another interesting characteristic of our model is the asymmetry in the trust relationships produced, which adheres to the natural form of relationships since the levels of trust that two entities place on each other may not be necessarily the same. As regards the other two properties b (belief) and d (disbelief), we set them up in such a way that they are dependent on the value of the Correlation Coefficient CC. We made the following two assumptions: x The belief (disbelief) property reaches its maximum value (1-u) when CC=1 (or CC=-1 respectively) x The belief (disbelief) property reaches its minimum value (1-u) when CC= -1 (or CC=1 respectively) which are expressed by the two formulae: )1( 2 )1( CC u b (4.2) )1( 2 )1( CC u d (4.3) 108 G. Pitsilis et al
Modeling Trust for Recommender Systems using Similarity Metrics As can be seen, the ratio of belief and disbelief is shaped by the CC value. In this ay, a positive Correlation Coefficient would be expected to strengthen the belief property at the expense of disbelief. In the same way, disbelief appears to be strong er than belief between entities that are negatively correlated (CC<o) These two formulae can be used in the opposite way too: for estimating how similar the two entities should consider each other, given their trust properties. The asymmetry in the trust relationships is mainly responsible for having unequal simi arities between the original one and the one derived from the backward application of the formula. The different points of view are responsible for this difference as well as the formula used to work out the predictions px in(4. 1). The formulas pro- posed in [15] as well as Resnick's[ 19] empirical one built for the Grouplens CF system can be used for the calculation of p As we can see in this proposed model, belief/disbelief increases/decreases linear- ly with the Correlation Coefficient and in terms of computational complexity, the uncertainty formula is O(n). This seems to be a significant drawback to this method be repeated whenever a new score is entered by any of the two parte i to run for n because the calculation of uncertainty requires the prediction formula times which in turn requires the calculation of similarity value k times. This has to 4.2 The new proposed model Since the above formula is found to be computationally intensive we came up with other less complex alternative formulas for modeling the same notions The first thing that we changed was the calculation of uncertainty. In contrast to the old approach, in the new design it is calculated exclusively from the quantity of experiences similarly as is done in the beta pdf mapping in Josang's approach[141 However, in our new model we propose that every pair of common scores is counted as a different experience and for the uncertainty calculation we use the for mula:u=(n+1), where n is the number of common scores non-linear and circular. Amongst the pros of the alternative formulas is the signifi- cantly lower complexity O(n) which means lower calculation time since it is now dependent only on the number of common ratings For a linear approach to shaping belief and disbelief the formulae used should be the same as before in the original model expressed in(4.2)and(4.3). For non-linear approaches we tried equations which are shown as figures of various skewnesses The belief property alternatives are expressed in table 1. To save space, the formulas from which disbelief (d) is derived are not presented but for all cases d is considered as the remainder since d=1-b-u and it is symmetric to belief. In addition to the two assumptions we made for the linear mapping shown in the previous paragraph, we included a third which is A zero correlation coefficient(CC=0)should mean that belief equals disbelief
As can be seen, the ratio of belief and disbelief is shaped by the CC value. In this way, a positive Correlation Coefficient would be expected to strengthen the belief property at the expense of disbelief. In the same way, disbelief appears to be stronger than belief between entities that are negatively correlated (CC<0). similar the two entities should consider each other, given their trust properties. The asymmetry in the trust relationships is mainly responsible for having unequal similarities between the original one and the one derived from the backward application of the formula. The different points of view are responsible for this difference as well as the formula used to work out the predictions px in (4.1). The formulas proposed in [15] as well as Resnick’s [19] empirical one built for the Grouplens CF system can be used for the calculation of x p . As we can see in this proposed model, belief/disbelief increases/decreases linearly with the Correlation Coefficient and in terms of computational complexity, the uncertainty formula is O(n2 ). This seems to be a significant drawback to this method because the calculation of uncertainty requires the prediction formula to run for n times which in turn requires the calculation of similarity value k times. This has to be repeated whenever a new score is entered by any of the two parties. 4.2 The new proposed model Since the above formula is found to be computationally intensive we came up with other less complex alternative formulas for modeling the same notions. The first thing that we changed was the calculation of uncertainty. In contrast to the old approach, in the new design it is calculated exclusively from the quantity of experiences similarly as is done in the beta pdf mapping in Josang’s approach [14]. However, in our new model we propose that every pair of common scores is counted as a different experience and for the uncertainty calculation we use the formula: 1 )1( nu , where n is the number of common scores. As to belief and disbelief we tried various associations with CC such as linear, non-linear and circular. Amongst the pros of the alternative formulas is the significantly lower complexity O(n) which means lower calculation time since it is now dependent only on the number of common ratings. For a linear approach to shaping belief and disbelief the formulae used should be the same as before in the original model expressed in (4.2) and (4.3). For non-linear approaches we tried equations which are shown as figures of various skewnesses. The belief property alternatives are expressed in table 1. To save space, the formulas from which disbelief (d) is derived are not presented but for all cases d is considered as the remainder since d = 1 – b – u and it is symmetric to belief. In addition to the two assumptions we made for the linear mapping shown in the previous paragraph, we included a third which is: x A zero correlation coefficient (CC=0) should mean that belief equals disbelief. Modeling Trust for Recommender Systems using Similarity Metrics 109 These two formulae can be used in the opposite way too: for estimating how