Results. More than 150 participants registered for the mailing list which enabled them to download the dataset. At the end we received 42 submissions -21 fo each of the Tasks 1 2. Additionally, 24 participants submitted a paper that can be found in the proceedings at hand We used the F1-Measure common in Information retrieval to evaluate the submitted recommendations. Therefore, we first computed for each post in the test data precision and recall by comparing the first five recommended tags against the tags the user has originally assigned to this post. Then we averaged precision and recall over all posts in the test data and used the resulting precision and recall to compute the F1-Measure as flm= =pecision recan The winning team of Task l has an flm of 0.18740, the second and third follow with 0. 18001 and 0.17975. For Task 2. the winner achieved an flm of 0.35594 33185and0. conference and later on the website of the challenge Lipczak et al. from Dalhousie University, Halifax, Canada(cf page 157)are the winners of Task 1. with a method based on the combination of tags from the resource's title, tags assigned to the resource by other users and tags in the user's profile they reached an flm of 0. 18740 in Task I and additionally achieved the third place in Task 2 with an flm of 0. 32461. The system is composed of six recommenders and the basic idea is to augment the tags from the title by related and then rescore and merge the occurrence graphs and from the user's profile extracted from two tag The winners of Task 2, Rendle and Schmidt-Thieme from University of Hildesheim, Germany(cf page 235)achieved an flm of 0. 35594 with a statis- tical method based on factor models. Therefore, they factorize the folksonomy structure to find latent interactions between users, resources and tags. Using a variant of the stochastic gradient descent algorithm the authors optimize an adaptation of the Bayesian Personal Ranking criterion. Finally, they estimate how many tags to recommend to further improve precision The second of Task 1(Mrosek et al., page 189)harvests tags from sources ke Delicious, Google Scholar, and CiteULike. They also employ the full-text of web pages and PDFs. The third (Ju and Hwang, page 109) merges tags which have been earlier assigned to the resource or used by the user as well as resource descriptions by a weighting scheme. Finally, the second of Task 2 (Balby Marinho et al., page 7)uses relational classification methods in a semi-supervised learning scenario to recommend tags We thank all participants of the challenge for their contributions and the organizers of the ECML PKDD 2009 conference for their support. Furthermore, we want to thank our sponsors Nokia and Tagora for supporting the challenge by awarding prizes for the winners of each task. We are looking forward to a very exciting and interesting workshop Folke Eisterlehner. Andreas hotho. Robert jaschke http://www.nokia.com/ http://www.tagora-project.eu
Results. More than 150 participants registered for the mailing list which enabled them to download the dataset. At the end, we received 42 submissions – 21 for each of the Tasks 1 & 2. Additionally, 24 participants submitted a paper that can be found in the proceedings at hand. We used the F1-Measure common in Information Retrieval to evaluate the submitted recommendations. Therefore, we first computed for each post in the test data precision and recall by comparing the first five recommended tags against the tags the user has originally assigned to this post. Then we averaged precision and recall over all posts in the test data and used the resulting precision and recall to compute the F1-Measure as f1m = 2·precision · recall precision + recall . The winning team of Task 1 has an f1m of 0.18740, the second and third follow with 0.18001 and 0.17975. For Task 2, the winner achieved an f1m of 0.35594, followed by 0.33185 and 0.32461. The winner of Task 3 will be announced at the conference and later on the website of the challenge. Lipczak et al. from Dalhousie University, Halifax, Canada (cf. page 157) are the winners of Task 1. With a method based on the combination of tags from the resource’s title, tags assigned to the resource by other users and tags in the user’s profile they reached an f1m of 0.18740 in Task 1 and additionally achieved the third place in Task 2 with an f1m of 0.32461. The system is composed of six recommenders and the basic idea is to augment the tags from the title by related tags extracted from two tag-tag–co-occurrence graphs and from the user’s profile and then rescore and merge them. The winners of Task 2, Rendle and Schmidt-Thieme from University of Hildesheim, Germany (cf. page 235) achieved an f1m of 0.35594 with a statistical method based on factor models. Therefore, they factorize the folksonomy structure to find latent interactions between users, resources and tags. Using a variant of the stochastic gradient descent algorithm the authors optimize an adaptation of the Bayesian Personal Ranking criterion. Finally, they estimate how many tags to recommend to further improve precision. The second of Task 1 (Mrosek et al., page 189) harvests tags from sources like Delicious, Google Scholar, and CiteULike. They also employ the full-text of web pages and PDFs. The third (Ju and Hwang, page 109) merges tags which have been earlier assigned to the resource or used by the user as well as resource descriptions by a weighting scheme. Finally, the second of Task 2 (Balby Marinho et al., page 7) uses relational classification methods in a semi-supervised learning scenario to recommend tags. We thank all participants of the challenge for their contributions and the organizers of the ECML PKDD 2009 conference for their support. Furthermore, we want to thank our sponsors Nokia3 and Tagora4 for supporting the challenge by awarding prizes for the winners of each task. We are looking forward to a very exciting and interesting workshop. Kassel, August 2009 Folke Eisterlehner, Andreas Hotho, Robert J¨aschke 3 http://www.nokia.com/ 4 http://www.tagora-project.eu/ 6
Relational Classification for Personalized Ta Recommendation eandro Balby Marinho, Christine Preisach, and Lars Schmidt-Thieme Information Systems and Machine Learning Lab (IsmLL) Samelsonplatz 1. University of Hildesheim, D-31141 Hildesheim, Germany [marinho, preisach, schmidt-thieme Gismll uni-hildesheimde ht Abstract. Folksonomy data is relational by nature, and therefore methods that directly exploit these relations are prominent for the tag recommendation prob lem. Relational methods have been successfully applied to areas in which en- tities are linked in an explicit manner, like hypertext documents and scientific publications. For approaching the graph-based tag recommendation task of the ECML PKDD Discovery Challenge 2009, we propose to turn the folksonomy raph into a homogeneous post graph and use relational classification techniques for predicting tags. Our approach features adherence to multiple kinds of rela- 1 Introduction One might want tag recommendations for several reasons, as for example: simplifying the tagging process for the user, exposing different facets of a resource and helping the tag vocabulary to converge. Given that users are free to tag, i.e., the same resource can be tagged differently by different people, it is important to personalize the recom- mended tags for an individual user. Tagging data forms a ternary relation between users, resources and tags, differently from typical recommender systems in which the relation is usually binary between users and resources. The best methods presented so far explore this ternary relation to com- pute tag predictions, either by means of tensor factorization [8]or PageRank 3], on the hypergraph induced by the ternary relational data. We, on the other hand, propose to explore the underlying relational graph between posts by means of relational classifica In this paper we describe our approaches for addressing the graph-based tag rec- ommendation task of the ECML PKDD Discovery Challenge 2009. We present two basic algorithms: PWA*(probabilistic weighted average ), an iterative relational clas- sification algorithm enhanced with relaxation labelling, and WA *(weighted average). an ite ods feature: adherence to multiple kinds of relations, training free, fast predictions, and semi-supervised classification Semi-supervised classification is particularly appealing because it allows us to evtl benefit from the information contained in the test dataset Furthermore, we propose to combine these methods through unweighted voting
Relational Classification for Personalized Tag Recommendation Leandro Balby Marinho, Christine Preisach, and Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Samelsonplatz 1, University of Hildesheim, D-31141 Hildesheim, Germany {marinho,preisach,schmidt-thieme}@ismll.uni-hildesheim.de http://www.ismll.uni-hildesheim.com/ Abstract. Folksonomy data is relational by nature, and therefore methods that directly exploit these relations are prominent for the tag recommendation problem. Relational methods have been successfully applied to areas in which entities are linked in an explicit manner, like hypertext documents and scientific publications. For approaching the graph-based tag recommendation task of the ECML PKDD Discovery Challenge 2009, we propose to turn the folksonomy graph into a homogeneous post graph and use relational classification techniques for predicting tags. Our approach features adherence to multiple kinds of relations, semi-supervised learning and fast predictions. 1 Introduction One might want tag recommendations for several reasons, as for example: simplifying the tagging process for the user, exposing different facets of a resource and helping the tag vocabulary to converge. Given that users are free to tag, i.e., the same resource can be tagged differently by different people, it is important to personalize the recommended tags for an individual user. Tagging data forms a ternary relation between users, resources and tags, differently from typical recommender systems in which the relation is usually binary between users and resources. The best methods presented so far explore this ternary relation to compute tag predictions, either by means of tensor factorization [8] or PageRank [3], on the hypergraph induced by the ternary relational data. We, on the other hand, propose to explore the underlying relational graph between posts by means of relational classification. In this paper we describe our approaches for addressing the graph-based tag recommendation task of the ECML PKDD Discovery Challenge 2009. We present two basic algorithms: PWA* (probabilistic weighted average), an iterative relational classification algorithm enhanced with relaxation labelling, and WA* (weighted average), an iterative relational classification method without relaxation labelling. These methods feature: adherence to multiple kinds of relations, training free, fast predictions, and semi-supervised classification. Semi-supervised classification is particularly appealing because it allows us to evtl. benefit from the information contained in the test dataset. Furthermore, we propose to combine these methods through unweighted voting. 7
4 The paper Is organized as follows. Section 2 presents the notation used throughout In Section 3 we show how we turned the folksonomy into a post relation graph. Section 4 introduces the individual classifiers and the ensemble technique we used In Section 5 we elaborate on the evaluation and experiments conducted for tuning the parameters of our models, and report the results obtained on the test dataset released for the challenge. The paper closes with conclusions and directions for future work 2 Notation Foksonomy data usually comprises a set of users U, a set of resources R, a set of tags T, and a set Y of ternary relations between them, i.e,YCUXRXT Let {(u,r)t∈T:(u,r,t) be the set of all unique user/resources combinations in the data, where each pair is called a post. For convenience, let T(r=(u, r)):=teTI(u,r,t)eY be the set of all tags assigned to a given post r E x. We consider train/test splits based on posts, i.e Xtrain, Xtest C X disjoint and covering all of X XtrainUXtest=X For training, the learner has access to the set Xtrain of training posts and their true gs T xmi. The tag recommendation task is then to predict, for a given E Xtest, a set T(a)sTof tags that are most likely to be used by the resp. user for the resp. resource 3 Relation Engineering We propose to represent folksonomy data as a homogeneous, undirected relational graph over the post set, i.e G: =(X, E)in which edges are annotated with a weight w:XxX-R denoting the strength of the relation. Besides making the input data nore compact-we have only a binary relation RcxxX between objects of the same ype-this representation will allow us to trivially cast the problem of personalized tag recommendations as a relational classification problem. Relational classifiers usually consider, additionally to the typical attribute-value data of objects, relational information. A scientific paper, for example, can be connected to another paper that has been written by the same author or because they share common citations. It has been shown in many classification problems that relational classifiers perform better than purely attribute-based classifiers [1, 4, 6] In our case, we assume that posts are related to each other if they share the same user: Ruser: =I(a, rE Xx Xluser(r)=user(r), the same resource: Rres Ia,TE Xx Xres(a)=res(r), or either share the same user or resource Ruser: Ruser U Rres(see Figure 1). For convenience, let user(=)and res(a) denote the user and resource of post r respectively. Thus, each post is connected to each other either in terms of other users that tagged the same resource, or the resources tagged by the same user. Weights are discussed in Section 4
The paper is organized as follows. Section 2 presents the notation used throughout the paper. In Section 3 we show how we turned the folksonomy into a post relational graph. Section 4 introduces the individual classifiers and the ensemble technique we used. In Section 5 we elaborate on the evaluation and experiments conducted for tuning the parameters of our models, and report the results obtained on the test dataset released for the challenge. The paper closes with conclusions and directions for future work. 2 Notation Foksonomy data usually comprises a set of users U, a set of resources R, a set of tags T, and a set Y of ternary relations between them, i.e., Y ⊆ U × R × T. Let X := {(u, r)| ∃t ∈ T : (u, r, t) ∈ Y } be the set of all unique user/resources combinations in the data, where each pair is called a post. For convenience, let T(x = (u, r)) := {t ∈ T | (u, r, t) ∈ Y } be the set of all tags assigned to a given post x ∈ X. We consider train/test splits based on posts, i.e., Xtrain, Xtest ⊂ X disjoint and covering all of X: Xtrain∪˙ Xtest = X For training, the learner has access to the set Xtrain of training posts and their true tags T|Xtrain . The tag recommendation task is then to predict, for a given x ∈ Xtest, a set Tˆ(x) ⊆ T of tags that are most likely to be used by the resp. user for the resp. resource. 3 Relation Engineering We propose to represent folksonomy data as a homogeneous, undirected relational graph over the post set, i.e., G := (X, E) in which edges are annotated with a weight w : X × X → R denoting the strength of the relation. Besides making the input data more compact – we have only a binary relation R ⊆ X×X between objects of the same type – this representation will allow us to trivially cast the problem of personalized tag recommendations as a relational classification problem. Relational classifiers usually consider, additionally to the typical attribute-value data of objects, relational information. A scientific paper, for example, can be connected to another paper that has been written by the same author or because they share common citations. It has been shown in many classification problems that relational classifiers perform better than purely attribute-based classifiers [1, 4, 6]. In our case, we assume that posts are related to each other if they share the same user: Ruser := {(x, x0 ) ∈ X × X | user(x) = user(x 0 )}, the same resource: Rres := {(x, x0 ) ∈ X × X|res(x) = res(x 0 )}, or either share the same user or resource: Rres user := Ruser ∪ Rres (see Figure 1). For convenience, let user(x) and res(x) denote the user and resource of post x respectively. Thus, each post is connected to each other either in terms of other users that tagged the same resource, or the resources tagged by the same user. Weights are discussed in Section 4. 8
Fig. 1. Ruser( top left), Rres(bottom left)and Ruser (right)of a given test post(nodes in grey) Note that it may happen that some of the related posts belong themselves to the test dataset, allowing us to evtl. profit from the unlabeled information of test nodes through, e. g, collective inference(see Section 4). Thus, differently from other ap. proaches(e.g, [3, 8]) that are only restricted to Xtrain, we can also exploit the set X of test posts, but of course not their associated true tags Now, for a given r E Xtest, one can use the tagging information of related instances to estimate T(a). A simple way to do that is, e. g, through tag frequencies of related P印):s|{x'∈NHt∈r(x)H x∈X,t∈T while Nr is the neighborhood of r: N:={x'∈X|(x,x)∈R,T(x)≠ In section 4 we will present the actual relational classifiers we have used to approach 4 Relational classification for Tag recommendation We extract the relational information by adapting simple statistical relational methods, usually used for classification of hypertext documents, scientific publications or movies to the tag recommendation scenario. The aim is to recommend tags to users by using the neighborhood encoded in the homogeneous graph G(X, E). Therefore we described a ery simple method in eq(1), where the probability for a tag t e T given a node a (post)is computed by counting the frequency of neighboring posts z'E Nr that have used the same tag t In this case the strength of the relations is not taken into account, i.e., all considered neighbors of a have the same influence on the probability of tag t
Fig. 1. Ruser (top left), Rres (bottom left) and Rres user (right) of a given test post (nodes in grey) Note that it may happen that some of the related posts belong themselves to the test dataset, allowing us to evtl. profit from the unlabeled information of test nodes through, e.g., collective inference (see Section 4). Thus, differently from other approaches (e.g., [3, 8]) that are only restricted to Xtrain, we can also exploit the set Xtest of test posts, but of course not their associated true tags. Now, for a given x ∈ Xtest, one can use the tagging information of related instances to estimate Tˆ(x). A simple way to do that is, e.g., through tag frequencies of related posts: P(t|x) := |{x 0 ∈ Nx|t ∈ T(x 0 )}| |Nx| , x ∈ X, t ∈ T (1) while Nx is the neighborhood of x: Nx := {x 0 ∈ X |(x, x0 ) ∈ R, T(x) 6= ∅} (2) In section 4 we will present the actual relational classifiers we have used to approach the challenge. 4 Relational Classification for Tag Recommendation We extract the relational information by adapting simple statistical relational methods, usually used for classification of hypertext documents, scientific publications or movies, to the tag recommendation scenario. The aim is to recommend tags to users by using the neighborhood encoded in the homogeneous graph G(X, E). Therefore we described a very simple method in eq. (1), where the probability for a tag t ∈ T given a node x (post) is computed by counting the frequency of neighboring posts x 0 ∈ Nx that have used the same tag t. In this case the strength of the relations is not taken into account, i.e., all considered neighbors of x have the same influence on the probability of tag t 9
given a. But this is not an optimal solution, the more similar posts are to each other th higher the weight of this edge should be Hence, a more suitable relational method for tag recommendation is the WeightedAverage(WA) which sums up all the weights of posts z'E N that share the same tag t E T and normalizes this by the sum over all weights in the neighborhood P(+|x) ∑r∈Nt∈r(x)(x,x Thus, WA does only consider neighbors that belong to the training set. A more sophisticated relational method that takes probabilities into account is the probabilistic weighted average(PWA), it calculates the probability of t given r by build- ing the weighted average of the tag probabilities of neighbor nodes T'E Na ∑x∈N,(x,x')P(tx P(t]r) Where P(tr)=l for I'E Xtrain, i.e., we are only exploiting nodes contained in the training set(see eq. (2). We will see in the next paragraph how one can exploit these probabilities in a more clever way. Both approaches have been introduced in [5] and applied to relational datasets Since we want to recommend more than one tag we need to cast the tag recommen- dation problem as a multilabel classification problem, i. e, assign one or more classes to a test node. We accomplish the multilabel problem by sorting the calculated probabili ties P(tr)for all r E Xtest and recommend the top n tags with highest probabilities The proposed relational methods could either be applied on Rrsr, 1. e, the union of the user and resource relation or on each relation Ruser, Rres individually. If applied on each relation the results could be combined by using ensemble techniques 4.1 Semi-Supervised learning As mentioned before, we would like additionally, to exploit unlabeled information con- tained in the graph and use the test nodes that have not been tagged yet, but are related to other nodes. This can be achieved by applying collective inference methods, being iterative procedures, which classify related nodes simultaneously and exploit relational autocorrelation and unlabeled data. relational autocorrelation is the correlation among a variable of an entity to the same variable(here the class)of a related entity, i.e,con- nected entities are likely to have the same classes assigned. Collective Classification is emi-supervised by nature, since one exploits the unlabeled part of the data. One of this semi-supervised methods is relaxation labeling [1], it can be formally expressed as: M(P(tI)22 We first initialize the unlabeled nodes with the prior probability calculated using the train set, then compute the probability of tag t given r iteratively using a relational clas- sification method M based on the neighborhood Nr in the inner loop. The procedure stops when the algorithm converges (i. e, the difference of the tag probability between
given x. But this is not an optimal solution, the more similar posts are to each other the higher the weight of this edge should be. Hence, a more suitable relational method for tag recommendation is the WeightedAverage (WA) which sums up all the weights of posts x 0 ∈ Nx that share the same tag t ∈ T and normalizes this by the sum over all weights in the neighborhood. P(t|x) = P x0∈Nx|t∈T(x0) w(x, x0 ) P x0∈Nx w(x, x0) (3) Thus, WA does only consider neighbors that belong to the training set. A more sophisticated relational method that takes probabilities into account is the probabilistic weighted average (PWA), it calculates the probability of t given x by building the weighted average of the tag probabilities of neighbor nodes x 0 ∈ Nx: P(t|x) = P x0∈Nx w(x, x0 )P(t|x 0 ) P x0∈Nx w(x, x0) (4) Where P(t|x 0 ) = 1 for x 0 ∈ Xtrain, i.e., we are only exploiting nodes contained in the training set (see eq. (2)). We will see in the next paragraph how one can exploit these probabilities in a more clever way. Both approaches have been introduced in [5] and applied to relational datasets. Since we want to recommend more than one tag we need to cast the tag recommendation problem as a multilabel classification problem, i.e., assign one or more classes to a test node. We accomplish the multilabel problem by sorting the calculated probabilities P(t|x) for all x ∈ Xtest and recommend the top n tags with highest probabilities. The proposed relational methods could either be applied on Rres user, i.e., the union of the user and resource relation or on each relation Ruser, Rres individually. If applied on each relation the results could be combined by using ensemble techniques. 4.1 Semi-Supervised Learning As mentioned before, we would like additionally, to exploit unlabeled information contained in the graph and use the test nodes that have not been tagged yet, but are related to other nodes. This can be achieved by applying collective inference methods, being iterative procedures, which classify related nodes simultaneously and exploit relational autocorrelation and unlabeled data. Relational autocorrelation is the correlation among a variable of an entity to the same variable (here the class) of a related entity, i.e., connected entities are likely to have the same classes assigned. Collective Classification is semi-supervised by nature, since one exploits the unlabeled part of the data. One of this semi-supervised methods is relaxation labeling [1], it can be formally expressed as: P(t|x) (i+1) = M(P(t|x 0 ) (i) x0∈Nx ) (5) We first initialize the unlabeled nodes with the prior probability calculated using the train set, then compute the probability of tag t given x iteratively using a relational classification method M based on the neighborhood Nx in the inner loop. The procedure stops when the algorithm converges (i.e., the difference of the tag probability between 10