310 Y. Blanco-Fernandez et al. Knowledge- Based Systems 21(2008)305-320 3.3. The user profiles where Cm is the superclass of Cm+I and #sib(Cm+l)de- notes the number of siblings of the class Cm+i in the Our proposal models the users' preferences by reusing hierarchy of genres defined in the TV ontology the knowledge formalized in the OWl ontology(hence our profiles are called ontology -profiles ) In contrast with As a result of Eq(5), this approach leads to higher other works that combine user modeling and use of ontol- DOI indexes for the superclasses closer to the leaf class ogies[23], our approach requires a more detailed semantic whose value is being propagated, and lower ones for model of the user preferences. This model stores the pro- the classes which are closer to the root of the hierarchy grams interesting and unappealing to the user(named posi- In addition, the higher the dOi of a given class and the tive and negative preferences, respectively ), along with their lower the number of its siblings, the higher the Doi semantic descriptions. Such descriptions refer both to their index propagated to its superclass. As a class can b main semantic characteristics and to the hierarchical genres superclass to multiple classes, every time its DoI is these programs belong to(e. g. action movies, sport, music, updated by eq. (5), we average the indexes of all of its nature documentaries, etc. ) Since this information is subclasses defined in the profile, and so compute its final already defined in the TV ontology, our approach does DoI index. not store it again in each user profile. Really, a user profile Our ontology-based model enables intelligent reasoning contains the IDs of the programs which are(un ) appealing about the semantics of the user preferences, and allows to to him/her. From their IDs, the recommender system can discover- by inferring semantic associations -enhanced locate these programs in the OWL ontology and access recommendations that would go unnoticed in the existing their complete semantic descriptions. This way, AVATAR systems. This reasoning process favors a more effective knows the attributes and genres of each program pointed matching process between the viewer profile and the pro- to from the user profile grams available in AVATAR, beyond the syntactic match- Along with the programs IDs, the profiles store the lev- ing techniques described in Section 2.1 els of interest for each instance and for each class referred Example 4. Let U be a Finnish viewer who has viewed the to the user preferences. The degrees of in terest(named following programs: (i)the movie Camaron about a famous DOI indexes)belong to the range [-1, 1] and can be spec- flamenco artist, (i) an advertisement about the travel ified by the user or inferred automatically by the recom- mender system. Specifically, the value of the Doi index agency Finnish Tours, (i) the movie Cast Away whose corresponding to a program recommended by avatar starring actor is Tom Hanks, and(iv) the movie Saving Prirate Ryan that also involves this actor and it is about depends on several factors, such as the user's reply to the World War II(henceforth wwIn). Taking into account the suggestion(accept or reject), the percentage of the program knowledge represented in Fig. I, our approach builds Us viewed by the user, and the time elapsed until the user deci- des to view the recommended program [6] profile from the instances referred to the aforementioned The Doi of each program is also used to set the doi programs their main semantic attributes and the hierarchy indexes of: (i) each one of its semantic characteristics of genres they belong to, as shown in Fig. 2. The DOI indexes (in brackets in Fig. 2)stored in this profile show (actors, presenters, topics, etc. ) and (i)the genres under that U liked all of the programs, except the war movie which this program is classified in the ontology Saving Private ryan The semantic characteristics of a given program inherit the DOI index of this program. Besides, if a characteris- 4. Semantic associations tic is related to several programs, we calculate its DOI index by averaging their respective levels of interest. In order to find programs semantically related to the Regarding the computation of the doi indexes for each viewer preferences, our methodology locates in the OWL ass(genre)included in the profile, we firstly compute ontology the programs he/she liked in the past,and the DOI of each leaf class as the average value of the explores the property sequences starting from them. As DOI indexes assigned to the programs in the profile these sequences are traversed and new instances are belonging to that class. Then, we propagate these values reached, we predict their relevance for the user. Once the through the genre hierarchy until reaching its root class. irrelevant instances have been filtered out, we get a set of For that purpose, we adopt the approach proposed in property sequences from which the semantic associations [43]. that leads to Eq. (5) (between specific TV programs) must be inferred, and the final recommendations are elab DOI(Cm) DOI(Cm+1) +#sib(Cm+l) For clarity, Fig. 2 shows the classes and instances which identify Us 9 These instances identify specific TV programs and their semantic preferences actually, Pu only contains the IDs of the four programs and attributes, whereas the classes refer to the genres under which the the DoI indexes for each one of these classes and instances(which programs are classified in the ontology stored in the OWl ontology and are pointed from the profile)
3.3. The user profiles Our proposal models the users’ preferences by reusing the knowledge formalized in the OWL ontology (hence, our profiles are called ontology-profiles). In contrast with other works that combine user modeling and use of ontologies [23], our approach requires a more detailed semantic model of the user preferences. This model stores the programs interesting and unappealing to the user (named positive and negative preferences, respectively), along with their semantic descriptions. Such descriptions refer both to their main semantic characteristics and to the hierarchical genres these programs belong to (e.g. action movies, sport, music, nature documentaries, etc.). Since this information is already defined in the TV ontology, our approach does not store it again in each user profile. Really, a user profile contains the IDs of the programs which are (un)appealing to him/her. From their IDs, the recommender system can locate these programs in the OWL ontology and access their complete semantic descriptions. This way, AVATAR knows the attributes and genres of each program pointed to from the user profile. Along with the programs IDs, the profiles store the levels of interest for each instance and for each class referred to the user preferences.9 The degrees of interest (named DOI indexes) belong to the range [1, 1] and can be specified by the user or inferred automatically by the recommender system. Specifically, the value of the DOI index corresponding to a program recommended by AVATAR depends on several factors, such as the user’s reply to the suggestion (accept or reject), the percentage of the program viewed by the user, and the time elapsed until the user decides to view the recommended program [6]. The DOI of each program is also used to set the DOI indexes of: (i) each one of its semantic characteristics (actors, presenters, topics, etc.), and (ii) the genres under which this program is classified in the ontology. • The semantic characteristics of a given program inherit the DOI index of this program. Besides, if a characteristic is related to several programs, we calculate its DOI index by averaging their respective levels of interest. • Regarding the computation of the DOI indexes for each class (genre) included in the profile, we firstly compute the DOI of each leaf class as the average value of the DOI indexes assigned to the programs in the profile belonging to that class. Then, we propagate these values through the genre hierarchy until reaching its root class. For that purpose, we adopt the approach proposed in [43], that leads to Eq. (5): DOIðCmÞ ¼ DOIðCmþ1Þ 1 þ #sibðCmþ1Þ ð5Þ where Cm is the superclass of Cm+1 and #sib(Cm+1) denotes the number of siblings of the class Cm+1 in the hierarchy of genres defined in the TV ontology. As a result of Eq. (5), this approach leads to higher DOI indexes for the superclasses closer to the leaf class whose value is being propagated, and lower ones for the classes which are closer to the root of the hierarchy. In addition, the higher the DOI of a given class and the lower the number of its siblings, the higher the DOI index propagated to its superclass. As a class can be superclass to multiple classes, every time its DOI is updated by Eq. (5), we average the indexes of all of its subclasses defined in the profile, and so compute its final DOI index. Our ontology-based model enables intelligent reasoning about the semantics of the user preferences, and allows to discover – by inferring semantic associations – enhanced recommendations that would go unnoticed in the existing systems. This reasoning process favors a more effective matching process between the viewer profile and the programs available in AVATAR, beyond the syntactic matching techniques described in Section 2.1. Example 4. Let U be a Finnish viewer who has viewed the following programs: (i) the movie Camaron about a famous flamenco artist, (ii) an advertisement about the travel agency Finnish Tours, (iii) the movie Cast Away whose starring actor is Tom Hanks, and (iv) the movie Saving Private Ryan that also involves this actor and it is about World War II (henceforth WWII). Taking into account the knowledge represented in Fig. 1, our approach builds U’s profile from the instances referred to the aforementioned programs, their main semantic attributes and the hierarchy of genres they belong to, as shown in Fig. 2. 10 The DOI indexes (in brackets in Fig. 2) stored in this profile show that U liked all of the programs, except the war movie Saving Private Ryan. 4. Semantic associations In order to find programs semantically related to the viewer preferences, our methodology locates in the OWL ontology the programs he/she liked in the past, and explores the property sequences starting from them. As these sequences are traversed and new instances are reached, we predict their relevance for the user. Once the irrelevant instances have been filtered out, we get a set of property sequences from which the semantic associations (between specific TV programs) must be inferred, and the final recommendations are elaborated. 9 These instances identify specific TV programs and their semantic attributes, whereas the classes refer to the genres under which the programs are classified in the ontology. 10 For clarity, Fig. 2 shows the classes and instances which identify U’s preferences; actually, PU only contains the IDs of the four programs and the DOI indexes for each one of these classes and instances (which are stored in the OWL ontology and are pointed from the profile). 310 Y. Blanco-Ferna´ndez et al. / Knowledge-Based Systems 21 (2008) 305–320
Y. Blanco-Fernandez et al. Knowledge-Based Systems 21(2008)305-320 rdfs Sub Class Of → rdf: type Of [1]/Oscar → owl: Object Property ○sans Movies (0.9) 阿 War‖(1) (0.3) Contents Movies - Zemeckis Contents Travelling (1) [1]hasActor [2 hasDirector [3] has Topic Fig. 2. The user ontology profle PL n order to identify the semantic associations between (ii)Association p-join: Let PS and PS, be two joined two instances as described in SemDis, it is necessary to property sequences with a join node C(i.e. PSID explore the property sequences in which both of them are PS2), and let psI and ps2 be two instances of included, and the relationships between these sequences PS and PS,, respectively. Then, p-join Associat Let x and y be two instances in our OWl knowledge base ( x, y) is true if x is the origin of psi and y is the referred to two specific Tv programs. In addition, let origin of ps2, or x is terminus of psi and y terminus PSI=[Po,., Pn] and PS2=[@o,., 0m] be two property sequences, instantiated by psI and ps2, respectively. Or Example 6. From the property sequences Ps Saving approach adopts the following relationship between two Private Ryan-wwII and ps2: Belle Epoque-Spanish property sequences Joined property sequences: PS and PS2 are joined Civil War, it is possible to infer the semantic associ (denoted by PSDe PS2) if they contain at least one com ation p-join Associated(Saving Private Ryan, Belle Epo- mon class named join node Using the function formalized que) by means of the join class War Topics. Thi in expression 3, PS o PS, holds if PS. Nodes- association detected because both movies are settled OfPS()n PS]. Nodes OfPS()=ICI is a nonempty set where C is the join node. From the above relation, the authors of [l]define several 5. Filtering and inference methodology semantic associations As we mentioned in previous sections, our approach thAssociated(x, y) is true if must only uncover associations relevant for the user,i.e there exists a property sequence ps where x is the ori- ssociations that relate in a significant way his/her positive gin and y is the terminus, or vice versa preferences to the finally suggested TV programs. The ranking process of semantic associations has also been Example 5. From the sequence Camaron-Flamenco addressed in the SemDis project. There, the authors pro- dancing-Carmen in Fig. 1, p-pathAssociated(Camaron, posed a mechanism for uncovering all the associations Carmen)is true because both movies are about the same between two user-specified instances, and for ranking them topic according to their relevance for a query [18]. The applica
In order to identify the semantic associations between two instances as described in SemDis, it is necessary to explore the property sequences in which both of them are included, and the relationships between these sequences. Let x and y be two instances in our OWL knowledge base referred to two specific TV programs. In addition, let PS1 = [P0,...,Pm] and PS2 = [Q0,...,Qm] be two property sequences, instantiated by ps1 and ps2, respectively. Our approach adopts the following relationship between two property sequences. Joined property sequences: PS1 and PS2 are joined (denoted by PS1fflqPS2) if they contain at least one common class named join node. Using the function formalized in expression 3, PS1fflqPS2 holds if PS1. NodesOfPS() \ PS2.NodesOfPS() = {C} is a nonempty set, where C is the join node. From the above relation, the authors of [1] define several semantic associations. (i) Association q-path: q-pathAssociated(x,y) is true if there exists a property sequence ps where x is the origin and y is the terminus, or vice versa. Example 5. From the sequence Camaron–Flamenco dancing–Carmen in Fig. 1, q-pathAssociated (Camaron, Carmen) is true because both movies are about the same topic. (ii) Association q-join: Let PS1 and PS2 be two joined property sequences with a join node C (i.e. PS1fflq PS2), and let ps1 and ps2 be two instances of PS1 and PS2, respectively. Then, q-joinAssociated(x,y) is true if x is the origin of ps1 and y is the origin of ps2, or x is terminus of ps1 and y terminus of ps2. Example 6. From the property sequences ps1: Saving Private Ryan–WWII and ps2: Belle Epoque–Spanish Civil War, it is possible to infer the semantic association q-joinAssociated (Saving Private Ryan, Belle Epoque) by means of the join class War Topics. This association is detected because both movies are settled in wars. 5. Filtering and inference methodology As we mentioned in previous sections, our approach must only uncover associations relevant for the user, i.e. associations that relate in a significant way his/her positive preferences to the finally suggested TV programs. The ranking process of semantic associations has also been addressed in the SemDis project. There, the authors proposed a mechanism for uncovering all the associations between two user-specified instances, and for ranking them according to their relevance for a query [18]. The applicardfs:Sub Class Of owl:Object Property rdf:type Of [1] hasActor [2] hasDirector [3] hasTopic Oscar Jaenada (0.9) Tom Hanks (0) World War II -1( ) Travelling Adv. 1( ) Cast Away (1) Camaron (0.9) Saving Private Ryan -1( ) [3] [3] [1] [1] [1] [2] Flamenco dancing (0.9) Robert Zemeckis (1) Classes Instances Romance Movies (0.9) Drama Movies (0.3) Adventure Movies (1) Tourism Contents ( ) 1 TV Contents Movies (0.8) Fig. 2. The user ontology-profile PU. Y. Blanco-Ferna´ndez et al. / Knowledge-Based Systems 21 (2008) 305–320 311