of these methods is that all the recommendation Meta-level hybrid recommenders. These systems apabilities are straightforward incorporated in the combine two recommendation techniques by using recommendation process. However, they have the the entire model generated by one as the input for implicit assumption that the relative value of the another. The benefit of these methods, especially for a different techniques is more or less uniform across the ontent-based/collaborative hybrid approach, is that space of items fact that is not always true. For the learned (content-based)model is a com example, from the discussion on the limitations of representation of the users interests, and the collaborative filtering given previously, it is known (collaborative) recommendation mechan that a CF system will be weaker for those items with follows can operate on this information-dense more small number of rating Switched hybrid recommenders. These sy stems use Hybrid recommenders based on feature some criterion to switch between recommendation ugmentation. These systems, similarly to cascade techniques. The benefit of these methods is that the hybrids, involve a staged process. A first suggestions can be sensitive to the strengths and recommendation technique produces a rating or weakness of the constituent recommendation classification of each item. Afterwards. a second However, they introduce additional recommendation technique takes the obtained complexity into the recommendation process since information and incorporates it into Its the switching criteria must be determined with recommendation process. Note that these approaches another level of parameterization are different to cascade ones since in the latter the first recommendation technique has no influence over Mixed hybrid recommenders. These systems present together the suggestions given by the different the second. the benefit of these methods is that it recommendation techniques. The benefit of these offers a way to improve the performance of core methods is that they directly exploit the benefits of recommendation techniques, enriching their inputs collaborative and without modifying their internal model recommendations. However, they require ranking of items, or selection of a single best suggestion. part from the specific weaknesses of both content- entailing the development of some kind of based and collaborative recommendation approaches, there combination technique exist other general limitations in the current recommender Hybrid recommenders based on feature These Poor understanding of users and items. Most of the content/collaborative suggestions treatin recommender systems produce ratings that are based collaborative information as simply additional on a limited information about users and items as atures associated with each item, and using content captured by user and item profiles, and do not take based techniques over the augmented dataset. The ull advantage of information from users' behaviour, benefit of these methods is that collaborative data is transactional histories and other available data. For considered without relying on it exclusively, reducing example, classical collaborative filtering methods rely thus the sensitivity of the recommendations to the exclusively on the ratings information to make recommendations. Although there has been some gress made on incorporating user and item profiles Cascade hybrid recommenders. These systems into some of the methods since the early days of involve a staged process. A first recommendation recommender systems, these profiles tend to be quite technique produces a coarse ranking of candidates simple and do not utilise more advanced profiling Afterwards, a second recommendation technique uses techniques. In addition to using traditional pI the previous filtered candidate set, refining the final features such as keywords and simple demograp suggestions. The benefit of these methods is that they more advanced profiling techniques based or void employing the second, lower-priority technique mining could be used, finding recommendation rules on items that are well differentiated by the first behaviour and usage patterns, etc. technique, or are sufficiently poorly-rated that they will never be recommended. Doing this, cascade contextual formation recommenders perform efficient recommendation process. Traditional recommender recommendations than, for example, a weighted systems operate on the two-dimensional User x Item hybrid recommender that has to apply allits space. That is, they make their recommendations techniques to all items. In addition, the cascade based only on the user and item information, and do approach is by its nature tolerant to noise in the low- not take into consideration additional contextual iority technique, since recommendations given by information that may be crucial in some applications high-priority recommender can only be refined However, in many situations, the utility of a certain item to a user may depend significantly on time, the people with whom the item will be consumed or
of these methods is that all the recommendation capabilities are straightforward incorporated in the recommendation process. However, they have the implicit assumption that the relative value of the different techniques is more or less uniform across the space of items – fact that is not always true. For example, from the discussion on the limitations of collaborative filtering given previously, it is known that a CF system will be weaker for those items with a small number of ratings. • Switched hybrid recommenders. These systems use some criterion to switch between recommendation techniques. The benefit of these methods is that the suggestions can be sensitive to the strengths and weakness of the constituent recommendation techniques. However, they introduce additional complexity into the recommendation process since the switching criteria must be determined with another level of parameterization. • Mixed hybrid recommenders. These systems present together the suggestions given by the different recommendation techniques. The benefit of these methods is that they directly exploit the benefits of both content-based and collaborative recommendations. However, they require ranking of items, or selection of a single best suggestion, entailing the development of some kind of combination technique. • Hybrid recommenders based on feature combination. These systems merge content/collaborative suggestions treating the collaborative information as simply additional features associated with each item, and using contentbased techniques over the augmented dataset. The benefit of these methods is that collaborative data is considered without relying on it exclusively, reducing thus the sensitivity of the recommendations to the number of ratings. • Cascade hybrid recommenders. These systems involve a staged process. A first recommendation technique produces a coarse ranking of candidates. Afterwards, a second recommendation technique uses the previous filtered candidate set, refining the final suggestions. The benefit of these methods is that they avoid employing the second, lower-priority technique on items that are well differentiated by the first technique, or are sufficiently poorly-rated that they will never be recommended. Doing this, cascade recommenders perform more efficient recommendations than, for example, a weighted hybrid recommender that has to apply all its techniques to all items. In addition, the cascade approach is by its nature tolerant to noise in the lowpriority technique, since recommendations given by high-priority recommender can only be refined. • Meta-level hybrid recommenders. These systems combine two recommendation techniques by using the entire model generated by one as the input for another. The benefit of these methods, especially for a content-based/collaborative hybrid approach, is that the learned (content-based) model is a compressed representation of the user’s interests, and the second (collaborative) recommendation mechanism that follows can operate on this information-dense more easily than on the initial raw data. • Hybrid recommenders based on feature augmentation. These systems, similarly to cascade hybrids, involve a staged process. A first recommendation technique produces a rating or classification of each item. Afterwards, a second recommendation technique takes the obtained information and incorporates it into its recommendation process. Note that these approaches are different to cascade ones, since in the latter the first recommendation technique has no influence over the second. The benefit of these methods is that it offers a way to improve the performance of core recommendation techniques, enriching their inputs and without modifying their internal model. Apart from the specific weaknesses of both contentbased and collaborative recommendation approaches, there exist other general limitations in the current recommender systems. • Poor understanding of users and items. Most of the recommender systems produce ratings that are based on a limited information about users and items as captured by user and item profiles, and do not take full advantage of information from users’ behaviour, transactional histories and other available data. For example, classical collaborative filtering methods rely exclusively on the ratings information to make recommendations. Although there has been some progress made on incorporating user and item profiles into some of the methods since the early days of recommender systems, these profiles tend to be quite simple and do not utilise more advanced profiling techniques. In addition to using traditional profile features such as keywords and simple demographics, more advanced profiling techniques based on data mining could be used, finding recommendation rules, behaviour and usage patterns, etc. • No contextual information within the recommendation process. Traditional recommender systems operate on the two-dimensional User × Item space. That is, they make their recommendations based only on the user and item information, and do not take into consideration additional contextual information that may be crucial in some applications. However, in many situations, the utility of a certain item to a user may depend significantly on time, the people with whom the item will be consumed or
shared and under which circumstances. For example, 2.4. Our proposal a user can have significantly different preferences for the types of movies she wants to see when she is As explained in the introduction, and as shall be described going out to a movie theatre with her boyfriend on a in detail in the next sections, we propose a multilayered Saturday night, as opposed to watching a rental movie approach to hybrid recommendation, based on the at home with her parents on a Wednesday evenin automatic identification of Col from semantic user Using multidimensional settings, the inclusion of ferences stored in well-structured ontology-based user knowledge about the user's task/environment into the profiles. Our method builds and compares profiles of user recommendation algorithm can lead to better interests for semantic topics and specific concepts in order to find similarities among users. The issue of finding hidden links between users and items based on the Non flexible recommendations. In similarity of the user preferences/interests (expressed by recommendation methods are inflexible in the sense means of opinions, comparatives or ratings of items) and that they support a predefined and fixed set of the item content features is the essence of the already recommendations. Moreover, most of them only presented hybrid recommender systems. But in contrast recommend individual items to individual users, and to classic collaborative strategies, the comparison is done do not deal with aggregation of items and/or users in our approach by splitting the user profiles into clusters Group recommendations [36](371[44] are still open of cohesive interests, and based on this, several layers of investigation and innovations. Col are found. This provides a richer model of Non support for multi-criteria ratings. Most of the find common interests in real life. According to the criterion ratings. However, it is important to be able taxonomy of hybrid recommender systems given to provide aggregated recommendations in a number previousl ur approach adopts the so-called f applications, such as recommend brands or collaborative via content"paradigm [45] and can be categories of items to certain segments of users. In categorised as a meta-level hybrid recommender. The some applications, it is crucial to incorporate mult users' interests are represented as semantic concepts of criteria ratings into recommendation methods. Mult domain ontologies, and a collaborative recommendation criteria ratings have been extensively studied in the mechanism is then applied which takes into account the similarities between such content-based user profiles Operation Research community Our proposal addresses some of the limitations of Scalability problem. Nearest neighbour-based current recommender systems, including both content- algorithms require computation that grows with the based and collaborative filtering strategies. As we ill umber of users and the number of items. With show, the semantic relations between concepts and millions of users and items, a typical web-based instances of the knowledge ontologies, are exploited in our recommender system running existing algorithms will approach to reduce the impact of problems such as suffer serious scalability problems. For them, efficient restricted content analysis, preference/rating sparsity, cold clustering techniques are thus needed. There exist a start, content overspecialisation, or portfolio effects. number of dimensionality reduction techniques [501 Moreover, through our mechanism for identifying uch as Singular Value Decomposition (SVD) multilayered communities of interest, we are able to 11[32], and efficient clustering strategies, suc discover relations between users at different levels co-clustering 24 augmenting the possibilities of finding similarities for those users without very common/popular interests(gray Intrusiveness. Many recommender systems ar sheep problem). Moreover, our user profile representatio intrusive in the sense that they require explicit and content retrieval mechanism are open to new strategies feedback from the user and often at a significant level of user involvement. Some non-intrusive methods of for group-oriented, context-aware, query-driven and multi- criteria recommendations. research fields which we have getting user feedback have been presented in the iterature. However. non-intrusive ratings are often already started to investigate We shall show results obtained from empirical inaccurate and cannot fully replace explicit rating evaluations of the model. As we explain in last sections provided by the user. Therefore, the problem of we conducted experiments with two different repositories, minimising intrusiveness while maintaining certain manually obtained from real users, and automatically levels of accuracy of recommendations needs to be enerated merging information from IMDb and Movielens addresse Need of explainability. The recommender should have the ability of explaini causes,inferences performed from the user considered constraints. etc
shared and under which circumstances. For example, a user can have significantly different preferences for the types of movies she wants to see when she is going out to a movie theatre with her boyfriend on a Saturday night, as opposed to watching a rental movie at home with her parents on a Wednesday evening. Using multidimensional settings, the inclusion of knowledge about the user’s task/environment into the recommendation algorithm can lead to better recommendations. • Non flexible recommendations. In general, recommendation methods are inflexible in the sense that they support a predefined and fixed set of recommendations. Moreover, most of them only recommend individual items to individual users, and do not deal with aggregation of items and/or users. Group recommendations [36][37][44] are still open to investigation and innovations. • Non support for multi-criteria ratings. Most of the current recommender systems deal with single criterion ratings. However, it is important to be able to provide aggregated recommendations in a number of applications, such as recommend brands or categories of items to certain segments of users. In some applications, it is crucial to incorporate multicriteria ratings into recommendation methods. Multicriteria ratings have been extensively studied in the Operation Research community. • Scalability problem. Nearest neighbour-based algorithms require computation that grows with the number of users and the number of items. With millions of users and items, a typical web-based recommender system running existing algorithms will suffer serious scalability problems. For them, efficient clustering techniques are thus needed. There exist a number of dimensionality reduction techniques [50], such as Singular Value Decomposition (SVD) [21][32], and efficient clustering strategies, such as co-clustering [24]. • Intrusiveness. Many recommender systems are intrusive in the sense that they require explicit feedback from the user and often at a significant level of user involvement. Some non-intrusive methods of getting user feedback have been presented in the literature. However, non-intrusive ratings are often inaccurate and cannot fully replace explicit ratings provided by the user. Therefore, the problem of minimising intrusiveness while maintaining certain levels of accuracy of recommendations needs to be addressed. • Need of explainability. The recommender systems should have the ability of explaining the recommendations they present to the user [26]: causes, inferences performed from the user profile, considered constraints, etc. 2.4. Our proposal As explained in the introduction, and as shall be described in detail in the next sections, we propose a multilayered approach to hybrid recommendation, based on the automatic identification of CoI from semantic user preferences stored in well-structured ontology-based user profiles. Our method builds and compares profiles of user interests for semantic topics and specific concepts in order to find similarities among users. The issue of finding hidden links between users and items based on the similarity of the user preferences/interests (expressed by means of opinions, comparatives or ratings of items) and the item content features is the essence of the already presented hybrid recommender systems. But in contrast to classic collaborative strategies, the comparison is done in our approach by splitting the user profiles into clusters of cohesive interests, and based on this, several layers of CoI are found. This provides a richer model of interpersonal links, which better represents the way people find common interests in real life. According to the taxonomy of hybrid recommender systems given previously, our approach adopts the so-called “collaborative via content” paradigm [45] and can be categorised as a meta-level hybrid recommender. The users’ interests are represented as semantic concepts of domain ontologies, and a collaborative recommendation mechanism is then applied which takes into account the similarities between such content-based user profiles. Our proposal addresses some of the limitations of current recommender systems, including both contentbased and collaborative filtering strategies. As we will show, the semantic relations between concepts and instances of the knowledge ontologies, are exploited in our approach to reduce the impact of problems such as restricted content analysis, preference/rating sparsity, coldstart, content overspecialisation, or portfolio effects. Moreover, through our mechanism for identifying multilayered communities of interest, we are able to discover relations between users at different levels, augmenting the possibilities of finding similarities for those users without very common/popular interests (gray sheep problem). Moreover, our user profile representation and content retrieval mechanism are open to new strategies for group-oriented, context-aware, query-driven and multicriteria recommendations, research fields which we have already started to investigate. We shall show results obtained from empirical evaluations of the model. As we explain in last sections, we conducted experiments with two different repositories, manually obtained from real users, and automatically generated merging information from IMDb and MovieLens repositories
3. Ontology-based recommendations Furthermore, ontology standards, such as RDF and OWL support inference mechanisms that can be used to In this section, we present our approach to the semantic enhance personalisation, so that, for instance, a user description of user preferences and items in terms of interested in animals (superclass of cat) is also concepts and instances defined in domain ontologies. We recommended items about cats. Inversely, a user interested also present a basic content-based recommendation model n lizards and snakes can be inferred with a certain that is used as the base line approach for the experiments confidence to be interested in reptiles. Similarly, a user performed with our hybrid recommendation models fascinated about the life of actors and actresses can be recommended items in which for example the name of Brad 3. 1. Knowledge representation Pitt appears, due to that person could be an instance of the class Actor. Also, a user keen on Spain can be assumed to In contrast to other strategies in personalised content like Madrid, through the locatedIn transitive relation. These retrieval, our approach makes use of explicit user profiles (as opposed to e.g. sets of preferred documents ). Working characteristics are exploited in our recommendation models within an ontology-based personalisation framework [56] user preferences are represented as vectors 3.2. Content-based recommendation model u=(um. w,,,Amw,x)where m E[0, 1] measures the With the presented knowledge representation, we use a (a class or an instance) in a do main ontology o, being that warks d two pmhasesnt lt im the trst on a form al the total number of concepts in the ontology. Similarly, the ontology-based query is issued by some form of query items d, ED in the retrieval space are assumed to be interface(e.g. NLP-based)formalising a user information notate vectors d,=(di, d need. The query is processed, outputting a set of ontology weights, in the same vector-space as user preferences concepts that satisfy it. From this point, the second phase is Based on this common logical representation, measures of based on an adaptation of the classic vector-space IR user interest for content items can be computed by model [41[49], where the axes of the space are the concepts comparing preference and annotation vectors, and these of O, instead of text keywords. The query and each item satisfaction of a query by an item can be computed bi. the measures can be used to prioritise, filter and rank contents are thus represented by vectors q and d, so tha (a collection, a catalogue, a search result) in a personal cosine measure pace ontology-based Semantic User Profile respectively er of users and items registered in the Weighted system. Users (MI Fina/ Ranked 88 88 8 Porsonal Annotations Repository Figure 6 Personalised ontology-based content retrieval Annotations 首/ The problem, of course, is how to build the q and d vectors. For more details, see [151[57]. Here we obviate this issue, and continue explaining our content retrieval Figure 5 Ontology-based user profiles and item descriptions Personalised Ranking in Figure 6). Once a user profile is e The ontology-based representation is richer and less obtained. our notion of content retrieval is based on a ambiguous than a keyword-based or item-based model. It matching algorithm that provides a personal relevance provides an adequate grounding for the representation of measure pref(d, u) of an item d for a user u.This coarse to fine-grained user interests(e.g. interest for items measure is set according to the semantic preferences of the such as a sports team, an actor, a stock value), and can be a user and the semantic annotations of the item based again key enabler to deal with the subtleties of user preferences on a cosine-based vector similarity An ontology provides further formal, computer-processable meaning on the concepts(who is coaching a team, an actors filmography, financial data on a stock), and makes it available for the personalisation system to take advantage of ResourceDescriptionfRameworkwww.w3.org/rdf 2WebOntologyLanguagewww.w3.org/2004/owl
3. Ontology-based recommendations In this section, we present our approach to the semantic description of user preferences and items in terms of concepts and instances defined in domain ontologies. We also present a basic content-based recommendation model that is used as the base line approach for the experiments performed with our hybrid recommendation models. 3.1. Knowledge representation In contrast to other strategies in personalised content retrieval, our approach makes use of explicit user profiles (as opposed to e.g. sets of preferred documents). Working within an ontology-based personalisation framework [56], user preferences are represented as vectors ,1 ,2 , ( , ,..., ) m m m mK u = uu u where [ ] , 0,1 m k u ∈ measures the intensity of the interest of user m u ∈ U for concept k c ∈O (a class or an instance) in a domain ontology O , K being the total number of concepts in the ontology. Similarly, the items n d ∈D in the retrieval space are assumed to be annotated by vectors ,1 ,2 , ( , ,..., ) n n nK = dd d dn of concept weights, in the same vector-space as user preferences. Based on this common logical representation, measures of user interest for content items can be computed by comparing preference and annotation vectors, and these measures can be used to prioritise, filter and rank contents (a collection, a catalogue, a search result) in a personal way. Figure 5 shows our twofold-space ontology-based knowledge representation, in which M and N are respectively the number of users and items registered in the system. Figure 5 Ontology-based user profiles and item descriptions The ontology-based representation is richer and less ambiguous than a keyword-based or item-based model. It provides an adequate grounding for the representation of coarse to fine-grained user interests (e.g. interest for items such as a sports team, an actor, a stock value), and can be a key enabler to deal with the subtleties of user preferences. An ontology provides further formal, computer-processable meaning on the concepts (who is coaching a team, an actor’s filmography, financial data on a stock), and makes it available for the personalisation system to take advantage of. Furthermore, ontology standards, such as RDF1 and OWL2 , support inference mechanisms that can be used to enhance personalisation, so that, for instance, a user interested in animals (superclass of cat) is also recommended items about cats. Inversely, a user interested in lizards and snakes can be inferred with a certain confidence to be interested in reptiles. Similarly, a user fascinated about the life of actors and actresses can be recommended items in which for example the name of Brad Pitt appears, due to that person could be an instance of the class Actor. Also, a user keen on Spain can be assumed to like Madrid, through the locatedIn transitive relation. These characteristics are exploited in our recommendation models. 3.2. Content-based recommendation model With the presented knowledge representation, we use a retrieval model (component ‘Item retrieval’ in Figure 6) that works in two phases [16]. In the first one, a formal ontology-based query is issued by some form of query interface (e.g. NLP-based) formalising a user information need. The query is processed, outputting a set of ontology concepts that satisfy it. From this point, the second phase is based on an adaptation of the classic vector-space IR model [4][49], where the axes of the space are the concepts of O , instead of text keywords. The query and each item are thus represented by vectors q and d , so that the satisfaction of a query by an item can be computed by its cosine measure. Figure 6 Personalised ontology-based content retrieval The problem, of course, is how to build the q and d vectors. For more details, see [15][57]. Here we obviate this issue, and continue explaining our content retrieval process with its personalisation phase (component ‘Personalised Ranking’ in Figure 6). Once a user profile is obtained, our notion of content retrieval is based on a matching algorithm that provides a personal relevance measure pref d u ( , ) of an item d for a user u . This measure is set according to the semantic preferences of the user and the semantic annotations of the item based again on a cosine-based vector similarity: 1 Resource Description Framework, www.w3.org/RDF 2 Web Ontology Language, www.w3.org/2004/OWL