RECOMMENDER SYSTEMS multicriteria User Modeling in Recommender Systems Kleanthi Lakiotaki and Nikolaos F. Matsatsinis, Technical University of Crete Alexis Tsoukias, Universite Paris Dauphine R recommender systems are software applications that attempt to reduce information overload by recommending items of interest to end users a bybrid based on their preferences, possibly giving movie, song, or other product sug- recommender gestions. In this sense, an accurate recommender system ideally will be able to systems framework act on the user's behalf I To achieve this goal, We form these profiles with a specially de it must gain knowledge of the user's value signed disaggregation-aggregation approach creates user- system and decision policy. Most existing that uses the UTA"(Utilites Additives recommender systems use a collaborative- algorithm profile groups filtering approach, some are based on a We demonstrate this proposed method content-based approach, and many have at- ology as a movie recommender system and before applying tempted to combine these two methods into test its performance with real user data hybrid frameworks. (See related research Furthermore, through a comparison study a collaborative for representative review of recommender systems.) collaborative-filtering methodologies, we filtering algorithm Multiple-criteria decision analysis (MCDA) clearly prove that creating user profile is a well-established field of decision sci- groups according to the proposed metho by incorporating techniques from n the support them in the decision-making tion process tegral part of a recommenda- ence that aims at analyzing and mod ology is an eling decision makers' value systems to process. This article presents a hybrid meth- Personalization multiple-criteria odological framework that combines tech- Personalization and customization are con niques from the MCDA field and, more sidered increasingly important elements of decision-analysis specifically, from the disaggregation- marketing applications. By personalization aggregation approach to model users' and customization, we usually refer to ex- (MCDA) field preferences together with the collaborative- ploiting information about a user (who filtering technique to identify the most- might be a customer, individual, or group)to preferred unknown items for every user. better design products and services targeted 1541-1672/11 0◎2011|EEE IEEE INTELLIGENT SYSTEMS Computer
64 1541-1672/11/$26.00 © 2011 IEEE IEEE INTELLIGENT SYSTEMS Published by the IEEE Computer Society R e c o m m e n d e r S y s t e m s Multicriteria User Modeling in Recommender Systems Kleanthi Lakiotaki and Nikolaos F. Matsatsinis, Technical University of Crete Alexis Tsoukiàs, Université Paris Dauphine A hybrid recommender systems framework creates userprofile groups before applying a collaborativefiltering algorithm by incorporating techniques from the multiple-criteria decision-analysis (MCDA) field. act on the user’s behalf.1 To achieve this goal, it must gain knowledge of the user’s value system and decision policy. Most existing recommender systems use a collaborativefiltering approach, some are based on a content-based approach, and many have attempted to combine these two methods into hybrid frameworks. (See related research for representative review of recommender systems.2) Multiple-criteria decision analysis (MCDA) is a well-established field of decision science that aims at analyzing and modeling decision makers’ value systems to support them in the decision-making process. This article presents a hybrid methodological framework that combines techniques from the MCDA field and, more specifically, from the disaggregationaggregation approach to model users’ preferences together with the collaborativefiltering technique to identify the mostpreferred unknown items for every user. We form these profiles with a specially designed disaggregation-aggregation approach that uses the UTA* (Utilités Additives) algorithm. We demonstrate this proposed methodology as a movie recommender system and test its performance with real user data. Furthermore, through a comparison study with other single- and multiple-criteria collaborative-filtering methodologies, we clearly prove that creating user profile groups according to the proposed methodology is an integral part of a recommendation process. Personalization Personalization and customization are considered increasingly important elements of marketing applications.3 By personalization and customization, we usually refer to exploiting information about a user (who might be a customer, individual, or group) to better design products and services targeted Recommender systems are software applications that attempt to reduce information overload by recommending items of interest to end users based on their preferences, possibly giving movie, song, or other product suggestions. In this sense, an accurate recommender system ideally will be able to IS-26-02-Lakio.indd 64 18/03/11 3:30 PM
Related Work in Decision Science n the 18th century, Nicolas de Condorcet (1743-1794) constructs user profiles by exploiting preference infor divided the decision process into three stages: mation that is integrated into a user value system and using these profiles to identify recommendation pattern the first discussion phase, where the principles that uld serve as the basis for the decision are discussed A In the past, researchers have attempted to detect cross links between the MCDa methodological frameworks and the second discussion phase, in which the question is other scientific fields and disciplines, such as al and ma clarified, with opinions approached and combined with chine learning to enhance the preference modeling capa each other to a small number of more general opinions, bilities of MCDa approaches and to improve their overall and alternatives are determined; and performance and efficiency. However, careful consideration the third phase, which consists of the actual choice is necessary to incorporate multicriteria ratings in an ex- between these alternatives isting recommendation process or to design new recom- mendation techniques to achieve maximum Later, Herbet A Simon adjusted to make them suitable for decisions in organizations, by proaches Zhang and his colleagues recently introduced multicriteria placing them into the three intelligence, design, and choice rating in recommender systems from a statistical machine hases. ' In a more recent paper, Alexis Tsoukias introduced a descriptive model of the decision-aiding process that in- The UTARec system, a predecessor of our proposed sys- volves a set of activities occurring between a decision maker and an analyst, who develops a formal model to help the lecision-analysis techniques in recommender systems. g decision maker face a problematic situation. 2 This model However, UTARec constituted only an experimental proof considers the decision-aiding process as a cognition process of the multicriteria algorithm efficiency to predict real user introducing schematically the cognitive artifacts aimed at ratings and served as a stepping stone for the integrated supporting the decision makers decision process. Within hybrid multicriteria recommender system we present here. such a model, a recommendation results from the construc R creTeS tion of an evaluation model resulting from a problem formu- lation that formally represents a specific problem situation. he New Science of ma Multiple-criteria decision analysis(MCDA), a well- rentice Hall. 1977. established field of decision science, comes in a variety of 2. A. Tsoukias, "On the Concept of Decision Aiding Process: An Operational Perspective " Annals of Operations Research, theories, methodologies, and techniques. 3 MCDA aims assisting a decision maker in dealing with the ubiquitous teria. A common approach states that mCDa is a method ser Profiling in Recommender Systems, "ACM Trans. logy enabling the construction of a reliable and convin Information Systems, vol 22, no 1, 2004, pp. 54-88 ing model when several alternatives need to be assessed 5.L. against multiple attributes under different problem state- for User Modeling, " User Modeling and User-Adapted ments(such as choosing, ranking, or classifying raction, vol. 11, nos. 1-2, 2001. 6. S Berkovsky, T. Kuflik, and F Ricci, "Mediation of User Models User modeling is a cross-disciplinary research field that recommender Systems, attempts to construct models of human behavior within User Modeling and User-Adapted Interaction, vol 18. a specific computer environment. Some approaches of o.3,2008,pp.245-286. modeling user preferences have already been applied to 7. G. Adomavicius and Y.o. Kwon. "New Recommendation recommender systems and mainly adopt techniques and Techniques for Multicriteria Rating Systems, "IEEE Intelligent methodologies from the larger fields of Al, knowledge Systems, vol 22, no 3, 2007, pp 48-55. engineering, or data mining, such as ontological user pro- Y Zhang et al, "Applying Probabilistic Latent Semantic filing, or from statistics. 5 Yet, new user-modeling ideas and approaches appear throughout literature, indicating 9. K. Lakiotaki s Tsafarakis, and N. Matsatsinis,"UTA-Rec that it has emerged as an important functional tool to en- A Recommender System Based on Multiple Criteria Analysis, hance recommender system performance. However, to o Proc. ACM Conf Recommender Systems, ACM Press, 2008. knowledge, we are the first to propose an approach that pp.219-226. to that user. Recommender systems complete and accurate user profiles, information for the recommendation that assist users in discovering the and thus, user modeling has become algorithm. This overall rating depends items closest to their preferences con- a key area in the development of such only on a single criterion that usually stitute an intrinsically important technologi represents the overall preference of tool of personalization technologies. So far, the majority of existing rec- user u on item i. However, previous Nevertheless, the effectiveness of per- ommender systems obtain an over- work has argued that recommender sonalized services highly depends on all numerical rating rui as input systems researchers should focus MARCHIAPRIL 2011 computer. org/intelligent
March/April 2011 www.computer.org/intelligent 65 to that user. Recommender systems that assist users in discovering the items closest to their preferences constitute an intrinsically important tool of personalization technologies. Nevertheless, the effectiveness of personalized services highly depends on complete and accurate user profiles, and thus, user modeling has become a key area in the development of such technologies. So far, the majority of existing recommender systems obtain an overall numerical rating rui as input information for the recommendation algorithm. This overall rating depends only on a single criterion that usually represents the overall preference of user u on item i. However, previous work has argued that recommender systems researchers should focus more I n the 18th century, Nicolas de Condorcet (1743–1794) divided the decision process into three stages: • the first discussion phase, where the principles that would serve as the basis for the decision are discussed; • the second discussion phase, in which the question is clarified, with opinions approached and combined with each other to a small number of more general opinions, and alternatives are determined; and • the third phase, which consists of the actual choice between these alternatives. Later, Herbet A. Simon adjusted the existent approaches to make them suitable for decisions in organizations, by placing them into the three intelligence, design, and choice phases.1 In a more recent paper, Alexis Tsoukiàs introduced a descriptive model of the decision-aiding process that involves a set of activities occurring between a decision maker and an analyst, who develops a formal model to help the decision maker face a problematic situation.2 This model considers the decision-aiding process as a cognition process, introducing schematically the cognitive artifacts aimed at supporting the decision maker’s decision process. Within such a model, a recommendation results from the construction of an evaluation model resulting from a problem formulation that formally represents a specific problem situation. Multiple-criteria decision analysis (MCDA), a wellestablished field of decision science, comes in a variety of theories, methodologies, and techniques.3 MCDA aims at assisting a decision maker in dealing with the ubiquitous difficulties in seeking compromise or consensus between conflicting interests and goals, represented by multiple criteria. A common approach states that MCDA is a methodology enabling the construction of a reliable and convincing model when several alternatives need to be assessed against multiple attributes under different problem statements (such as choosing, ranking, or classifying). User modeling is a cross-disciplinary research field that attempts to construct models of human behavior within a specific computer environment. Some approaches of modeling user preferences have already been applied to recommender systems and mainly adopt techniques and methodologies from the larger fields of AI, knowledge engineering, or data mining, such as ontological user profiling,4 or from statistics.5 Yet, new user-modeling ideas and approaches appear throughout literature, indicating that it has emerged as an important functional tool to enhance recommender system performance.6 However, to our knowledge, we are the first to propose an approach that constructs user profiles by exploiting preference infor mation that is integrated into a user value system and using these profiles to identify recommendation patterns. In the past, researchers have attempted to detect cross links between the MCDA methodological frameworks and other scientific fields and disciplines, such as AI and machine learning, to enhance the preference modeling capabilities of MCDA approaches and to improve their overall performance and efficiency. However, careful consideration is necessary to incorporate multicriteria ratings in an existing recommendation process or to design new recommendation techniques to achieve maximum accuracy.7 Yin Zhang and his colleagues recently introduced multicriteria rating in recommender systems from a statistical machinelearning perspective.8 The UTARec system, a predecessor of our proposed system, is an initial demonstration of applying multicriteria decision-analysis techniques in recommender systems.9 However, UTARec constituted only an experimental proof of the multicriteria algorithm efficiency to predict real user ratings and served as a stepping stone for the integrated hybrid multicriteria recommender system we present here. References 1. H.A. Simon, The New Science of Management Decision, Prentice Hall, 1977. 2. A. Tsoukiàs, “On the Concept of Decision Aiding Process: An Operational Perspective,“ Annals of Operations Research, vol. 154, no. 1, 2007, pp. 3–27. 3. J. Figueira, S. Greco, and M. Ehrgott, eds., Multiple Criteria Decision Analysis: State of the Art Surveys, Springer, 2005. 4. S.E. Middleton, N.R. Shadbolt, and D.C.D. Roure, “Ontological User Profiling in Recommender Systems,” ACM Trans. Information Systems, vol. 22, no. 1, 2004, pp. 54–88. 5. I. Zukerman and D.W. Albrecht, “Predictive Statistical Models for User Modeling,” User Modeling and User-Adapted Interaction, vol. 11, nos. 1–2, 2001. 6. S. Berkovsky, T. Kuflik, and F. Ricci, “Mediation of User Models for Enhanced Personalization in Recommender Systems,” User Modeling and User-Adapted Interaction, vol. 18, no. 3, 2008, pp. 245–286. 7. G. Adomavicius and Y.O. Kwon, “New Recommendation Techniques for Multicriteria Rating Systems,” IEEE Intelligent Systems, vol. 22, no. 3, 2007, pp. 48–55. 8. Y. Zhang et al., “Applying Probabilistic Latent Semantic Analysis to Multi-criteria Recommender System,” AI Comm., vol. 22, no. 2, 2009, pp. 97–107. 9. K. Lakiotaki, S. Tsafarakis, and N. Matsatsinis, “UTA-Rec: A Recommender System Based on Multiple Criteria Analysis,” Proc. ACM Conf. Recommender Systems, ACM Press, 2008, pp. 219–226. Related Work in Decision Science IS-26-02-Lakio.indd 65 18/03/11 3:30 PM
RECOMMENDER SYSTEMS on the user-oriented perspective, indi- to learn the decision maker's pref. With the completion of this ste cating that people are not satisfied by erences. Both decision-support and we form a data matrix that acts as an existing recommender systems recommender systems try to assist input for the second phase to more effective person- the decision maker and user, respec- alization services is the ability to de- tively, throughout the decision- Second Phase: Multicriteria velop a system able to understand not making process. This decision might User Modeling only what people like, but why they vary from a simple purchase of an The multicriteria input data ma like it. In other words, an accurate item to more sophisticated manage- trix is next analyzed and processed modeling of a user's value system and rial matters throughout the systems second an effective preference representation sign of a recommendation algorithm To begin, we discuss the cr.ork phase leading to the formation of a schema will potentially lead to the de- Methodological Framework single k-dimensional vector for every user, which we call the significance with increased performance. Such a all framework of our proposed ap- weight vector or merely the weight system could understand how users proach together with the systems vector. During this second phase, we think about items by considering the individual components. Figure 1 apply the UTA, algorithm, 6 one of knowledge about the underlying at- summarizes our systems overall the most representative and widely tributes that attract users to choose process structure, and the follow- applied disaggregation-aggregation Preferences,not just patterns, ensur- involved ections outline the steps framework algorithms, to analyze the user's cognitive decision policy. ing a more sophisticated understand The UTA" algorithm adopts the pref- ing of the user. First Phase: Data Acquisition erence disaggregation principle, the In decision theory, the MCDA field For the data-acquisition procedure, philosophy of which is to assess emerged from the fact that real-world we must first gather two types of data infer preference models from given decision-making problems are intrin- attained from user statements. The preferential structures sically multidimensional. MCDa first type is preference data given as Figure 2 shows the complete se aims at aiding the decision maker numerical ratings, and the second ries of step in the disaggregation- in the decision-making process, deals with preference statements in aggregation approach. The first step concerning a set of objects, actions, the form of a ranking order, or the determines the problem statement. alternatives, and items evaluated on weak preference order. Among the various problem state multiple points of view, which are To acquire user-preference infor- ments in decision-aiding theory, three roughly referred to as criteria(attri- mation, every user u, E U, where t= 1, are most appropriate for this domain: butes, features, variables, and so on). 2, . n, and n is the total number of By "aiding, "we mean that MCDA users asked to evaluate a set of items choosing one or more potential ac- supports decision makers but does A; E AR, named the reference set Ar. tions from a set of actions(alterna not substitute them during the deci- For every alternative A; E AR, i= 1, tives)A sion process. (See the "Related Work 2, -. m, where m is the length of Ar, ranking those alternatives in a in Decision Science"sidebar for pre- the user u provides a rating rui, for ev- descending order, and vious research. ery criterion ci, j= 1, 2, .. k, where k sorting them into predefined or Under such a perspective, we face is the total number of criteria, follow- dered categories the recommendation process ing a predefined measurement scale decision problem and exploit tech- (that is, from one to five). In addition There are various ways to present niques from decision theory and the to these individual evaluations, we recommendations to the end user. We MCDA field to accurately build a ask users to rank all the alternatives can offer users the best item(choos- erences. Following this framework, descending order and thus provide, p model representing the user's pref- that belong to the reference set in a ing), present the top N items as a recommendation list (ranking),or a potential user in a recommender weak preference order. Indifference classify the items into categories, such system corresponds to the decision relations are acceptable in the rank- as highly recommended, fairly rec- maker in a decision process. We can ing order and are considered accord- ommended, and not recommended thus consider recommender systems ingly during the multicriteria user-(sorting). Accordingly, a recommenda as decision-support systems that need modeling phase. tion problem can equivalently belong ww.computer.org/intelligent IEEE INTELLIGENT SYSTEMS
66 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS R e c o m m e n d e r S y s t e m s on the user-oriented perspective, indicating that people are not satisfied by existing recommender systems.4 The key to more effective personalization services is the ability to develop a system able to understand not only what people like, but why they like it. In other words, an accurate modeling of a user’s value system and an effective preference representation schema will potentially lead to the design of a recommendation algorithm with increased performance. Such a system could understand how users think about items by considering the knowledge about the underlying attributes that attract users to choose particular items and hence recognize preferences, not just patterns, ensuring a more sophisticated understanding of the user. In decision theory, the MCDA field emerged from the fact that real-world decision-making problems are intrinsically multidimensional.5 MCDA aims at aiding the decision maker in the decision-making process, concerning a set of objects, actions, alternatives, and items evaluated on multiple points of view, which are roughly referred to as criteria (attributes, features, variables, and so on). By “aiding,” we mean that MCDA supports decision makers but does not substitute them during the decision process. (See the “Related Work in Decision Science” sidebar for previous research.) Under such a perspective, we face the recommendation process as a decision problem and exploit techniques from decision theory and the MCDA field to accurately build a model representing the user’s preferences. Following this framework, a potential user in a recommender system corresponds to the decision maker in a decision process. We can thus consider recommender systems as decision-support systems that need to learn the decision maker’s preferences. Both decision-support and recommender systems try to assist the decision maker and user, respectively, throughout the decisionmaking process. This decision might vary from a simple purchase of an item to more sophisticated managerial matters. Methodological Framework To begin, we discuss the overall framework of our proposed approach together with the system’s individual components. Figure 1 summarizes our system’s overall process structure, and the following subsections outline the steps involved. First Phase: Data Acquisition For the data-acquisition procedure, we must first gather two types of data attained from user statements. The first type is preference data given as numerical ratings, and the second deals with preference statements in the form of a ranking order, or the weak preference order. To acquire user-preference information, every user ut ∈ U, where t = 1, 2, …, n, and n is the total number of users asked to evaluate a set of items Ai ∈ AR, named the reference set AR. For every alternative Ai ∈ AR, i = 1, 2, …, m, where m is the length of AR, the user u provides a rating rui, for every criterion cj, j = 1, 2, …, k, where k is the total number of criteria, following a predefined measurement scale (that is, from one to five). In addition to these individual evaluations, we ask users to rank all the alternatives that belong to the reference set in a descending order and thus provide a weak preference order. Indifference relations are acceptable in the ranking order and are considered accordingly during the multicriteria usermodeling phase. With the completion of this step, we form a data matrix that acts as an input for the second phase. Second Phase: Multicriteria User Modeling The multicriteria input data matrix is next analyzed and processed throughout the system’s second phase, leading to the formation of a single k-dimensional vector for every user, which we call the significance weight vector or merely the weight vector. During this second phase, we apply the UTA* algorithm,6 one of the most representative and widely applied disaggregation-aggregation framework algorithms, to analyze the user’s cognitive decision policy. The UTA* algorithm adopts the preference disaggregation principle, the philosophy of which is to assess and infer preference models from given preferential structures. Figure 2 shows the complete series of step in the disaggregationaggregation approach. The first step determines the problem statement. Among the various problem statements in decision-aiding theory,7 three are most appropriate for this domain: • choosing one or more potential actions from a set of actions (alternatives) A, • ranking those alternatives in a descending order, and • sorting them into predefined ordered categories. There are various ways to present recommendations to the end user. We can offer users the best item (choosing), present the top N items as a recommendation list (ranking), or classify the items into categories, such as highly recommended, fairly recommended, and not recommended (sorting). Accordingly, a recommendation problem can equivalently belong IS-26-02-Lakio.indd 66 18/03/11 3:30 PM
Fourth phase First phase Start value fur New user? ommend Introduce error ystem Yes? recommend nformation for recommender Solve line feedback Multicriteria MCRE data matrⅸ algorithm Stability analysis MCDA analysis Clusterin Significance similar Profile A Profile B Third phase Figure 1. Proposed system's build-up architecture. We follow a four-step sequential procedure to form groups of users with similar preferences and apply collaborative filtering inside each group to predict user ratings to one of the first three problem at this point, in the user modeling requirements are available elsewhere. 5) Although the UTA method per- but ultimately we are trying to pre- as follow. alued hinc. be a nondecreas tatements,depending on its design phase, the goal is to model the users Each criterion mus value system using the UTA method, ing, real tion defined on a formed in this step using the UTa dict ratings for unknown items algorithm belongs to the ranking- Following the disaggregation- g; A[8, lcr/a>g(a)ER(1) problem statement, this does not aggregation methodological schema, imply that the recommendation prob- the modeling process of level two lem should also belong to the same must conclude with a consistent where [8 8, l is the criterion evalua- problem statement. To elucidate the family of criteria [g1, g2,.,g]. tion scale; gr and g; are the worst and inconsistency that seems to emerge (More details on the criterion family the best level of the ith criterion, MARCHIAPRIL 2011 computer. org/intelligent
March/April 2011 www.computer.org/intelligent 67 to one of the first three problem statements, depending on its design architecture. Although the UTA method performed in this step using the UTA* algorithm belongs to the rankingproblem statement, this does not imply that the recommendation problem should also belong to the same problem statement. To elucidate the inconsistency that seems to emerge at this point, in the user modeling phase, the goal is to model the user’s value system using the UTA method, but ultimately we are trying to predict ratings for unknown items. Following the disaggregationaggregation methodological schema, the modeling process of level two must conclude with a consistent family of criteria {g1, g2, …, gk}. (More details on the criterion family requirements are available elsewhere.5) Each criterion must be a nondecreasing, real-valued function defined on A as follows: g A g g a j j j : , [ ] ( ) * → ⊂ → ∈ * g α (1) where [ , ] * g g* j j is the criterion evaluation scale; gj* and gj * are the worst and the best level of the jth criterion, Express global value function Introduce error functions Solve linear program Stability analysis Clustering algorithm MCDA analysis Significance weights Create discrete groups of similar preferences Multicriteria data matrix Collect preferential information for recommender system Marginal utility functions Profile A Profile B Third phase Fourth phase First phase Second phase Profile N ... Show recommender system Start UTA* algorithm End Feedback correction algorithm MCRF algorithm Provide feedback Need recommendation ? New user? No? No? No? Yes? Yes? Yes? | Ruser – rsystem | > MAE Figure 1. Proposed system’s build-up architecture. We follow a four-step sequential procedure to form groups of users with similar preferences and apply collaborative filtering inside each group to predict user ratings. IS-26-02-Lakio.indd 67 18/03/11 3:30 PM
RECOMMENDER SYSTEMS (level 2) g Decision data and users global judgment policy Instead of randomly selecting initia values for all cluster centers this al (level 1) rithm crement by optimally adding one new cluster Consistency of the preference model and user's Preference model center at each stage the one that min- global judgment policy construction imizes a certain clustering criterion Suppose we are given a data set {x1x2,…,xn,xn∈Rd. Thek-clustering Decision problem divides this data set into k disjointed groups called clusters C1, C2,., Ck by optimizing a certain The disaggregation-aggregation approach. According to this approach clustering criterion. We have adopted ision maker's value system is first decomposed into decision variables that the most widely used clustering crite- additively composed to build a decision model, which is as consistent as rion: the sum of squared error(SSE) with the actual decision makers value system. between each data point x;(i =1, 2,,nm)and the centroid m,(i=1, respectively; gia)is the evaluation or with each criterion, approximated by 2,.,k)of the subset Ci, which con performance of action a on the jth cri- linear segments, as well as the crite- tains x This clustering criterion terion; and g(a) is the vector of perfor- ria significance weights (trade-offs depends on the cluster centers m mances of action a on the k criteria. among the criteria values). The latter, m2,..., mk, and is shown in Equation 2 The multicriteria data input matrix expressed as a weight vector, serves is processed by the UTA algorithm each user's value system information- through an iterative ordinal regression representation schema and provides SSE( procedure(Analytical details and an the required user-modeling data to ∑∑(c川x-mP illustrative example of the UTA'algo- proceed to the next phase rithm are available elsewhere. 6) The UTA algorithm considers as Third Phase: Clustering Applied to the set of user weight input a weak-order preference struc- Generally, a clustering algorithm vectors, global k-means labels every ture on a set of actions, together with divides the original data set into dis- user to a specific group the performances of the alternatives jointed groups. Clustering is an unsu on all attributes, and returns as out- pervised process that aims to group Fourth Phase: Recommendation put a set of additive value functions data objects based only on informa- Following the formation of user groups based on multiple criteria. It does this tion found in the data that describes with similar preferences(user-profile in such a way that the resulting struc- the objects and their relationships. clusters), we can provide users with ture will be as consistent as possible The objects within a group should be curate item recommendations by im- with the initial structure given by the similar (or related)as much as possi- plementing the collaborative-filtering user. This is accomplished using spe- ble to one another and should be dif- philosophy inside each user group cial linear programming techniques. ferent from (or unrelated to) the ob- The multidimensional MRCF The UTA algorithm follows four jects in other groups. Most existing (MRCF-dim)approach we apply here basic steps during which all the nec- clustering algorithms are sensitive to is based on multidimensional distance essary parameters to estimate global initial parameters, such as the num- metrics. First, it calculates the dis value functions for each item and ber of clusters and initial centroid tance between two users, u and u, for user are calculated. (See"The UTA positions. To limit these shortcom- the same item, according to Equation 3 Algorithm"sidebar for more details. ings, we use global k-means, a deter Thus, a value is assessed for each al- ministic approach to the traditional ternative that belongs to the reference k-means clustering algorithm, in this dmar=,Eram-Irnl set, quantifying its value to each user third phase. Global k-means does and ensuring consistency with the us- not depend on any initial parameter er's value system. UTA"'s output in- values and employs the k-means al- where ru is the rating vector of user u volves the value functions associated gorithm as a local search procedure. and r is the rating vector of user u ww.computer.org/intelligent IEEE INTELLIGENT SYSTEMS
68 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS R e c o m m e n d e r S y s t e m s respectively; gj (a) is the evaluation or performance of action a on the jth criterion; and g(a) is the vector of performances of action a on the k criteria. The multicriteria data input matrix is processed by the UTA* algorithm through an iterative ordinal regression procedure. (Analytical details and an illustrative example of the UTA* algorithm are available elsewhere.6) The UTA* algorithm considers as input a weak-order preference structure on a set of actions, together with the performances of the alternatives on all attributes, and returns as output a set of additive value functions based on multiple criteria. It does this in such a way that the resulting structure will be as consistent as possible with the initial structure given by the user. This is accomplished using special linear programming techniques. The UTA* algorithm follows four basic steps during which all the necessary parameters to estimate global value functions for each item and user are calculated. (See “The UTA* Algorithm” sidebar for more details.) Thus, a value is assessed for each alternative that belongs to the reference set, quantifying its value to each user and ensuring consistency with the user’s value system. UTA*’s output involves the value functions associated with each criterion, approximated by linear segments, as well as the criteria significance weights (trade-offs among the criteria values). The latter, expressed as a weight vector, serves as each user’s value system informationrepresentation schema and provides the required user-modeling data to proceed to the next phase. Third Phase: Clustering Generally, a clustering algorithm divides the original data set into disjointed groups. Clustering is an unsupervised process that aims to group data objects based only on information found in the data that describes the objects and their relationships. The objects within a group should be similar (or related) as much as possible to one another and should be different from (or unrelated to) the objects in other groups. Most existing clustering algorithms are sensitive to initial parameters, such as the number of clusters and initial centroid positions. To limit these shortcomings, we use global k-means,8 a deterministic approach to the traditional k-means clustering algorithm, in this third phase. Global k-means does not depend on any initial parameter values and employs the k-means algorithm as a local search procedure. Instead of randomly selecting initial values for all cluster centers, this algorithm acts in an incremental way by optimally adding one new cluster center at each stage, the one that minimizes a certain clustering criterion. Suppose we are given a data set {x1, x2, …, xn}, xn∈Rd. The k-clustering problem divides this data set into k disjointed groups called clusters C1, C2, …, Ck by optimizing a certain clustering criterion. We have adopted the most widely used clustering criterion: the sum of squared error (SSE) between each data point xi (i = 1, 2, …, n) and the centroid mj (j = 1, 2, …, k) of the subset Cj, which contains xi. This clustering criterion depends on the cluster centers m1, m2, …, mk, and is shown in Equation 2. SSE ( , ,..., ) ( ) | | m m m I x C x m K i j j K i N i j 1 2 11 = ∈ − == ∑∑ 2 (2) Applied to the set of user weight vectors, global k-means labels every user to a specific group. Fourth Phase: Recommendation Following the formation of user groups with similar preferences (user-profile clusters), we can provide users with accurate item recommendations by implementing the collaborative-filtering philosophy inside each user group. The multidimensional MRCF (MRCF-dim) approach we apply here is based on multidimensional distance metrics. First, it calculates the distance between two users, u and u′, for the same item, according to Equation 3: duu un u n n k ′ ′ = + = − ∑( ) r r 1 1 2 (3) where ru is the rating vector of user u and ru′ is the rating vector of user u′. Criteria modeling (level 2) Problem (level 1) Decision data and user’s global judgment policy Preference model construction Decision Consistency of the preference model and user’s global judgment policy Figure 2. The disaggregation-aggregation approach. According to this approach, the decision maker’s value system is first decomposed into decision variables that are then additively composed to build a decision model, which is as consistent as possible with the actual decision maker’s value system. IS-26-02-Lakio.indd 68 18/03/11 3:31 PM