USER: User-Sensitive Expert Recommendations for Knowledge.Dense Environments Colin DeLong, Prasanna Desikan2, and Jaideep Srivastava I College of Liberal Arts, University of Minnesota, 101 Pleasant St SE, 55 Minneapolis, MN, United States of America delo0041@umn. edu Department of Computer Science, University of Minnesota, 200 Union Street SE. 55455 Minneapolis, MN, United States of America [desikan, srivastalecs. umn. edu Abstract. Traditional recommender systems tend to focus on e-commerce ap- plications, recommending products to users from a large catalog of available items. The goal has been to increase sales by tapping into the users interests by utilizing information from various data sources to make relevant recommenda tions. Education, government, and policy websites face parallel challenges, ex- cept the product is information and their users may not be aware of what is relevant and what isnt. Given a large, knowledge-dense website and expert user searching for information, making relevant recommendations be- comes a significant challenge. This paper addresses the problem of providing recommendations to non-experts, helping them understand what they need to know, as opposed to what is popular among other users. The approach is user sensitive in that it adopts a 'model of learning whereby the user's context is dynamically interpreted as they browse and then leveraging that information to 1 Introduction Current recommender systems typically ask the question "What does the user or groups of similar users find interesting?""and make recommendations, usually in the form of products or Web documents, on that basis. In some domains, however, the question of what is a relevant recommendation becomes a function of user interest and expert opinion, a need recommender systems havent traditionally dealt with This paper is meant to address this very issue: providing expert-driven recommenda tions. To our knowledge, this topic has been little-explored outside of a few tangen tially-related recommender systems [24]. Essentially, the goal of an expert-driven recommendation system is to determine the users context and provide recommenda- ons by incorporating a mapping of expert knowledge. The idea is to direct users to what they need to know versus what is popular, as the two are not al ways in perfect alignment with each other. For instance, an important human resources policy affect- ing some subset of people in an organization may not receive much traffic online, but that in and of itself does not mean the policy is irrelevant or unimportant. Addition- ally, with more and more organizations putting more and more expert-oriented O Nasraoui et al. (Eds WebKDD 2005, LNAl4198, pp 77-95, 2006 o Springer-Verlag Berlin Heidelberg 2006
O. Nasraoui et al. (Eds.): WebKDD 2005, LNAI 4198, pp. 77 – 95, 2006. © Springer-Verlag Berlin Heidelberg 2006 USER: User-Sensitive Expert Recommendations for Knowledge-Dense Environments Colin DeLong1, Prasanna Desikan2, and Jaideep Srivastava2 1 College of Liberal Arts, University of Minnesota, 101 Pleasant St. SE, 55455 Minneapolis, MN, United States of America delo0041@umn.edu 2 Department of Computer Science, University of Minnesota, 200 Union Street SE, 55455 Minneapolis, MN, United States of America {desikan, srivasta}@cs.umn.edu Abstract. Traditional recommender systems tend to focus on e-commerce applications, recommending products to users from a large catalog of available items. The goal has been to increase sales by tapping into the user’s interests by utilizing information from various data sources to make relevant recommendations. Education, government, and policy websites face parallel challenges, except the product is information and their users may not be aware of what is relevant and what isn’t. Given a large, knowledge-dense website and a nonexpert user searching for information, making relevant recommendations becomes a significant challenge. This paper addresses the problem of providing recommendations to non-experts, helping them understand what they need to know, as opposed to what is popular among other users. The approach is usersensitive in that it adopts a ‘model of learning’ whereby the user’s context is dynamically interpreted as they browse and then leveraging that information to improve our recommendations. 1 Introduction Current recommender systems typically ask the question “What does the user or groups of similar users find interesting?” and make recommendations, usually in the form of products or Web documents, on that basis. In some domains, however, the question of what is a relevant recommendation becomes a function of user interest and expert opinion, a need recommender systems haven’t traditionally dealt with. This paper is meant to address this very issue: providing expert-driven recommendations. To our knowledge, this topic has been little-explored outside of a few tangentially-related recommender systems [24]. Essentially, the goal of an expert-driven recommendation system is to determine the user’s context and provide recommendations by incorporating a mapping of expert knowledge. The idea is to direct users to what they need to know versus what is popular, as the two are not always in perfect alignment with each other. For instance, an important human resources policy affecting some subset of people in an organization may not receive much traffic online, but that in and of itself does not mean the policy is irrelevant or unimportant. Additionally, with more and more organizations putting more and more expert-oriented
78 C. DeLong. P. Desikan. and J Srivastava content online, user navigational issues can be exacerbated to the point where such websites are essentially unusable for non-experts. Other examples would include any Web sites that have dense policy information, such as those created for academic advising tax law, and human resources. This paper, which is an extension and update of a previous paper of ours [25], is step towards providing a generalized framework for expert-driven recommendations Our starting point is the"Model of Learning; the philosophical basis our system and its components are constructed from. User Intent"is modeled through usage data analysis, producing sets of sequential association rules, and"Expert Knowledge through the structure and hypertext of the website itself, an expansion over our previ ous work, resulting in a concept graph mapped to individual URLs. Three separate rankings-UsageRank, ExpertRank, and GraphRank- produced and incorporated into an overall USER Rank, which produces the final list of recommendations. To the best of our knowledge, this approach of providing expert-driven recommendations is novel Section 2 talks briefly about the related work in this area. In Section 3 we discuss the key problems addressed by this paper. Section 4 describes our method from a philosophical and technical perspective. In Section 5 we give an overview of the architecture for the recommendation system that has been created for testing and evaluation purposes. We present our experiments and results in Section 6. Finally, we provide conclusions and some future research directions in Section 7. 2 Related Work Research of recommendation systems started in early 90s, when these systems were meant to aggregate past preferences of all users to provide recommendations to users with similar interests [18], [21],[8].[17]. Various possible classifications of recom- mendation techniques have been proposed [6], [21 ],[22]. There have also been vari ous survey papers with focus on particular aspects of recommender systems, such as rating-based recommendations [4], hybrid recommendations [5], and e-commerce applications [22]. In general, recommendation systems can be categorized, based on their approach, into four categories, an extension of the initial categorization(note by earlier work [6],[4]) Content-Based Approach: These recommendations are derived from the similarity of a product the user is interested in with other products that the user has bought/referred to in the past and preference level to those products as expressed by the user. Collaborative Approach: In such an approach, similarity is measured with respect to the end products purchased by other users. For instance, a shared abset of previously-purchased items between the user and other customers be a driving factor for recommendations Usage-Behavior Recommendations: Similarity based on usage behavior of users is used as a criterion for recommendations These recommendations are mapped to correlations in browsing behavior and mined from logs such as those generated by Web servers. Hybrid Approach: Any combination of the above three approaches
78 C. DeLong, P. Desikan, and J. Srivastava content online, user navigational issues can be exacerbated to the point where such websites are essentially unusable for non-experts. Other examples would include any Web sites that have dense policy information, such as those created for academic advising, tax law, and human resources. This paper, which is an extension and update of a previous paper of ours [25], is a step towards providing a generalized framework for expert-driven recommendations. Our starting point is the “Model of Learning”; the philosophical basis our system and its components are constructed from. “User Intent” is modeled through usage data analysis, producing sets of sequential association rules, and “Expert Knowledge” through the structure and hypertext of the website itself, an expansion over our previous work, resulting in a concept graph mapped to individual URLs. Three separate rankings – UsageRank, ExpertRank, and GraphRank – produced and incorporated into an overall USER Rank, which produces the final list of recommendations. To the best of our knowledge, this approach of providing expert-driven recommendations is novel. Section 2 talks briefly about the related work in this area. In Section 3 we discuss the key problems addressed by this paper. Section 4 describes our method from a philosophical and technical perspective. In Section 5 we give an overview of the architecture for the recommendation system that has been created for testing and evaluation purposes. We present our experiments and results in Section 6. Finally, we provide conclusions and some future research directions in Section 7. 2 Related Work Research of recommendation systems started in early 90’s, when these systems were meant to aggregate past preferences of all users to provide recommendations to users with similar interests [18], [21], [8], [17]. Various possible classifications of recommendation techniques have been proposed [6], [21], [22]. There have also been various survey papers with focus on particular aspects of recommender systems, such as rating-based recommendations [4], hybrid recommendations [5], and e-commerce applications [22]. In general, recommendation systems can be categorized, based on their approach, into four categories, an extension of the initial categorization (noted by earlier work [6], [4]): • Content-Based Approach: These recommendations are derived from the similarity of a product the user is interested in with other products that the user has bought/referred to in the past and preference level to those products as expressed by the user. • Collaborative Approach: In such an approach, similarity is measured with respect to the end products purchased by other users. For instance, a shared subset of previously-purchased items between the user and other customers would be a driving factor for recommendations. • Usage-Behavior Recommendations: Similarity based on usage behavior of users is used as a criterion for recommendations. These recommendations are mapped to correlations in browsing behavior and mined from logs such as those generated by Web servers. • Hybrid Approach: Any combination of the above three approaches
USER: User-Sensitive Expert Recommendations Content-Based and Collaborative Filtering [6], [10] approaches require th ability of user profiles and their explicitly-solicited opinion for a set of target ucts. Collaborative approaches also suffer from well known problems such as fication Requirement, the"Pump Priming"problem, Scalability, Sparseness, an Rating Bias'[7]. Web mining techniques such as Web Usage Mining overcome of these problems through the use of the implicitly-influenced opinion of the use extracting information from Web usage logs for ranking relevant Web pages s [201, [14]. The goal of most existing recommendation systems that follow collabora tive approaches [13] or hybrid approaches [1[2], [3] is to recommend products in order to boost sales, as opposed to recommending 'products'-Web documents in this paper- necessary to the user Web page recommendations and personalization techniques in which a users pref erences are automatically learned from Web usage mining techniques became popular in late 90s [14,[19]. Li and Zaiane have illustrated an approac ombine content. structure, and usage for Web page recommendations Huang et al make use of transitive associations and a bipartite graph to address the sparsity issue[27] and have conducted thorough research on other graph-based models as well [28]. An extensive study of Web personalization based on Web usage mining can be found in [12]. Ishi- kawa et al [9] have studied the effectiveness of Web usage mining as a recommenda tion system empirically. Most of these recommendation systems have been developed through the use of data obtained from the explicit link structure of the Web and Web usage logs. For this extended paper, we use these same sets of data in our efforts to generalize the expert-recommendation problem and its potential solutions. Our previ- ous work in this area [25] focused on a richer data set, whereby each Web document was assigned one or more concepts by an expert via a custom content management system(CMS)called Crimson. For this paper, a similar mapping of concepts for Web pages has been assembled using the link structure and link text, but the construction of this mapping is automated, a departure from traditional knowledge-based methods. In our approach, the web graph itself can be thought of as an unprocessed representation of domain knowledge. This information is then leveraged to generate a set of recommen dations relevant to the user's context in a knowledge-dense Web environment 3 The problem The primary reason for the creation of an expert-driven recommender system is to facilitate the navigation of non-expert surfers through websites containing expert- defined content, the numbers of which are increasing rapidly. Parallel with the growth of the World Wide Web, user demands for complete content availability have also increased. To keep up with expectations of constituent users, expert-defined websites, such as those found in higher education, medicine, and government, are making available vast amounts of technically-oriented content, including policies rocedures, and other knowledge-dense information. Generally, this trend is good for users in that more information can be easily accessed by visiting websites, but diffi- culties arise when attempting to connect users to the"right"content. Here, two key problems arise
USER: User-Sensitive Expert Recommendations 79 Content-Based and Collaborative Filtering [6], [10] approaches require the availability of user profiles and their explicitly-solicited opinion for a set of targeted products. Collaborative approaches also suffer from well known problems such as ‘Identification Requirement, the “Pump Priming” problem, Scalability, Sparseness, and Rating Bias’ [7]. Web mining techniques such as Web Usage Mining overcome some of these problems through the use of the implicitly-influenced opinion of the user by extracting information from Web usage logs for ranking relevant Web pages [11], [20], [14]. The goal of most existing recommendation systems that follow collaborative approaches [13] or hybrid approaches [1],[2], [3] is to recommend products in order to boost sales, as opposed to recommending ‘products’ - Web documents in this paper - necessary to the user. Web page recommendations and personalization techniques in which a user’s preferences are automatically learned from Web usage mining techniques became popular in late 90s [14], [19]. Li and Zaiane have illustrated an approach to combine content, structure, and usage for Web page recommendations [11]. Huang et al make use of transitive associations and a bipartite graph to address the sparsity issue [27] and have conducted thorough research on other graph-based models as well [28]. An extensive study of Web personalization based on Web usage mining can be found in [12]. Ishikawa et al [9] have studied the effectiveness of Web usage mining as a recommendation system empirically. Most of these recommendation systems have been developed through the use of data obtained from the explicit link structure of the Web and Web usage logs. For this extended paper, we use these same sets of data in our efforts to generalize the expert-recommendation problem and its potential solutions. Our previous work in this area [25] focused on a richer data set, whereby each Web document was assigned one or more concepts by an expert via a custom content management system (CMS) called Crimson. For this paper, a similar mapping of concepts for Web pages has been assembled using the link structure and link text, but the construction of this mapping is automated, a departure from traditional knowledge-based methods. In our approach, the web graph itself can be thought of as an unprocessed representation of domain knowledge. This information is then leveraged to generate a set of recommendations relevant to the user’s context in a knowledge-dense Web environment. 3 The Problem The primary reason for the creation of an expert-driven recommender system is to facilitate the navigation of non-expert surfers through websites containing expertdefined content, the numbers of which are increasing rapidly. Parallel with the growth of the World Wide Web, user demands for complete content availability have also increased. To keep up with expectations of constituent users, expert-defined websites, such as those found in higher education, medicine, and government, are making available vast amounts of technically-oriented content, including policies, procedures, and other knowledge-dense information. Generally, this trend is good for users in that more information can be easily accessed by visiting websites, but difficulties arise when attempting to connect users to the “right” content. Here, two key problems arise
C. DeLong. P Desikan. and J Srivastava First, the surfer is often unable to"ask the right question". Even if expert-defined websites are well-organized, the content language is often not intuitive to non-experts Non-experts browsing such a website may quickly find themselves lost or uninter- ested because of this. Additionally, a site search using Google, for instance, will not always return relevant links if the surfer's query is inconsistent with the website's language. As such, surfer navigation can become a formidable task. This is further complicated when legal concerns must take precedence over the simplification of content wording, which might otherwise leave out important clauses or restrictions that could be applicable to the surfer The second problem, related to the first, is how expert-defined websites can meet educational goals by connecting surfers to content relevant to their questions. The etermination of relevance is at the core of all recommende tems. however. for expert-defined websites, such recommendations must depend not only on the surfers definition of relevance, but also on the experts opinion. Higher education websites face this issue in particular, where a students context must be brought into alignment with certain academic outcomes, such as timely graduation. As such, a set of relevant recommendations might include links that Web usage mining alone would have never produced. Relevance in an expert recommender system is, therefore, a function of both surfer intent and expert knowledge. This lays the foundation for the model of learning on which our initial system is built 4 Method The philosophical groundwork of learning in general had to be examined prior to the implementation of our candidate expert recommender system. This gave us an intui- tive base to work from, presenting a clear picture of what pieces would be necessary in order to build a prototype, and what future work could be done to improve upon it. 4.1 Philosophical Here, we introduce a model of learning which is ultimately recursive in nature, and represents the core philosophy at the foundation of our recommender system. This model is constructed from interactions typically found in a conversation between an expert and a learner, such as those between a student and their academic advisor First, a question is formulated by the learner and asked of the expert, who then lev erages their accumulated knowledge in order to answer the question and/or ask other questions which try to discern the intent of the learner. Next, the expert must align the question's language with the language the answer or answers are defined in. As a result, the learner is able to more properly formulate additional questions and the expert can be more helpful -and detailed -in their replies. Using the student/advisor interaction as an example, an advisor may be asked about a policy the student does not understand(such as the 13-credit policy, where students must file a petition in order to take less than 13 credits of coursework in a semester). The student's question might not be worded consistently with name or content of the policy they are seeking to know more about, especially if it is technically-worded for legal reasons. The advi- sor must then bridge this knowledge gap by trying to discover the intent of the
80 C. DeLong, P. Desikan, and J. Srivastava First, the surfer is often unable to “ask the right question”. Even if expert-defined websites are well-organized, the content language is often not intuitive to non-experts. Non-experts browsing such a website may quickly find themselves lost or uninterested because of this. Additionally, a site search using Google, for instance, will not always return relevant links if the surfer’s query is inconsistent with the website’s language. As such, surfer navigation can become a formidable task. This is further complicated when legal concerns must take precedence over the simplification of content wording, which might otherwise leave out important clauses or restrictions that could be applicable to the surfer. The second problem, related to the first, is how expert-defined websites can meet educational goals by connecting surfers to content relevant to their questions. The determination of relevance is at the core of all recommender systems. However, for expert-defined websites, such recommendations must depend not only on the surfer’s definition of relevance, but also on the expert’s opinion. Higher education websites face this issue in particular, where a student’s context must be brought into alignment with certain academic outcomes, such as timely graduation. As such, a set of relevant recommendations might include links that Web usage mining alone would have never produced. Relevance in an expert recommender system is, therefore, a function of both surfer intent and expert knowledge. This lays the foundation for the model of learning on which our initial system is built. 4 Method The philosophical groundwork of learning in general had to be examined prior to the implementation of our candidate expert recommender system. This gave us an intuitive base to work from, presenting a clear picture of what pieces would be necessary in order to build a prototype, and what future work could be done to improve upon it. 4.1 Philosophical Here, we introduce a model of learning which is ultimately recursive in nature, and represents the core philosophy at the foundation of our recommender system. This model is constructed from interactions typically found in a conversation between an expert and a learner, such as those between a student and their academic advisor. First, a question is formulated by the learner and asked of the expert, who then leverages their accumulated knowledge in order to answer the question and/or ask other questions which try to discern the intent of the learner. Next, the expert must align the question’s language with the language the answer or answers are defined in. As a result, the learner is able to more properly formulate additional questions and the expert can be more helpful – and detailed – in their replies. Using the student/advisor interaction as an example, an advisor may be asked about a policy the student does not understand (such as the 13-credit policy, where students must file a petition in order to take less than 13 credits of coursework in a semester). The student’s question might not be worded consistently with name or content of the policy they are seeking to know more about, especially if it is technically-worded for legal reasons. The advisor must then bridge this knowledge gap by trying to discover the intent of the
USER: User-Sensitive Expert Recommendations question and connect it to appropriate answers. Through on,the gap is further narrowed as the student's questions become more con ith the language the policy is defined in Often, the advisor may have questions about the reason for the student's question ("Why do you want to file a petition?"), which may uncover additional information that can help the advisor in recommending the best course of action for the student. A student may not want to be asked about why they want to take less than 13-credits but it is advisor's job to encourage a student's academic success, even if it means telling them to take a full credit load. This is a vital piece of our model: an expert is teaching the learner what they need to know, versus telling them what they want to know. This is the primary difference between the model presented in this work and those of other recommender systems. LEARNER EXPERT List of Recommendations Fig. 1. Philosophical Model of Learning Figure I is a visual representation of this model as demonstrated through the dy namic of an expert and a learner. Similar to the conversation between the student and their advisor, the model is recursive, a representation of learning at the meta level. We use this model as the foundation for our recommender system, which incorporates both learner intent and expert knowledge in order to generate a list of 4.2 Technical The different perspectives of the expert and user are captured using different model from the available data(described in more detail below ). The problem then reduces to applying the set of ranking methods to the data representing expert and user opinions and finally merging them. The data contains the following descriptive eler
USER: User-Sensitive Expert Recommendations 81 question and connect it to appropriate answers. Through conversation, the gap is further narrowed as the student's questions become more consistent with the language the policy is defined in. Often, the advisor may have questions about the reason for the student’s question (“Why do you want to file a petition?”), which may uncover additional information that can help the advisor in recommending the best course of action for the student. A student may not want to be asked about why they want to take less than 13-credits, but it is advisor’s job to encourage a student's academic success, even if it means telling them to take a full credit load. This is a vital piece of our model: an expert is teaching the learner what they need to know, versus telling them what they want to know. This is the primary difference between the model presented in this work and those of other recommender systems. Fig. 1. Philosophical Model of Learning Figure 1 is a visual representation of this model as demonstrated through the dynamic of an expert and a learner. Similar to the conversation between the student and their advisor, the model is recursive, a representation of learning at the meta level. We use this model as the foundation for our recommender system, which incorporates both learner intent and expert knowledge in order to generate a list of recommendations. 4.2 Technical The different perspectives of the expert and user are captured using different models from the available data (described in more detail below). The problem then reduces to applying the set of ranking methods to the data representing expert and user opinions and finally merging them. The data contains the following descriptive elements: LLL EEE AAA RRR NNN EEE RRR EEE XXX PPP EEE RRR TTT Query Words, Usage Sequence List of Recommendations