Can Concept-Based User Modeling Improve Adaptive Visualization? Jae-wook Ahn and Peter Brusilovsky School of Information Sciences University of Pittsburgh Pittsburgh, PA, 15260 jahn, peter /@mail. sis. pitt. edu Abstract. Adaptive visualization can present user-adaptive informa- tion in such a way as to help users to analyze complicated information spaces easily and intuitively. We presented an approach called Adap- tive VIBE, which extended the traditional reference point-based visu- alization algorithm, so that it could adaptively visualize documents of interest. The adaptive visualization was implemented by separating the effects of user models and queries within the document space and we were able to show the potential of the proposed idea. However, ad tive visualization still remained in the simple bag-of-words realm. The keywords used to construct the user models were not effective enough to express the concepts that need to be included in the user models In this study, we tried to improve the old-fashioned keyword-only user models by adopting more concept-rich named-entities. The evaluation results show the strengths and shortcomings of using named-entities as conceptual elements for visual user models and the potential to improve the effectiveness of personalized information access systems 1 Introduction Personalized information access is one of the most important keys to user sat isfaction in today's information environment. Ni Is information services and applications are producing new information every second and it is getting more and more complicated to access relevant items in time. Personalization plays a role in that challenge. It tries to solve the problem by understanding a user's needs and providing tailored information efficiently. There are several approaches for this personalized information access: personalized information re- trieval [16, information filtering 10, and adaptive visualization 13, 21. Among them, adaptive visualization is an attempt to improve information visualization by adding an adaptation component. Through adaptation, users can modify the way in which the system visualizes a collection of documents 21. It combines algorithm-based personalization with user interfaces in order to better learn about users and to provide personalized information more efficiently. It also shares the spirit of exploratory search [ 15. Both attempt to enhance users'own intelligence by providing more interactive and expressive user interfaces so tha P De Bra. A Kobsa, and D. Chin(Eds ) UMAP 2010, LNCS 6075, pp 4-15. 2010
Can Concept-Based User Modeling Improve Adaptive Visualization? Jae-wook Ahn and Peter Brusilovsky School of Information Sciences University of Pittsburgh Pittsburgh, PA, 15260 {jahn,peterb}@mail.sis.pitt.edu Abstract. Adaptive visualization can present user-adaptive information in such a way as to help users to analyze complicated information spaces easily and intuitively. We presented an approach called Adaptive VIBE, which extended the traditional reference point-based visualization algorithm, so that it could adaptively visualize documents of interest. The adaptive visualization was implemented by separating the effects of user models and queries within the document space and we were able to show the potential of the proposed idea. However, adaptive visualization still remained in the simple bag-of-words realm. The keywords used to construct the user models were not effective enough to express the concepts that need to be included in the user models. In this study, we tried to improve the old-fashioned keyword-only user models by adopting more concept-rich named-entities. The evaluation results show the strengths and shortcomings of using named-entities as conceptual elements for visual user models and the potential to improve the effectiveness of personalized information access systems. 1 Introduction Personalized information access is one of the most important keys to user satisfaction in today’s information environment. Numerous information services and applications are producing new information every second and it is getting more and more complicated to access relevant items in time. Personalization plays a role in that challenge. It tries to solve the problem by understanding a user’s needs and providing tailored information efficiently. There are several approaches for this personalized information access: personalized information retrieval [16], information filtering [10], and adaptive visualization [13, 21]. Among them, adaptive visualization is an attempt to improve information visualization by adding an adaptation component. Through adaptation, users can modify the way in which the system visualizes a collection of documents [21]. It combines algorithm-based personalization with user interfaces in order to better learn about users and to provide personalized information more efficiently. It also shares the spirit of exploratory search [15]. Both attempt to enhance users’ own intelligence by providing more interactive and expressive user interfaces so that P. De Bra, A. Kobsa, and D. Chin (Eds.): UMAP 2010, LNCS 6075, pp. 4–15, 2010. c Springer-Verlag Berlin Heidelberg 2010
Can Concept-Based User Modeling Improve Adaptive Visualization? can achieve better search results. However, adaptive visualization is even evolved than simple exploratory searching, because it actively endeavors to estimate users' search context and help them to discover optimal solutions In order to implement the adaptive visualization, we extended a well-known vi- sualization framework called VIBE (Visual Information Browsing Environment) 18 and created Adaptive VIBE. VIBE is a reference point(called POl, meaning Point Of Interest )-based visualization method and we extended it to visualize the user models and the personalized search results. We have begun to evaluate this idea [1 and are currently studying user behaviors with the system. However, th user models adopted in previous study were constructed using the old-fashioned keyword-based bag-of-words approach. We have always suspected the lin ntation of the keyword-based user modeling for dealing with large amount of data; there- fore, we decided to address this problem in the current study by extending the user models and enriching them with more semantic-rich elements. We chose to use named-entities(NEs, henceforth) as alternatives to the simple keywords They were expected to be semantically richer than keywords and could bette represent concepts This paper investigates whether the use of NEs in the user models -especially in the Adaptive Vibe visualization can lead us to build better personalized information access services. In the next section, the ideas of concept-based use modeling and NE-based information systems are introduced( Sect. 2). In Sect. 3 the proposed adaptive visualization and the concept-based user modeling are described. The following sections explain the methodology and the results of our experiments with the NE-based adaptive visualization. The concluding section discusses the implications of this study and future plans 2 Concept-Based User Modeling and Named-Entities Keyword-based user modeling is a traditional approach widely used for content based personalization and other related areas. Even though this simple bag- of-words approach has been working relatively well, its limitations such as the polysemy problem or the independence assumption among keywords were consis- tently noted too. Therefore, there have been many attempts to build user models to overcome the limitations and they are classified into two categories: network- based and ontology-based user models [8. Networked user models adopted in projects like 9, 14 were constructed in a way that connected the concept nodes included in the user model networks and tried to represent the relationships among the concepts. Ontology-based approaches 5, 20 incorporated more so- phisticated methods. Unlike the network user models where the relationship were flat, they tried to build user models hierarchically by making use of already existing ontologies Despite all of these efforts, we still could see the chance to enrich the mean- ings of the user model elements themselves. Therefore, we tried to use NEs as conceptual elements in our user models and to extend the semantics and expres- sive power of the user models. As a semantic category, NEs act as pointers to
Can Concept-Based User Modeling Improve Adaptive Visualization? 5 they can achieve better search results. However, adaptive visualization is even more evolved than simple exploratory searching, because it actively endeavors to estimate users’ search context and help them to discover optimal solutions. In order to implement the adaptive visualization, we extended a well-known visualization framework called VIBE (Visual Information Browsing Environment) [18] and created Adaptive VIBE. VIBE is a reference point (called POI, meaning Point Of Interest)-based visualization method and we extended it to visualize the user models and the personalized search results. We have begun to evaluate this idea [1] and are currently studying user behaviors with the system. However, the user models adopted in previous study were constructed using the old-fashioned, keyword-based bag-of-words approach. We have always suspected the limitation of the keyword-based user modeling for dealing with large amount of data; therefore, we decided to address this problem in the current study by extending the user models and enriching them with more semantic-rich elements. We chose to use named-entities (NEs, henceforth) as alternatives to the simple keywords. They were expected to be semantically richer than keywords and could better represent concepts. This paper investigates whether the use of NEs in the user models – especially in the Adaptive VIBE visualization – can lead us to build better personalized information access services. In the next section, the ideas of concept-based user modeling and NE-based information systems are introduced (Sect. 2). In Sect. 3, the proposed adaptive visualization and the concept-based user modeling are described. The following sections explain the methodology and the results of our experiments with the NE-based adaptive visualization. The concluding section discusses the implications of this study and future plans. 2 Concept-Based User Modeling and Named-Entities Keyword-based user modeling is a traditional approach widely used for contentbased personalization and other related areas. Even though this simple bagof-words approach has been working relatively well, its limitations such as the polysemy problem or the independence assumption among keywords were consistently noted too. Therefore, there have been many attempts to build user models to overcome the limitations and they are classified into two categories: networkbased and ontology-based user models [8]. Networked user models adopted in projects like [9, 14] were constructed in a way that connected the concept nodes included in the user model networks and tried to represent the relationships among the concepts. Ontology-based approaches [5, 20] incorporated more sophisticated methods. Unlike the network user models where the relationships were flat, they tried to build user models hierarchically by making use of alreadyexisting ontologies. Despite all of these efforts, we still could see the chance to enrich the meanings of the user model elements themselves. Therefore, we tried to use NEs as conceptual elements in our user models and to extend the semantics and expressive power of the user models. As a semantic category, NEs act as pointers to
J -w. Ahn and P. Brusilovsky real world entities such as locations, organizations, people, or events [19. NEs can provide much richer semantic content than most vocabulary keywords and many researchers argued that semantic features were able to better model es sential document content. Therefore, the application of NEs was considered to improve a user's ability to find and access the right information [ 19. They have been studied extensively in various language processing and information access tasks such as document indexing [17 and topic detection and tracking [12. At e same time, NEs have been successfully adopted by analytic systems such as 4, where user interaction and feedback plays a key role similar to that in the personalized information access systems To our knowledge, however, there has been no attempt to directly incorporate NEs into user model construction. We have utilized NEs as conceptual elements for news articles(where NEs can be particularly useful for catching concepts in one of our previous studies and found that NEs organized into the editor's 4Ws(Who, What, Where, and When) could assist users in finding relevant information in a non-personalized information retrieval setting [2. With this experience, we could expect NEs to be high-quality semantic elements which would enhance the user model representation. 3 Adaptive Concept-Based Visualization: The Technology 3.1 Adaptive ViBE Visualization VIBe was first developed at the University of Pittsburgh and Molde College in Norway [18. It is a reference point (called POl, Point Of Interest)-based visualization, which displays documents according to their similarity ratios to the POIs, so that more similar documents are located closer to the POIs(fo more details about the visualization algorithm, see [1l] and 18). Figure 1 shows examples of the ViBe visualization. On top of this general idea, we attempted to add adaptivity by separating the originally-equivalent POIs into multiple groups The traditional VibE usually arranged the POIs in a circle, where POIs with different layers of meaning were treated equally (like a round table) and which required further user intervention to organize the different groups of POIs. For Adaptive VIBE, we grouped the different POls into different locations from th beginning. That is, we separated the two groups of POIs-query and user model POIs. By separating them, we were able to spatially distinguish the documents which were more related to the query or the user model, respectively. This method is similar to the usual personalized searching method, wher documents are re-ranked according to their similarities to user models. The documents more related to user models are brought higher to the top of the ranked list, while less related ones are at the bottom. In Adaptive Vibe, the one-dimensional ranked list is now replaced with a two-dimensional spatial visu- alization. The documents that used to be scattered all over the screen (located according to their similarities to POIs or query terms)are now organized by heir similarities closer to the query or user model. In order to implement thi
6 J.-w. Ahn and P. Brusilovsky real world entities such as locations, organizations, people, or events [19]. NEs can provide much richer semantic content than most vocabulary keywords and many researchers argued that semantic features were able to better model essential document content. Therefore, the application of NEs was considered to improve a user’s ability to find and access the right information [19]. They have been studied extensively in various language processing and information access tasks such as document indexing [17] and topic detection and tracking [12]. At the same time, NEs have been successfully adopted by analytic systems such as [4], where user interaction and feedback plays a key role similar to that in the personalized information access systems. To our knowledge, however, there has been no attempt to directly incorporate NEs into user model construction. We have utilized NEs as conceptual elements for news articles (where NEs can be particularly useful for catching concepts) in one of our previous studies and found that NEs organized into the editor’s 4Ws (Who, What, Where, and When) could assist users in finding relevant information in a non-personalized information retrieval setting [2]. With this experience, we could expect NEs to be high-quality semantic elements which would enhance the user model representation. 3 Adaptive Concept-Based Visualization: The Technology 3.1 Adaptive VIBE Visualization VIBE was first developed at the University of Pittsburgh and Molde College in Norway [18]. It is a reference point (called POI, Point Of Interest)-based visualization, which displays documents according to their similarity ratios to the POIs, so that more similar documents are located closer to the POIs (for more details about the visualization algorithm, see [11] and [18]). Figure 1 shows examples of the VIBE visualization. On top of this general idea, we attempted to add adaptivity by separating the originally-equivalent POIs into multiple groups. The traditional VIBE usually arranged the POIs in a circle, where POIs with different layers of meaning were treated equally (like a round table) and which required further user intervention to organize the different groups of POIs. For Adaptive VIBE, we grouped the different POIs into different locations from the beginning. That is, we separated the two groups of POIs – query and user model POIs. By separating them, we were able to spatially distinguish the documents which were more related to the query or the user model, respectively. This method is similar to the usual personalized searching method, where documents are re-ranked according to their similarities to user models. The documents more related to user models are brought higher to the top of the ranked list, while less related ones are at the bottom. In Adaptive VIBE, the one-dimensional ranked list is now replaced with a two-dimensional spatial visualization. The documents that used to be scattered all over the screen (located according to their similarities to POIs or query terms) are now organized by their similarities closer to the query or user model. In order to implement this
Can Concept-Based User Modeling Improve Adaptive Visualization? separation, we added two new adaptive layouts of POIs(Hemisphere and Paral- lel) to the old circular layout(Radial) as shown in Fig. 1. There, it can be seen that the document space is separated into two parts: the one that is closer to the query side and the other closer to the user model side. This separation is the esult of the effect of the user model POIs(using the adaptive layouts) CoNvICt 00 PARDON AHEMUN O (a)Radal (b) Hemisphere (c) Paralled Fig 1. Adaptive VIBE layouts(a) Radial,(b) Hemisphere, and(c)Parallel. Yellow (CONVICT and PARDON)and blue(YEAR, POPE, and so on) POIs are query terms and user model keywords, respectively. White squares are retrieved documents We could extend the visual user models even further by incorporating conc tual NEs into them. Figure 2 shows an example of this extension. Originally, there were only keyword-based POIs(POPE, YEAR, ESPIONAGE, and CHARGE) but we added five more NE-based POIs to the model (lowercased in the figure) With the addition of these NEs, the user model could express more information It was not just increasing the number of POls, but adding more meanings to the user model. For example, united_states__america is usually split into 4 words and expressed as unite, state, and america(after stemmed and stopwords are removed)in keyword-based approaches. Russia and russian are reduced into one stemmed word, russia. However, these lost meanings were recovered in NE-based user models and we expected that it would help users to access relevant infor- mation. The following sections describe the NE-based user model construction process in more detail. 3.2 Named Entity Extraction We first needed to extract Nes from texts in order to build ne-based user mod els. For this task, we used software developed by our partner at IBM 7. With the help of the Ne annotator, we could extract the NEs to construct the use models and calculate the similarity between documents and the entities as puts to the Adaptive VIBE system. The NE annotation process was based on a statistical maximum-entropy model that recognized 32 types of named, nominal and pronominal entities(such as person, organization, facility, location, occupa tion,etc), and 13 types of events (such as event-violence, event communication, etc). Among them, we selected the nine most frequent entity typ
Can Concept-Based User Modeling Improve Adaptive Visualization? 7 separation, we added two new adaptive layouts of POIs (Hemisphere and Parallel) to the old circular layout (Radial) as shown in Fig. 1. There, it can be seen that the document space is separated into two parts: the one that is closer to the query side and the other closer to the user model side. This separation is the result of the effect of the user model POIs (using the adaptive layouts). Fig. 1. Adaptive VIBE layouts (a) Radial, (b) Hemisphere, and (c) Parallel. Yellow (CONVICT and PARDON) and blue (YEAR, POPE, and so on) POIs are query terms and user model keywords, respectively. White squares are retrieved documents. We could extend the visual user models even further by incorporating conceptual NEs into them. Figure 2 shows an example of this extension. Originally, there were only keyword-based POIs (POPE, YEAR, ESPIONAGE, and CHARGE) but we added five more NE-based POIs to the model (lowercased in the figure). With the addition of these NEs, the user model could express more information. It was not just increasing the number of POIs, but adding more meanings to the user model. For example, united states of america is usually split into 4 words and expressed as unite, state, and america (after stemmed and stopwords are removed) in keyword-based approaches. Russia and russian are reduced into one stemmed word, russia. However, these lost meanings were recovered in NE-based user models and we expected that it would help users to access relevant information. The following sections describe the NE-based user model construction process in more detail. 3.2 Named Entity Extraction We first needed to extract NEs from texts in order to build NE-based user models. For this task, we used software developed by our partner at IBM [7]. With the help of the NE annotator, we could extract the NEs to construct the user models and calculate the similarity between documents and the entities as inputs to the Adaptive VIBE system. The NE annotation process was based on a statistical maximum-entropy model that recognized 32 types of named, nominal and pronominal entities (such as person, organization, facility, location, occupation, etc), and 13 types of events (such as event violence, event communication, etc). Among them, we selected the nine most frequent entity types
J -w. Ahn and P. Brusilovsky User Query Relevant Documents 000 united states of america Fig. 2. Adaptive VIBE enriched with a concept user model -lowercased elements (pope, prison, russian, russia, united-statesof america) are NE One very important characteristic of the NE annotator we used was that it could distinguish between different forms of the same entities within- and between-documents. For example, it was able to detect"ski lovers"and"who which were pointing to the same group of people and could give them the same ID within a single document. They were marked as ZBN2000111304000019- E75 which represented the 75th entity in document ZBN2000111304000019 Therefore, those two entities with different forms ( "ski lovers"and"who")could be assessed as having the same meaning(E75) by the system. At the same time the annotator could do the same thing across the documents. It could endow a single ID"XDC Per wolfgang -schussel"to the words/phrases in the text like . Schussel,“ director”,“ Chancellor”,and"him”, so that users could grasp the fact that they represented a single person. This capability was considered very promising, since it could deliver the semantics of the entities in the text regardless of the varying textual representations. For more details about the ne annotation algorithm and the selection process, please see 2 3.3 Construction of Concept-Based User Models Using NEs As discussed in the previous sections, we assumed that NEs were semantical richer than vocabulary keywords and would contribute greatly to accessing rel evant information. This expectation was grounded on our previous study 2 hich used NEs as pseudo-facets for browsing information. However, we had no idea about the best method for constructing ne-based user models. Is it a better approach to use NEs only in user models? What fraction of NEs should be used with keywords, if we choose to mix them? Therefore, we prepared seven combinations of the keyword and nE mixtures, in the spectrum between the ex treme"keywords-only"mixture and the"NE-only"mixture. They are as follows k20n0, k10n0, k5n5, n8n8, k10n10, kOn10, kOn20. Here, kzny represents
8 J.-w. Ahn and P. Brusilovsky Fig. 2. Adaptive VIBE enriched with a concept user model – lowercased elements (pope, prison, russian, russia, united states of america) are NEs One very important characteristic of the NE annotator we used was that it could distinguish between different forms of the same entities within- and between-documents. For example, it was able to detect “ski lovers” and “who”, which were pointing to the same group of people and could give them the same ID within a single document. They were marked as ZBN20001113.0400.0019- E75 which represented the 75th entity in document ZBN20001113.0400.0019. Therefore, those two entities with different forms (“ski lovers” and “who”) could be assessed as having the same meaning (E75) by the system. At the same time, the annotator could do the same thing across the documents. It could endow a single ID “XDC:Per:wolfgang schussel” to the words/phrases in the text like “Schussel”, “director”, “Chancellor”, and “him”, so that users could grasp the fact that they represented a single person. This capability was considered very promising, since it could deliver the semantics of the entities in the text regardless of the varying textual representations. For more details about the NE annotation algorithm and the selection process, please see [2]. 3.3 Construction of Concept-Based User Models Using NEs As discussed in the previous sections, we assumed that NEs were semantically richer than vocabulary keywords and would contribute greatly to accessing relevant information. This expectation was grounded on our previous study [2] which used NEs as pseudo-facets for browsing information. However, we had no idea about the best method for constructing NE-based user models. Is it a better approach to use NEs only in user models? What fraction of NEs should be used with keywords, if we choose to mix them? Therefore, we prepared seven combinations of the keyword and NE mixtures, in the spectrum between the extreme “keywords-only” mixture and the “NE-only” mixture. They are as follows: k20n0, k10n0, k5n5, n8n8, k10n10, k0n10, k0n20. Here, kxny represents