CHAPTER 2 RELATED AND PREVIOUS WORK This dissertation draws on several different fields, most notably recommender systems information seeking, searching(on the Internet, for example), and citation analysis. In this chapter, we review this related and previous work. a detailed discussion of the recommendation process, recommender algorithms, and recommender metrics appears in Chapter 3 Recommender Systems and personalization Recommender systems help a target user navigate through a complex information space by making suggestions of which bits of information the user should consume (i.e. read, watch, listen to, etc. ) based on the system's knowledge of the user, other users in the system, and the information space itself [119]. All recommenders use information about a user(sometimes called a user profile, a user model, or user preferences)to generate recommendations. Recommenders come in three varieties: content-based, collaborative based(also known as'social ), and knowledge-based [17]. Moreover, there are hybrid commenders combining these varieties. Content-based recommenders use domain content to generate recommendations: information and meta-data gathered from items in the domain(e.g. the text of books you previously purchased). Collaborative-based recommenders use people's opinions of items in the domain to generate recommendations(e.g. users with similar item preferences to you liked this book, thus you may like it too). Knowledge-based recommenders use rules, patterns, or connections between items to generate recommendations(e. g. when you are buying a lamp, it ggests that you also buy some light bulbs). After discussing content-based and collaborative recommending, we will discuss one kind of knowledge-based recommending, case-based reasoning Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Content-Based Recommenders Content-based recommenders use information from the items themselves to generate recommendations. For example, in a research paper recommender, text extracted from the papers could be used to generate recommendations. Such recommenders use information retrieval and filtering algorithms to generate recommendations. A complete review of information retrieval(IR)and information filtering(IF)is beyond the scope of this dissertation, but a high-level overview helps place our research in context. We will first discuss the similarities and differences between IR and IF, review common models used in these systems, and finally discuss the relationship between IR, IF, and content based recommenders In their influential 1992 paper, Belkin and Croft provided a clear argument that information filtering(IF)and information retrieval (Ir) were much closer related than had been previously discussed in their respective communities [7]. In it, they argue that both IR and IF share the same five characteristics: A predefined representation and organization of documents, a representation of a user's current information state, a comparison step in which relevant documents are selected, an evaluation step where the user reviews the selected documents, and a possible iteration on the user's information state. The two key points are the user's information state, and the possibility of iterating on this state. In combined IR/IF systems, this information state must to be translated into a query that the system can parse. Such queries are comprised of keywords describing the user's need [5]. This implied translation could seriously affect the user's ability to find documents that meet her information need. Once given this translation of state, the iteration step becomes essential for helping users meet their information need. These problems also appear in recommender systems and we discuss them in this dissertation While there are many similarities between IR and IF, there are differences as well The differences between IR and IF can be expressed in a few salient points 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
iR is focused on returning relevant documents from a static corpus, whereas if is focused on selecting relevant documents from an incoming stream of documents(a constantly changing corpus) if is concerned with the timeliness of the documents returned, preferring new documents to older ones, whereas IR wants the most relevant . IR bases its model on independent queries; IF retains query information into a user information model If assumes to have a richer and more well-defined user information state than IR. IF has a model of the user's information need whereas IR usually only has the user's current query There are several kinds of models that IR and if systems use to generate meaningful recommendations. All models index and categorize information from the documents in the corpus. These indexes are based on content extracted from the documents themselves. The correct data representation is very important as it can limit the kinds of results that IR/IF systems return [51 With indices in place, one of several models can be employed to return relevant documents. We mention three popular ones 1. Boolean Retrieval Model In this model, the user's query is augmented with Boolean operators indicating the user's intention with their query (i.e. "recipes AND hamburger OR recipes AND cheeseburger"). This is an 'exact-match'model where the given query terms must match terms found in relevant documents. This model does not provide for relevance ranking, and it could exclude possible relevant results. It is however, simple to implement, and is commonly seen in IR systems 2. The Vector Space model In this model, all documents are evaluated using a multi-dimensional vector 16 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
space, where each dimension represents a word in the corpus. Similarities between documents can be computed using cosine similarity measures. One extremely popular version of this model weights the different word dimensions based on the frequency of the word in the document and in the corpus and is called TD/IDF similarity [127]. We will discuss using TF/IDF as an IR algorithm nside a content-based recommender in Chapter 3 3. The Probabilistic Model The probabilistic model uses probability calculations to determine the relevance of a query to different documents. For example, a Bayesian inference network can be used to model the relationships between documents in the corpus. When a query is presented to the system, the network ' propagates a signal depending on he probabilities between nodes, and returns the nodes representing the documents with highest probability of being related to the query [7]. We will discuss a different probabilistic model, the Naive Bayes Classifier, for use in recommender systems in Chapter 3 The relationship between IR/IF systems and content-based recommender systems is one of abstraction and purpose. Recommenders require elements of both kinds of systems, including the ability to search an existing corpus and streams of information make use of an existing user model, and return the most relevant information. In essence we argue a content-based recommender is equal to the abstraction above an ir or an IF system-an information processing system with the explicit goal of generating meaningful recommendations to users. This goal, we believe is a different goal from IR or IF systems. In an IR/IF system, the goal is to provide the most relevant documents to meet a user's information need, either from the corpus or from the incoming streams. In a recommender, the goal is the similar, but instead returning the most relevant documents for a user's information need, they return the most salient. That is, they should return not only relevant documents, but those which have the greatest impact on the user's Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
perception of completing her information seeking task. This is a point we will discuss in detail when we present Human-Recommender Interaction theory in Chapter 6 Collaborative Filterin Collaborative filtering-based recommenders(CF)work by gathering the opinions of users about items in a domain( e.g. movie ratings) placing this information into a user-item ratings matrix. Algorithms then compare either rows(users)or columns (items )to predict values for empty entries in the matrix. The idea of using the opinions of others to generate item recommendations combined with the phrase"collaborative filtering"was first discussed by Goldberg et al [41]. Their vision was that the opinions of others could help a person manage email and electronic documents by providing opinions and annotations to each item, giving the current user extra information about each of the Items In 1994, Resnick et al. published"GroupLens: an open architecture for collaborative filtering of netnews "in which they proposed a k-nearest neighbor algorithm for generating recommendations based on user opinions of netnews news articles [120] This algorithm, now commonly referred to as User-based Collaborative Filtering, was the rst algorithm widely used in recommender systems. Herlocker et al. performed a detailed analysis and proposed several important modifications [55]. a more technical discussion will appear in Chapter 3. The first experiments in CF were in netnews news articles [120, 143], but this soon expanded into several other domains, such as: jokes [42], movies [55, 58, and music [135] The k-nearest neighbor algorithm is considered an instance-based machine learning algorithm used for instance classification [97] before becoming known as'the CF algorithm. Soon, other machine learning algorithms were also explored recommenders. The first to be tried were statistical methods such as Bayesian networks [13] and clustering methods [15, 147]. Just as important, these papers brought with them machine learning evaluation methodologies for evaluating recommendation quality, such as k-folding and leave-n-out, the implications of which will be discussed in detail later Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission