Artificial Intelligence Review (2005)24: 319-338 C Springer 2005 DOI10.1007/10462-005-9000-z Retrieval Failure and Recovery in Recommender Systems DAVID MCSHERRY School of Computing and Information Engineering. University of Ulster, Coleraine BT52 ISA, Northern Ireland, UK(Tel: +44(0)28 7032 4130, Fax:+44(0)28 7032 Abstract. In case-based reasoning(CBR) approaches to product recommendation, descriptions of the available products are stored in a case library and retrieved in response to a query representing the users requirements. We present an approach to recovery from the retrieval failures that often occur when the user's requirements are treated as constraints that must be satisfied. Failure to retrieve a matching case trig- gers a recovery process in which the user is invited to select from a recovery set of relaxations (or sub-queries) of her query that are guaranteed to succeed. The sug gested relaxations are ranked according to a simple measure of recovery cost defined n terms of the importance weights assigned to the query attributes. The recovery set for an unsuccessful query also serves as a guide to continued exploration of the prod uct space when none of the cases initially recommended by the system is acceptable to the user Keywords: case-based reasoning, recommender systems, retrieval failure, query 1. Introduction n spite of limitations such as lack of diversity in the recommended cases(e.g. McGinty and Smyth, 2003; McSherry, 2003c), similar y-based retrieval remains the dominant CBR approach to retrieval in recommender systems. However, it is increasingly common for the user's requirements to be treated initially as constraints that the recommended products must satisfy(Goker and Thompson, 2000: Bridge, 2002; Ricci et al., 2002; McSherry, 2004b: Thompson et al 2004). Typically these approaches rely on query relaxation to recover from the retrieval failures that occur when there is no product that satisfies all the user's requirements. We focus here on approaches in which relaxing a query means eliminating one or more constraints from the query rather than requiring the user to revise individual con- straints, for example as in Sermo(Bridge, 2002) In Ricci et al.s (2002)Intelligent Travel Recommender, the user is told how many results she will get, if any, by eliminating each of
DOI 10.1007/s10462-005-9000-z Artificial Intelligence Review (2005) 24:319–338 © Springer 2005 Retrieval Failure and Recovery in Recommender Systems DAVID MCSHERRY School of Computing and Information Engineering, University of Ulster, Coleraine BT52 1SA, Northern Ireland, UK (Tel: +44 (0)28 7032 4130; Fax: +44 (0)28 7032 4916; e-mail: dmg.mcsherry@ulster.ac.uk) Abstract. In case-based reasoning (CBR) approaches to product recommendation, descriptions of the available products are stored in a case library and retrieved in response to a query representing the user’s requirements. We present an approach to recovery from the retrieval failures that often occur when the user’s requirements are treated as constraints that must be satisfied. Failure to retrieve a matching case triggers a recovery process in which the user is invited to select from a recovery set of relaxations (or sub-queries) of her query that are guaranteed to succeed. The suggested relaxations are ranked according to a simple measure of recovery cost defined in terms of the importance weights assigned to the query attributes. The recovery set for an unsuccessful query also serves as a guide to continued exploration of the product space when none of the cases initially recommended by the system is acceptable to the user. Keywords: case-based reasoning, recommender systems, retrieval failure, query relaxation 1. Introduction In spite of limitations such as lack of diversity in the recommended cases (e.g. McGinty and Smyth, 2003; McSherry, 2003c), similarity-based retrieval remains the dominant CBR approach to retrieval in recommender systems. However, it is increasingly common for the user’s requirements to be treated initially as constraints that the recommended products must satisfy (Goker and Thompson, 2000; ¨ Bridge, 2002; Ricci et al., 2002; McSherry, 2004b; Thompson et al., 2004). Typically these approaches rely on query relaxation to recover from the retrieval failures that occur when there is no product that satisfies all the user’s requirements. We focus here on approaches in which relaxing a query means eliminating one or more constraints from the query rather than requiring the user to revise individual constraints, for example as in Sermo (Bridge, 2002). In Ricci et al.’s (2002) Intelligent Travel Recommender, the user is told how many results she will get, if any, by eliminating each of
DAVID MCSHERRY the constraints in her query. The Adaptive Place Advisor(Goker and Thompson, 2000; Thompson et al., 2004)is an in-car recommender system in which the selection of a constraint to be eliminated from an unsuccessful query, a suggestion that the user need not accept, is based on the systems current understanding of the user's personal priorities. The assistance that the user is given in these approaches is a clear improvement on traditional database approaches that force the user to adopt a trial-and-error approach to revising her query when there is no product that satisfies all her requirements(wilke et al., 1998). However, a problem that existing techniques often fail to address is that recovery from retrieval failure may not be possible by eliminating a single constraint(McSherry, 2003b, 2004b) Of course, if queries are incrementally elicited as in the Adap- tive Place Advisor, then recovery is always possible by eliminating the most recently elicited constraint. But often queries are not elicited incrementally, and even if recovery is possible by eliminating a sin- gle constraint, this may be a compromise that the user is not pre- pared to accept. In recent work we presented an approach to recovery from retrieval failure in which queries need not be elicited incremen- tally and there is no assumption that recovery is possible by eliminat ing a single constraint(McSherry, 2004b). The recovery process begins with an explanation of the retrieval failure that draws the user's atten- tion to combinations of features in her query for which there are no matching cases. The user is then guided in the selection of the most useful attribute. and associated constraint to be eliminated from her query at each stage of a mixed-initiative relaxation process. If not pre- pared to compromise on the attribute suggested for elimination at any stage. the user can select another attribute to be eliminated Go In this paper, we present an alternative and more direct approach recovery from retrieval failure in which the user is invited to selec from a recovery set of relaxations of her query that are guaranteed to succeed. We demonstrate the approach in a prototype recommender system called Front Case in which cases retrieved on successful com- pletion of the recovery process are ranked in order of similarity to initial query. In general, retrieved cases are presented to the user will depend on the systems overall recommendation strategy Adapted from techniques for providing cooperative responses to fail- a g database queries(e.g. Kaplan, 1982: Gaasterland et al.,1992: God also influ by recent work on the role of compromise in product recommendation
320 DAVID MCSHERRY the constraints in her query. The Adaptive Place Advisor (Goker and ¨ Thompson, 2000; Thompson et al., 2004) is an in-car recommender system in which the selection of a constraint to be eliminated from an unsuccessful query, a suggestion that the user need not accept, is based on the system’s current understanding of the user’s personal priorities. The assistance that the user is given in these approaches is a clear improvement on traditional database approaches that force the user to adopt a trial-and-error approach to revising her query when there is no product that satisfies all her requirements (Wilke et al., 1998). However, a problem that existing techniques often fail to address is that recovery from retrieval failure may not be possible by eliminating a single constraint (McSherry, 2003b, 2004b). Of course, if queries are incrementally elicited as in the Adaptive Place Advisor, then recovery is always possible by eliminating the most recently elicited constraint. But often queries are not elicited incrementally, and even if recovery is possible by eliminating a single constraint, this may be a compromise that the user is not prepared to accept. In recent work we presented an approach to recovery from retrieval failure in which queries need not be elicited incrementally and there is no assumption that recovery is possible by eliminating a single constraint (McSherry, 2004b). The recovery process begins with an explanation of the retrieval failure that draws the user’s attention to combinations of features in her query for which there are no matching cases. The user is then guided in the selection of the most useful attribute, and associated constraint, to be eliminated from her query at each stage of a mixed-initiative relaxation process. If not prepared to compromise on the attribute suggested for elimination at any stage, the user can select another attribute to be eliminated. In this paper, we present an alternative and more direct approach to recovery from retrieval failure in which the user is invited to select from a recovery set of relaxations of her query that are guaranteed to succeed. We demonstrate the approach in a prototype recommender system called Front Case in which cases retrieved on successful completion of the recovery process are ranked in order of similarity to the user’s initial query. In general, of course, the way in which the retrieved cases are presented to the user will depend on the system’s overall recommendation strategy. Adapted from techniques for providing cooperative responses to failing database queries (e.g. Kaplan, 1982; Gaasterland et al., 1992; Godfrey, 1997, 1998), our direct approach to recovery is also influenced by recent work on the role of compromise in product recommendation
RETRIEVAL FAILURE AND RECOVERY IN RECOMMENDER SYSTEMS 321 (McSherry, 2003a, c, 2004a). Relaxations in the recovery set cover all possible sets of compromises that the user might be prepared to con- sider, at least without knowing the extent of the compromises involved The suggested relaxations are also ranked according to a measure of recovery cost defined in terms of the importance weights assigned to the query attributes. Another important benefit is that the recovery set for an unsuccessful query serves as a guide to continued exploration of the product space when none of the cases initially recommended by the system is acceptable to the user In Sections 2-4, we describe our new approach to recovery fror retrieval failure, investigate its feasibility in terms of cognitive load, and evaluate its performance on a well-known case library in the travel domain. In Section 5, we present a detailed account of the rec- ommendation process in Front Case and the assistance provided to the user when faced with retrieval failure. Related work is discussed in Section 6 and our conclusions are presented in Section 7 2. Recovery from Retrieval Failure We assume that the user's query 2, over a subset atts( @) of the case attributes, is represented as a set of constraints that the retrieved cases must satisfy. Depending on the attribute type, the constraint ca(g) ssociated with a given attribute a E atts(@) may be expressed, for example, in terms of a required value, a maximum or minimum value, or a range of acceptable values. We will refer to latts(@) as the length of th Definition 1: A given query Q1 is a sub-query of another query Q if atts(Q1)C atts(02)and ca(Q1)=Ca(02) for all a e atts(o1 If atts(@1)C atts(@2), we say that Q1 is a proper sub-query of @2 Given a query Q for which a retrieval failure has occurred, the aim of our direct approach to recovery from retrieval failure is to construct a recovery set of successful relaxations or sub-queries of Q that the user may be prepared to accept in spite of the compromises they involve. Recovery by query relaxation is always possible, in the worst case by eliminating all constraints from the unsuccessful query (or demoting them to soft constraints). As a failing query may have many successful sub-queries, our aim is to minimise the size of the
RETRIEVAL FAILURE AND RECOVERY IN RECOMMENDER SYSTEMS 321 (McSherry, 2003a, c, 2004a). Relaxations in the recovery set cover all possible sets of compromises that the user might be prepared to consider, at least without knowing the extent of the compromises involved. The suggested relaxations are also ranked according to a measure of recovery cost defined in terms of the importance weights assigned to the query attributes. Another important benefit is that the recovery set for an unsuccessful query serves as a guide to continued exploration of the product space when none of the cases initially recommended by the system is acceptable to the user. In Sections 2–4, we describe our new approach to recovery from retrieval failure, investigate its feasibility in terms of cognitive load, and evaluate its performance on a well-known case library in the travel domain. In Section 5, we present a detailed account of the recommendation process in Front Case and the assistance provided to the user when faced with retrieval failure. Related work is discussed in Section 6 and our conclusions are presented in Section 7. 2. Recovery from Retrieval Failure We assume that the user’s query Q, over a subset atts(Q) of the case attributes, is represented as a set of constraints that the retrieved cases must satisfy. Depending on the attribute type, the constraint ca(Q) associated with a given attribute a ∈ atts(Q) may be expressed, for example, in terms of a required value, a maximum or minimum value, or a range of acceptable values. We will refer to |atts(Q)| as the length of the query. Definition 1: A given query Q1 is a sub-query of another query Q2 if atts(Q1) ⊆ atts(Q2) and ca(Q1) = ca(Q2) for all a ∈ atts(Q1). If atts(Q1) ⊂ atts(Q2), we say that Q1 is a proper sub-query of Q2. Given a query Q for which a retrieval failure has occurred, the aim of our direct approach to recovery from retrieval failure is to construct a recovery set of successful relaxations or sub-queries of Q that the user may be prepared to accept in spite of the compromises they involve. Recovery by query relaxation is always possible, in the worst case by eliminating all constraints from the unsuccessful query (or demoting them to soft constraints). As a failing query may have many successful sub-queries, our aim is to minimise the size of the
DAVID MCSHERRY recovery set while ensuring that it covers all possible sets of compro- mises that the user might be prepared to consider To illustrate our approach, Figure 1 shows all sub-queries of a query @24 involving four attributes a1, a2, a3, and a4. We denote by 234 the sub-query involving only attributes ar d a4, and by Q4 the sub-query involving only a3 and a4. We use a similar nota- tion for each of the other sub-queries except O, the empty query. We will assume in this example that all sub-queries of Q are successful queries except those that are marked'x'in Figure 1. As any query a sub-query of itself, Q 234 can be seen to have sixteen sub-queries, of which ten are successful Among the successful sub-queries of ol234 an obvious candidate be included in the recovery set is Q 4 as it is the only success- ful sub-query that involves only a single compromise. But if Q included in the recovery set, then there is no need to show the user any of its proper sub-queries. For example, although Q3 is also a successful sub-query, it involves the same compromise as Q4 and an additional compromise that Q 34+ does not involve. It is there fore reasonable to assume that the user would never choose Q in 1234 Q×Q×Q 34 Q× Figure 1. Sub-query lattice for an example query involving four attributes
322 DAVID MCSHERRY recovery set while ensuring that it covers all possible sets of compromises that the user might be prepared to consider. To illustrate our approach, Figure 1 shows all sub-queries of a query Q1234 involving four attributes a1, a2, a3, and a4. We denote by Q134 the sub-query involving only attributes a1, a3, and a4, and by Q34 the sub-query involving only a3 and a4. We use a similar notation for each of the other sub-queries except Ø, the empty query. We will assume in this example that all sub-queries of Q1234 are successful queries except those that are marked ‘×’ in Figure 1. As any query is a sub-query of itself, Q1234 can be seen to have sixteen sub-queries, of which ten are successful. Among the successful sub-queries of Q1234, an obvious candidate to be included in the recovery set is Q134 as it is the only successful sub-query that involves only a single compromise. But if Q134 is included in the recovery set, then there is no need to show the user any of its proper sub-queries. For example, although Q13 is also a successful sub-query, it involves the same compromise as Q134 and an additional compromise that Q134 does not involve. It is therefore reasonable to assume that the user would never choose Q13 in Figure 1. Sub-query lattice for an example query involving four attributes
RETRIEVAL FAILURE AND RECOVERY IN RECOMMENDER SYSTEMS 323 preference to Q>, at least without knowing the extent of the com- promises involved Following the elimination of all the proper sub-queries of Q 34 th only remaining candidates among the successful sub-queries of @234 are 2-4 and Q-. It makes sense to include Q-+in the recovery set as the user may not be prepared to accept the compromise associated with Q 34. Inclusion of Q4 in the recovery set also means that Qcan be excluded as it involves an additional compromise. The successful sub-queries Q 34 and 024 that we have identified as successful relax ations that the user may wish to consider are in fact the maximally successful sub-queries of Q 234 in the sense of the following definition. Definition 2: A successful sub-query Q" of a given query Q is a max- imally successful sub-query of Q if there is no successful sub-query of e of which Q* is a proper sub-query For example, Q4 is a successful sub-query of @234 but not a max imally successful sub-query since @4 is also a successful sub-query In general, the recovery set for a failing query in our approach con- sists of all the maximally successful sub-queries of the failing query Definition 3: Given a failing query o, we define rs(@), the recovery set for g, to be the set of all maximally successful sub-queries of Q For the example Figure 1, rs(0234)=1034, 024. In the domain of personal computers, suppose that @24 is the query: lapto en size= 19. make=X and that its successful sub-queries are as shown in Figure 1. In front Case, the user would be informed that there is no match for her query and invited to select from the following sub-queries, both of which are guaranteed to succeed price <700, screen size= 19, make=X (Q 34) type=laptop, make=X (Q+) If not prepared to compromise on type, user may decide instead to compromise on price and screen size by selecting the sec- ond sub-query. In this case, the constraints associated with screen size are not eliminated in Front Case but demoted to soft con- straints and used to rank the retrieved cases in order of similarity to
RETRIEVAL FAILURE AND RECOVERY IN RECOMMENDER SYSTEMS 323 preference to Q134, at least without knowing the extent of the compromises involved. Following the elimination of all the proper sub-queries of Q134 the only remaining candidates among the successful sub-queries of Q1234 are Q24 and Q2. It makes sense to include Q24 in the recovery set, as the user may not be prepared to accept the compromise associated with Q134. Inclusion of Q24 in the recovery set also means that Q2 can be excluded as it involves an additional compromise. The successful sub-queries Q134 and Q24 that we have identified as successful relaxations that the user may wish to consider are in fact the maximally successful sub-queries of Q1234 in the sense of the following definition. Definition 2: A successful sub-query Q∗ of a given query Q is a maximally successful sub-query of Q if there is no successful sub-query of Q of which Q∗ is a proper sub-query. For example, Q14 is a successful sub-query of Q1234 but not a maximally successful sub-query since Q134 is also a successful sub-query. In general, the recovery set for a failing query in our approach consists of all the maximally successful sub-queries of the failing query. Definition 3: Given a failing query Q, we define rs(Q), the recovery set for Q, to be the set of all maximally successful sub-queries of Q. For the example query in Figure 1, rs(Q1234) = {Q134, Q24}. In the domain of personal computers, suppose that Q1234 is the query: price ≤700,type=laptop, screen size=19, make=X and that its successful sub-queries are as shown in Figure 1. In Front Case, the user would be informed that there is no match for her query and invited to select from the following sub-queries, both of which are guaranteed to succeed: price ≤700,screen size=19, make=X (Q134) type=laptop, make=X (Q24) If not prepared to compromise on type, the user may decide instead to compromise on price and screen size by selecting the second sub-query. In this case, the constraints associated with price and screen size are not eliminated in Front Case but demoted to soft constraints and used to rank the retrieved cases in order of similarity to