ARTICLE N PRESS xpert Systems with Applications xxx(2011)xXX-XXX Contents lists available at SciVerse Science Direct Expert Systems with Applications ELSEVIER journalhomepagewww.elsevier.com/locate/eswa A recommender mechanism based on case-based reasoning Chen-Shu Wang Heng-Li Yang Graduate Institute of information and Logistics Management, National Taipei University of Technology. 1, Sec. 3, Chung-hsiao E Road, Taipei, Taiwan Department of mis, National Chengchi University, 64, Sec. 2, Chihnan Road, Taipei, Taiwan ARTICLE INFO ABSTRACT ase-based reasoning(CBr)algorithm is particularly suitable for solving ill-defined and unstructured decision-making problems in many different areas. The traditional CBR algorithm, however, is inappro priate to deal with complicated problems and therefore needs to be further revised. This study thus pr ultiple stage poses a next-generation CBR(GCBR) model and algorithm. GCBR presents as a new problem-solving Artificial intelligence application decision-making problems by using hierarchical criteria architecture(HCA)problem representation which involves multiple decision objectives on each level of hierarchical, multiple-level decision criteria, thereby enables decision makers to identify problems more precisely. Additionally, the proposed GCBR can also provide decision makers with series of cases in support of these multiple decision-making stages. GCBR furthermore employs a genetic algorithm in its implementation in order to reduce the effort involved in case evaluation. This study found experimentally that using GCBR for making travel-planning recommendations involved approximately 80% effort than traditional CBR, and therefore concluded that GCBR should be the next generation of case-based reasoning algorithms and can be applied to actual case-based recommender mechanism implementation. e 2011 Elsevier Ltd. All rights reserved. 1 Introduction Furthermore, Chang(2005)applied CBr to screening children with delayed development in order to detect their disorder early Case based reasoning( CBR)is a paradigm, concept, and intuitive rough analysis of their symptoms mechanism for solving ill-defined and unstructured problems chances of effective treatment. Both Garrell, Golobardes, Bernado ( Belecheanu, Pawar, Barson, Bredehorst, Weber, 2003). Similar and Llora(1999)and Golobardes, Llora, Salamo, and Marti(2002) to the natural human problem-solving process, CBR retrieves past have used Cbr to diagnose breast cancer based on mammary nces for reuse in regard to target problems. Since such pro- biopsy data and micro calcifications, respectively. Additionally ess is likely to need to revise previous-case solutions before Shimazu, Shibata, and Nihei(2001)applied conversational case applying them, CBR then retains successful problem-solving expe- based algorithm( CCBr) to developing a mentor guide for user riences for further reuse(Aamodt Plaza, 1994). This, then, is tra- helpdesk implementation and Shimazu(2002)applied CCBr to ditional CBr,'s 4R processes of retrieve, reuse, revise, and retain automatic-clerk mechanisms and electronic website shopping CBR is therefore a classical artificial intelligence algorithm. assistance. Researchers have historically tended to solve these Many have applied CBr within various problem-solving domains problems by using such mathematical models as regressions but (Aamodt Plaza, 1994; Kolodner, 1993; Shiu Pal, 2001; Waston, these mathematical models involve too many assumptions to be 1997). Cirovic and Cekic(2002)applied CBr to construction pro pplied effectively to real-world problem solving, and CBr seems jects during their preliminary design phase by retrieving historical to be a feasible alternative cases from a historical project database, storing useful case(s)in Researchers have until recently extended CBR applications their construction knowledge base, and then applying the most mechanisms for making recommendations based on previous similar previous case(s) to improve the quality of construction cases. Yang and Wang(2009a)applied the CBr algorithm to infor- designs. Belecheanu et al. (2003)referred to past records in order mation-system project management as a recommender mecha- to reduce information uncertainty in regard to such industrial nism by offering project managers preferences from previous ticul uirements as those involved in new product development, par- cases to help project managers construct new project plans. They larly when employing the concurrent engineering approach. also applied similar mechanisms to travel-schedule planning Edu cators, furthermore, can integrate CBr recommender mechanism g author into e-learning systems to provide learners with reference- E-mail addresses: wangcsentutedu w (C-S. Wang). yanhenccuedutw certification paths(2009b). Such real-world problems as these are usually difficult to formulate within strict mathematical 0957-4174/s- see front matter o 2011 Elsevier Ltd. All rights reserved doi:10.1016/eswa2011.09.1 Please cite this article in press as: Wang, C-S,& Yang. H.-L A recommender mechanism based on case-based reasoning Expert Systems with Application 2011da06 /jeswa.201109161
A recommender mechanism based on case-based reasoning Chen-Shu Wang a , Heng-Li Yang b,⇑ aGraduate Institute of Information and Logistics Management, National Taipei University of Technology, 1, Sec. 3, Chung-hsiao E. Road, Taipei, Taiwan bDepartment of MIS, National ChengChi University, 64, Sec. 2, Chihnan Road, Taipei, Taiwan article info Keywords: Recommender mechanism Case-based reasoning Multiple stage reasoning Genetic algorithm Artificial intelligence application abstract Case-based reasoning (CBR) algorithm is particularly suitable for solving ill-defined and unstructured decision-making problems in many different areas. The traditional CBR algorithm, however, is inappropriate to deal with complicated problems and therefore needs to be further revised. This study thus proposes a next-generation CBR (GCBR) model and algorithm. GCBR presents as a new problem-solving paradigm that is a case-based recommender mechanism for assisting decision making. GCBR can resolve decision-making problems by using hierarchical criteria architecture (HCA) problem representation which involves multiple decision objectives on each level of hierarchical, multiple-level decision criteria, thereby enables decision makers to identify problems more precisely. Additionally, the proposed GCBR can also provide decision makers with series of cases in support of these multiple decision-making stages. GCBR furthermore employs a genetic algorithm in its implementation in order to reduce the effort involved in case evaluation. This study found experimentally that using GCBR for making travel-planning recommendations involved approximately 80% effort than traditional CBR, and therefore concluded that GCBR should be the next generation of case-based reasoning algorithms and can be applied to actual case-based recommender mechanism implementation. 2011 Elsevier Ltd. All rights reserved. 1. Introduction Case based reasoning (CBR) is a paradigm, concept, and intuitive mechanism for solving ill-defined and unstructured problems (Belecheanu, Pawar, Barson, Bredehorst, & Weber, 2003). Similar to the natural human problem-solving process, CBR retrieves past experiences for reuse in regard to target problems. Since such process is likely to need to revise previous-case solutions before applying them, CBR then retains successful problem-solving experiences for further reuse (Aamodt & Plaza, 1994). This, then, is traditional CBR’s 4R processes of retrieve, reuse, revise, and retain. CBR is therefore a classical artificial intelligence algorithm. Many have applied CBR within various problem-solving domains (Aamodt & Plaza, 1994; Kolodner, 1993; Shiu & Pal, 2001; Waston, 1997). Cirovic and Cekic (2002) applied CBR to construction projects during their preliminary design phase by retrieving historical cases from a historical project database, storing useful case(s) in their construction knowledge base, and then applying the most similar previous case(s) to improve the quality of construction designs. Belecheanu et al. (2003) referred to past records in order to reduce information uncertainty in regard to such industrial requirements as those involved in new product development, particularly when employing the concurrent engineering approach. Furthermore, Chang (2005) applied CBR to screening children with delayed development in order to detect their disorder early through analysis of their symptoms, thereby improving the chances of effective treatment. Both Garrell, Golobardes, Bernado, and Llora (1999) and Golobardes, Llora, Salamo, and Marti (2002) have used CBR to diagnose breast cancer based on mammary biopsy data and micro calcifications, respectively. Additionally, Shimazu, Shibata, and Nihei (2001) applied conversational casebased algorithm (CCBR) to developing a mentor guide for user helpdesk implementation and Shimazu (2002) applied CCBR to automatic-clerk mechanisms and electronic website shopping assistance. Researchers have historically tended to solve these problems by using such mathematical models as regressions but these mathematical models involve too many assumptions to be applied effectively to real-world problem solving, and CBR seems to be a feasible alternative. Researchers have until recently extended CBR applications as mechanisms for making recommendations based on previous cases. Yang and Wang (2009a) applied the CBR algorithm to information-system project management as a recommender mechanism by offering project managers preferences from previous cases to help project managers construct new project plans. They also applied similar mechanisms to travel-schedule planning. Educators, furthermore, can integrate CBR recommender mechanism into e-learning systems to provide learners with referencecertification paths (2009b). Such real-world problems as these are usually difficult to formulate within strict mathematical 0957-4174/$ - see front matter 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.09.161 ⇑ Corresponding author. E-mail addresses: wangcs@ntut.edu.tw (C.-S. Wang), yanh@nccu.edu.tw (H.-L. Yang). Expert Systems with Applications xxx (2011) xxx–xxx Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa Please cite this article in press as: Wang, C.-S., & Yang, H.-L. A recommender mechanism based on case-based reasoning. Expert Systems with Applications (2011), doi:10.1016/j.eswa.2011.09.161
ARTICLE N PRESS models, and people have often solved them using experiences they enable decision makers to state their problems adequately (Yang obtain by word-of-mouth. Some studies(Adomavicius Kwon, Wang 2008). This follows Adomavicius and tuzhilin s(2005) 2007: Adomavicius Tuzhilin, 2005)have also recommended that recommendation that a next-generation recommender system the next generation of recommender mechanisms should focus on should be able to solve multi-dimensional problem real-life problem solving and applications. Case-based recom- HCA can enhance descriptions of problems involving multiple ender mechanisms are therefore particularly appropriate for objectives by enabling decision makers to describe each decision s solving unstructured problems because people can use the CBr objectives with the appropriate amount of detail. The solutions to style to describe them and should therefore be regarded as a problems described using HCa are therefore more valuable than new problem-solving paradigm. those using other methods because decision makers can represent In order to create such a mechanism it is necessary to review, such problems accurately. Fig. 1 illustrates how describing prob- redefine, and expand both the traditional recommender mecha- lems using HCa allows decision makers to consider them from a nisms and the original CBR algorithms Using the traditional CBr multi-criteria perspective while still reducing each criterion hier- gorithm for complex problems requires retrieving each case for archically until reaching the required level of detail with the the decision makers multiple objectives. As decision-making description remaining sufficiently detailed to represent the prob- problems become increasingly complicated, however, a merely lem. HCa is therefore an improved and enhanced methodology ultiple-objective problem representation becomes too unsophi for presenting decision-making problems. ated to reflect their reality. a revised case-based recommender Fig. 2 illustrates a hypothetical e-learning system problem in mechanism equipped with the ability to address more complicated which learners must use the recommender system to retrieve a real-life problems is therefore necessary, as obtaining actionable similar previous case or cases for an information technology(r) information is particularly valuable for decision makers. Cao and certification examination reference and learning-path suggestion. Zhang(2007)found that existing recommender mechanisms can- By representing this decision-making problem using HCA, learners not provide decision makers with a direction in which to take can set three objectives for their decisions comparing the similarity action, even though recommender mechanisms should be able to of such data in the case-base as those in regard to personal demo- tell decision makers what to do next(Yang, 2007). Based on the pre graphics, capabilities, and learning paths. They may also decide vious cases that CBRs have retrieved a next-generation recom that they can measure personal capabilities with work experience mender mechanism needs to have the ability to provide decision and thereby achieve It certification. They can therefore increase makers with better directions in regard to what actions to take the detail of the target conditions until they consider the problem Furthermore, traditional CBR mechanisms have to evaluate all description to be sufficiently complete the cases in the case base to retrieve those most similar case(s) which makes their efficiency strongly and negatively related to the size of the applicable case base. Consequently, researchers have 3. The proposed GCBR model therefore developed numerous approaches to decreasing the effort nvolved in case evaluation, with K-means being the most popular To address the hCa problem, this study has revised Yang and approaches. K-CBR, which involves integrating CBR with the k- angs(2008) revised CBR algorithm and used it to propose a means approach, first clusters all the cases and only evaluates ase-based recommender mechanism that is a GCBR algorithm. those from the most similar cluster for case retrieval. Chang and Table 1 shows its variables and their definitions and descriptions Lai (2005) then attempted to apply self-organizing maps(SOMs ). The revised GCBR algorithm has three characteristics. One indi and found that SOM-CBR outperformed k-CBR although both k- cates that GCBr is a generalized problem-solving model because of its ability to help solve HCa problems Similar to traditional CBr of the two revised CBR mechanisms are, however, closely related problems, HCA problems can include multiple object to the case representation and indexing approach(Shin Han, gle level. Furthermore, each decisions objectives can be divided 1999), so their superior performances are unstable and cannot be into multiple hierarchical levels. Another characteristic is that GCBR acts as a predictor in support of multi-stage decision making. This study therefore proposes a revised case-based recom GCBR can provide decision makers with a series of cases stage by mender mechanism, to which it refers as the next-generation CBR(GCBR) algorithm. GCBR is also applicable to various real- world applications, particularly case-based recommender mecha Criteria 1.1 nisms, and can serve as a new problem-solving paradigm. GCBR Criteria 1.2 is designed to improve traditional CBR's efficiency and stability regardless of the case representation and indexing approach em- Objective 1 ployed. Section 2 of this paper presents a new method for describ WGT Criteria 1.m1 wgt 1,m1 ing problems. Section 3 presents the proposed GCBr model Section 4 reports an experiment using this model and also presents wgt 2 neria 2.1 a scenario illustrating GCBR application Section 5 presents conclu sions and proposes future research directions. Target O wGT2-Objective2 5 Criteria 2.2 2. Problem description: hierarchical criteria architecture(HCA) Descriptions of decision-making problems involving multiple Objective n Criteria n, 1 objectives become too complicated to represent them adequately Coello, 2000) but if decision makers are unable to conceptualize such problems clearly they are unlikely to devise trustworthy and useful solutions. This study has therefore adopted a new rep- resentation methodology for describing decision-making problems called the hierarchical criteria architecture(HCa)in order to Fig. 1. HCA problem representation met Please cite this article in press as: Wang, C-S, Yang. H.-L A recommender mechanism based on case-based reasoning Expert Systems with Applicatior (201.doi1010160eswa201109.161
models, and people have often solved them using experiences they obtain by word-of-mouth. Some studies (Adomavicius & Kwon, 2007; Adomavicius & Tuzhilin, 2005) have also recommended that the next generation of recommender mechanisms should focus on real-life problem solving and applications. Case-based recommender mechanisms are therefore particularly appropriate for solving unstructured problems because people can use the CBR style to describe them and should therefore be regarded as a new problem-solving paradigm. In order to create such a mechanism it is necessary to review, redefine, and expand both the traditional recommender mechanisms and the original CBR algorithms. Using the traditional CBR algorithm for complex problems requires retrieving each case for the decision makers’ multiple objectives. As decision-making problems become increasingly complicated, however, a merely multiple-objective problem representation becomes too unsophisticated to reflect their reality. A revised case-based recommender mechanism equipped with the ability to address more complicated real-life problems is therefore necessary, as obtaining actionable information is particularly valuable for decision makers. Cao and Zhang (2007) found that existing recommender mechanisms cannot provide decision makers with a direction in which to take action, even though recommender mechanisms should be able to tell decision makers what to do next (Yang, 2007). Based on the previous cases that CBRs have retrieved, a next-generation recommender mechanism needs to have the ability to provide decision makers with better directions in regard to what actions to take. Furthermore, traditional CBR mechanisms have to evaluate all the cases in the case base to retrieve those most similar case(s) which makes their efficiency strongly and negatively related to the size of the applicable case base. Consequently, researchers have therefore developed numerous approaches to decreasing the effort involved in case evaluation, with K-means being the most popular approaches. K-CBR, which involves integrating CBR with the kmeans approach, first clusters all the cases and only evaluates those from the most similar cluster for case retrieval. Chang and Lai (2005) then attempted to apply self-organizing maps (SOMs), and found that SOM-CBR outperformed k-CBR, although both kCBR and SOM-CBR improved CBR’s efficiency. The performances of the two revised CBR mechanisms are, however, closely related to the case representation and indexing approach (Shin & Han, 1999), so their superior performances are unstable and cannot be guaranteed. This study therefore proposes a revised case-based recommender mechanism, to which it refers as the next-generation CBR (GCBR) algorithm. GCBR is also applicable to various realworld applications, particularly case-based recommender mechanisms, and can serve as a new problem-solving paradigm. GCBR is designed to improve traditional CBR’s efficiency and stability regardless of the case representation and indexing approach employed. Section 2 of this paper presents a new method for describing problems. Section 3 presents the proposed GCBR model. Section 4 reports an experiment using this model and also presents a scenario illustrating GCBR application. Section 5 presents conclusions and proposes future research directions. 2. Problem description: hierarchical criteria architecture (HCA) Descriptions of decision-making problems involving multiple objectives become too complicated to represent them adequately (Coello, 2000) but if decision makers are unable to conceptualize such problems clearly they are unlikely to devise trustworthy and useful solutions. This study has therefore adopted a new representation methodology for describing decision-making problems called the hierarchical criteria architecture (HCA) in order to enable decision makers to state their problems adequately (Yang & Wang, 2008). This follows Adomavicius and Tuzhilin’s (2005) recommendation that a next-generation recommender system should be able to solve multi-dimensional problems. HCA can enhance descriptions of problems involving multiple objectives by enabling decision makers to describe each decision’s objectives with the appropriate amount of detail. The solutions to problems described using HCA are therefore more valuable than those using other methods because decision makers can represent such problems accurately. Fig. 1 illustrates how describing problems using HCA allows decision makers to consider them from a multi-criteria perspective while still reducing each criterion hierarchically until reaching the required level of detail with the description remaining sufficiently detailed to represent the problem. HCA is therefore an improved and enhanced methodology for presenting decision-making problems. Fig. 2 illustrates a hypothetical e-learning system problem in which learners must use the recommender system to retrieve a similar previous case or cases for an information technology (IT) certification examination reference and learning-path suggestion. By representing this decision-making problem using HCA, learners can set three objectives for their decisions comparing the similarity of such data in the case-base as those in regard to personal demographics, capabilities, and learning paths. They may also decide that they can measure personal capabilities with work experience and thereby achieve IT certification. They can therefore increase the detail of the target conditions until they consider the problem description to be sufficiently complete. 3. The proposed GCBR model To address the HCA problem, this study has revised Yang and Wang’s (2008) revised CBR algorithm and used it to propose a case-based recommender mechanism that is a GCBR algorithm. Table 1 shows its variables and their definitions and descriptions. The revised GCBR algorithm has three characteristics. One indicates that GCBR is a generalized problem-solving model because of its ability to help solve HCA problems. Similar to traditional CBR problems, HCA problems can include multiple objectives on a single level. Furthermore, each decision’s objectives can be divided into multiple hierarchical levels. Another characteristic is that GCBR acts as a predictor in support of multi-stage decision making. GCBR can provide decision makers with a series of cases stage by Objective 1 Objective 2 Objective n Criteria 1,1 Criteria … 1,2 Criteria 1,m1 Criteria 2,1 Criteria … 2,2 Criteria 2,m2 Criteria n,1 Criteria … n,2 Criteria n,mn … Target WGT1 WGT2 WGT3 wgt 1,1 wgt 1,2 wgt 1,m1 wgt 2,1 wgt 2,2 wgt 2,m2 wgt n,1 wgt n,2 wgt n,mn Fig. 1. HCA problem representation methodology. 2 C.-S. Wang, H.-L. Yang / Expert Systems with Applications xxx (2011) xxx–xxx Please cite this article in press as: Wang, C.-S., & Yang, H.-L. A recommender mechanism based on case-based reasoning. Expert Systems with Applications (2011), doi:10.1016/j.eswa.2011.09.161
ARTICLE N PRESS Degree I wgt=0.5 Its core algorithm evaluates the similarity of the target case(n) with the cases in the case base. The retrieval sub-algorithm evalu- Gendering=0.2 target and each case in the base by summarizing each feature's gap, which describes that case in detail. CBR judges the similarity here by calculating the differ ence between each case and the target, the similarity increasing d capability< it to evaluate the features'similarity, as illustrated in Fig 3. The fet-check-function is set to either total or partial similar according to the feature 's characteristics. For partial similarit serial the check function returns a real number between 0, indicating that they are identical, and 1, indicating that they are totally differ wg=06 ent. The total similar fet-chedk-function, however, returns either 1 or O. For example, if the targets gender feature is female and case, wO=0. 3 is male, the gap between the target and case: would be 1. The dif- erence between case and the target is therefore the gap as calcu Fig. 2. An IT certification recommender problem described by HCA. lated using Eq(1). GCBR then selects the case with the smallest gap as the most feasible solution and provides it to the decision makers for reference Table 1 GCBR variables definitions and description. Variable on and description diference(ci. T)=cosine(etm fetm) (1) The ith case of the case base. i= 1.2..n The case-retrieval sub-algorithm needs to be revised, however, The number of features each because the problems that GCBR addresses involve hierarchical The target case inputted by the decision makers. The levels of criteria. This study therefore proposes the recursive sub provides them with feasible e ference cases according to the target cases condition algorithm fet-check-rewrite, as shown in Fig 3. This algorithm is a To present the jth feature, the recursive one for rewriting fet-check-function in order to allow GCBr fett the jth feature of case i to manage the HCA problem. When the HCA feature level exceeds 1 of target such as in level (fet)>1, the weighted sum of the next level replace weter ecision makers can assign an importan he feature, wget to denote the importance weight of For example, according to the It certification example illus- trated in Fig. 2 the similarity evaluation function should be altered he jth feature: j= 1, 2. difference(CD ifference(ci is an array ecords the degree of as shown in Fig 4, to consider three levels recursively, so its con- ifference between the ith case and the target case(T), sideration of level 3, which is the dash-block area, precedes that hich is evaluated by Eq (5) i= 1, 2,. of level 2, which precedes that of level 1. The Fet-check-rewrite To evaluate the jth feature gap of the ith case and T mechanism is a function that returns the revised feature- 1,2,n,j=1,2,m,byEq(4) check-function to GCBr's core algorithm in order to evaluate the level(fern) urn to the conditio ase s simllar ity with the targ hreshold ecision makers can set the gap threshold. If the ig. 5 compares the gap between the features of specific cases ifference in a case is less than the threshold, it provides with the target using the fet-check-function, as shown in Eq. (2). xt_level(fety) retrieve the next level feature of the jth fear Eq (3)then summarizes the feature gap to evaluate the similarity he numbers of stages, k, that decision makers expect the Fig. 5's Reuse algorithm. Following Yang and Wangs(2009a)proce- recommendation system to require for providing a dure, GCBr then analyzes the retrieved case or cases further using feasible suggestion the knowledge discovery(KDD)mechanism, which includes asso- ciation mining techniques and statistical analyses to produce potential knowledge rules and then provide decision makers with stage suggesting reference cases in each stage. This refined infor- revised case information upon which they can take action. Yang mation is useful for decision makers in revising their solutions. and Wang(2008)claimed that simply presenting the retrieved The third characteristic is that its performance is superior to that case or cases to decision makers is useless because the case filter of the traditional CBR algorithm because it employs a genetic algo- ing performs poorly under loose target conditions. The system rithm( GA)to keep the convergence rate stable, thereby increasing should therefore employ data mining analysis to identify the efficiency of the solution process. 3. 1. GCBR as a generalized problem-solving model Function fet_check_function=fer-check-rewrite (fer if( leveller)>1) The traditional CBR algorithms 4R steps are that CBr retrieves the feasible cases so that decision makers may either reuse the fet_check_finction=wgte,x fer_check_rewrite(next_level(fer) olution of these retrieved cases directly or revise the solution according to real applications: CBR then retains the successful case fet_check_fuanction= wgt s xfer_check_function process is similar to that of ordinary human problem solving, and /ar n or cases and the solution in the case base for further reference. this lany have applied CBr successfully to a variety of contexts during the past few decades. Please cite this article in press as: Wang, C-S,& Yang. H.-L A recommender mechanism based on case-based reasoning Expert Systems with Application 2011da06 /jeswa.201109161
stage suggesting reference cases in each stage. This refined information is useful for decision makers in revising their solutions. The third characteristic is that its performance is superior to that of the traditional CBR algorithm because it employs a genetic algorithm (GA) to keep the convergence rate stable, thereby increasing the efficiency of the solution process. 3.1. GCBR as a generalized problem-solving model The traditional CBR algorithm’s 4R steps are that CBR retrieves the feasible cases so that decision makers may either reuse the solution of these retrieved cases directly or revise the solution according to real applications; CBR then retains the successful case or cases and the solution in the case base for further reference. This process is similar to that of ordinary human problem solving, and many have applied CBR successfully to a variety of contexts during the past few decades. Its core algorithm evaluates the similarity of the target case (T) with the cases in the case base. The retrieval sub-algorithm evaluates the similarity between the target and each case in the case base by summarizing each feature’s gap, which describes that case in detail. CBR judges the similarity here by calculating the difference between each case and the target, the similarity increasing as the difference decreases. Each feature has a fet-check-function to evaluate the features’ similarity, as illustrated in Fig. 3. The fet-check-function is set to either total or partial similar according to the feature’s characteristics. For partial similarity, the check function returns a real number between 0, indicating that they are identical, and 1, indicating that they are totally different. The total similar fet-chedk-function, however, returns either 1 or 0. For example, if the target’s gender feature is female and casei is male, the gap between the target and casei would be 1. The difference between casei and the target is therefore the gap as calculated using Eq. (1). GCBR then selects the case with the smallest gap as the most feasible solution and provides it to the decision makers for reference. differenceð~Ci;~TÞ ¼ cosineðfetCi m !; fetT m !Þ ¼ fetCi m ! fetT m ! fetCi m ! 2 fetT m ! 2 ð1Þ The case-retrieval sub-algorithm needs to be revised, however, because the problems that GCBR addresses involve hierarchical levels of criteria. This study therefore proposes the recursive subalgorithm fet-check-rewrite, as shown in Fig. 3. This algorithm is a recursive one for rewriting fet-check-function in order to allow GCBR to manage the HCA problem. When the HCA feature level exceeds 1, such as in level (fet) > 1, the weighted sum of the next level replaces the fet-check-rewrite. For example, according to the IT certification example illustrated in Fig. 2 the similarity evaluation function should be altered, as shown in Fig. 4, to consider three levels recursively, so its consideration of level 3, which is the dash-block area, precedes that of level 2, which precedes that of level 1. The Fet-check-rewrite mechanism is a function that returns the revised featurecheck-function to GCBR’s core algorithm in order to evaluate the case’s similarity with the target. Fig. 5 compares the gap between the features of specific cases with the target using the fet-check-function, as shown in Eq. (2). Eq. (3) then summarizes the feature gap to evaluate the similarity between the target and each case in the case base. Fig. 6 presents Fig. 5’s Reuse algorithm. Following Yang and Wang’s (2009a) procedure, GCBR then analyzes the retrieved case or cases further using the knowledge discovery (KDD) mechanism, which includes association mining techniques and statistical analyses to produce potential knowledge rules and then provide decision makers with revised case information upon which they can take action. Yang and Wang (2008) claimed that simply presenting the retrieved case or cases to decision makers is useless because the case filtering performs poorly under loose target conditions. The system should therefore employ data mining analysis to identify demographic data capability learning path Degree Gender age Achieved certification Working experience Unit time unit serial wgt=0.2 wgt=0.5 wgt=0.3 wgt=0.5 wgt=0.2 wgt=0.3 wgt=0.4 wgt=0.6 wgt=0.6 wgt=0.4 wgt=0.6 wgt=0.4 target Fig. 2. An IT certification recommender problem described by HCA. Table 1 GCBR variables definitions and description. Variable Definition and description n The number of cases in the case base Ci The ith case of the case base, i = 1,2,...,n fet Features used to describe a case m The number of features each case employs T The target case inputted by the decision makers. The recommender mechanism provides them with feasible reference cases according to the target case’s condition fetj To present the jth feature, that fetCi j the jth feature of case i fetT j the jth feature of target ( i ¼ 1; 2; ... ; n; j ¼ 1; 2; ... ; m wgtfet Decision makers can assign an importance weighting to the feature, wgtfetj , to denote the importance weight of the jth feature; j = 1,2,...,m difference(Ci) difference(Ci) is an array that records the degree of difference between the ith case and the target case (T), which is evaluated by Eq. (5), i = 1,2,...,n gap fetCi j ; fetT j To evaluate the jth feature gap of the ith case and T, i = 1, 2,...,n, j = 1, 2,...,m, by Eq. (4) level(fetj) To return to the condition that decision makers set on the jth feature threshold Decision makers can set the gap threshold. If the difference in a case is less than the threshold, it provides the case as a reference next_level(fetj) To retrieve the next level feature of the jth feature fet_check_function To evaluate the gap between the jth feature and T k The numbers of stages, k, that decision makers expect the recommendation system to require for providing a feasible suggestion Function fet_check_function= fet-check-rewrite (fet) if ( level(fet)>1) fet_check_function= _ _ ( _ ( )) ∑wgt fet check rewrite next level fet fet × else fet_check_function=wgt fet × fet _ check _ function end if; end Fig. 3. Fet-check-rewrite algorithm. C.-S. Wang, H.-L. Yang / Expert Systems with Applications xxx (2011) xxx–xxx 3 Please cite this article in press as: Wang, C.-S., & Yang, H.-L. A recommender mechanism based on case-based reasoning. Expert Systems with Applications (2011), doi:10.1016/j.eswa.2011.09.161
ARTICLE N PRESS C-S. Wang, H.-L Yang/ Expert Systems with Applications xo(2011)xx-XXx knowledge with the potential to assist in decision making. Except for the retrieved cases, KDD results can also provide decision mak- ers with refined information for revising actions in order For j=I to m if level( fer)>1) prove the quality of their decision fet _check function= fet_check_rewrite(fer ) GCBR as a prophet recommender: multiple stages of gap(fer fer)=fet _check_finction Equation(2) A next-generation recommender should also support multiple C)=2, tages of recommendations. People are likely to face multi-stage decision-making problems in such situations as making travel plans for several days, in which they need a recommender mecha- Recommendation_Cases=Reuse(difference, threshold); lism that provides a suggestion package with detailed Potential_rules, Sol_ Suggestion )=KDD( Recommendation_cases ommendations for each stage. Almost all current recommender mechanisms, however, involve only Fig. 5. GCBR's single-stage reasoning algorithm. muir. 7 shows how GCBR provides a series of cases to support multi-stage decision making. Few previous studies have paid mud attention to multi-stage decision making. Smyth, Keane, and CunI Function Recommendation_Cases=Reuse(difference, threshold) Recommendation_case=d ingham(2001)described the technique of hierarchical case-based For i=l to n asoning, which borrows ideas from hierarchical planning and If(difference( Ci)< threshold, es a divide-and-conquer strategy to enable the solution of Recommendation case=Recommendation case+Ci omplex problems by reusing multiple cases at various levels of abstraction along an abstract-to-concrete continuum. The employed this technique to design device-control or process / return the most similar one even if the difference cannot satisfy the control software for industrial applications. Their focus differs, however, from multi-stage decision making in the real world. To find Cg having minimal difference in difference( Ci)i=1, 2. lulti-stage case-based recommender, the target Recommendation case=Ce equirements should be rewritten in each stage according to previ- us actions or responses. The target-rewrite-mechanism is a core End reuse algorithm applicable to g, as Fi ustrates Fig. 6. The single-stage reasoning algorithms reuse algorithm. Fig. 8 illustrates an overall case feature that includes a con- umption feature an accumulation feature a replacement feature, and a feature for other factors. With the first three, the actions or budget feature to obtain the new available budget. It also responses of each stage change the feature values of the next Per- to incorporate the site visited in the previous stage into th forming the recommender for the next stages recommendations been-sightseeing feature, as most travelers do not want to therefore req dly visit the same sites during a short vacation. In order to provide writing the target features, so the target- recommendations for the next stage it should therefore revise the rewrite-mechanism is able to call the fet-changecheck-function heck for any changes necessary. The algorithm then alters these target to recognize the previous stages'actions and responses features and generates a new target the next stage of reasoning ig. 9 presents an example involving sightseeing in which the 3.3. GCBR improving efficiency via GA available budget, which is the consumption feature, for travel plan ning decreases with each stage, while the ever-been-sightseeing The GCBR algorithms overall complexity exceeds that of tradi- factor, which is the accumulation feature, increases accordingly. tional CBr in order to fulfill its reasonings general and forecasting The algorithm therefore needs to alter the target before performing potential, even with the adoption of the revised CBr(2008). The the next case-recommendation stage GCBR must first deduct the traditional CBr's reasoning process compares every case in the case previous stage's sightseeing entrance fees from the available base in order to obtain feasible cases to refer to the decision Level difference(C, T)=0. 2x DI Check (C, T)+0. 5xPC Check (Ch T)+0.3xLP Check (CT) Leve/2 DI Check(ChT=0.5× AD Check.D+0.2× GD Check(C,m+0.3× AG Check(ChT) PC_ Check(Ch T)10. 6x CT_ Check(Ch T)+0.4x wY_ Check(CT) LP Check(CT=0.6× LU Check(C,D+0.4× LT Check(c Level 3 CT Check(C, T)=0.6x CU Check(Ch T)+0.4x CS Check(Ch T) Fig. 4. Example of a similarity evaluation in an HCA problem. Please cite this article in press as: Wang, C-S, Yang. H.-L A recommender mechanism based on case-based reasoning Expert Systems with Applicatior (201).do1010160eswa201109.161
knowledge with the potential to assist in decision making. Except for the retrieved cases, KDD results can also provide decision makers with refined information for revising actions in order to improve the quality of their decisions. 3.2. GCBR as a prophet recommender: multiple stages of recommendations A next-generation recommender should also support multiple stages of recommendations. People are likely to face multi-stage decision-making problems in such situations as making travel plans for several days, in which they need a recommender mechanism that provides a suggestion package with detailed action recommendations for each stage. Almost all current case-based recommender mechanisms, however, involve only single-stage reasoning. Fig. 7 shows how GCBR provides a series of cases to support multi-stage decision making. Few previous studies have paid much attention to multi-stage decision making. Smyth, Keane, and Cunningham (2001) described the technique of hierarchical case-based reasoning, which borrows ideas from hierarchical planning and uses a divide-and-conquer strategy to enable the solution of complex problems by reusing multiple cases at various levels of abstraction along an abstract-to-concrete continuum. They employed this technique to design device-control or processcontrol software for industrial applications. Their focus differs, however, from multi-stage decision making in the real world. To implement a real multi-stage case-based recommender, the target requirements should be rewritten in each stage according to previous actions or responses. The target-rewrite-mechanism is a core algorithm applicable to multiple stages of reasoning, as Fig. 8 illustrates. Fig. 8 illustrates an overall case feature that includes a consumption feature, an accumulation feature, a replacement feature, and a feature for other factors. With the first three, the actions or responses of each stage change the feature values of the next. Performing the recommender for the next stage’s recommendations therefore requires rewriting the target features, so the targetrewrite-mechanism is able to call the fet-changecheck-function to check for any changes necessary. The algorithm then alters these features and generates a new target the next stage of reasoning. Fig. 9 presents an example involving sightseeing in which the available budget, which is the consumption feature, for travel planning decreases with each stage, while the ever-been-sightseeing factor, which is the accumulation feature, increases accordingly. The algorithm therefore needs to alter the target before performing the next case-recommendation stage. GCBR must first deduct the previous stage’s sightseeing entrance fees from the available budget feature to obtain the new available budget. It also needs to incorporate the site visited in the previous stage into the everbeen-sightseeing feature, as most travelers do not want to repeatedly visit the same sites during a short vacation. In order to provide recommendations for the next stage it should therefore revise the target to recognize the previous stages’ actions and responses. 3.3. GCBR improving efficiency via GA The GCBR algorithm’s overall complexity exceeds that of traditional CBR in order to fulfill its reasoning’s general and forecasting potential, even with the adoption of the revised CBR (2008). The traditional CBR’s reasoning process compares every case in the case base in order to obtain feasible cases to refer to the decision Fig. 4. Example of a similarity evaluation in an HCA problem. For i=1 to n For j=1 to m if ( level( T j fet ) >1) _ _ _ _ ( ); j j fet check function fet check rewrite fet = end if; gap fet fet fetj check function T j C j i ( , ) = _ _ Equation (2) next j; ( ) ( , ) 1 T j m j C i j difference C gap fet fet i ∑= = Equation (3) next i; Recommendation_Cases=Reuse(difference,threshold); (Potential_rules, Sol_Suggestion)=KDD(Recommendation_cases); Fig. 5. GCBR’s single-stage reasoning algorithm. Function Recommendation_Cases=Reuse(difference,threshold); Recommendation_case={}; For i=1 to n If (difference(Ci) < threshold) Recommendation_case=Recommendation_case+Ci; End if Next i; If Recommendation_case={}; /* return the most similar one even if the difference cannot satisfy the threshold */ find Cq having minimal difference in difference(Ci) i=1,2,..,n Recommendation_case=Cq; endif End reuse Fig. 6. The single-stage reasoning algorithm’s reuse algorithm. 4 C.-S. Wang, H.-L. Yang / Expert Systems with Applications xxx (2011) xxx–xxx Please cite this article in press as: Wang, C.-S., & Yang, H.-L. A recommender mechanism based on case-based reasoning. Expert Systems with Applications (2011), doi:10.1016/j.eswa.2011.09.161
ARTICLE N PRESS C-S. Wang, H.-L Yang/ Expert Systems with Applications xxx(2011)XXx-XXx Case bases Stage ll.. Stage k evaluate Retain, tarErI' Mechanism New Target Requirer Retrieved cases Target Rewrite k-stages recommender Reuse (Series case(s) Recommendation) Refined recommendation KDD Retrieved cases Mechanism Fig. 7. The multiple stage GCBR process, including KDD. For p=l to k a GA to customize housing plans, Shin and Han(1999)used one to support CBR in order to enhance classification accuracy, and Yang difference( C) Stage Reasoning( Case-Base, T, thresh and Wang(2009a), Yang and wang(2009b)also successfully com- =1,2n bined a ga with cbr to accelerate case evaluation. These approaches integrating GAs with CBR have exhibited superior per- (Potential_rules, Sol_ Suggestion)kDD(Recommendation_ cases) ormance. It therefore seems to be a good method of improving CBR efficiency. For j=l to m If( Type( fer )=consumption) drec∑wxg(rr i=1.2 n elseif (Typed fer )=accumulation) elseif (Typer fet, )=replacement) p(fetf fet) 0. if no gap 1. otherwise This study also implemented gCBR using a genetic algorithm that Nerr j: expressed the HCa problem using goal programming. GCBR Next k employed the goal gap in Eqs. (4)and (5)as a fitness function. It fu ther regarded the gap between casei and t as the survival probabil- Fig 8. The GCBR's multi-stage reasoning algorithm. ity and used it in the evolution of the next generation. If case: has the smallest gap from target, then this study regards it as an out standing gene, and thereby has a higher probability of survival. It Function New Budget= budget changecheck(view, budget) has, furthermore, adopted the robin- wheel selection mechanism If(budget>view. ticket) to perform GA selection. The higher the survival probability, there- New budget=budget-view ticker fore, the higher the possibility that the gene or genes could persist to the final generation. Finally, the best chromosome, which con- sists of the fittest gene, represents a series of retrieved cases for End budger-changecheck: 4. An experiment and an illustrated scenario Fig. 9. The sightseeing example's target-rewriting- mechanism in the budget- changecheck-function algorith We conducted an experiment to validate the general GCBr equiring a decision becomes more com- model's efficiency by validating its characteristics. To do this, we plicated, however, the reasoning process is likely to become esigned an HCa problem, implemented the model on gCBr, and increasingly time-consuming. As the number of features increases, compared its experimental efficiency with that of traditional CBr. furthermore, the evaluation of the similarities between the cases This section also presents a scenario illustrating a proposed it cer- and the target takes up an incr amount of computer mem- tification path to explain the recommender's multiple stages ory, particularly if the problem s description is in the HCa style. GCBR's efficiency therefore needs to improve in order to enable it 4. 1. Experiment 1: travel case recommender o function well Some works have integrated CBR with other artificial intelli We obtained the experimental cases from a free online dataset gence techniques. Juan, Shin, and Perng(2006)combined CBR with called Travel. Each of the 1024 cases had 14 features and the three Please cite this article in press as: Wang, C-S,& Yang. H.-L A recommender mechanism based on case-based reasoning Expert Systems with Application 2011da06 /jeswa.201109161
makers. As the problem requiring a decision becomes more complicated, however, the reasoning process is likely to become increasingly time-consuming. As the number of features increases, furthermore, the evaluation of the similarities between the cases and the target takes up an increasing amount of computer memory, particularly if the problem’s description is in the HCA style. GCBR’s efficiency therefore needs to improve in order to enable it to function well. Some works have integrated CBR with other artificial intelligence techniques. Juan, Shin, and Perng (2006) combined CBR with a GA to customize housing plans, Shin and Han (1999) used one to support CBR in order to enhance classification accuracy, and Yang and Wang (2009a), Yang and Wang (2009b) also successfully combined a GA with CBR to accelerate case evaluation. These approaches integrating GAs with CBR have exhibited superior performance. It therefore seems to be a good method of improving CBR efficiency. differenceðCiÞ ¼ X t wt gap fetCi j ; fetT j ; i ¼ 1; 2; ... ; n ð4Þ gap fetCi j ; fetT j ¼ 0; if no gap 1; otherwise ð5Þ This study also implemented GCBR using a genetic algorithm that expressed the HCA problem using goal programming. GCBR employed the goal gap in Eqs. (4) and (5) as a fitness function. It further regarded the gap between casei and T as the survival probability and used it in the evolution of the next generation. If casei has the smallest gap from target, then this study regards it as an outstanding gene, and thereby has a higher probability of survival. It has, furthermore, adopted the robin-wheel selection mechanism to perform GA selection. The higher the survival probability, therefore, the higher the possibility that the gene or genes could persist to the final generation. Finally, the best chromosome, which consists of the fittest gene, represents a series of retrieved cases for the decision makers’ reference. 4. An experiment and an illustrated scenario We conducted an experiment to validate the general GCBR model’s efficiency by validating its characteristics. To do this, we designed an HCA problem, implemented the model on GCBR, and compared its experimental efficiency with that of traditional CBR. This section also presents a scenario illustrating a proposed IT certification path to explain the recommender’s multiple stages. 4.1. Experiment 1: travel case recommender We obtained the experimental cases from a free online dataset called Travel. Each of the 1024 cases had 14 features and the three Case Bases Evaluate Mechanism Decision Maker Retain Target Requirement Retrieved Cases Target Rewrite Mechanism New Target Requirement KDD Mechanism Refined Recommendation Reuse (Series case(s) Recommendation) Stage I Stage II…Stage k k-stages recommender Retrieved Cases Fig. 7. The multiple stage GCBR process, including KDD. For p=1 to k difference( ) Ci =Single_Stage_Reasoning (Case-Base, T, threshold), i=1,2,…n; Recommendation_Cases=Reuse(difference,thershold); (Potential_rules, Sol_Suggestion)=KDD(Recommendation_cases); For j=1 to m If ( Type( T j fet )=consumption) _ ( )j T j T j fet = fet − Sol suggestion fet elseif (Type( T j fet )=accumulation) fet fet j Sol suggestion T j = + _ ( )j fet elseif (Type( T j fet )=replacement) fet Sol suggestion T j = _ ( )j fet Next j; Next k; Fig. 8. The GCBR’s multi-stage reasoning algorithm. Function New_Budget= budget_changecheck (view, budget) If (budget>view.ticket) New_budget=budget-view.ticket else New_budget=0 End if End budget-changecheck; Fig. 9. The sightseeing example’s target-rewriting-mechanism in the budgetchangecheck-function algorithm. C.-S. Wang, H.-L. Yang / Expert Systems with Applications xxx (2011) xxx–xxx 5 Please cite this article in press as: Wang, C.-S., & Yang, H.-L. A recommender mechanism based on case-based reasoning. Expert Systems with Applications (2011), doi:10.1016/j.eswa.2011.09.161