pplications of recommender systems in target selection amoorthy Srikumar; Bharat Bha Journal of Targeting, Measurement and Analysis for Marketing: Oct 2004; 13, 1; ABI/INFORM Global Applications of recommender systems in target selection Received: 25th May, 2004 Krishnamoorthy Srikumar the Indian Institute of Management, Lucknow. He specialises in the area of information technology and systems. He received his BE (Production Engineering) from the University of Madras. His current research interests clude data mining, electronic commerce, knowledge management and supply chain management Bharat bhasker logy and systems ( Electronics& Comm. Engineering) from the University of Roorkee, India, and his MS and PhD in Computer Science from Virginia Tech, USA, His key research topics include distributed heterogeneous database management systems, query optimisation in distributed and parallel DBMSs, internet applications in business and electronic commerce, agent-based electronic commerce and data warehousing. Abstract A typical target selection problem aims at selecting prospects that are more likely to respond to a promotional campaign. There are varieties of target selection models available in the literature that address this problem. This paper investigates the use of recommender systems for selecting target customers in internet business. The suggested methodology uses the concepts of collaborative filtering and data mining for effectively selecting the target customers. The methodology is experimentally evaluate on a real-life data set and its benefits demonstrated. The experimental results reveal hat the suggested methodology provides better predictive capabilities compared to random target selection methods. The methodology could be useful for e-commerce managers in devising suitable promotional strategies whenever a new product is introduced into the online store INTRODUCTION targeted marketing is aimed at identifying Database marketing involves collecting a few groups of customers who are ore and electronically storing information kely to respond to the promotional about custom products Inpaign. Selee databases. The proliferation of database prospectivccustomers and offerin arketing activities has fuelled the targeted promotion helps in reducing tI owth of direct marketing, which is promotional cost as well as in deriving a typically targeted at a single business or realistic improvement in response rates individual consumer. This is in contrast Suppose a mass mailing has to be sent to mass marketing that is aimed at to 100,000 customers at a promotional th ds or even millions of prospective cost of rupees(Rs )20 per customer. If Krishnamoorthy Srikumar customers. The data that are collected for it is assumed that 3 per cent of tl Management, Lucknow database marketing initiatives are used to customers respond to the campaign, then 26013,Inda profile customers and develop effective the total profit would be Rs. 18 lakI le: 191 2223 4102. and efficient promotional strategies (one lakh= 100,000 Rs), taking profit at ail: srikumar@iiml, ac, in Unlike lass Imarketing, direct or Rs (00 per custoer. The total cost, 3) Henry Stewart Publications 0967-3237 (2004) Vol 13, 1, 61-69 Journal of Targeting, Measurement and Analysis for Marketing 61 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Applications of recommender systems in target selection Krishnamoorthy Srikumar; Bharat Bhasker Journal of Targeting, Measurement and Analysis for Marketing; Oct 2004; 13, 1; ABI/INFORM Global pg. 61
Srikumar and bhasker Table 1: Cost benefits of target marketing against mass marketing S No. Description Mass marketing Target marketing Total number of customers 100,000 2 Costs of promotion (at Rs 20 per customer) 20 lakhs 3 Cost of target selection 0. 5 lakhs 2.5 lakhs Number of customers responded er cen 700(7 per cent 6 Profits(at Rs 600 per customer 18 lakhs 4.2 lakhs Net gain/ ()2 lakhs 1.7 lakh however, of the promotional campaign is on their likelihood of responding to the Rs. 20 lakhs (promotional cost at Rs 20 promotional campaign. The customers per customer) leading to a net loss of with higher scores are then selected for Rs. 2 lakhs. On the other hand, if target targeted mailings selection is carried out and 10 per cent The techniques used for target of the customers are targeted, the total selection in the literature include cost of promotion would be Rs. 2.5 regression,decision trees, neural lakhs. Due to target selection, the networks and fuzzy logic. Bauer and of respondents is likely Kaymak explore the use of RFM increase.SO, assuming a nominal increase (recency, frequency and monetary) of 4 per cent (ie from 3 per cent to 7 variables for efficient target selection per cent), the total profits would be Rs. RFM variables the 4.2 lakhs. This leads to a net gain of Rs. purchase behaviour with a relatively 7 lakhs due to the targeted smaller number of features promotional campaign (refer to Table 1 for sample cost computations). As illustrated in Table 1, mass marketing Problem statement and motivation may yield losses while target marketing The traditional approach to the target (which is aimed at a few prospects) election problem makes use of a set of provides significant gains explanatory features, which is built on customers' past history(such as purchase dem Related work details)or trial campaign results, for The target selection methods in the building the prediction model. The iterature can be broadly classified into model is then used to identify the two likelihood of Inethods and scoring methods. In the the promotional campaign. In this paper, segmentation method, the customer a specific class of this target selection database is split into different groups problem is addressed, viz identifying ed on the similarities of h features. An estimate of the response product is introduced into a particular computed. The customers within the One approach to handling this problem group having the higher response is to explicitly ask the customer's interest percentage are then selected for targeted on a set of product categories and send mailings In scoring methods, a separate them targeted promotions as and when a score is assigned to each individual based new product is introduced in the 62 Journal of Targeting, Measurement and Analysis for Marketing Vol. 13, 1, 61-69 o Henry Stewart Publications 0967-3237(2004) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Applications of recommender systems in target selection categories of interest to customers. The ssumptions used in this paper. The likely problem in such a simple approach methodology for target sclcction is is that not all potential buyers may solicit deliberated on in the third section. Th promotional mails for new product scction following that is devoted to introductions in a category. Here,a expcrimental evaluation of the systen structured methodology is presented to and a discussion of its implications. The identify the prospects that are more paper concludes with a summary of the likely d to campaign, especially for new product introductions within a category of an online retail store PRELIMINARIES A typical recommender system, which This section describes the definitions, is aimed at generating recommendations notations al sumptions used in this at product category level, profiles paper. The notations used are made customers and identifies a set of likely distinct by making the in bold and products(categories of product) that are italics throughout this paper of interest to the. These systems also P A set of products (categories of generate Top-N products(categories of products) in the database is denoted as product)as recommendation in a ranked P=(P, P,.. Pn;, where n is the tot order. Apart from providing number of product categories in the Cs re recommender systems can also provide cn 1n sl ch a way that there are only rich insights in identifying prospects that Stock Keeping Units(SKUS) or brand are likely to purchase a new product names of products below this level. For e there is a Here. the use of such a novel products available in the databasc as methodology is investigated for this shown in Figure 1. In Figure l, at level specific class of target selection problem 1(root), there is the personal care and in c-cotmmeice. grooming category. At level 2, there is The primary contributions of this the dental care and hair care product paper are as follows: first, a novel egory. At level 3, each of the product methodology for target selection in categories in level 2 has sct of other internet business using recommender product categorics. Bclow this level systems is suggested. The methodology there are varicties of products uses basic concepts of collaborative (SKUs/brand names of products). So, the filtering and data miningfor total number of products in level 3( effectively selecting the target customers. this example) is taken as the total Secondly, the methodology is number of products in the database. experimentally evaluated on a real-life Tgtp The target product is tI data retailer in India. and its benefits customers need to be sclected and demonstrated. The suggested system can denoted as TgtP(TgtPE P erve as a useful tool for e-commerce CustomerDB. Customer database managers in devising effective denoted as Customer DB. consists of the promotional strategies The organisation of the rest of this products in P. More specifically, the is as follows: the following section database has data of the for <Ck, P> describes the definitions, notations and for each customer Ck in C. The Pi,s il a Henry Stewart Publications 0967-3237(2004) Vol 13, 1, 61-69 Journal of Targeting, Measurement and Analysis for Marketing 63 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Srikumar and Bhasker Personal care and grooming(PCG) Dental care(DC) Hair care(HC) TB Toothpastes DCo Dental care others Toothbrushes Shampoos and conditioners Figure 1 Product taxonomy for personal care and grooming category p takes on the value of count of rule mining" are now discussed. The purchases. A better understanding of this parameters used in association rule notation can be derived with the help of mining viz minimum support and the sample customer database presented minimum confidence are denoted as nonsupport and minconf respectively. In The campaign size used for target this paper, a default minimum support denoted as s alue has been used as frequent 1-itenl the targo target product, Ttp. The customer database(CustomerDB)is split default minimum confidence value is hosen as 50 per cent, although the eferred to as Train DB and TestDB choice can be made flexibly respectivel The maximum number of rules that is SimU. Total number of similar users identified in the collaborative filtering Max Rules. In this paper, the default process is denoted as SimU value of maxRules are sct as 100.000 Collab UserDB. The similar This is done to reduce the performance collaborative users identified in the bottleneck of the The system collaborative filtering step is denoted as however, can be experimented with Collab UserDB and consists of a set of various other values collaborative users for the target he rules are scored in this approach custoner. That is, it is of the form <Ci, as the product of support and confidence Ci> where= 1 to SimU and CiCi. of the rule, that is score SimMetric. The similarity metrics used confidence. Lin'" has utilised this method by the system is denoted as SimMetric. of scoring in the literature. In this paper, cosine similarity metric S N. The total number of products that has been utilised for experimentation needs to be generated The notations spccific to association recoendation is denoted as N. In this 4 Jounal of Targeting, Measurement and Analysis for Marketing Vol. 13, 1. 61-69 0 Henry Stewart Publications 0967-3237(2004) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Applications of recommender systems in target selectin Table 2: A sample customer database CXP P P P7 CCCCCCG 25 253 81111 0413 203 2099 245175625 11 paper, the default value of N is taken as database, TestDB, the prediction for ten ie Top-10 products are generated as purchase of the target product, T'gtP hen computed. It is conmputed as the most common value of the collaborative users' target product, TgtP, value in the TARGET SELECTION training database, TiainDB The complete pseudo-code for target 3 Rule mining: For a customer Ci in lection is provided in Figure 2. The the collaborative user databasc, Methodology for target selection in this Collab_ UserDB, extract his/he1 approach is described in five ste collaborative users. Then, for the selected follows users, cull thcir purchase details froim the I Initial processing: From the given training databasc, Tiain DB. The resulting customer database separate out the details data are mined to generate association of the target product, TgtP ' The target rules with the following constraints:(1) product valucs are converted to binary The rule consequent has single itcl. values(ie O and 1s) based on whether which is TgtP, (2)maximun nuInber of e product has been purchased by the rules generated are less than or equal to customer or not. The customer database MaxRules. The generated rules are ther is then split into training and test cored and sorted on descending order of databases. Now, the objective in target their scores. The score for the rule selection is to identify the prospects in computed as the product of their support test database that are likely to respond to and confidence values the promotional campaign diction From the training databasc rules generated, extract Top-N products 2 Collaborative user identification: For based on their scores, The cumulative h user Ci, in TestDB, identify it 15 collaborative users. Collaborative user response prediction score for tha two steps viz (1)Compule runed in is computed for every customer l/a. arc identified using the collaborative customer. The response prediction scol g test database, TestDB, by executing steps similarities between Ci and every user 2 and 3 above (ie collaborati the training database, Train DB Id rule Similarities are computed using osine 5 larget selection: All customers in the this (2)Sele )B SimU users who have higher similarity non-increasing order of thcir response values. For every user in the to prediction scores. Now, using the 1) Henry Stewart Publications 0967-3237(2004) Vol 13, 1, 61-69 Journal of Targeting, Measurement and Analysis for Marketing 65 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission