myVU: A Next Generation Recommender system Based on observed Consumer Behavior and Interactive Evolutionary Algorithms Andreas Geyer-Schulz, Michael Hahsler, and Maximillian Jahn' Abteilung fur Informationswirtschaft Institut fur Informationsverarbeitung und Informationswirtschaft Abstract. my vU is a next generation recommender system based on observed consumer behavior and interactive evolutionary algorithms implementing customer relationship man- gement and one-to-one marketing in the educational and scientific broker system of a vir- tual university. my VU provides a personalized, adaptive www-based user interface for all mbers of a virtual university an scientific and educational Web-sites 1 Introduction In this article we describe my VU, a next generation recommender system based on observed consumer behavior and interactive evolutionary algorithms and report first results on its usage. my VU is a prototype of an educational and scientific rec- ommender system for providing routine recommendation and consulting services in mass universities to meet the challenge of reengineering the university posed in Tsichritzis(1999). my VU has been developed with the explicit goal of supporting life-long learning for large parts of the population and, therefore, must address the problems caused by the resulting high heterogeneity in students' background, capa bilities, and previous experience. In this article, however, we emphasize the role of evolution in the design of such systems and concentrate on the evolution strategies embedded into my VU. In section 2 we critically review existing interactive evo- lutionary algorithms and we identify the two main obstacles for their application to web-site personalization and design. Section 3 has the aim of explaining, ho evolutionary algorithms can be integrated in such a system: We treat a virtual uni- versity as an information market and base the evolutionary algorithm shaping the user interface on well-known statistics which are widely used in the retail industry namely ABC-analysis, market-basket analysis(see Blischok(1995))and consumer purchase histories(and -in the near future stochastic consumer choice models gner and Taudes(1987)). These statistics are computed from ap. propriately instrumented Web-server transaction logs. The evolutionary operators, namely fitness-biased selection and various kinds of mutation operators, influence the design and layout of the user-interface and thus influence user behavior. Ob usly, my VU can serve as a model of the business to customer interface of future e-commerce applications. my VU supports customer relationship management and
myVU: A Next Generation Recommender System Based on Observed Consumer Behavior and Interactive Evolutionary Algorithms Andreas Geyer-Schulz , Michael Hahsler , and Maximillian Jahn Abteilung fur¨ Informationswirtschaft, Institut fur¨ Informationsverarbeitung und Informationswirtschaft, Wirtschaftsuniversitat¨ Wien, A-1090 Wien, Austria Abstract. myVU is a next generation recommender system based on observed consumer behavior and interactive evolutionary algorithms implementing customer relationship management and one-to-one marketing in the educational and scientific broker system of a virtual university. myVU provides a personalized, adaptive WWW-based user interface for all members of a virtual university and it delivers routine recommendations for frequently used scientific and educational Web-sites. 1 Introduction In this article we describe myVU, a next generation recommender system based on observed consumer behavior and interactive evolutionary algorithms and report on first results on its usage. myVU is a prototype of an educational and scientific recommender system for providing routine recommendation and consulting services in mass universities to meet the challenge of reengineering the university posed in Tsichritzis (1999). myVU has been developed with the explicit goal of supporting life-long learning for large parts of the population and, therefore, must address the problems caused by the resulting high heterogeneity in students’ background, capabilities, and previous experience. In this article, however, we emphasize the role of evolution in the design of such systems and concentrate on the evolution strategies embedded into myVU. In section 2 we critically review existing interactive evolutionary algorithms and we identify the two main obstacles for their application to web-site personalization and design. Section 3 has the aim of explaining, how evolutionary algorithms can be integrated in such a system: We treat a virtual university as an information market and base the evolutionary algorithm shaping the user interface on well-known statistics which are widely used in the retail industry, namely ABC-analysis, market-basket analysis (see Blischok (1995)) and consumer purchase histories (and – in the near future – stochastic consumer choice models, for a survey see Wagner and Taudes (1987)). These statistics are computed from appropriately instrumented Web-server transaction logs. The evolutionary operators, namely fitness-biased selection and various kinds of mutation operators, influence the design and layout of the user-interface and thus influence user behavior. Obviously, myVU can serve as a model of the business to customer interface of future e-commerce applications. myVU supports customer relationship management and
Geyer-Schulz et al. one-to-one marketing as described in Kelly(1998)to a considerable extent. The evolutionary algorithm described above is common to all my VU recommender ser vices. All my VU recommender services available today(March 2000)are presented in section 4 together with a first analysis of their usage. Finally, we describe possible future improvements of the system 2 Interactive Evolutionary Algorithms for design In the opening lecture of the 1997 Genetic Programming Conference John Koza identified in his outlook on the future of evolutionary algorithms web-site design as one of the most promising(commercial) application areas. However, not much has been achieved in this area in the last three years. To be honest, we must be a little bit more precise. Interactive evolutionary and genetic algorithms have flourished in mu- sicandartsSeee.g.theweb-siteofCraigReynolds(http://www.red.com/cwr/evolve.html for examples on the www An evolutionary algorithm is characterized by a select-and-mutate approach with a population size of 1, a genetic algorithm in addition by a crossover opera- tor and a population size larger than 1. In this paper, however, we will neglect these differences. The reader may think of an evolutionary algorithm as a border line vari- ant of a genetic algorithm with zero probability of crossover and a population size 3 A survey of applications of interactive evolutionary and genetic algorithms for erspective is presented in Takagi(1996a)an akagi(1996b). However, we are not aware of any application of interactive evo- lutionary or genetic algorithms in the area of web-site personalization for e.g. cus- tomer relationship management and one-to-one marketing. Why Before we answer this question, we give a short review of the two generations of nteractive evolutionary and genetic algorithms used today by artists, musicians and engineers. The main feature of the first generation of these algorithms is that the user of such an algorithm is required to explicitely and directly assign a fitness value to each of the objects bred by the algorithm as shown in Figure 1. The ancestor of these algorithms is Richard Dawkins"Blind Watchmaker"which bred simple tree-based forms-so called biomorphs- with an interactive select-and-mutate approach. See Dawkins(1986). Smith's(1991)interactive genetic algorithm for breeding bug like creatures and Caldwell and Johnston's(1991)approach of assisting a witness in building a facial composite of a criminal suspect with an interactive genetic algo- rithm are two additional representatives of this class of algorithms. Combined with a virtual reality environment and bio-sensors as proposed by Hafner and RoBler (1995)interactive evolutionary al gorithms serve as advanced industrial design envi ronments The class of interactive evolutionary algorithms requires that we can phrase the problem as a search through some parameter space, that we can generate new can- didate solutions in near real-time and that the utility of those candidate solutions can easily be compared by humans, but not by means of a precisely defined fitness
2 Geyer-Schulz et al. one-to-one marketing as described in Kelly (1998) to a considerable extent. The evolutionary algorithm described above is common to all myVU recommender services. All myVU recommender services available today (March 2000) are presented in section 4 together with a first analysis of their usage. Finally, we describe possible future improvements of the system. 2 Interactive Evolutionary Algorithms for Design In the opening lecture of the 1997 Genetic Programming Conference John Koza identified in his outlook on the future of evolutionary algorithms web-site design as one of the most promising (commercial) application areas. However, not much has been achieved in this area in the last three years. To be honest, we must be a little bit more precise. Interactive evolutionary and genetic algorithms have flourished in music and arts. See e.g. the web-site of Craig Reynolds(http://www.red.com/cwr/evolve.html) for examples on the WWW. An evolutionary algorithm is characterized by a select-and-mutate approach with a population size of ✁ , a genetic algorithm in addition by a crossover operator and a population size larger than ✁ . In this paper, however, we will neglect these differences. The reader may think of an evolutionary algorithm as a border line variant of a genetic algorithm with zero probability of crossover and a population size of 1. A survey of applications of interactive evolutionary and genetic algorithms for system design from an engineering perspective is presented in Takagi (1996a) and Takagi (1996b). However, we are not aware of any application of interactive evolutionary or genetic algorithms in the area of web-site personalization for e.g. customer relationship management and one-to-one marketing. Why? Before we answer this question, we give a short review of the two generations of interactive evolutionary and genetic algorithms used today by artists, musicians and engineers. The main feature of the first generation of these algorithms is that the user of such an algorithm is required to explicitely and directly assign a fitness value to each of the objects bred by the algorithm as shown in Figure 1. The ancestor of these algorithms is Richard Dawkins’ “Blind Watchmaker” which bred simple tree-based forms – so called biomorphs – with an interactive select-and-mutate approach. See Dawkins (1986). Smith’s (1991) interactive genetic algorithm for breeding bug like creatures and Caldwell and Johnston’s (1991) approach of assisting a witness in building a facial composite of a criminal suspect with an interactive genetic algorithm are two additional representatives of this class of algorithms. Combined with a virtual reality environment and bio-sensors as proposed by Hafner ¨ and Roßler ¨ (1995) interactive evolutionary algorithms serve as advanced industrial design environments. The class of interactive evolutionary algorithms requires that we can phrase the problem as a search through some parameter space, that we can generate new candidate solutions in near real-time and that the utility of those candidate solutions can easily be compared by humans, but not by means of a precisely defined fitness
my VU: A Next Generation Recommender System 日团■ Interactive User assigns Fig. 1. Early interactive genetic algorithms function. Crucial for the success of this approach is that the number of candidate solutions that humans must evaluate is kept as low as possible, because baby-sitting for an interactive-evolutionary algorithm is not a task humans volunteer for This problem- the human fitness bottleneck- is addressed by interactive evo- lutionary algorithms of the second generation with the help of a meta-level genetic own in Figure 2 N 日■ Optimize object for a given fitness function fitness function! Fitness function GA Fig. 2. Interactive(meta) genetic algorithms. The interactive evolutionary algorithm at the meta-level breeds fitness functions which are used by a second genetic algorithm at the base -level to generate new can- didate solutions. For each fitness function, only the best solution is presented to the user for evaluation. The user evolves(either implicitely or explicitely) the fitness function which captures his taste or experience or aesthetics. An example of such an interactive algorithm, where the user explicitely manipulates a set of weights which express the relative importance of design factors as structural configuration harmony with the surrounding environment, and slenderness in the design of aes- thetic bridge structures, is presented in Furuta et al. (1996). Biles et al. (1996)carry the concept one step further. In GenJam, an interactive genetic algorithm for breed ing jazz-solos, the user implicitely chooses between fitness functions in the form of neural networks. By choosing a jazz-solo, he increases the fitness of the associated neural network of evolutionary algorithm which has been employed e.g. on the International netic Art II site by John Mount, Scott Neal Reilly and Michael Witbrock(described
myVU: A Next Generation Recommender System 3 Interactive GA User Population of objects sees breeds assigns fitness Fig. 1. Early interactive genetic algorithms. function. Crucial for the success of this approach is that the number of candidate solutions that humans must evaluate is kept as low as possible, because baby-sitting for an interactive-evolutionary algorithm is not a task humans volunteer for. This problem – the human fitness bottleneck – is addressed by interactive evolutionary algorithms of the second generation with the help of a meta-level genetic algorithm as shown in Figure 2. User sees Population GA Interactive Meta-GA Manipulate fitness function! select best Optimize object for a given fitness function! Fitness function Fig. 2. Interactive (meta) genetic algorithms. The interactive evolutionary algorithm at the meta-level breeds fitness functions which are used by a second genetic algorithm at the base-level to generate new candidate solutions. For each fitness function, only the best solution is presented to the user for evaluation. The user evolves (either implicitely or explicitely) the fitness function which captures his taste or experience or aesthetics. An example of such an interactive algorithm, where the user explicitely manipulates a set of weights which express the relative importance of design factors as structural configuration, harmony with the surrounding environment, and slenderness in the design of aesthetic bridge structures, is presented in Furuta et al. (1996). Biles et al. (1996) carry the concept one step further. In GenJam, an interactive genetic algorithm for breeding jazz-solos, the user implicitely chooses between fitness functions in the form of neural networks. By choosing a jazz-solo, he increases the fitness of the associated neural network. However, the human fitness bottleneck can be tackled with a different kind of evolutionary algorithm which has been employed e.g. on the International Genetic Art II site by John Mount, Scott Neal Reilly and Michael Witbrock (described
Geyer-Schulz et al. athttp://www.cs.cmu.edu/jmount/g3.html)atCarnegieMellon.Inthisapproach many users evaluate the objects bred by the genetic algorithm, the evaluations are collected and aggregated by the fitness function. A new generation of objects is generated as soon as a certain number of evaluations have been collected or a pre- specified period of time has passed. This approach suffers from two draw-backs first. the user receives no immediate feedback and he can not see. how his evalua tions have influenced the outcome of the algorithm and, second, if users have non- homogeneous utility functions and this is the case for heterogeneous user groups the averaging process inherent in the aggregation of the fitness values may lead to art objects whose aesthetic is a disappointment for all users. This effect is even more pronounced with small population sizes The main problems for applying interactive evolutionary algorithms currently ed by 1. The human fitness bottleneck caused by the required explicit evaluation of de- signs by the users 2. The heterogeneity of user groups which leads to non-homogeneous fitness func Bottleneck and Heterogeneous User Group Fitness 3 The Design of my VU: Addressing the humai To address the problems identified in the previous section, the design of my VU is based on the metaphor of an information market(broker) and on the principle of self-assessment of experience. This is an application of the economic principle of self-selection which has been suggested as a rationale for versioning information products by Shapiro and Varian(1999). In my VU we generate recommendations on information products(web-sites) from observed purchasing behavior for infor- mation products. The key idea is identify the information market as a system and the rest of the internet as its environment whenever the user follows an external link from the information market to an information product and crosses the system boundary, this is registered as a purchase incident. Use of internal links in the broker reveals the users' preferences for broker services. Information products in the virtual university have a rich meta-data description with one or more category attributes. In addition, in my VU users have the opportunity to incrementally establish a profile of their experience with each information product category visited(self-assessment of In the following the main loop of the interactive evolutionary algorithm of my VU shown in Figure 3 is explained in more detail. Numbers in parenthesis refer to arcs in Figure 3. In my VU recommendations are usually presented as ranked and labelled ists of suggested information products( web-sites) 1. The user perceives(1)a list of labelled links to other web-sites(e.g. his personal favorites shown in Figure 4)which constitutes an element of the user interface
4 Geyer-Schulz et al. at http://www.cs.cmu.edu/˜jmount/g3.html) at Carnegie Mellon. In this approach, many users evaluate the objects bred by the genetic algorithm, the evaluations are collected and aggregated by the fitness function. A new generation of objects is generated as soon as a certain number of evaluations have been collected or a prespecified period of time has passed. This approach suffers from two draw-backs: first, the user receives no immediate feedback and he can not see, how his evaluations have influenced the outcome of the algorithm and, second, if users have nonhomogeneous utility functions and this is the case for heterogeneous user groups, the averaging process inherent in the aggregation of the fitness values may lead to art objects whose aesthetic is a disappointment for all users. This effect is even more pronounced with small population sizes. The main problems for applying interactive evolutionary algorithms currently used by artists and engineers for personalized web-design are: 1. The human fitness bottleneck caused by the required explicit evaluation of designs by the users. 2. The heterogeneity of user groups which leads to non-homogeneous fitness functions. 3 The Design of myVU: Addressing the Human Fitness Bottleneck and Heterogeneous User Groups To address the problems identified in the previous section, the design of myVU is based on the metaphor of an information market (broker) and on the principle of self-assessment of experience. This is an application of the economic principle of self-selection which has been suggested as a rationale for versioning information products by Shapiro and Varian (1999). In myVU we generate recommendations on information products (web-sites) from observed purchasing behavior for information products. The key idea is identify the information market as a system and the rest of the Internet as its environment. Whenever the user follows an external link from the information market to an information product and crosses the system boundary, this is registered as a purchase incident. Use of internal links in the broker revealsthe users’ preferences for brokerservices. Information products in the virtual university have a rich meta-data description with one or more category attributes. In addition, in myVU users have the opportunity to incrementally establish a profile of their experience with each information product category visited (self-assessment of experience). In the following the main loop of the interactive evolutionary algorithm of myVU shown in Figure 3 is explained in more detail. Numbers in parenthesis refer to arcs in Figure 3. In myVU recommendations are usually presented as ranked and labelled lists of suggested information products (web-sites). 1. The user perceives(1) a list of labelled links to other web-sites (e.g. his personal favorites shown in Figure 4) which constitutes an element of the user interface
my VU: A Next Generation Recommender System Observatic Transaction Aggregation Agents Statistics Services Fig 3. my VU as interactive evolutionary algorithm 2. By following(2)one of these links the user reveals his preference for this web site. The oberservation agents record (4)the purchase incident in the transaction log. The fitness of an information product is proportional to the number of time it has been purchased. In my vU the fitness of an information product is thus not explicitely assigned by a user, it is computed from observed user behavior 3. By incrementally revealing their experience(novice, average, advanced or ex pert) with information product categories they have visited in previous ses- sions, my VU users establish(3)their own experience profile which is used for computing group specific recommendations(6)and for selecting the appropr ate group specific recommender service(7)for the user. This mechanism ad dresses the problem of heterogeneous user groups and has been recommended in Shapiro and Varian(1999) 4. Every night aggregation agents update the retail statistics with the purchase incidents(5)from the transaction log in accordance with the experience level of the user(6). Today, practitioners call this"web-mining. Such statistics include depending on the degree of anonymity chosen by the user) (a)If we can extract anonymous user sessions from the transaction log, market baskets can be analyzed. We compute for each information product yi the frequencies that, if information product yi is in a market basket, the other information products y1, .. i-1, yi+1,. yn are also in the basket. This requency is proportional to the conditional probability P(yiyi) to buy information product yi, if information product yi has been purchased in the same session (b)If we can extract pseudonymous user sessions from the transaction log, we can establish the purchase histories of users and combine these with their experience profiles. (Note, that pseudonymous means, that we do not know the true identity of a user, we only know that it is the same user. )We com- pute for all information products in a category the frequencies with which they are bought by a novice, an average user, an advanced user, and an
myVU: A Next Generation Recommender System 5 Observation Agents Experience Profile Recommendation Agents User (Fitness Function) Transaction Log Aggregation Agents Retail Recommender Statistics Services (Biased Selection) User Interface Discover Services (Mutation) (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) Fig. 3. myVU as interactive evolutionary algorithm. 2. By following (2) one of these links the user reveals his preference for this website. The oberservation agents record (4) the purchase incident in the transaction log. The fitness of an information product is proportional to the number of times it has been purchased. In myVU the fitness of an information product is thus not explicitely assigned by a user, it is computed from observed user behavior. 3. By incrementally revealing their experience (novice, average, advanced or expert) with information product categories they have visited in previous sessions, myVU users establish (3) their own experience profile which is used for computing group specific recommendations (6) and for selecting the appropriate group specific recommender service (7) for the user. This mechanism addresses the problem of heterogeneous user groups and has been recommended in Shapiro and Varian (1999). 4. Every night aggregation agents update the retail statistics with the purchase incidents (5) from the transaction log in accordance with the experience level of the user (6). Today, practitioners call this “web-mining”. Such statistics include (depending on the degree of anonymity chosen by the user): (a) If we can extract anonymous user sessions from the transaction log, market baskets can be analyzed. We compute for each information product ✂☎✄ the frequencies that, if information product ✂☎✄ is in a market basket, the other information products ✂✝✆✟✞✠✞✟✞✠✆ ✂✄☛✡ ✝✆ ✂✄✌☞ ✍✆✠✞✟✞✠✞ ✂✏✎ are also in the basket. This frequency is proportional to the conditional probability ✑✓✒✔✂✝✕✗✖ ✂✄✙✘ to buy information product ✂✝✕ , if information product ✂✄ has been purchased in the same session. (b) If we can extract pseudonymous user sessions from the transaction log, we can establish the purchase histories of users and combine these with their experience profiles. (Note, that pseudonymous means, that we do not know the true identity of a user, we only know that it is the same user.) We compute for all information products in a category the frequencies with which they are bought by a novice, an average user, an advanced user, and an