In summary, each observation in our dataset contains the hotel id, week id, number of displays, number of clicks, number of conversions, average screen position (i.e, rank on the result page), average page number, and the corresponding service-/location-/review-related characteristics for that hotel in that week. For a better understanding of the variables in our setting, we present the definitions and the summary statistics of our data variables in Table 4. Hierarchical Bayesian Mod In this section, we discuss how we develop our simultaneous model in a hierarchical Bayesian framework. Then we describe how we apply the markov Chain Monte Carlo(MCMC) methods (Rossi and Allenby 2003) to empirically estimate the impacts from the search engine ranking mechanism on consumer search and purchase behavior Our model is motivated by the work in( Ghose and Yang 2009). The general idea is the following We propose to build a simultaneous model of click-through, conversion, and rank. We model the click-through and conversion behavior as a function of hotel brand, price, rank, page, sorting criteria,and hotel characteristics(available from either the hotel search summary page or the hotel landing page, depending on the stage in a search process). The rank of a hotel is modeled as a function of hotel brand, price, sorting criteria, hotel characteristics that are available from the hotel landing page, and performance metrics like previous conversion rate. Each function contains an unobserved error that is assumed to be normally distributed with mean zero. To capture the unobserved co-variation among click-throughs, conversions, and rank, we assume that the three error terms are correlated and follow the multivariate normal distribution with mean zero.More specifically, our model can be described as follows 4.1. Model Setup First, we define our unit of observation to be"hotel-week "Thus, for hotel j in week t assume that there are n; clicks-throughs among N, displays (n, sN, and N, >0). Meanwhile, assume that For robustness check, we also tried a count data model, the Poisson Model. The qualitative nature of our results stays consistent. Due to brevity, we do not describe it in this paper. The results are available upon request
11 In summary, each observation in our dataset contains the hotel id, week id, number of displays, number of clicks, number of conversions, average screen position (i.e., rank on the result page), average page number, and the corresponding service-/location-/review-related characteristics for that hotel in that week. For a better understanding of the variables in our setting, we present the definitions and the summary statistics of our data variables in Table 1. 4. Hierarchical Bayesian Model In this section, we discuss how we develop our simultaneous model in a hierarchical Bayesian framework. Then we describe how we apply the Markov Chain Monte Carlo (MCMC) methods (Rossi and Allenby 2003) to empirically estimate the impacts from the search engine ranking mechanism on consumer search and purchase behavior. Our model is motivated by the work in (Ghose and Yang 2009). The general idea is the following. We propose to build a simultaneous model of click-through, conversion, and rank. We model the click-through and conversion behavior as a function of hotel brand, price, rank, page, sorting criteria, and hotel characteristics (available from either the hotel search summary page or the hotel landing page, depending on the stage in a search process). The rank of a hotel is modeled as a function of hotel brand, price, sorting criteria, hotel characteristics that are available from the hotel landing page, and performance metrics like previous conversion rate. Each function contains an unobserved error that is assumed to be normally distributed with mean zero. To capture the unobserved co-variation among click-throughs, conversions, and rank, we assume that the three error terms are correlated and follow the multivariate normal distribution with mean zero. More specifically, our model can be described as follows.4 4.1. Model Setup First, we define our unit of observation to be “hotel-week.” Thus, for hotel j in week t assume that there are jt n clicks-throughs among Njt displays ( jt jt n N and 0 Njt ). Meanwhile, assume that 4 For robustness check, we also tried a count data model, the Poisson Model. The qualitative nature of our results stays consistent. Due to brevity, we do not describe it in this paper. The results are available upon request
among the n click-throughs, there exist m conversions(m, sn). We further define the probability of having a click-through to be p, and the probability of having a conversion conditional on a click- through to be q, A consumers decision process involves two steps: In the first step, she sees a hotel displayed on the search result web page and decides whether to click it; in the second step, if she clicks on the hotel, she will decide whether to purchase it. Accordingly, we would expect to observe three types of events:(1)A consumer sees a hotel, but does not click or purchase. The probability of such event is 1-P,.(2)A consumer sees a hotel, clicks through, but does not purchase. The probability of such event is P (1-q,).(3)A consumer sees a hotel, clicks through and makes a purchase. The probability of such event is pi g Therefore, we can derive the likelihood function of observing the joint occurrence of n, click- throughs and m, conversions, (n, m, ) to be the following P(nm,p2q)=CMn(P)y(1-pn),Cmn(qn)(1-9)y )(N (P, q). [P (1-q,)]t. (1-P)Jyr 4.2. A Simultaneous Equation Model of click-Through, Conversion, and rank We model the click-through, conversion and rank simultaneously in a hierarchical bayesian framework. In particular, we divide our model into three interactive components (1)Click-Through Rate Model First, we model the probability that a consumer clicks on hotel j in week t to be a function of rank order, page number, hotel price, and hotel characteristics that are available from the search result summary page(i.e, hotel class, customer rating, and customer review count). In addition, to control for the size of the local market, we include the total number of hotels iny's city, Hi, as a control variable. We also include hotel brand dummies to control for the unobserved hotel characteristics Finally, to capture consumers' particular sorting preferences we include an additional control
12 among the jt n click-throughs, there exist mjt conversions ( m n jt jt ). We further define the probability of having a click-through to be jt p and the probability of having a conversion conditional on a clickthrough to be jt q . A consumer’s decision process involves two steps: In the first step, she sees a hotel displayed on the search result web page and decides whether to click it; in the second step, if she clicks on the hotel, she will decide whether to purchase it. Accordingly, we would expect to observe three types of events: (1) A consumer sees a hotel, but does not click or purchase. The probability of such event is 1 jt p . (2) A consumer sees a hotel, clicks through, but does not purchase. The probability of such event is (1 ) jt jt p q . (3) A consumer sees a hotel, clicks through and makes a purchase. The probability of such event is jt jt p q . Therefore, we can derive the likelihood function of observing the joint occurrence of jt n clickthroughs and mjt conversions, ( , jt jt n m ), to be the following Pr( , , , ) ( ) (1 ) ( ) (1 ) ! ( ) [ (1 )] (1 ) . !( )!( )! n n N n m m n m jt jt jt jt jt jt jt jt jt jt jt jt N jt jt n jt jt jt jt jt m n m N n jt jt jt jt jt jt jt jt jt jt jt jt jt jt jt n m p q C p p C q q N p q p q p m n m N n (1) 4.2. A Simultaneous Equation Model of Click-Through, Conversion, and Rank We model the click-through, conversion and rank simultaneously in a hierarchical Bayesian framework. In particular, we divide our model into three interactive components. (1) Click-Through Rate Model First, we model the probability that a consumer clicks on hotel j in week t to be a function of rank order, page number, hotel price, and hotel characteristics that are available from the search result summary page (i.e., hotel class, customer rating, and customer review count). In addition, to control for the size of the local market, we include the total number of hotels in j’s city, Hj , as a control variable. We also include hotel brand dummies to control for the unobserved hotel characteristics. Finally, to capture consumers’ particular sorting preferences we include an additional control