1014 Journal of Economic Literature,Vol.XLII (December 2004) that employs a standard subject pool of extent of discrimination in the sports-card stud ct framing,and an marketplace .an'artefactual field e riment is the 3.Methodological Importance of Field Experiments Field experiments are methodologically aframed field experiment is the same as rtant beca use the an artefactual field experiment but with tion to iss s tha field context in either the commodity. task,or information set that the subjects can use: i the field.bu e com a natural field experiment is the same as ral imporf t they are of more anc a framed field experiment but where the environment is one where the sub- ny t effect tion method for construct t jects naturally undertake these tasks count e prop. er rfactua that tow spen years exam ing approaches to this We recognize that any such taxonom problem five alternative methods structing the counterfactual:con leaves o s,and that certain studies may no fall neatly into our classification scheme trolled experiments,natural experiments propensity score matching (PSM),instru it is often iate mental variables(IV)estimation,and struc- duct e identify the in order tural approaches.Define y,as the outcome with treatment as the outcome withou tual L Har List (2003 expe ents The treatment h the ect pool,pr effect for unit i can then be measured as ntify The maior problem.however.is natur latter setting one of a missing unterfactual:t.is to s context-ridd unknown.If we could observe the outcome found in the former se ting.And for an untreated observation had it been conducted artefactual framed,and natura experiments to investigate the nature and Controlled" laborat expe nte and field ent the ing or sc nethod ofpr ting the counterfa di rand rectyc9St group nts cas the population an ok the a rimental desi f Maril of the field with () continuous tent.ohit naturally ics.one might have some asure of risk av 2 and" pts that Lis 01)conducted wit nwhich the ontrol d the they moditie trcitaticnot Th cono-UTC
Journal of Economic Literature, Vol. XLII (December 2004) that employs a standard subject pool of students, an abstract framing, and an imposed10 set of rules; * an artefactual field experiment is the same as a conventional lab experiment but with a nonstandard subject pool;11 * aframed field experiment is the same as an artefactual field experiment but with field context in either the commodity, task, or information set that the subjects can use;2 * a naturalfield experiment is the same as a framed field experiment but where the environment is one where the sub- jects naturally undertake these tasks and where the subjects do not know that they are in an experiment.l3 We recognize that any such taxonomy leaves gaps, and that certain studies may not fall neatly into our classification scheme. Moreover, it is often appropriate to con- duct several types of experiments in order to identify the issue of interest. For example, Harrison and List (2003) conducted artefac- tual field experiments and framed field experiments with the same subject pool, pre- cisely to identify how well the heuristics that might apply naturally in the latter setting "travel" to less context-ridden environments found in the former setting. And List (2004b) conducted artefactual, framed, and natural experiments to investigate the nature and 10 The fact that the rules are imposed does not imply that the subjects would reject them, individually or social- ly, if allowed. 11 To offer an early and a recent example, consider the risk-aversion experiments conducted by Hans Binswanger (1980, 1981) in India, and Harrison, Lau, and Williams (2002), who took the lab experimental design of Maribeth Coller and Melonie Williams (1999) into the field with a representative sample of the Danish population. 12 For example, the experiments of Peter Bohm (1984b) to elicit valuations for public goods that occurred naturally in the environment of subjects, albeit with unconventional valuation methods; or the Vickrey auctions and "cheap talk" scripts that List (2001) conducted with sport-card collectors, using sports cards as the commodity and at a show where they trade such commodities. 13 For example, the manipulation of betting markets by Camerer (1998) or the solicitation of charitable contribu- tions by List and Lucking-Reiley (2002). extent of discrimination in the sports-card marketplace. 3. Methodological Importance of Field Experiments Field experiments are methodologically important because they mechanically force the rest of us to pay attention to issues that great researchers seem to intuitively address. These issues cannot be comfortably forgotten in the field, but they are of more general importance. The goal of any evaluation method for "treatment effects" is to construct the prop- er counterfactual, and economists have spent years examining approaches to this problem. Consider five alternative methods of constructing the counterfactual: con- trolled experiments, natural experiments, propensity score matching (PSM), instru- mental variables (IV) estimation, and struc- tural approaches. Define y, as the outcome with treatment, Yo as the outcome without treatment, and let T=l when treated and T=0 when not treated.14 The treatment effect for unit i can then be measured as ti=yi--yio. The major problem, however, is one of a missing counterfactual: t, is unknown. If we could observe the outcome for an untreated observation had it been treated, then there is no evaluation problem. "Controlled" experiments, which include laboratory experiments and field experi- ments, represent the most convincing method of creating the counterfactual, since they directly construct a control group via randomization.15 In this case, the population 14 We simplify by considering a binary treatment, but the logic generalizes easily to multiple treatment levels and continuous treatments. Obvious examples from outside economics include dosage levels or stress levels. In eco- nomics, one might have some measure of risk aversion or "other regarding preferences" as a continuous treatment. 15 Experiments are often run in which the control is pro- vided by theory, and the objective is to assess how well the- ory matches behavior. This would seem to rule out a role for randomization, until one recognizes that some implicit or explicit error structure is required in order to test theo- ries meaningfully. We return to this issue in section 8. 1014 This content downloaded from 218.106.182.180 on Sat, 11 Jun 2016 06:18:54 UTC All use subject to http://about.jstor.org/terms
Harrison and List:Field Experiments 1015 average treatment effect is given by individuals with the same value for these fac t=U u*where u*,and u*are the treat tors will display homogenous responses to ed and nontreated average outcomes after the treatment.then the treatment effect can the treatment.we have much more to sav be measured without bias.In effect.one can about controlled experiments,in particular use statistical methods to identify which two field experiments,below individuals are"more homogeneous lab rats' “Natural experiments'”consider the treat- ment itself as an exp eriment and find a natu rally occurring con is to find a vector of covariates.Z.such that o⊥T|Z and pr(T=1Z)∈(0,1,wher ing the difference in outcomes before mpar enotes indepe after for the treated g nother tive and after outo s fo eated grou th of instn ntal Estin ation of the tre tm tak che the the hat it relie unit o where th s, Guido at time t, Dona 1996 and controls. nary varia Angrist and Al lan B.Krueger 2001) nr=4+7 n-d e IV method,which essentially assumes ferences(DID age treatment effec that compor nts of the nor xpen we assume that data exists for two periods ental data are random,is perhaps the most then t=(u (y-y*)where widely utilized approach to measuring treat for examp y'is the mean outcome for ment effects(Mark Rosenzweig and Kenneth the treated group Wolpin 2000).The crux of the IV approach is A maior identitving assumption in dIl to find a variable that is excluded from the estimation is that there e are no time-varyin outcome equation.but which is related to unit-specific shocks to the outcome variab treatment status and has no direct association that are correlated with treatment status with the outcome The weakness of the Iv and that selection into treatment is inde approach is that such variables do not often pendent of temporary individual-specific exist.or that unpalatable assumptions must effect:E(n X D)=E(a X D.)+.If be maintained in order for them to be used to E,and t are related,DID is inconsistently identify the treatment effect of interest estimated as E(t)=t+E(e.,-Em D=1) A final alternative to the DID model is -E(E.-ED=0). structural modeling such models often entail One hative method of assessing the a heavy impact of the treatr nt is the method of 16 If (1983 g t sively in the debate c nd Ialonde (1986 dhave been had the 1000 a 2002 and le Smith and etra Todd (2000 ymay limit the M is to make nor erimental data sho li experimental The intuition he be hind PSM is that if the researcher can "propensit select observable factors so that any two bit or logit model with T as the dependent variable. This content downle 3728s206614UTC
Harrison and List: Field Experiments average treatment effect is given by T=y* -y*o, where y*J and y*0 are the treat- ed and nontreated average outcomes after the treatment. We have much more to say about controlled experiments, in particular field experiments, below. "Natural experiments" consider the treat- ment itself as an experiment and find a natu- rally occurring comparison group to mimic the control group: T is measured by compar- ing the difference in outcomes before and after for the treated group with the before and after outcomes for the nontreated group. Estimation of the treatment effect takes the form Yit=Xit3 + Tit+lit, where i indexes the unit of observation, t indexes years, Yit is the outcome in cross-section i at time t, Xit is a vector of controls, Tit is a binary variable, lit=a,+ t+£it, and t is the difference-in-dif- ferences (DID) average treatment effect. If we assume that data exists for two periods, then t=(ytt-yl y*to)-(y*ti -y*tO) where, for example, yt*t is the mean outcome for the treated group. A major identifying assumption in DID estimation is that there are no time-varying, unit-specific shocks to the outcome variable that are correlated with treatment status, and that selection into treatment is inde- pendent of temporary individual-specific effect: E(rlit I Xit, Dit)=E(oi I Xit, Dit)+. If Eit, and T are related, DID is inconsistently estimated as E(t)=X+ E(£it1-£ D=1) -E(Eitl -ito D=0). One alternative method of assessing the impact of the treatment is the method of propensity score matching (PSM) developed in P. Rosenbaum and Donald Rubin (1983). This method has been used extensively in the debate over experimental and nonexper- imental evaluation of treatment effects initi- ated by Lalonde (1986): see Rajeev Dehejia and Sadek Wahba (1999, 2002) and Jeffrey Smith and Petra Todd (2000). The goal of PSM is to make non-experimental data "look like" experimental data. The intuition behind PSM is that if the researcher can select observable factors so that any two individuals with the same value for these fac- tors will display homogenous responses to the treatment, then the treatment effect can be measured without bias. In effect, one can use statistical methods to identify which two individuals are "more homogeneous lab rats" for the purposes of measuring the treatment effect. More formally, the solution advocated is to find a vector of covariates, Z, such that y,,y0 I T | Z and pr(T=l IZ) e (0,1), where I denotes independence.6 Another alternative to the DID model is the use of instrumental variables (IV), which approaches the structural econometric method in the sense that it relies on exclusion restrictions (Joshua D. Angrist, Guido W. Imbens, and Donald B. Rubin 1996; and Joshua D. Angrist and Alan B. Krueger 2001). The IV method, which essentially assumes that some components of the non-experi- mental data are random, is perhaps the most widely utilized approach to measuring treat- ment effects (Mark Rosenzweig and Kenneth Wolpin 2000). The crux of the IV approach is to find a variable that is excluded from the outcome equation, but which is related to treatment status and has no direct association with the outcome. The weakness of the IV approach is that such variables do not often exist, or that unpalatable assumptions must be maintained in order for them to be used to identify the treatment effect of interest. A final alternative to the DID model is structural modeling. Such models often entail a heavy mix of identifying restrictions (e.g., 16 If one is interested in estimating the average treat- ment effect, only the weaker condition E(yolT=l, Z)=E(yoIT=O, Z)=E(yo I Z) is required. This assumption is called the "conditional independence assumption," and intuitively means that given Z, the nontreated outcomes are what the treated outcomes would have been had they not been treated. Or, likewise, that selection occurs only on observables. Note that the dimensionality of the prob- lem, as measured by Z, may limit the use of matching. A more feasible alternative is to match on a function of Z. Rosenbaum and Rubin (1983, 1984) showed that matching on p(Z) instead of Z is valid. This is usually carried out on the "propensity" to get treated p(Z), or the propensity score, which in turn is often implemented by a simple pro- bit or logit model with T as the dependent variable. 1015 This content downloaded from 218.106.182.180 on Sat, 11 Jun 2016 06:18:54 UTC All use subject to http://about.jstor.org/terms
1016 Journal of Economic Literature,Vol.XLII(December 2004) could be applied to real people,but to a pref (e.g. ally enous and often e or unitary incor elasticitie )and sin ctive lo al problems plifying assumptions about equilibrium out- A more subst anti response to this criti- (e.g.,zero-profit condit su)Perhaps the ons defin is to consider what it is about students equilibrium industr that is view a priori,as being nonrepre best-known class of such structural mode S IS sentative of the target population.There are computable general equilibrium models at least two issues here.The first is whethe which have been extensively applied to eval endogenous sample selection or attrition has ate trade policies.for example. It typically occurred due to incomplete control ove relies on complex estimation strategies,but recruitment and retention,so that the yields structural parameters that are well observed sample is unreliable in some statis suited for ex ante policy simulation provided tical sense (e.g.,generating inconsistent esti- one undertakes systematic sensitivity analysis mates of treatment effects)The second is of those parameters.8 In this sense,structur whether the observed sample can be inform al models have been the cornerstone of non experimental evaluation of tax and welfan policies (R.Blundell and Thomas MaCurdy 1999:and Blundell and M.Costas Dias 2002) A 2 sample selection in the field 4.Artefactual Field Experiments tudents who ar told only the 4.1 The Nature of the Subject Pool rge A common criticism of the relevance of oid mentio inferences drawn from laboratory experi as以 the 1 ea Most lab ments is that one needs to undertake an e-shot, ople not students tha d repeated subject to attrition. with the following i course,neither of these atures is sentia think that the te differ If one wanted to recruit sub cts with speci ic interest in a task,it would be easy to do experim (e.g..Peter Bohm and Hans lind 1993).And al pe challe nge the if one wanted to recruit subiects for severa "super-experienced or to conduct pre-tests of such heeheont things as risk aversion trust or "other an The first response,to suggest t regarding preferences that could be built run the experiment th real people, into the design as well often adequate to get rid of unwanted refer One concern with lab experiments con- ducted with convenience sam oles of students ees at academic journals.In practice,howev er,few experimenters ever examine fielo behavior in a serious and large-sample way It is relatively easy to say that the experiment Kagel,Battalio, nard from the ory to organ 1Trr(1997 nd H.D.Vinod (192) 200 For e cample,.Cox(2004)】 Th cono-UTC
Journal of Economic Literature, Vol. XLII (December 2004) separability), impose structure on technology and preferences (e.g., constant returns to scale or unitary income elasticities), and sim- plifying assumptions about equilibrium out- comes (e.g., zero-profit conditions defining equilibrium industrial structure). Perhaps the best-known class of such structural models is computable general equilibrium models, which have been extensively applied to evalu- ate trade policies, for example.17 It typically relies on complex estimation strategies, but yields structural parameters that are well- suited for ex ante policy simulation, provided one undertakes systematic sensitivity analysis of those parameters.18 In this sense, structur- al models have been the cornerstone of non- experimental evaluation of tax and welfare policies (R. Blundell and Thomas MaCurdy 1999; and Blundell and M. Costas Dias 2002). 4. Artefactual Field Experiments 4.1 The Nature of the Subject Pool A common criticism of the relevance of inferences drawn from laboratory experi- ments is that one needs to undertake an experiment with "real" people, not students. This criticism is often deflected by experi- menters with the following imperative: if you think that the experiment will generate differ- ent results with "real" people, then go ahead and run the experiment with real people. A variant of this response is to challenge the crit- ics' assertion that students are not representa- tive. As we will see, this variant is more subtle and constructive than the first response. The first response, to suggest that the crit- ic run the experiment with real people, is often adequate to get rid of unwanted refer- ees at academic journals. In practice, howev- er, few experimenters ever examine field behavior in a serious and large-sample way. It is relatively easy to say that the experiment 17 For example, the evaluation of the Uruguay Round of multilateral trade liberalization by Harrison, Thomas Rutherford, and David Tarr (1997). 18 For example, see Harrison and H.D. Vinod (1992). could be applied to real people, but to actu- ally do so entails some serious and often unattractive logistical problems.19 A more substantial response to this criti- cism is to consider what it is about students that is viewed, a priori, as being nonrepre- sentative of the target population. There are at least two issues here. The first is whether endogenous sample selection or attrition has occurred due to incomplete control over recruitment and retention, so that the observed sample is unreliable in some statis- tical sense (e.g., generating inconsistent esti- mates of treatment effects). The second is whether the observed sample can be inform- ative on the behavior of the population, assuming away sample selection issues. 4.2 Sample Selection in the Field Conventional lab experiments typically use students who are recruited after being told only general statements about the experiment. By and large, recruitment pro- cedures avoid mentioning the nature of the task, or the expected earnings. Most lab experiments are also one-shot, in the sense that they do not involve repeated observa- tions of a sample subject to attrition. Of course, neither of these features is essential. If one wanted to recruit subjects with specif- ic interest in a task, it would be easy to do (e.g., Peter Bohm and Hans Lind 1993). And if one wanted to recruit subjects for several sessions, to generate "super-experienced" subjects20 or to conduct pre-tests of such things as risk aversion, trust, or "other- regarding preferences,"21 that could be built into the design as well. One concern with lab experiments con- ducted with convenience samples of students 19 Or one can use "real" nonhuman species: see John Kagel, Don MacDonald, and Raymond Battalio (1990) and Kagel, Battalio, and Leonard Green (1995) for dramatic demonstrations of the power of economic theory to organ- ize data from the animal kingdom. 20 For example, John Kagel and Dan Levin (1986, 1999, 2002). 21 For example, Cox (2004). 1016 This content downloaded from 218.106.182.180 on Sat, 11 Jun 2016 06:18:54 UTC All use subject to http://about.jstor.org/terms
Harrison and List:Field Experiments 1017 is that students might be self-selected in allows one to remove this recruitment bias some way,so that they are a sample that from the resulting inference excludes certain individuals with characteris Some field experiments face a more seri tics that are important determinants of ous problem of sample selection that underlying population behavior.Although den nds on the nature of the task.Once the potential periment has beg in.it is not as easy as it is should not be in the lab to control information flow about remphasi d.it is always possible to sim the t the mple to s matter of degree endog repr ented ubiect att least undop rition fr the expen umn this t i actually in ab tha ma fro the the nitial ec t had whicl i th S vere su classi and Lary Hed 06 no many cor selec d mass murderers or brain surgeons in to possibl recruitment biase suc student samples, we certainly that the erved sample is generate where to go if we feel the need to include process that depends on the nature of the them in our sampl Another consideration.of increasing importance for experimenters,is the possi bility of recruitment biases in our proce dures.One aspect of this issue is studied by population from which volunteers are being Rutstrom (1998).She examines the role of recruited has diverse risk attitudes and plau- recruitment fees in biasing the samples of sibly expects the experiment to have some subiects that are obtained The context for element of randomization then the her experiment is particularly relevant here observed sample will tend to look less risk since it entails the elicitation ofv月lues for a averse than the population.It is easy to imagine how this could then affect behavior differentially in some treatments.Iame ited as varies the Heckman and Jeffrey Smith (1995)discr this issue in the ontext of social and then up toten dolars.An tan ots but the cor applies equ ally to ho s tha of tho h experiments 4.3 Are Students Different? 6 a s and th This oy co essed in ing for the In eral includi field mat group o subjects trea Lichte tein an ment has 60 ercent fen es and the o 1973 and Penny Burr ns (19 sample of subjects in another trea ent has nn Harrison and James Les ley(1996 only 40 percent females,provided one con (HL)approach this ques tion with a simple trols for the difference in gender when poo framework.Indeec they do not ing the data and examining the key treatment consider the issue in terms of the relevance effect.This is a situation in which gender might influence the response or the effect of owerchoiesiodeenntthe mization often occur the treatment,but controlling for gender This content downle a20g2016o61s4UTC
Harrison and List: Field Experiments is that students might be self-selected in some way, so that they are a sample that excludes certain individuals with characteris- tics that are important determinants of underlying population behavior. Although this problem is a severe one, its potential importance in practice should not be overemphasized. It is always possible to sim- ply inspect the sample to see if certain strata of the population are not represented, at least under the tentative assumption that it is only observables that matter. In this case it would behoove the researcher to augment the initial convenience sample with a quota sample, in which the missing strata were sur- veyed. Thus one tends not to see many con- victed mass murderers or brain surgeons in student samples, but we certainly know where to go if we feel the need to include them in our sample. Another consideration, of increasing importance for experimenters, is the possi- bility of recruitment biases in our proce- dures. One aspect of this issue is studied by Rutstr6m (1998). She examines the role of recruitment fees in biasing the samples of subjects that are obtained. The context for her experiment is particularly relevant here since it entails the elicitation of values for a private commodity. She finds that there are some significant biases in the strata of the population recruited as one varies the recruitment fee from zero dollars to two dol- lars, and then up to ten dollars. An important finding, however, is that most of those biases can be corrected simply by incorporating the relevant characteristics in a statistical model of the behavior of subjects and thereby con- trolling for them. In other words, it does not matter if one group of subjects in one treat- ment has 60 percent females and the other sample of subjects in another treatment has only 40 percent females, provided one con- trols for the difference in gender when pool- ing the data and examining the key treatment effect. This is a situation in which gender might influence the response or the effect of the treatment, but controlling for gender allows one to remove this recruitment bias from the resulting inference. Some field experiments face a more seri- ous problem of sample selection that depends on the nature of the task. Once the experiment has begun, it is not as easy as it is in the lab to control information flow about the nature of the task. This is obviously a matter of degree, but can lead to endoge- nous subject attrition from the experiment. Such attrition is actually informative about subject preferences, since the subject's exit from the experiment indicates that the sub- ject had made a negative evaluation of it (Tomas Philipson and Larry Hedges 1998). The classic problem of sample selection refers to possible recruitment biases, such that the observed sample is generated by a process that depends on the nature of the experiment. This problem can be serious for any experiment, since a hallmark of virtually every experiment is the use of some ran- domization, typically to treatment.22 If the population from which volunteers are being recruited has diverse risk attitudes and plau- sibly expects the experiment to have some element of randomization, then the observed sample will tend to look less risk- averse than the population. It is easy to imagine how this could then affect behavior differentially in some treatments. James Heckman and Jeffrey Smith (1995) discuss this issue in the context of social experi- ments, but the concern applies equally to field and lab experiments. 4.3 Are Students Different? This question has been addressed in sev- eral studies, including early artefactual field experiments by Sarah Lichtenstein and Paul Slovic (1973), and Penny Burns (1985). Glenn Harrison and James Lesley (1996) (HL) approach this question with a simple statistical framework. Indeed, they do not consider the issue in terms of the relevance 22 If not to treatment, then randomization often occurs over choices to determine payoff. 1017 This content downloaded from 218.106.182.180 on Sat, 11 Jun 2016 06:18:54 UTC All use subject to http://about.jstor.org/terms
1018 Journal of Economic Literature,Vol.XLII(December 2004) of experimental methods.but rather in the subject was asked whether he or she e re ples or the contingent vala would b e wil g to pay ard a pi domly selecte However it is easy to se hat their methods e910 0,or $120.A s bject would respond to this question with a yes, m0, or a "not sure."A simple statistical terms of their attempt to mimic the results model is deve eloped to explain behavior as a of a large-scale na ional survey conducted Valdez oil-spill litigation. function of theobservable socioecnm for the Exxon characteristics major national survey was undertaken in this Assuming that a statistical model has case by Richard Carson et al.(1992)for the been developed,HL then proceeded to the attorney general of the state of alaska.this key stage of their method.This is to assume survey used then-state-of-the-art survey that the coefficient estimates from the statis- methods but,more importantly for present tical model based on the student sample purposes,used a full probability sa nle of apply to the the nation HI asked if one can obtair essentially the same results using a conver tained,then the statistical model may be ience sample of students from the University used to predict the behavior of the targe Carolina.using students as a con oulation if one can obtain information ole is la matter about the onomic characteristics of methodolo ould r ilv ulati in using stude eof the HL method is ents pro a tough e and m a licable than pro They proceeded by developinga senta ample If st pp survey re op a tudy behavior to the method on can oft use publ y ava ilable infor not essentia to Thi matio on the characte s of the targe survey was inistered to a relativel population to predict the be havior of large sample of students population Their fund mental point is that survey that aims the“pi oblem with students is the lack of to control for subject attributes,is the col variability in their socio-demographic char- lection of a range of standard socioeconom acteristics.not necessarily the unrepresen ic characteristics of the individual (e.g.,sex tativeness of their behavioral responses age,income. parental income conditional on their socio-demographic size,and marital status).Once these data characteristics. are collated,a statistical model is developed To the extent that student samples exhibit in order to explain the key responses in the limited variability in some key characteris- survey.In this case the key tics,such as age,then one might be wary of the ve racity of the maintained assumption ce valuation question. In other words involved here.Hov er.the sample do no have to look like the ulation ir order for the statistical model to be an ade ate one 黑the subject to place )e
Journal of Economic Literature, Vol. XLII (December 2004) of experimental methods, but rather in terms of the relevance of convenience sam- ples for the contingent valuation method.23 However, it is easy to see that their methods apply much more generally. The HL approach may be explained in terms of their attempt to mimic the results of a large-scale national survey conducted for the Exxon Valdez oil-spill litigation. A major national survey was undertaken in this case by Richard Carson et al. (1992) for the attorney general of the state of Alaska. This survey used then-state-of-the-art survey methods but, more importantly for present purposes, used a full probability sample of the nation. HL asked if one can obtain essentially the same results using a conven- ience sample of students from the University of South Carolina. Using students as a con- venience sample is largely a matter of methodological bravado. One could readily obtain convenience samples in other ways, but using students provides a tough test of their approach. They proceeded by developing a simpler survey instrument than the one used in the original study. The purpose of this is purely to facilitate completion of the survey and is not essential to the use of the method. This survey was then administered to a relatively large sample of students. An important part of the survey, as in any field survey that aims to control for subject attributes, is the col- lection of a range of standard socioeconom- ic characteristics of the individual (e.g., sex, age, income, parental income, household size, and marital status). Once these data are collated, a statistical model is developed in order to explain the key responses in the survey. In this case the key response is a simple "yes" or "no" to a single dichotomous choice valuation question. In other words, 23 The contingent valuation method refers to the use of hypothetical field surveys to value the environment, by posing a scenario that asks the subject to place a value on an environmental change contingent on a market for it existing. See Cummings and Harrison (1994) for a critical review of the role of experimental economics in this field. the subject was asked whether he or she would be willing to pay $X towards a public good, where $X was randomly selected to be $10, $30, $60, or $120. A subject would respond to this question with a "yes," a "no," or a "not sure." A simple statistical model is developed to explain behavior as a function of the observable socioeconomic characteristics.24 Assuming that a statistical model has been developed, HL then proceeded to the key stage of their method. This is to assume that the coefficient estimates from the statis- tical model based on the student sample apply to the population at large. If this is the case, or if this assumption is simply main- tained, then the statistical model may be used to predict the behavior of the target population if one can obtain information about the socioeconomic characteristics of the target population. The essential idea of the HL method is simple and more generally applicable than this example suggests. If students are repre- sentative in the sense of allowing the researcher to develop a "good" statistical model of the behavior under study, then one can often use publicly available infor- mation on the characteristics of the target population to predict the behavior of that population. Their fundamental point is that the "problem with students" is the lack of variability in their socio-demographic char- acteristics, not necessarily the unrepresen- tativeness of their behavioral responses conditional on their socio-demographic characteristics. To the extent that student samples exhibit limited variability in some key characteris- tics, such as age, then one might be wary of the veracity of the maintained assumption involved here. However, the sample does not have to look like the population in order for the statistical model to be an adequate one 24 The exact form of that statistical model is not impor- tant for illustrative purposes, although the development of an adequate statistical model is important to the reliability of this method. 1018 This content downloaded from 218.106.182.180 on Sat, 11 Jun 2016 06:18:54 UTC All use subject to http://about.jstor.org/terms