2148 The Journal of Finance horizon results-such as return reversals,the book-to-market effect,and the cashflow-to-price effect-can be largely subsumed within a three-factor model that they interpret as a variant of the APT or ICAPM.However,this position has been controversial,since there is little affirmative evidence that the Fama-French factors correspond to economically meaningful risks.Indeed, several recent papers demonstrate that the contrarian strategies that ex- ploit long-horizon overreaction are not significantly riskier than average.10 There seems to be more of a consensus that the short-horizon underreaction evidence cannot be explained in terms of risk.Bernard and Thomas(1989)re- ject risk as an explanation for post-earnings-announcement drift,and Fama and French(1996)remark that the continuation results of Jegadeesh and Tit- man(1993)constitute the "main embarrassment"for their three-factor model. IⅡ.The Model A.Price Formation with Newswatchers Only As mentioned above,our model features two classes of traders,newswatch- ers and momentum traders.We begin by describing how the model works when only the newswatchers are present.At every time t,the newswatchers trade claims on a risky asset.This asset pays a single liquidating dividend at some later time T.The ultimate value of this liquidating dividend can be written as:Dr=Do+,where all the e's are independently distrib- uted,mean-zero normal random variables with variance o2.Throughout,we consider the limiting case where T goes to infinity.This simplifies matters by allowing us to focus on steady-state trading strategies-that is,strat- egies that do not depend on how close we are to the terminal date.11 In order to capture the idea that information moves gradually across the news- watcher population,we divide this population into z equal-sized groups.We also assume that every dividend innovation e;can be decomposed into z in- dependent subinnovations,each with the same variance 2/:=+...+ef. The timing of information release is then as follows.At timet,news about e+-1 begins to spread.Specifically,at time t,newswatcher group 1 observes e+-1, group 2 observes e2-1,and so forth,through group z,which observes e+-1. Thus at time t each subinnovation ofe+has been seen by a fraction 1/z of the total population. Next,at time t +1,the groups "rotate,"so that group 1 now observes e-1,group 2 observes e1,and so forth,through group z,which now observes e-1.Thus at time t+1 the information has spread further,and 10 See Lakonishok et al.(1994)and MacKinlay (1995).Daniel and Titman(1997)directly dispute the idea that the book-to-market effect can be given a risk interpretation. 11 A somewhat more natural way to generate an infinite-horizon formulation might be to allow the asset to pay dividends every period.The only reason we push all the dividends out into the infinite future is for notational simplicity.In particular,when we consider the strat- egies of short-lived momentum traders below,our approach allows us to have these strategies depend only on momentum traders'forecasts of price changes,and we can ignore their forecasts of interim dividend payments
horizon results—such as return reversals, the book-to-market effect, and the cashflow-to-price effect—can be largely subsumed within a three-factor model that they interpret as a variant of the APT or ICAPM. However, this position has been controversial, since there is little affirmative evidence that the Fama–French factors correspond to economically meaningful risks. Indeed, several recent papers demonstrate that the contrarian strategies that exploit long-horizon overreaction are not significantly riskier than average.10 There seems to be more of a consensus that the short-horizon underreaction evidence cannot be explained in terms of risk. Bernard and Thomas ~1989! reject risk as an explanation for post-earnings-announcement drift, and Fama and French ~1996! remark that the continuation results of Jegadeesh and Titman ~1993! constitute the “main embarrassment” for their three-factor model. II. The Model A. Price Formation with Newswatchers Only As mentioned above, our model features two classes of traders, newswatchers and momentum traders. We begin by describing how the model works when only the newswatchers are present. At every time t, the newswatchers trade claims on a risky asset. This asset pays a single liquidating dividend at some later time T. The ultimate value of this liquidating dividend can be written as: DT 5 D0 1 (j50 T ej , where all the e’s are independently distributed, mean-zero normal random variables with variance s2. Throughout, we consider the limiting case where T goes to infinity. This simplifies matters by allowing us to focus on steady-state trading strategies—that is, strategies that do not depend on how close we are to the terminal date.11 In order to capture the idea that information moves gradually across the newswatcher population, we divide this population into z equal-sized groups. We also assume that every dividend innovation ej can be decomposed into z independent subinnovations, each with the same variance s2 0z: ej 5 ej 11{{{1ej z . The timing of information release is then as follows. At time t, news about et1z21 begins to spread. Specifically, at time t, newswatcher group 1 observes et1z21 1 , group 2 observes et1z21 2 , and so forth, through group z, which observes et1z21 z . Thus at time t each subinnovation of et1z21 has been seen by a fraction 10z of the total population. Next, at time t 1 1, the groups “rotate,” so that group 1 now observes et1z21 2 , group 2 observes et1z21 3 , and so forth, through group z, which now observes et1z21 1 . Thus at time t 1 1 the information has spread further, and 10 See Lakonishok et al. ~1994! and MacKinlay ~1995!. Daniel and Titman ~1997! directly dispute the idea that the book-to-market effect can be given a risk interpretation. 11 A somewhat more natural way to generate an infinite-horizon formulation might be to allow the asset to pay dividends every period. The only reason we push all the dividends out into the infinite future is for notational simplicity. In particular, when we consider the strategies of short-lived momentum traders below, our approach allows us to have these strategies depend only on momentum traders’ forecasts of price changes, and we can ignore their forecasts of interim dividend payments. 2148 The Journal of Finance
Underreaction,Momentum Trading,and Overreaction 2149 each subinnovation of e+1has been seen by a fraction 2/z of the total population.This rotation process continues until time t +z-1,at which point every one of the z groups has directly observed each of the subinno- vations that comprise+1.So+1 has become totally public by time t+z-1.Although this formulation may seem unnecessarily awkward,the rotation feature is useful because it implies that even as information moves slowly across the population,on average everybody is equally well-informed.12 This symmetry makes it transparently simple to solve for prices,as is seen momentarily. In this context,the parameter z can be thought of as a proxy for the (lin- ear)rate of information flow-higher values of z imply slower information diffusion.Of course,the notion that information spreads slowly is more ap- propriate for some purposes than others.In particular,this construct is fine if our goal is to capture the sort of underreaction that shows up empirically as unconditional positive correlation in returns at short horizons.However, if we are also interested in capturing phenomena like post-earnings- announcement drift-where there is apparently underreaction even to data that is made available to everyone simultaneously-we need to embellish the model.We discuss this embellishment later;for now it is easiest to think of the model as only speaking to the unconditional evidence on underreaction. All the newswatchers have constant absolute risk aversion(CARA)utility with the same risk-aversion parameter,and all live until the terminal date T.The riskless interest rate is normalized to zero,and the supply of the asset is fixed at Q.So far,all these assumptions are completely orthodox.We now make two that are less conventional.First,at every time t,newswatch- ers formulate their asset demands based on the static-optimization notion that they buy and hold until the liquidating dividend at time T.13 Second, and more critically,while newswatchers can condition on the information sets described above,they do not condition on current or past prices.In other words,our equilibrium concept is a Walrasian equilibrium with pri- vate valuations,as opposed to a fully revealing rational expectations equilibrium. As suggested in the Introduction,these two unconventional assumptions can be motivated based on a simple form of bounded rationality.One can think of the newswatchers as having their hands full just figuring out the implications of the e's for the terminal dividend D.This leaves them unable to also use current and past market prices to form more sophisticated fore- casts of D(our second assumption);it also leaves them unable to make any forecasts of future price changes,and hence unable to implement dynamic strategies (our first assumption). 12 Contrast this with a simpler setting where group 1 always sees all of+-1first,then group 2 sees it second,etc.In this case,group 1 newswatchers are better-informed than their peers. 1a There is an element of time-inconsistency here,since in fact newswatchers may adjust their positions over time.Ignoring the dynamic nature of newswatcher strategies is more sig- nificant when we add momentum traders to the model,so we discuss this issue further in Section II.B
each subinnovation of et1z21 has been seen by a fraction 20z of the total population. This rotation process continues until time t 1 z 21, at which point every one of the z groups has directly observed each of the subinnovations that comprise et1z21. So et1z21 has become totally public by time t 1 z 2 1. Although this formulation may seem unnecessarily awkward, the rotation feature is useful because it implies that even as information moves slowly across the population, on average everybody is equally well-informed.12 This symmetry makes it transparently simple to solve for prices, as is seen momentarily. In this context, the parameter z can be thought of as a proxy for the ~linear! rate of information flow—higher values of z imply slower information diffusion. Of course, the notion that information spreads slowly is more appropriate for some purposes than others. In particular, this construct is fine if our goal is to capture the sort of underreaction that shows up empirically as unconditional positive correlation in returns at short horizons. However, if we are also interested in capturing phenomena like post-earningsannouncement drift—where there is apparently underreaction even to data that is made available to everyone simultaneously—we need to embellish the model. We discuss this embellishment later; for now it is easiest to think of the model as only speaking to the unconditional evidence on underreaction. All the newswatchers have constant absolute risk aversion ~CARA! utility with the same risk-aversion parameter, and all live until the terminal date T. The riskless interest rate is normalized to zero, and the supply of the asset is fixed at Q. So far, all these assumptions are completely orthodox. We now make two that are less conventional. First, at every time t, newswatchers formulate their asset demands based on the static-optimization notion that they buy and hold until the liquidating dividend at time T. 13 Second, and more critically, while newswatchers can condition on the information sets described above, they do not condition on current or past prices. In other words, our equilibrium concept is a Walrasian equilibrium with private valuations, as opposed to a fully revealing rational expectations equilibrium. As suggested in the Introduction, these two unconventional assumptions can be motivated based on a simple form of bounded rationality. One can think of the newswatchers as having their hands full just figuring out the implications of the e’s for the terminal dividend DT. This leaves them unable to also use current and past market prices to form more sophisticated forecasts of DT ~our second assumption!; it also leaves them unable to make any forecasts of future price changes, and hence unable to implement dynamic strategies ~our first assumption!. 12 Contrast this with a simpler setting where group 1 always sees all of et1z21 first, then group 2 sees it second, etc. In this case, group 1 newswatchers are better-informed than their peers. 13 There is an element of time-inconsistency here, since in fact newswatchers may adjust their positions over time. Ignoring the dynamic nature of newswatcher strategies is more significant when we add momentum traders to the model, so we discuss this issue further in Section II.B. Underreaction, Momentum Trading, and Overreaction 2149
2150 The Journal of Finance Given these assumptions,and the symmetry of our setup,the conditional variance of fundamentals is the same for all newswatchers,and the price at time t is given by P:=D:+{(z-1)e+1+(z-2)et+2+…+et+2-1}/2-0Q, (1) where 6 is a function of newswatchers'risk aversion and the variance of the e's.For simplicity,we normalize the risk aversion so that 6=1 hereafter.In words,equation (1)says that the new information works its way linearly into the price over z periods.This implies that there is positive serial cor- relation of returns over short horizons (of length less than z).Note also that prices never overshoot their long-run values,or,equivalently,that there is never any negative serial correlation in returns at any horizon. Even given the eminently plausible assumption that private information diffuses gradually across the population of newswatchers,the gradual-price- adjustment result in equation (1)hinges critically on the further assump- tion that newswatchers do not condition on prices.For if they did-and as long as Q is nonstochastic-the logic of Grossman (1976)would imply a fully revealing equilibrium,with a price P,following a random walk given by(for6=1):14 P=D+2-1-Q (2) We should therefore stress that we view the underreaction result embod- ied in equation (1)to be nothing more than a point of departure.As such,it raises an obvious next question:Even if newswatchers are too busy process- ing fundamental data to incorporate prices into their forecasts,cannot some other group of traders focus exclusively on price-based forecasting,and in so doing generate an outcome close to the rational expectations equilibrium of equation(2)?It is to this central question that we turn next,by adding the momentum traders into the mix. B.Adding Momentum Traders to the Model Momentum traders also have CARA utility.Unlike the newswatchers,how- ever,they have finite horizons.In particular,at every time t,a new gener- ation of momentum traders enters the market.Every trader in this generation takes a position,and then holds this position forj periods-that is,until time t +j.For modeling purposes,we treat the momentum traders'horizon J as an exogenous parameter. The momentum traders transact with the newswatchers by means of mar- ket orders.They submit quantity orders,not knowing the price at which these orders will be executed.The price is then determined by the competi- tion among the newswatchers,who double as market makers in this setup. Thus,in deciding the size of their orders,the momentum traders at time t must try to predict(P+-P).To do so,they make forecasts based on past 14 Strictly speaking,this result also requires that there be an initial"date 0"at which ev. erybody is symmetrically informed
Given these assumptions, and the symmetry of our setup, the conditional variance of fundamentals is the same for all newswatchers, and the price at time t is given by Pt 5 Dt 1 $~z 2 1!et11 1 ~z 2 2!et121{{{1et1z21%0z 2 uQ, ~1! where u is a function of newswatchers’ risk aversion and the variance of the e’s. For simplicity, we normalize the risk aversion so that u 5 1 hereafter. In words, equation ~1! says that the new information works its way linearly into the price over z periods. This implies that there is positive serial correlation of returns over short horizons ~of length less than z!. Note also that prices never overshoot their long-run values, or, equivalently, that there is never any negative serial correlation in returns at any horizon. Even given the eminently plausible assumption that private information diffuses gradually across the population of newswatchers, the gradual-priceadjustment result in equation ~1! hinges critically on the further assumption that newswatchers do not condition on prices. For if they did—and as long as Q is nonstochastic—the logic of Grossman ~1976! would imply a fully revealing equilibrium, with a price Pt * , following a random walk given by ~for u 5 1!:14 Pt * 5 Dt1z21 2 Q. ~2! We should therefore stress that we view the underreaction result embodied in equation ~1! to be nothing more than a point of departure. As such, it raises an obvious next question: Even if newswatchers are too busy processing fundamental data to incorporate prices into their forecasts, cannot some other group of traders focus exclusively on price-based forecasting, and in so doing generate an outcome close to the rational expectations equilibrium of equation ~2!? It is to this central question that we turn next, by adding the momentum traders into the mix. B. Adding Momentum Traders to the Model Momentum traders also have CARA utility. Unlike the newswatchers, however, they have finite horizons. In particular, at every time t, a new generation of momentum traders enters the market. Every trader in this generation takes a position, and then holds this position for j periods—that is, until time t 1 j. For modeling purposes, we treat the momentum traders’ horizon j as an exogenous parameter. The momentum traders transact with the newswatchers by means of market orders. They submit quantity orders, not knowing the price at which these orders will be executed. The price is then determined by the competition among the newswatchers, who double as market makers in this setup. Thus, in deciding the size of their orders, the momentum traders at time t must try to predict ~Pt1j 2 Pt!. To do so, they make forecasts based on past 14 Strictly speaking, this result also requires that there be an initial “date 0” at which everybody is symmetrically informed. 2150 The Journal of Finance
Underreaction,Momentum Trading,and Overreaction 2151 price changes.We assume that these forecasts take an especially simple form:The only conditioning variable is the cumulative price change over the past k periods;that is,(P-1-P-1). As it turns out,the exact value of k is not that important,so in what follows we simplify things by setting k =1,and using(P-1-P-2)=AP:-1 as the time-t forecasting variable.15 What is more significant is that we restrict the momentum traders to making univariate forecasts based on past price changes.If,in contrast,we allow them to make forecasts using n lags of price changes,giving different weights to each of the n lags,we suspect that for sufficiently large n,many of the results we present below would go away.Again,the motivation is a crude notion of bounded rationality:Mo- mentum traders simply do not have the computational horsepower to run complicated multivariate regressions. With k=1,the order flow from generation-t momentum traders,F,is of the form F=A+中△P-1 (3) where the constant A and the elasticity parameter have to be determined from optimization on the part of the momentum traders.This order flow must be absorbed by the newswatchers.We assume that the newswatchers treat the order flow as an uninformative supply shock.This is consistent with our prior assumption that the newswatchers do not condition on prices. Given that the order flow is a linear function of past price changes,if we allowed the newswatchers to extract information from it,we would be indi- rectly allowing them to learn from prices. To streamline things,the order flow from the newswatchers is the only source of supply variation in the model.Given that there are j generations of momentum traders in the market at any point in time,the aggregate supply S,absorbed by the newswatchers is given by: 8=Q-含+1=Q-A-含1R4 (4) We continue to assume that,at any time t,the newswatchers act as if they buy and hold until the liquidating dividend at time T.This implies that prices are given exactly as in equation(1),except that the fixed supply Q is replaced by the variable St,yielding B二=D,+2-1)e+1+2-2e+2+…++-k-Q+jA+ 中AP-i (5) 15 In the NBER working paper version,we provide a detailed analysis of the comparative statics properties of the model with respect to k
price changes. We assume that these forecasts take an especially simple form: The only conditioning variable is the cumulative price change over the past k periods; that is, ~Pt21 2 Pt2k21!. As it turns out, the exact value of k is not that important, so in what follows we simplify things by setting k 5 1, and using ~Pt21 2 Pt22! [ DPt21 as the time-t forecasting variable.15 What is more significant is that we restrict the momentum traders to making univariate forecasts based on past price changes. If, in contrast, we allow them to make forecasts using n lags of price changes, giving different weights to each of the n lags, we suspect that for sufficiently large n, many of the results we present below would go away. Again, the motivation is a crude notion of bounded rationality: Momentum traders simply do not have the computational horsepower to run complicated multivariate regressions. With k 5 1, the order flow from generation-t momentum traders, Ft, is of the form Ft 5 A 1 fDPt21, ~3! where the constant A and the elasticity parameter f have to be determined from optimization on the part of the momentum traders. This order flow must be absorbed by the newswatchers. We assume that the newswatchers treat the order flow as an uninformative supply shock. This is consistent with our prior assumption that the newswatchers do not condition on prices. Given that the order flow is a linear function of past price changes, if we allowed the newswatchers to extract information from it, we would be indirectly allowing them to learn from prices. To streamline things, the order flow from the newswatchers is the only source of supply variation in the model. Given that there are j generations of momentum traders in the market at any point in time, the aggregate supply St absorbed by the newswatchers is given by: St 5 Q 2 ( i51 j Ft112i 5 Q 2 jA 2 ( i51 j fDPt2i. ~4! We continue to assume that, at any time t, the newswatchers act as if they buy and hold until the liquidating dividend at time T. This implies that prices are given exactly as in equation ~1!, except that the fixed supply Q is replaced by the variable St, yielding Pt 5 Dt 1 $~z 2 1!et11 1 ~z 2 2!et121{{{1et1z21%0z 2 Q 1 jA 1 ( i51 j fDPt2i. ~5! 15 In the NBER working paper version, we provide a detailed analysis of the comparative statics properties of the model with respect to k. Underreaction, Momentum Trading, and Overreaction 2151
2152 The Journal of Finance In most of the analysis,the constants Q and A play no role,so we disregard them when it is convenient to do so. As noted previously,newswatchers'behavior is time-inconsistent.Al- though at time t they base their demands on the premise that they do not retrade,they violate this to the extent that they are active in later periods. We adopt this time-inconsistent shortcut because it dramatically simplifies the analysis.Otherwise,we face a complex dynamic programming problem, with newswatcher demands at time t depending not only on their forecasts of the liquidating dividend D but also on their predictions for the entire future path of prices. Two points can be offered in defense of this time-inconsistent simplifica- tion.First,it fits with the basic spirit of our approach,which is to have the newswatchers behave in a simple,boundedly rational fashion.Second,we have no reason to believe that it colors any of our important qualitative conclusions.Loosely speaking,we are closing down a "frontrunning"effect, whereby newswatchers buy more aggressively at time t in response to good news,since they know that the news will kick off a series of momentum trades and thereby drive prices up further over the next several periods.16 Such frontrunning by newswatchers may speed the response of prices to information,thereby mitigating underreaction,but in our setup it can never wholly eliminate either underreaction or overreaction.17 C.The Nature of Equilibrium With all of the assumptions in place,we are now ready to solve the model. The only task is to calculate the equilibrium value of Disregarding con- stants,optimization on the part of the momentum traders implies AP:-1=yEM(Pti-P:)/varM(Pti-P), (6) where y is the aggregate risk tolerance of the momentum traders,and Ev and vary denote the mean and variance given their information,which is just△P:-l.We can rewrite equation(⑥)as 中=ycov(P+i-P,△P-1)/var(△P)varM(P+i-P)} (7) The definition of equilibrium is a fixed point such that o is given by equa- tion (7),while at the same time price dynamics satisfy equation (5).We restrict ourselves to studying covariance-stationary equilibria.In Appendix A we prove that a necessary condition for a conjectured equilibrium process to be covariance stationary is that<1.Such an equilibrium may not exist for arbitrary parameter values,and we are also unable to generically rule out the possibility of multiple equilibria.However,we prove in the ap- pendix that existence is guaranteed so long as the risk tolerance y of the 16 This sort of frontrunning effect is at the center of DeLong et al.(1990). 17 See the NBER working paper version for a fuller treatment of this frontrunning issue
In most of the analysis, the constants Q and A play no role, so we disregard them when it is convenient to do so. As noted previously, newswatchers’ behavior is time-inconsistent. Although at time t they base their demands on the premise that they do not retrade, they violate this to the extent that they are active in later periods. We adopt this time-inconsistent shortcut because it dramatically simplifies the analysis. Otherwise, we face a complex dynamic programming problem, with newswatcher demands at time t depending not only on their forecasts of the liquidating dividend DT but also on their predictions for the entire future path of prices. Two points can be offered in defense of this time-inconsistent simplification. First, it fits with the basic spirit of our approach, which is to have the newswatchers behave in a simple, boundedly rational fashion. Second, we have no reason to believe that it colors any of our important qualitative conclusions. Loosely speaking, we are closing down a “frontrunning” effect, whereby newswatchers buy more aggressively at time t in response to good news, since they know that the news will kick off a series of momentum trades and thereby drive prices up further over the next several periods.16 Such frontrunning by newswatchers may speed the response of prices to information, thereby mitigating underreaction, but in our setup it can never wholly eliminate either underreaction or overreaction.17 C. The Nature of Equilibrium With all of the assumptions in place, we are now ready to solve the model. The only task is to calculate the equilibrium value of f. Disregarding constants, optimization on the part of the momentum traders implies fDPt21 5 gEM ~Pt1j 2 Pt!0varM ~Pt1j 2 Pt!, ~6! where g is the aggregate risk tolerance of the momentum traders, and EM and varM denote the mean and variance given their information, which is just DPt21. We can rewrite equation ~6! as f 5 g cov~Pt1j 2 Pt,DPt21!0$var~DP!varM ~Pt1j 2 Pt!%. ~7! The definition of equilibrium is a fixed point such that f is given by equation ~7!, while at the same time price dynamics satisfy equation ~5!. We restrict ourselves to studying covariance-stationary equilibria. In Appendix A we prove that a necessary condition for a conjectured equilibrium process to be covariance stationary is that 6f6 , 1. Such an equilibrium may not exist for arbitrary parameter values, and we are also unable to generically rule out the possibility of multiple equilibria. However, we prove in the appendix that existence is guaranteed so long as the risk tolerance g of the 16 This sort of frontrunning effect is at the center of DeLong et al. ~1990!. 17 See the NBER working paper version for a fuller treatment of this frontrunning issue. 2152 The Journal of Finance