Ch. 21 Univariate Unit Root process 1 Introduction Consider OLS estimation of a AR(1)process, Yt= pYt-1+ut where ut w ii d (0, 0), and Yo=0. The OLS estimator of p is given by and we also have (1) t=1 When the true value of p is less than l in absolute value, then y(so does y ?) is a covariance-stationary process. Applying LLN for a covariance process(see 9.19 of Ch. 4)we have 21) T·a4 1-p2 /T=a2/(1-p2 t=1 Since Yt-lut is a martingale difference sequence with variance E(Y-1)2 and t=1 Applying CLT for a martingale difference sequence to the second term in the righthand side of (1)we have Y-1ut)→N(0 t=1
Ch. 21 Univariate Unit Root Process 1 Introduction Consider OLS estimation of a AR(1) process, Yt = ρYt−1 + ut , where ut ∼ i.i.d.(0,σ2 ), and Y0 = 0. The OLS estimator of ρ is given by ρˆT = PT t=1 Yt−1Yt PT t=1 Y 2 t−1 = X T t=1 Y 2 t−1 !−1 X T t=1 Yt−1Yt ! and we also have (ˆρT − ρ) = X T t=1 Y 2 t−1 !−1 X T t=1 Yt−1ut ! . (1) When the true value of ρ is less than 1 in absolute value, then Yt (so does Y 2 t ?) is a covariance-stationary process. Applying LLN for a covariance process (see 9.19 of Ch. 4) we have ( X T t=1 Y 2 t−1 )/T p −→ E[(X T t=1 Y 2 t−1 )/T] = T · σ 2 1 − ρ 2 /T = σ 2 /(1 − ρ 2 ). (2) Since Yt−1ut is a martingale difference sequence with variance E(Yt−1ut) 2 = σ 2 σ 2 1 − ρ 2 and 1 T X T t=1 σ 2 σ 2 1 − ρ 2 → σ 2 σ 2 1 − ρ 2 . Applying CLT for a martingale difference sequence to the second term in the righthand side of (1) we have 1 √ T ( X T t=1 Yt−1ut) L−→ N(0,σ2 σ 2 1 − ρ 2 ). (3) 1
Substituting(2)and(3)to(1)we have N(0 N(0,1 (6)is not valid for the case when p= 1, however. To see this, recall that the variance of Yt when p= l is to then the Lln as in(2) would not be valid since if we apply clt, then it would incur that ∑2mE(∑y2/=a2一∞ t=1 t=1 Similar reason would show that the CLT would not apply for v(> In stead, T-(Et Yt-1ut)converges. )To obtain the limiting distribution, as we shall prove in the following, for(PT-p)in the unit root case, it turn out that we have to multiply (er -p) by T rather than by VT 1 T t=1 t=1 Thus, the unit root coefficient converge at a faster rate(T)than a coefficient for stationary regression( which converges at VT)
Substituting (2) and (3) to (1) we have √ T(ˆρT − ρ) = [(X T t=1 Y 2 t−1 )/T] −1 · √ T[(X T t=1 Yt−1ut)/T] (4) L−→ σ 2 1 − ρ 2 −1 N(0,σ2 σ 2 1 − ρ 2 ) (5) ≡ N(0, 1 − ρ 2 ). (6) (6) is not valid for the case when ρ = 1, however. To see this, recall that the variance of Yt when ρ = 1 is tσ2 , then the LLN as in (2) would not be valid since if we apply CLT, then it would incur that ( X T t=1 Y 2 t−1 )/T p −→ E[(X T t=1 Y 2 t−1 )/T] = σ 2 PT t=1 t T → ∞. (7) Similar reason would show that the CLT would not apply for √ 1 T ( PT t=1 Yt−1ut). ( In stead, T −1 ( PT t=1 Yt−1ut) converges.) To obtain the limiting distribution, as we shall prove in the following, for (ˆρT − ρ) in the unit root case, it turn out that we have to multiply (ˆρT − ρ) by T rather than by √ T: T(ˆρT − ρ) = " ( X T t=1 Y 2 t−1 )/T2 #−1 " T −1 ( X T t=1 Yt−1ut) # . (8) Thus, the unit root coefficient converge at a faster rate (T) than a coefficient for stationary regression ( which converges at √ T). 2
2 Unit Root Asymptotic theories In this section, we develop tools to handle the asymptotics of unit root process 2.1 Random walks and wiener Process Consider a random walk Yt=Yt-1+Et where Yo =0 and Et is i.i.d. with mean zero and Var(et)=02<oo By repeated substitution we have +Et=Yt-2+Et-1+Et S=」 Before we can study the behavior of estimators based on random walks, we must understand in more detail the behavior of the random walk process itsel Thus, consider the random walk Yt, we can write Rescaling, we have /o=T12∑en t=1 (It is important to note here should be read as Var(T-12 E+Et)=ET-I Et)21 Tg==o2 According to the Lindeberg-Levy CLT, we have T-12/o-→N(0,1) More generally, we can construct a variable Yr(r) from the partial sum of Et Trl
2 Unit Root Asymptotic Theories In this section, we develop tools to handle the asymptotics of unit root process. 2.1 Random Walks and Wiener Process Consider a random walk, Yt = Yt−1 + εt , where Y0 = 0 and εt is i.i.d. with mean zero and V ar(εt) = σ 2 < ∞. By repeated substitution we have Yt = Yt−1 + εt = Yt−2 + εt−1 + εt = Y0 + X t s=1 εs = X t s=1 εs. Before we can study the behavior of estimators based on random walks, we must understand in more detail the behavior of the random walk process itself. Thus, consider the random walk {Yt}, we can write YT = X T t=1 εt . Rescaling, we have T −1/2YT /σ = T −1/2X T t=1 εt/σ. (It is important to note here σ 2 should be read as V ar(T −1/2 PT t=1 εt) = E[T −1 ( Pεt) 2 ] = T·σ 2 T = σ 2 .) According to the Lindeberg-L´evy CLT, we have T −1/2YT /σ L−→ N(0, 1). More generally, we can construct a variable YT (r) from the partial sum of εt YT (r) = [T r] X∗ t=1 εt , 3
where 0<r< l and Tr* denotes the largest integer that is less than or equal Applying the same rescaling, we define Wr()≡T-1/Yr(r)/a (9) Now Trl W()=T1()()∑e/ and for a given r, the term in the brackets again obeys the CLT and converges in distribution to N(0, 1), whereas T-12([Tr]*1/2 converges to r1/2. It follows from standard arguments that Wr(r) converges in distribution to N(O, r) We have written Wr(r) so that it is clear that Wr can be considered to be a function of r. Also, because Wr(r) depends on the E s, it is random. There- fore, we can think of Wr(r) as defining a random function of r, which we write Wr(. Just as the CLT provides conditions ensuring that the rescaled random walk T-1/Yr/o(which we can now write as Wr(1) converges, as T become large, to a well-defined limiting random variables(the standard normal), the function central limit theorem(FCLt) provides conditions ensuring that the random function Wr( converge, as T become large, to a well-defined limit ran- dom function, say W(. The word "Functional"in Functional Central Limit theorem appears because this limit is a function of r Some further properties of random walk, suitably rescaled, are in the follow P If Yt is a random walk, then Yta -Yis is independent of Yt2 -Yt for all ti <t2< t3 <t4. Consequently, W(ra)-Wr(r3) is independent of Wi(r2)-Wr(ri) for all T·r=t1,i=1
where 0 ≤ r ≤ 1 and [Tr] ∗ denotes the largest integer that is less than or equal to Tr. Applying the same rescaling, we define WT (r) ≡ T −1/2YT (r)/σ (9) = T −1/2 [T r] X∗ t=1 εt/σ. (10) Now WT (r) = T −1/2 ([Tr] ∗ ) 1/2 ([Tr] ∗ ) −1/2 [T r] X∗ t=1 εt/σ , and for a given r, the term in the brackets {·} again obeys the CLT and converges in distribution to N(0, 1), whereas T −1/2 ([Tr] ∗ ) 1/2 converges to r 1/2 . It follows from standard arguments that WT (r) converges in distribution to N(0,r). We have written WT (r) so that it is clear that WT can be considered to be a function of r. Also, because WT (r) depends on the ε ′ t s, it is random. Therefore, we can think of WT (r) as defining a random function of r, which we write WT (·). Just as the CLT provides conditions ensuring that the rescaled random walk T −1/2YT /σ (which we can now write as WT (1)) converges, as T become large, to a well-defined limiting random variables (the standard normal), the function central limit theorem (FCLT) provides conditions ensuring that the random function WT (·) converge, as T become large, to a well-defined limit random function, say W(·). The word ”Functional” in Functional Central Limit theorem appears because this limit is a function of r. Some further properties of random walk, suitably rescaled, are in the following. Proposition: If Yt is a random walk, then Yt4 − Yt3 is independent of Yt2 − Yt1 for all t1 < t2 < t3 < t4. Consequently, Wt(r4) − WT (r3) is independent of Wt(r2) − WT (r1) for all [T · ri ] ∗ = ti ,i = 1,..., 4. 4
Note that +Et4-1+….+Et +ct2-1+….+E Since(Eta, Eta-1., Et1+1) is independent of (Et4, Et-1, . Eta+1) it follow that Yi.- Yta and Yto -yt 1 are independe Wr(r4)-Wr(r3) 2(et4+et4-1+…+et2+1)/ is independent of (et2+E2-1+….+t1+1) P For given0≤a<b≤1,Wr(b)-Wr(a)→N(0,b-a)asT→ Proof: by definition (b)-Wr(a) Et t=ITa+l =T-1/(b-md)2x(rb-r)∑s t=[Ta1*+1 The last term(Tb"-Tal*-1/22-p -t=ITa] +1 Et -N(O, 1) by the CLt, and T-12(T-Ta)12=(T-Tam)2→(b-a)/2asT→∞. Hence Wr(b)-Wr(a)-N(0, b-a)
Proof: Note that Yt4 − Yt3 = εt4 + εt4−1 + ... + εt3+1, Yt2 − Yt1 = εt2 + εt2−1 + ... + εt1+1. Since (εt2 ,εt2−1,...,εt1+1) is independent of (εt4 ,εt4−1,...,εt3+1) it follow that Yt4 − Yt3 and Yt2 − Yt1 are independent. Consequently, WT (r4) − WT (r3) = T −1/2 (εt4 + εt4−1 + ... + εt3+1)/σ is independent of WT (r2) − WT (r1) = T −1/2 (εt2 + εt2−1 + ... + εt1+1)/σ. Proposition: For given 0 ≤ a < b ≤ 1, WT (b) − WT (a) L−→ N(0,b − a) as T → ∞. Proof: By definition WT (b) − WT (a) = T −1/2 [T b] X∗ t=[T a] ∗+1 εt = T −1/2 ([Tb] ∗ − [Ta] ∗ ) 1/2 × ([Tb] ∗ − [Ta] ∗ ) −1/2 [T b] X∗ t=[T a] ∗+1 εt . The last term ([Tb] ∗ − [Ta] ∗ ) −1/2 P[T b] ∗ t=[T a] ∗+1 εt L−→ N(0, 1) by the CLT, and T −1/2 ([Tb] ∗ − [Ta] ∗ ) 1/2 = (([Tb] ∗ − [Ta] ∗ )/T) 1/2 → (b − a) 1/2 as T → ∞. Hence WT (b) − WT (a) L−→ N(0,b − a). 5