30 Section 8.2.Asymptotic normality We assume that Xn=(X1,...,Xn),where the Xi's are i.i.d.with common density p(x;fo)∈P={p(x;0):0∈Θ}.We assume that 0 o is identified in the sense that if0≠0oand0∈Θ,then p(x;0)p(x;00)with respect to the dominating measure u. In order to prove asymptotic normality,we will need certain regularity conditions.Some of these were encountered in the proof of consistency,but we will need some additional assumptions
30 Section 8.2. Asymptotic normality We assume that Xn = (X1,...,Xn), where the Xi’s are i.i.d. with common density p(x; θ0) ∈ P = {p(x; θ) : θ ∈ Θ}. We assume that θ0 is identified in the sense that if θ = θ0 and θ ∈ Θ, then p(x; θ) = p(x; θ0) with respect to the dominating measure µ. In order to prove asymptotic normality, we will need certain regularity conditions. Some of these were encountered in the proof of consistency, but we will need some additional assumptions.
31 Regularity Conditions i.00 lies in the interior of which is assumed to be a compact subset of Rk. i.logp(x;0)is continuous at each0∈Θfor all x∈X(a.e.will suffice). ii.|logp(x;f)川≤d(x)for all0∈Θand Eo[d(X)】<o. iv.p(x;0)is twice continuously differentiable and p(x;0)>0 in a neighborhood,N,of 00. v.‖pgPI‖≤e(c)for all0∈Nand∫e(r)du(c)<o
31 Regularity Conditions i. θ0 lies in the interior of Θ, which is assumed to be a compact subset of Rk. ii. log p(x; θ) is continuous at each θ ∈ Θ for all x ∈ X (a.e. will suffice). iii. | log p(x; θ)| ≤ d(x) for all θ ∈ Θ and Eθ0 [d(X)] < ∞. iv. p(x; θ) is twice continuously differentiable and p(x; θ) > 0 in a neighborhood, N , of θ0. v. ∂p(x;θ) ∂θ ≤ e(x) for all θ ∈ N and e(x)dµ(x) < ∞
32 vi.Defining the score vector (x;0)=(6 log p(x;0)/∂91,.,0log p(x;0)/a0k)1 then we assume that I(00)=E(X;0o)(X;00)exists and is non-singular. vi.≤fa)for all8∈Vad Eo(X】<o. .iIe%2l≤g()for all9∈V and)du(o)<. vii Theorem 8.6:If these 8 regularity conditions hold,then )-0)N(0,1-(0))
32 vi. Defining the score vector ψ(x; θ)=(∂ log p(x; θ)/∂θ1,...,∂ log p(x; θ)/∂θk) then we assume that I(θ0) = Eθ0 [ψ(X; θ0)ψ(X; θ0)] exists and is non-singular. vii. ∂2 log p(x;θ) ∂θ∂θ ≤ f(x) for all θ ∈ N and Eθ0 [f(X)] < ∞. viii. ∂2p(x;θ) ∂θ∂θ ≤ g(x) for all θ ∈ N and g(x)dµ(x) < ∞. Theorem 8.6: If these 8 regularity conditions hold, then √n(ˆθ(Xn) − θ0) D(θ0) → N(0, I−1(θ0))
33 Proof:Note that conditions i.-iii.guarantee that the MLE is consistent.Since 0o is assumed to lie in the interior of we know that with sufficiently large probability that the MLE will lie in N and cannot be on the boundary.This implies that the maximum is also a local maximum,which implies that aQ(0(Xn);Xn)/0=0 or品∑-1(X;(Xn》=0.That is,the MLE is the solution to the score equations. By the mean value theorem,applied to each element of the score vector,we have that 0=ax:x》=X:o+(-iX小mX,-) i=1 Note that J(n)is a k x k random matrix where the jth row of the matrix is the jth row of In evaluated at n(Xn)where (n)is an intermediate value between (n)and 00.n(n) may be different from row to row but it will be consistent for 00
33 Proof: Note that conditions i. - iii. guarantee that the MLE is consistent. Since θ0 is assumed to lie in the interior of Θ, we know that with sufficiently large probability that the MLE will lie in N and cannot be on the boundary. This implies that the maximum is also a local maximum, which implies that ∂Q( ˆ θ(Xn); Xn)/∂θ = 0 or 1 n ni=1 ψ(Xi; ˆθ(Xn)) = 0. That is, the MLE is the solution to the score equations. By the mean value theorem, applied to each element of the score vector, we have that 0 = 1√n ni=1 ψ(Xi; ˆθ(Xn)) = 1√n ni=1 ψ(Xi; θ0)+{−J∗n(Xn)}√n(ˆθ(Xn)−θ0) Note that J∗n(Xn) is a k × k random matrix where the jth row of the matrix is the jth row of Jn evaluated at θ∗jn(Xn) where θ∗jn(Xn) is an intermediate value between ˆ θ(Xn) and θ0. θ∗jn(Xn) may be different from row to row but it will be consistent for θ0.
34 We will establish two facts: F1:(:0)N(0,I(0)) F2:J(Xn)P9oI(0)】 By assumption vi.,we know that I(0o)is non-singular.The inversion of a non-singular matrix is a continuous function in 0. Since (n)I(0o),we know that(n))I(00)-1. This also means that with sufficiently large probability,as n gets large,(Xn)is invertible. Therefore,we know that V0x-)=(Xa0X:则 TL i=1
34 We will establish two facts: F1: √1n ni=1 ψ(Xi; θ0) D(θ0) → N(0, I(θ0)) F2: J∗n(Xn) P (θ0) → I(θ0) By assumption vi., we know that I(θ0) is non-singular. The inversion of a non-singular matrix is a continuous function in θ. Since J∗n(Xn) P→ I(θ0), we know that {J∗n(Xn)}−1 P→ I(θ0)−1. This also means that with sufficiently large probability, as n gets large, J∗n(Xn) is invertible. Therefore, we know that √n(ˆθ(Xn) − θ0) = {J∗n(Xn)}−1 1√n ni=1 ψ(Xi; θ0)