11 4 Semiparametric regression We next consider the empirical likelihood method in the context of semiparametric regression. 4.1 Partial linear regression model Let us first consider the partial linear model,defined as follows Yi=BXi+g(Zi)+ei, for i=1,...,n, (31) where the response Yi and the explanatory variable Zi are one-dimensional,B and Xi are p-dimensional(p 1),and g()is a continuous,but unknown nuisance function.It is assumed that the error ei satisfies E(eilXi,Zi)=0 and Var(eiXi,Zi)=a2 Our goal is to construct confidence regions or test hypotheses concerning the vector Bo of true regression coefficients.For this,we first need to estimate the unknown function g.Define for fixed B. 三- ga()= where Kh(u)=K(u/h)/h,h hn is a smoothing parameter and K is a (one- dimensional)kernel function (probability density function).Instead of the above lo- cal constant estimator,we could also use e.g.local polynomial estimators.The idea is now to mimic the empirical likelihood method developed for parametric regression but considering Y-ge(Z)as the new (artificial)response.This leads to the following likelihood ratio function Rn()=maxp) =1 where the maximum is taken over all n-tuples (p1,...,Pn)that satisfy p≥0(i=1,,n, =1 三nx+西}-x-训-a Note that the latter constraint is equivalent to 三n所-产0-0 where x=X-∑ Kh(Zi-Zj) Kh(Zi-Zj) 名a1K(G-石 are estimators of Xi-E(XZ=Zi)and Yi-E(YIZ=Zi)respectively.Wang and Jing(2003)showed that under certain regularity conditions,the following result holds true: rn(Bo)=-2log Rn(8)
11 4 Semiparametric regression We next consider the empirical likelihood method in the context of semiparametric regression. 4.1 Partial linear regression model Let us first consider the partial linear model, defined as follows : Yi = β T Xi + g(Zi) + εi , for i = 1, . . . , n, (31) where the response Yi and the explanatory variable Zi are one-dimensional, β and Xi are p-dimensional (p ≥ 1), and g(·) is a continuous, but unknown nuisance function. It is assumed that the error εi satisfies E(εi |Xi , Zi) = 0 and Var(εi |Xi , Zi) = σ 2 . Our goal is to construct confidence regions or test hypotheses concerning the vector β0 of true regression coefficients. For this, we first need to estimate the unknown function g. Define for fixed β, bgβ(z) = Xn i=1 Kh(z − Zi) Pn j=1 Kh(z − Zj ) (Yi − β T Xi), where Kh(u) = K(u/h)/h, h = hn is a smoothing parameter and K is a (onedimensional) kernel function (probability density function). Instead of the above local constant estimator, we could also use e.g. local polynomial estimators. The idea is now to mimic the empirical likelihood method developed for parametric regression, but considering Y − bgβ(Z) as the new (artificial) response. This leads to the following likelihood ratio function : Rn(β) = maxYn i=1 (npi), where the maximum is taken over all n-tuples (p1, . . . , pn) that satisfy pi ≥ 0 (i = 1, . . . , n), Xn i=1 pi = 1, Xn i=1 pi n Xi + ∂bgβ(Zi) ∂β o (Yi − β T Xi − bgβ(Zi)) = 0. Note that the latter constraint is equivalent to Xn i=1 piXei(Yei − β T Xei) = 0, where Xei = Xi − Xn j=1 Kh(Zi − Zj ) Pn k=1 Kh(Zi − Zk) Xj and Yei = Yi − Xn j=1 Kh(Zi − Zj ) Pn k=1 Kh(Zi − Zk) Yj are estimators of Xi − E(X|Z = Zi) and Yi − E(Y |Z = Zi) respectively. Wang and Jing (2003) showed that under certain regularity conditions, the following result holds true : rn(β0) = −2 log Rn(β0) d→ χ 2 p
12 This result shows that asymptotically,the estimation of the unknown function g has no influence on the asymptotic limit,as we get exactly the same limit as in the parametric case,i.e.as in the case where the function g would be known.This result is important, as it shows that we can obtain empirical likelihood confidence regions for Bo without estimating any variance. When the interest lies in testing the validity of the whole partial linear model by means of an EL approach (instead of testing only the value of the parameter vector B0),one can use the method developed by Chen and Van Keilegom (2009)and Van Keilegom,Sanchez Sellero and Gonzalez Manteiga (2008).In the former paper the authors developed a general smoothing based EL approach to test the validity of any semiparametric model,whereas the latter paper considers the same testing problem, but by using an EL approach based on marked empirical processes,which is quite different in nature from the former approach.See also Section 7 for more details. 4.2 Single-index regression model Let us now consider the case of single-index models.Suppose that the relation between the (one-dimensional)response Yi and the p-dimensional vector Xi of explanatory variables is given by Yi=g(BTXi)+Ei, (32) where g is an unknown but smooth nuisance function,and the error i satisfies E(iXi)= 0 and Var(iXi)=o.Let Bo be the true parameter vector.In order to iden- tify the model,we suppose that llll =1,where ll ll denotes the Euclidean norm. For any B=(3i,,p)satisfying=1 and any1≤r≤p,let B(r)= p防盐and suppoingtht>0 we can write Now,let Ja(r)be the p x (p-1)Jacobian matrix,given by 03 JB(r)= 08=(1,p7, with Ys(sr)a unit vector with sth component equal to one,and r=-(1- ()-1/28().Now,it can be easily seen that E[()]=0(i=1,....n),where i(B()=[Yi-g(Bxi)]g'(BXi)JXi Hence,it seems natural to use the (())'s as building blocks of the empirical likeli- hood ratio.However,since they depend on the unknown functions g and g',we first replace them by suitable estimators.Let 6-名产 、Wni(tB,h)Y -名”m而 be local linear estimators of g(t)and g'(t),where Wni(t:B,h)=Kh(BXi-t)[Sn2(t;B,h)- (BTXi-t)Sn1(t;B,h)],Wni(t:B,h)=Kn(BTXi-t)[(BTXi-t)Sno(t;B,h)-Sn1(t;B,h)]
12 This result shows that asymptotically, the estimation of the unknown function g has no influence on the asymptotic limit, as we get exactly the same limit as in the parametric case, i.e. as in the case where the function g would be known. This result is important, as it shows that we can obtain empirical likelihood confidence regions for β0 without estimating any variance. When the interest lies in testing the validity of the whole partial linear model by means of an EL approach (instead of testing only the value of the parameter vector β0), one can use the method developed by Chen and Van Keilegom (2009) and Van Keilegom, S´anchez Sellero and Gonz´alez Manteiga (2008). In the former paper the authors developed a general smoothing based EL approach to test the validity of any semiparametric model, whereas the latter paper considers the same testing problem, but by using an EL approach based on marked empirical processes, which is quite different in nature from the former approach. See also Section 7 for more details. 4.2 Single-index regression model Let us now consider the case of single-index models. Suppose that the relation between the (one-dimensional) response Yi and the p-dimensional vector Xi of explanatory variables is given by Yi = g(β T Xi) + εi , (32) where g is an unknown but smooth nuisance function, and the error εi satisfies E(εi |Xi) = 0 and Var(εi |Xi) = σ 2 . Let β0 be the true parameter vector. In order to identify the model, we suppose that kβk = 1, where k · k denotes the Euclidean norm. For any β = (β1, . . . , βp) T satisfying kβk = 1 and any 1 ≤ r ≤ p, let β (r) = (β1, . . . , βr−1, βr+1, . . . , βp) T , and supposing that βr > 0, we can write β = (β1, . . . , βr−1,(1− kβ (r) k 2 ) 1/2 , βr+1, . . . , βp) T . Now, let Jβ(r) be the p × (p − 1) Jacobian matrix, given by Jβ(r) = ∂β ∂β(r) = (γ1, . . . , γp) T , with γs (s 6= r) a unit vector with sth component equal to one, and γr = −(1 − kβ (r) k 2 ) −1/2β (r) . Now, it can be easily seen that E[ξi(β (r) 0 )] = 0 (i = 1, . . . , n), where ξi(β (r) ) = [Yi − g(β T Xi)]g ′ (β T Xi)J T β(r)Xi . Hence, it seems natural to use the ξi(β (r) )’s as building blocks of the empirical likelihood ratio. However, since they depend on the unknown functions g and g ′ , we first replace them by suitable estimators. Let bg(t; β) = Xn i=1 P Wni(t; β, h)Yi n j=1 Wnj(t; β, h) , bg ′ (t; β) = Xn i=1 P fWni(t; β, h)Yi n j=1 Wnj(t; β, h) , be local linear estimators of g(t) and g ′ (t), where Wni(t; β, h) = Kh(β T Xi−t)[Sn2(t; β, h)− (β T Xi−t)Sn1(t; β, h)], fWni(t; β, h) = Kh(β T Xi−t)[(β T Xi−t)Sn0(t; β, h)−Sn1(t; β, h)]