trace Mx trace(Ir)-trace(X(XX)X) trace(Ir)-trace((X'X)XX) T-k Corollary An unbiased estimator of g ee T一k Exercise Reproduce the estimate results at Table 4.2 p. 52, for B, s2(X'X)-I and e'e 2.3 Partitioned Regression Estimation It is common to specify a multiple regression model, when in fact, interest centers on only one of a subset of the full set of variables. Let k1+k2= k we can express the Ols result in isolation as β1 +e X1月1+X2B2+ where X1andx2 areTxk1andT×k2, respectively;β1andβ2arek1×1and k2×1, respectively What is the algebraic solution for B2? Denote M1=I-XI(X1X1-X1 then Miy MiX1B1+M1X2B2+Mie M1X22+
but trace MX = trace (IT ) − trace (X(X′X) −1X′ ) = trace (IT ) − trace ((X′X) −1X′X) = T − k. Corollary: An unbiased estimator of σ 2 is s 2 = e ′e T − k . Exercise: Reproduce the estimate results at Table 4.2 p. 52, for βˆ, s 2 (X′X) −1 and e ′e. 2.3 Partitioned Regression Estimation It is common to specify a multiple regression model, when in fact, interest centers on only one of a subset of the full set of variables. Let k1 +k2 = k we can express the OLS result in isolation as y = Xβˆ + e = X1 X2 βˆ 1 βˆ 2 + e = X1βˆ 1 + X2βˆ 2 + e, where X1 and X2 are T × k1 and T × k2, respectively; βˆ 1 and βˆ 2 are k1 × 1 and k2 × 1, respectively. What is the algebraic solution for βˆ 2 ? Denote M1 = I − X1(X′ 1X1) −1X′ 1 , then M1y = M1X1βˆ 1 + M1X2βˆ 2 + M1e = M1X2βˆ 2 + e, 6
using the fact that MiXI=0 and Mie=e. Multiplying X'2 on the above equation and using the fact that Xe 1=0 we have X2Miy X2MiX2B2+X2e=X2MiX2B Therefore B, can be expressed in isolation as B2=(X2M1X2)-4X2M1y (X2X2)X2y, where X2=MiX2 and y=Miy, are vectors of residual from the regression of X2 and y on X1, respectively Theorem 2(Frisch-Waugh) The subvector B, is the set of coefficients obtained when the residuals from a regression of y on Xi alone are regressed on the set of residuals obtained when each column of X is regressed on X Consider a simple regression with a constant, then the slope estimator can also be obtained from a data-demeaned regression without constant 2.4 The Restricted Least Squares Estimators Suppose that we explicitly imposes the restrictions of the hypothesis in the re- gression(take the example of LM test). The restricted least square estimator is obtained as the solution to Minimize SSe(S)=(y-Xp'(y-XB) subject to RB=9, where R is a known J x k matrix and g is values of these linear restrictions
using the fact that M1X1 = 0 and M1e = e. Multiplying X′ 2 on the above equation and using the fact that X′e = X′ 1 X′ 2 e = X′ 1e X′ 2e = 0 we have X′ 2M1y = X′ 2M1X2βˆ 2 + X′ 2e = X′ 2M1X2βˆ 2 . Therefore βˆ 2 can be expressed in isolation as βˆ 2 = (X′ 2M1X2 ) −1X′ 2M1y = (X∗ ′ 2 X∗ 2 ) −1 X∗ ′ 2 y ∗ , where X∗ ′ 2 = M1X2 and y ∗ = M1y, are vectors of residual from the regression of X2 and y on X1, respectively. Theorem 2 (Frisch-Waugh): The subvector βˆ 2 is the set of coefficients obtained when the residuals from a regression of y on X1 alone are regressed on the set of residuals obtained when each column of X2 is regressed on X1. Example: Consider a simple regression with a constant, then the slope estimator can also be obtained from a data-demeaned regression without constant. 2.4 The Restricted Least Squares Estimators Suppose that we explicitly imposes the restrictions of the hypothesis in the regression (take the example of LM test). The restricted least square estimator is obtained as the solution to M inimizeβ SSE(β) = (y − Xβ) ′ (y − Xβ) subject to Rβ = q, where R is a known J × k matrix and q is values of these linear restrictions. 7
A Lagrangean function for this problem can be written L' (B, A)=(y-XB'y-XB)-2X(RB-q), where A isJx1 The solutions B" and A will satisfy the necessary conditions OL* 2x(y-X)-2RA=0, OL dax 2( 9=0(remember ax Dividing through by 2 and expanding terms produces the partitioned matrix equation X'X R'lB y R 0 Assuming that the partitioned matrix in brackets is nonsingular, then Using the partition inverse rule of A 11A1,7-1 A11(I+A12F2A21A11)-A1A12F2 ao a F2A21A11 F where F2=(A22-A21A11 A12)-1 we have the restricted least squared estimator B*=B-XX-RR(XX)RT(RB-q A=[R(X'X-R]-(RB-q
A Lagrangean function for this problem can be written L ∗ (β, λ) = (y − Xβ) ′ (y − Xβ) − 2λ ′ (Rβ − q), where λ is J × 1. The solutions βˆ∗ and λˆ will satisfy the necessary conditions ∂L∗ ∂βˆ∗ = −2X′ (y − Xβˆ ∗ ) − 2R′λˆ = 0, ∂L∗ ∂λˆ = 2(Rβˆ∗ − q) = 0 (remember ∂a ′x ∂x = a) Dividing through by 2 and expanding terms produces the partitioned matrix equation X′X R′ R 0 " βˆ∗ λˆ # = X′y q , or Wˆd ∗ = v. Assuming that the partitioned matrix in brackets is nonsingular, then ˆd ∗ = W−1 v. Using the partition inverse rule of A11 A12 A21 A22 −1 = A−1 11 (I + A12F2A21A−1 11 ) −A−1 11 A12F2 −F2A21A−1 11 F2 , where F2 = (A22 − A21A−1 11 A12) −1 , we have the restricted least squared estimator βˆ∗ = βˆ − (X′X) −1R ′ [R(X′X) −1R ′ ] −1 (Rβˆ − q), and λˆ = [R(X′X) −1R ′ ] −1 (Rβˆ − q). 8
Exercise Show that Var(B")-Var(B)is a nonpositive definite matrix The above result of exercise holds whether or not the restriction are true One way to interpret this reduction in variance is as the value of the information contained in the restriction. See Table 6.2 at p. 103 Let e equal y-xB', i.e., the residuals vector from the restricted least square estimator, then using the familiar device, e*=y-XB-X(B-B)=e-X(B-B) The'restricted' sums of squared residuals is (B*-BXX(B since X'X is a positive definite matrix 2.5 Measurement of goodness of fit Denote the dependent variable's 'fitted value from dependent variables and OLS estimator, y, to be y=xB, that is y=y+e yy Proof Using the fact that X'y=XXB, we have e y -2BXy+BXx6 yy=yy Three measurements of variation are defined as following (a). SST(Sums of Squared Total variation)=EL(Y-Y)2=y'Moy, (b). SSR(Sums of Squared Regression variation)=E(Y-Y)2=y'Moy
Exercise: Show that V ar(βˆ∗ ) − V ar(βˆ) is a nonpositive definite matrix. The above result of exercise holds whether or not the restriction are true. One way to interpret this reduction in variance is as the value of the information contained in the restriction. See Table 6.2 at p. 103. Let e ∗ equal y − Xβˆ∗ , i.e., the residuals vector from the restricted least square estimator, then using the familiar device, e ∗ = y − Xβˆ − X(βˆ∗ − βˆ) = e − X(βˆ∗ − βˆ). The ’restricted’ sums of squared residuals is e ′∗ e ∗ = e ′ e + (βˆ∗ − βˆ) ′X′X(βˆ∗ − βˆ) ≥ e ′ e (3) since X′X is a positive definite matrix. 2.5 Measurement of Goodness of Fit Denote the dependent variable’s ’fitted value’ from dependent variables and OLS estimator, yˆ, to be yˆ = Xβˆ, that is y = yˆ + e. Lemma: e ′e = y ′y − ˆy ′ˆy. Proof: Using the fact that X′y = X′Xβˆ, we have e ′ e = y ′y − 2βˆ ′ X′y + βˆ ′ X′Xβˆ = y ′y − ˆy ′ˆy. Three measurements of variation are defined as following: (a). SST (Sums of Squared Total variation)= PT t=1(Yt − Y¯ ) 2 = y ′M0y, (b). SSR (Sums of Squared Regression variation)=PT t=1(Yˆ t − ¯ Yˆ ) 2 = yˆ ′M0yˆ, 9