当前位置：和泉文库 > 统计 > 浏览文档

《实用非参数统计》课程教学资源（阅读材料）回归与方差分析 Practical Regression and Anova using R（共16章）Faraway-PRA

Chapter 1 Introduction Chapter 2 Estimation Chapter 3 Inference Chapter 4 Errors in Predictors Chapter 5 Generalized Least Squares Chapter 6 Testing for Lack of Fit Chapter 7 Diagnostics Chapter 8 Transformation Chapter 9 Scale Changes, Principal Components and Collinearity Chapter 10 Variable Selection Chapter 11 Statistical Strategy and Model Uncertainty Chapter 12 Chicago Insurance Redlining - a complete example Chapter 13 Robust and Resistant Regression Chapter 14 Missing Data Chapter 15 Analysis of Covariance Chapter 16 ANOVA

文件格式：PDF，文件大小：0.99MB，售价：39.42元

共212页，可试读40页，点击往前阅读 ↑↑

文档详细内容（约212页）

Chapter 3 Inference Up till now, we haven’t found it necessary to assume any distributional form for the errors ε. However, if we want to make any confidence intervals or perform any hypothesis tests, we will need to do this. The usual assumption is that the errors are normally distributed and in practice this is often, although not always, a reasonable assumption. We’ll assume that the errors are independent and identically normally distributed with mean 0 and variance σ 2 , i.e. ε N ✁ 0 σ 2 I ✂ We can handle non-identity variance matrices provided we know the form — see the section on generalized least squares later. Now since y ☎ Xβ ε, y N ✁ Xβ σ 2 I ✂ is a compact description of the regression model and from this we find that (using the fact that linear combinations of normally distributed values are also normal) ˆβ ☎ ✁ X TX ✂ 1X T y N ✁ β ✁ X TX ✂ 1σ 2 ✂ 3.1 Hypothesis tests to compare models Given several predictors for a response, we might wonder whether all are needed. Consider a large model, Ω, and a smaller model, ω, which consists of a subset of the predictors that are in Ω. By the principle of Occam’s Razor (also known as the law of parsimony), we’d prefer to use ω if the data will support it. So we’ll take ω to represent the null hypothesis and Ω to represent the alternative. A geometric view of the problem may be seen in Figure 3.1. If RSSω RSSΩ is small, then ω is an adequate model relative to Ω. This suggests that something like RSSω RSSΩ RSSΩ would be a potentially good test statistic where the denominator is used for scaling purposes. As it happens the same test statistic arises from the likelihood-ratio testing approach. We give an outline of the development: If L ✁ β σ ✁ y ✂ is likelihood function, then the likelihood ratio statistic is maxβ ✂ σ✄ Ω L ✁ β σ ✁ y ✂ maxβ ✂ σ✄ω L ✁ β σ ✁ y ✂ 26

3.2.SOME EXAMPLES 28 Thus we would reject the null hypothesis ifThe degrees of freedom of a model is (usually)the number of observations minus the number of parameters so this test statistic can be written F (RSS+RSSg)/(df +dfo) RSSg/dfo where dfo n+g and dfo n+p.The same test statistic applies not just when is a subset of but also to a subspace.This test is very widely used in regression and analysis of variance.When it is applied in different situlations,the form bf test statistic may be re-expressed in various different ways.The beauty of this approach is you only need to know the general form.In any particular case,you just need to figure out which models represents the null and alternative hypotheses,fit them and compute the test statistic.It is very versatile. 3.2 Some Examples 3.2.1 Test of all predictors Are any of the predictors useful in predicting the response? ·Full model(2):yXβ+e where X is a full-.rank n×p matrix. ●Reduced model(o:yu+e一predict y by the mean. We could write the null hypothesis in this case as H0:β1 Bp-10 Now 11川1 ·RSSa O+XB)TUy+XB)èTRSS RSS+)SYY,which is sometimes known as the sum of squares corrected for the mean. So in this case F (SYY+RSS)/(p+1) RSS/(n+p) We'd now refer to Fp-)for a critical value or a p-value.Large values of Fwould indicate rejection of the null.Traditionally,the information in the above test is presented in an analysis of variance table. Most computer packages produce a variant on this.See Table 3.1.It is not really necessary to specifically compute all the elements of the table.As the originator of the table,Fisher said in 1931,it is"nothing but a convenient way of arranging the arithmetic".Since he had to do his calculations by hand,the table served some purpose but it is less useful now. A failure to reject the null hypothesis is not the end of the game-you must still investigate the pos- sibility of non-linear transformations of the variables and of outliers which may obscure the relationship. Even then,you may just have insufficient data to demonstrate a real effect which is why we must be care- ful to say "fail to reject"the null rather than "accept"the null.It would be a mistake to conclude that no real relationship exists.This issue arises when a pharmaceutical company wishes to show that a proposed generic replacement for a brand-named drug is equivalent.It would not be enough in this instance just to fail to reject the null.A higher standard would be required

3.2. SOME EXAMPLES 28 Thus we would reject the null hypothesis if F ✆ F α ✁ q p ✂ n q The degrees of freedom of a model is (usually) the number of observations minus the number of parameters so this test statistic can be written F ☎ ✁ RSSω RSSΩ ✂✁ ✁ d fω d fΩ ✂ RSSΩ d fΩ where d fΩ ☎ n q and d fω ☎ n p. The same test statistic applies not just when ω is a subset of Ω but also to a subspace. This test is very widely used in regression and analysis of variance. When it is applied in different situations, the form of test statistic may be re-expressed in various different ways. The beauty of this approach is you only need to know the general form. In any particular case, you just need to figure out which models represents the null and alternative hypotheses, fit them and compute the test statistic. It is very versatile. 3.2 Some Examples 3.2.1 Test of all predictors Are any of the predictors useful in predicting the response? Full model (Ω) : y ☎ Xβ ε where X is a full-rank n p matrix. Reduced model (ω) : y ☎ µ ε — predict y by the mean. We could write the null hypothesis in this case as H0 : β1 ☎ ✁✂✁✂✁ βp 1 ☎ 0 Now RSSΩ ☎ ✁ y X ˆβ✂ T ✁ y X ˆβ✂ ☎ εˆ T εˆ ☎ RSS RSSω ☎ ✁ y ¯y ✂ T ✁ y ¯y ✂ ☎SYY, which is sometimes known as the sum of squares corrected for the mean. So in this case F ☎ ✁ SYY RSS✂✁ ✁ p 1 ✂ RSS ✁ n p ✂ We’d now refer to Fp 1 ✂ n p for a critical value or a p-value. Large values of F would indicate rejection of the null. Traditionally, the information in the above test is presented in an analysis of variance table. Most computer packages produce a variant on this. See Table 3.1. It is not really necessary to specifically compute all the elements of the table. As the originator of the table, Fisher said in 1931, it is “nothing but a convenient way of arranging the arithmetic”. Since he had to do his calculations by hand, the table served some purpose but it is less useful now. A failure to reject the null hypothesis is not the end of the game — you must still investigate the possibility of non-linear transformations of the variables and of outliers which may obscure the relationship. Even then, you may just have insufficient data to demonstrate a real effect which is why we must be careful to say “fail to reject” the null rather than “accept” the null. It would be a mistake to conclude that no real relationship exists. This issue arises when a pharmaceutical company wishes to show that a proposed generic replacement for a brand-named drug is equivalent. It would not be enough in this instance just to fail to reject the null. A higher standard would be required

3.2.SOME EXAMPLES 29 Source Deg.of Freedom Sum of Squares Mean Square F Regression p+1 SSreg SSres(p+1)F Residual n-p RSS RSS (n+p) Total n-1 SYY Table 3.1:Analysis of Variance table When the null is rejected,this does not imply that the alternative model is the best model.We don't know whether all the predictors are required to predict the response or just some of them.Other predictors might also be added-for example quadratic terms in the existing predictors.Either way,the overall F-test is just the beginning of an analysis and not the end. Let's illustrate this test and others using an old economic dataset on 50 different countries.These data are averages over 1960-1970(to remove business cycle or other short-term fluctuations).dpi is per-capita disposable income in U.S.dollars;ddpi is the percent rate of change in per capita disposable income;sr is aggregate personal saving divided by disposable income.The percentage population under 15(pop15) and over 75(pop75)are also recorded.The data come from Belsley,Kuh,and Welsch(1980).Take a look at the data: data(savings) savings sr pop15 pop75 dpi ddpi Australia 11.4329.352.872329.682.87 Austria 12.0723.324.411507.993.93 ---cases deleted -- Malaysia 4.7147.200.66 242.695.08 First consider a model with all the predictors: g<-1m(sr pop15 pop75 dpi ddpi,data=savings) summary (g) Coefficients: Estimate Std.Error t value Pr(>Itl) (Intercept)28.566087 7.354516 3.88 0.00033 pop15 -0.461193 0.144642 -3.19 0.00260 pop75 -1.691498 1.083599 -1.56 0.12553 dpi -0.000337 0.000931 -0.36 0.71917 ddpi 0.409695 0.196197 2.090.04247 Residual standard error:3.8 on 45 degrees of freedom Multiple R-Squared:0.338, Adjusted R-squared:0.28 F-statistic:5.76 on 4 and 45 degrees of freedom, p-va1ue:0.00079 We can see directly the result of the test of whether any of the predictors have significance in the model. In other words,whether Bi=B2 =B3=B4=0.Since the p-value is so small,this null hypothesis is rejected. We can also do it directly using the F-testing formula:

3.2. SOME EXAMPLES 29 Source Deg. of Freedom Sum of Squares Mean Square F Regression p 1 SSreg SSreg ✁ p 1 ✂ F Residual n-p RSS RSS ✁ n p ✂ Total n-1 SYY Table 3.1: Analysis of Variance table When the null is rejected, this does not imply that the alternative model is the best model. We don’t know whether all the predictors are required to predict the response or just some of them. Other predictors might also be added — for example quadratic terms in the existing predictors. Either way, the overall F-test is just the beginning of an analysis and not the end. Let’s illustrate this test and others using an old economic dataset on 50 different countries. These data are averages over 1960-1970 (to remove business cycle or other short-term fluctuations). dpi is per-capita disposable income in U.S. dollars; ddpi is the percent rate of change in per capita disposable income; sr is aggregate personal saving divided by disposable income. The percentage population under 15 (pop15) and over 75 (pop75) are also recorded. The data come from Belsley, Kuh, and Welsch (1980). Take a look at the data: > data(savings) > savings sr pop15 pop75 dpi ddpi Australia 11.43 29.35 2.87 2329.68 2.87 Austria 12.07 23.32 4.41 1507.99 3.93 --- cases deleted --- Malaysia 4.71 47.20 0.66 242.69 5.08 First consider a model with all the predictors: > g <- lm(sr ˜ pop15 + pop75 + dpi + ddpi, data=savings) > summary(g) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 28.566087 7.354516 3.88 0.00033 pop15 -0.461193 0.144642 -3.19 0.00260 pop75 -1.691498 1.083599 -1.56 0.12553 dpi -0.000337 0.000931 -0.36 0.71917 ddpi 0.409695 0.196197 2.09 0.04247 Residual standard error: 3.8 on 45 degrees of freedom Multiple R-Squared: 0.338, Adjusted R-squared: 0.28 F-statistic: 5.76 on 4 and 45 degrees of freedom, p-value: 0.00079 We can see directly the result of the test of whether any of the predictors have significance in the model. In other words, whether β1 ☎ β2 ☎ β3 ☎ β4 ☎ 0. Since the p-value is so small, this null hypothesis is rejected. We can also do it directly using the F-testing formula:

点击进入文档下载页（PDF格式）

共212页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录