当前位置：和泉文库 > 统计 > 浏览文档

《社会科学统计》(英文版) Lecture 8 Simple Linear Regression

Simple Linear assumptions: Regression The relationship should be linear; and The level of data must be continuous Simple Linear Regression The regression equation (Bivariate Regression The purpose of simple linear regression is We already looked at measuring

文件格式：PDF，文件大小：282.15KB，售价：5.03元

文档详细内容（约17页）

Simple Linear Regression Lecture 8 Like correlation, there are two major Simple Linear assumptions Regression The relationship should be linear, and The level of data must be continuou Simple Linear Regression The regression equation (Bivariate Regression) The purpose of simple linear looked at measuring relationships to fit a line to the two variables this line is between two interval variables using correlation called the line of best fit, or the Now we continue to look at the bivariate analysis of the two variables using regression analysis regression line. When we do a scatterplot However, the purpose of doing regression rather of two variables, it is possible to fit a line than correlation is that we can predict results in which best represents the data one variable based on another variable. so rather than simply see if the variables are related, we can interpret their effect

1 1 Lecture 8 Simple Linear Regression 2 Simple Linear Regression (Bivariate Regression) We already looked at measuring relationships between two interval variables using correlation. Now we continue to look at the bivariate analysis of the two variables using regression analysis. However, the purpose of doing regression rather than correlation is that we can predict results in one variable, based on another variable. So, rather than simply see if the variables are related, we can interpret their effect. 2 3 Simple Linear Regression Like correlation, there are two major assumptions: • The relationship should be linear; and • The level of data must be continuous 4 The regression equation The purpose of simple linear regression is to fit a line to the two variables. This line is called the line of best fit, or the regression line. When we do a scatterplot of two variables, it is possible to fit a line which best represents the data

The regression equation The regression equation A regression equation is used to define the relationship between two variables. It These represent the following: takes the form a or A=Constant value. It is the value at which the line intersects the Y=a+bX bor A Contant vale It is the slope(or gradient)or the ne. It represents the change in Y for each inerease or decrease nX. Y=B0+B1X1+8 I= The value of the x variable for each case The regression equation Scatterplot and regression ine They are essentially the same, except that the second includes an error term at the end. This error term indicates that what we Change in Y s 1o have is in fact a model and hence won t fit the data perfectly

3 5 The regression equation A regression equation is used to define the relationship between two variables. It takes the form: or 6 The regression equation They are essentially the same, except that the second includes an error term at the end. This error term indicates that what we have is in fact a model, and hence won't fit the data perfectly. 4 7 The regression equation 8 Scatterplot and regression line 0 10 20 30 40 50 60 70 012345 X YChange in X is 1 Change in Y is 10 Intercept is 20

How do we fit a line to data? Now, we do not have to test every possible line to see which fits the data best. The method of least squares In order to fit a line of best fit we use a provides the optimal values of a(or岛)andb(or月 method called the method of least Squares. This method allows us to Once we have established them, we can use them in determine which line, out of all the lines the regression equation. that could be drawn, best represents the The formulas for calculating a and b are least amount of difference between the actual values(the data points )and the n(EXr)-CxcEr predicted line In the Figure above, three data points fall on the Example 1 line, while the remaining 6 are slightly above or below the line. The difference between these Agen Children ints and the line are called residuals. some of thile others will be negative(below the line) we add up all these differences, some of the positive and negative values will cancel each other out. which will have the effect of verestimating how well the line represents the data. Instead, if we square the differences and Total 149 92994903 then add them up then we can ut which line has the smallest sum of squares(that is, the one with the least error) so,n=5,2X=149,ZY=9,2XY=29922=403 22=19,(2X2=149149=2221

5 9 How do we fit a line to data? In order to fit a line of best fit we use a method called the Method of Least Squares. This method allows us to determine which line, out of all the lines that could be drawn, best represents the least amount of difference between the actual values (the data points) and the predicted line. 10 In the Figure above, three data points fall on the line, while the remaining 6 are slightly above or below the line. The difference between these points and the line are called residuals. Some of these differences will be positive (above the line), while others will be negative (below the line). If we add up all these differences, some of the positive and negative values will cancel each other out, which will have the effect of overestimating how well the line represents the data. Instead, if we square the differences and then add them up then we can work out which line has the smallest sum of squares (that is, the one with the least error). 6 11 12 Example 1

CI)-()C∑1 We could now draw a line of best fit x2)-②m through the observed data points 5*(299)-(149)*(9)154 5*(4803)-(202181085 9-0085*(149) 5 1520253035404550 =-0.73+0085X Predictio Regression Variables Entered/ Remove based on their age. So, if some woman in the community re aged 27 we could predict that their CEB number was: Model Summary F=-0.73+0085*27=1.56 a Predictors: (Constant), AGE 8

7 13 14 Prediction 8 15 We could now draw a line of best fit through the observed data points 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 15 20 25 30 35 40 45 50 Age Number of children 16

Example 2: Crying and IQ ANova Infants who cry easily may be a sign of higher IQ. Crying intensity and IQ data on Crying Io Crying IQ Crying IQ crying b, Dependent vaiable: EB 181091511 1112111416118 Dependant variable CEB Inference for regression When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative response variable y, we can use the least squares line fitted to the data to predict y for a given value of x. Now we want to do tests and confidence intervals in this

9 17 18 Inference for Regression • When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative response variable y, we can use the leastsquares line fitted to the data to predict y for a given value of x. Now we want to do tests and confidence intervals in this setting. 10 19 Example 2: Crying and IQ • Infants who cry easily may be a sign of higher IQ. Crying intensity and IQ data on 38 infants: IQ=intelligence quotient 20

点击进入文档下载页（PDF格式）

共17页，可试读7页，点击继续阅读 ↓↓

您可能感兴趣的文档

《社会科学统计》(英文版) Lecture 5 T-Test (W)
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第五章比估计与回归估计 §3 数值例子 §4 回归估计量
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第五章比估计与回归估计 §1 比估计及其性质 §2 分层抽样中的比估计
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第四章分层抽样 §3 样本总容量n的确定 §4 分层的若干技术问题
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第四章分层抽样 §1 分层抽样及估计量 §2 比例分配及最优分配
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第十章系统抽样 §1 系统抽样的若干习性 §2 估计量与方差 §3 方差与总体单元排列顺序的关系 §4 具有线性趋势的总体的抽样方法改进
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第十二章非抽样误差 §1 抽样方案及抽样框引起的非抽样误差 §2 无回答现象 §3 计量误差 §4 敏感性问题的调查
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第三章简单随机抽样（3-4）百分数的估针及其误差 §4 百分数的估计及其误差 §5 样本容量n的确定
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第三章简单随机抽样 §1 简单随机抽样及实施方法 §2 总体平均数与总和的估计 §3 估计量的方差及其估计
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第七章不等概率抽样 §1 放回的不等概率抽样 §2 不放回的不等概率抽样
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第六章二重抽样 §3 二重抽样的比估计与回归估计 §4 二重抽样样本量的最优分配
《抽样调查理论与方法》课程教学资源（PPT课件讲稿）第六章二重抽样 §1 二重抽样简述 §2 二重分层抽样
《抽样调查》课程教学资源（PPT课件讲稿）第一章抽样调查概述
《抽样调查》课程教学资源（PPT课件讲稿）第十章非抽样误差及其控制
《抽样调查》课程教学资源（PPT课件讲稿）第二章抽样调查基本原理
《抽样调查》课程教学资源（PPT课件讲稿）第三章简单随机抽样
《抽样调查》课程教学资源（PPT课件讲稿）第四章分层抽样
《抽样调查》课程教学资源（PPT课件讲稿）第五章比估计与回归估计
《抽样调查》课程教学资源（PPT课件讲稿）第六章整群抽样
《抽样调查》课程教学资源（PPT课件讲稿）第七章等距抽样
《抽样调查》课程教学资源（PPT课件讲稿）第八章多阶抽样
《抽样调查》课程教学资源（PPT课件讲稿）第九章二重抽样
《中国财务会计研究综述》讲义
《管理统计分析方法》习题二

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录