3-6 Simple regression: An Example Suppose that we have the following data on the excess returns on a fund manager' s portfolio(“fundⅹxX” together with the excess returns on a market index: Year t Ex cess return Excess return on market index r 178 39.0 23.2 12.8 69 242 16.8 17.2 12.3 We therefore want to find whether there appears to be a relationship between x and y given the data that we have. The first stage would be to form a scatter plot of the two variables
3-6 Simple Regression: An Example • Suppose that we have the following data on the excess returns on a fund manager’s portfolio (“fund XXX”) together with the excess returns on a market index: • We therefore want to find whether there appears to be a relationship between x and y given the data that we have. The first stage would be to form a scatter plot of the two variables. Year, t Excess return = rXXX,t – rft Excess return on market index = rmt - rft 1 17.8 13.7 2 39.0 23.2 3 12.8 6.9 4 24.2 16.8 5 17.2 12.3
3-7 Graph(Scatter Diagram) 45 40 30 soE92 25 20 15 610 0 15 20 25 Excess return on market portfolio
3-7 Graph (Scatter Diagram) 0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 25 Excess return on market portfolio Excess return on fund XXX
3-8 Finding a Line of best Fit We can use the general equation for a straight line Fa+bx to get the line that best“fts” the data However, this equation (y=a+bx) is completely deterministic Is this realistic? No. so what we do is to add a random disturbance term u into the equation y=a+Bx,+u, where t=1,2,3,4,5
3-8 Finding a Line of Best Fit • We can use the general equation for a straight line, y=a+bx to get the line that best “fits” the data. • However, this equation (y=a+bx) is completely deterministic. • Is this realistic? No. So what we do is to add a random disturbance term, u into the equation. yt = + xt + ut where t = 1,2,3,4,5
3-9 Why do we include a disturbance term? The disturbance term can capture a number of features We always leave out some determinants of There may be errors in the measurement of y, that cannot be modelled Random outside influences on v, which we cannot moc
3-9 Why do we include a Disturbance term? • The disturbance term can capture a number of features: - We always leave out some determinants of yt - There may be errors in the measurement of yt that cannot be modelled. - Random outside influences on yt which we cannot model
3-10 Determining the regression Coefficients So how do we determine what a and B are? Choose a and B so that the(vertical) distances from the data points to the fitted lines are minimised(so that the line fits the data as closely as possible)
3-10 Determining the Regression Coefficients • So how do we determine what and are? • Choose and so that the (vertical) distances from the data points to the fitted lines are minimised (so that the line fits the data as closely as possible): y x