alysis with Cross-Sectional Data obtained for a given sample of data, a new sample will generate a different slope and intercept in equation(2.23) In most cases the slope estimate, which we can write as B=△M△, 224 is of primary interest. It tells us the amount by which y changes when x increases by one unit. Equivalently, 225 so that given any change in x(whether positive or negative), we can compute the pre- dicted change in We now present several examples of simple regression obtained by using real data. In other words, we find the intercept and slope estimates with equations(2.17)and (2. 19). Since these examples involve many observations, the calculations were done using an econometric software package. At this point, you should be careful not to read too much into these regressions; they are not necessarily uncovering a causal relation ship. We have said nothing so far about the statistical properties of OLS. In Section 2.5 we consider statistical properties after we explicitly impose assumptions on the popu lation model equation(2.1) E. 3 (cEo Salary and return on equity For the population of chief executive officers, let y be annual salary (salary) in thousands of dollars. Thus, y=856. 3 indicates an annual salary of $856, 300, and y= 1452. 6 indicates a salary of $1, 452, 600. Let x be the average return equity (roe) for the CEOs firm for the previous three years (Return on equity is defined in terms of net income as a percentage of common equity )For example, if roe= 10, then average return on equity is 10 percent. To study the relationship between this measure of firm performance and CEO com nsation, we postulate the simple model salary =Bo+ Biroe +I. The slope parameter B, measures the change in annual salary, in thousands of dollars, when return on equity increases by one percentage point. Because a higher roe is good for the company, we think B,>0 The data set CEOSAL1 RAW contains information on 209 CEOs for the year 1990; these data were obtained from Business Week(5/6/91). In this sample, the average annual salary is $1, 281, 120, with the smallest and largest being $223, 000 and $14, 822, 000, respective- ly. The average return on equity for the years 1988, 1989, and 1990 is 17 18 percent, with the smallest and largest values being 0.5 and 56.3 percent, respectively Using the data in CEOSAL1 RAW, the OLS regression line relating salary to roe is 963.191+18.501 226
obtained for a given sample of data, a new sample will generate a different slope and intercept in equation (2.23). In most cases the slope estimate, which we can write as ˆ 1 yˆ/x, (2.24) is of primary interest. It tells us the amount by which yˆ changes when x increases by one unit. Equivalently, yˆ ˆ 1x, (2.25) so that given any change in x (whether positive or negative), we can compute the predicted change in y. We now present several examples of simple regression obtained by using real data. In other words, we find the intercept and slope estimates with equations (2.17) and (2.19). Since these examples involve many observations, the calculations were done using an econometric software package. At this point, you should be careful not to read too much into these regressions; they are not necessarily uncovering a causal relationship. We have said nothing so far about the statistical properties of OLS. In Section 2.5, we consider statistical properties after we explicitly impose assumptions on the population model equation (2.1). EXAMPLE 2.3 (CEO Salary and Return on Equity) For the population of chief executive officers, let y be annual salary (salary) in thousands of dollars. Thus, y 856.3 indicates an annual salary of $856,300, and y 1452.6 indicates a salary of $1,452,600. Let x be the average return equity (roe) for the CEO’s firm for the previous three years. (Return on equity is defined in terms of net income as a percentage of common equity.) For example, if roe 10, then average return on equity is 10 percent. To study the relationship between this measure of firm performance and CEO compensation, we postulate the simple model salary 0 1roe u. The slope parameter 1 measures the change in annual salary, in thousands of dollars, when return on equity increases by one percentage point. Because a higher roe is good for the company, we think 1 0. The data set CEOSAL1.RAW contains information on 209 CEOs for the year 1990; these data were obtained from Business Week (5/6/91). In this sample, the average annual salary is $1,281,120, with the smallest and largest being $223,000 and $14,822,000, respectively. The average return on equity for the years 1988, 1989, and 1990 is 17.18 percent, with the smallest and largest values being 0.5 and 56.3 percent, respectively. Using the data in CEOSAL1.RAW, the OLS regression line relating salary to roe is salˆ ary 963.191 18.501 roe, (2.26) Part 1 Regression Analysis with Cross-Sectional Data 32 d 7/14/99 4:30 PM Page 32
The Simple Regression Model where the intercept and slope estimates have been rounded to three decimal places; we use"salary hat"to indicate that this is an estimated equation. How do we interpret the equation? First, if the return on equity is zero, roe=0, then the predicted salary is the inter ept, 963. 191, which equals $963, 191 since salary is measured in thousands. Next, we can write the predicted change in salary as a function of the change in roe: Asalary = 18.501 (Aroe). This means that if the return on equity increases by one percentage point, Aroe equation, this is the estimated change regardless of the initial salan suse(2. 26)is a linear 1, then salary is predicted to change by about 18.5, or $18, 500. Be Ve can easily use(2. 26) to compare predicted salaries at different values of roe Suppose roe= 30. Then salary =963. 191+ 18.501(30)=1518.221, which is just over $1.5 million. However, this does not mean that a particular CEo whose firm had an roe= 30 earns $1, 518, 221. There are many other factors that affect salary. This is jus our prediction from the olS regression line(2.26). The estimated line is graphed in Fig ure 2.5, along with the population regression function E(salarylroe). We will never know the pre, so we cannot tell how close the srf is to the Pre. another sample of data will give a different regression line, which may or may not be closer to the population regres sion line Figure 2.5 The OLS regression line 963. 191 18.50 roe and the(unknown)population ( salarylroe)=βo+β 963.191
where the intercept and slope estimates have been rounded to three decimal places; we use “salary hat” to indicate that this is an estimated equation. How do we interpret the equation? First, if the return on equity is zero, roe 0, then the predicted salary is the intercept, 963.191, which equals $963,191 since salary is measured in thousands. Next, we can write the predicted change in salary as a function of the change in roe: salˆ ary 18.501 (roe). This means that if the return on equity increases by one percentage point, roe 1, then salary is predicted to change by about 18.5, or $18,500. Because (2.26) is a linear equation, this is the estimated change regardless of the initial salary. We can easily use (2.26) to compare predicted salaries at different values of roe. Suppose roe 30. Then salˆ ary 963.191 18.501(30) 1518.221, which is just over $1.5 million. However, this does not mean that a particular CEO whose firm had an roe 30 earns $1,518,221. There are many other factors that affect salary. This is just our prediction from the OLS regression line (2.26). The estimated line is graphed in Figure 2.5, along with the population regression function E(salaryroe). We will never know the PRF, so we cannot tell how close the SRF is to the PRF. Another sample of data will give a different regression line, which may or may not be closer to the population regression line. Chapter 2 The Simple Regression Model 33 Figure 2.5 The OLS regression line saˆ lary 963.191 18.50 roe and the (unknown) population regression function. salary 963.191 salary 963.191 18.501 roe E(salaryroe) 0 1roe roe ˆ d 7/14/99 4:30 PM Page 33
Regressie alysis with Cross-Sectional Data EX 2.4 Wage and education) For the population of people in the work force in 1976, let y= wage, where wage is mea- sured in dollars per hour. Thus, for a particular person, if wage =6.75, the hourly wage is $6.75. Let x educ denote years of schooling: for example, educ 12 corresponds to a complete high school education. Since the average wage in the sample is $5.90, the con- sumer price index indicates that this amount is equivalent to $16.64 in 1997 dollars Using the data in WAGE1 RAW where n= 526 individuals, we obtain the following olS regression line(or sample regression function) wage=-0.90+0.54edc 227 Ve must interpret this equation with caution the intercept of -0.90 literally means that a erson with no education has a predicted hourly wage of -90 cents an hour. This, of course, is silly. It turns out that no one in the sample has less than eight years of education, which helps to explain the crazy prediction for a zero education value. For a person with eight years of education, the predicted wage is wage=-0.90+0.54(8)=3.42 Q U EST ION 2. 2 S3.42 per hour (in 1976 dollars) he slope estimate in(2.27)implies that dollars.What is this value in 1997 dollars?(Hint: You have enough one more year of education increases hourly information in Example 2. 4 to answer this question. e by 54 cents an hour. Therefore, four dicted wage by 4(0.54)=2.16 or $2. 16 per hour. These are fairly large effects. Because of the linear nature of (2.27), another year of education increases the wage by the same amount, regardless of the initial level of education In Section 2. 4, we discuss some meth ods that allow for nonconstant marginal effects of our explanatory variables E A MPLE 2. 5 The file VOTElRAW contains data on election outcomes and campaign expenditures for 173 two-party races for the U.S. House of Representatives in 1988. There are two candi- dates in each race, A and B Let vote a be the percentage of the vote received by Candidate A and sharea be the the percentage of total campaign expenditures accounted for by Candidate A. Many factors other than sharea affect the election outcome(including the quality of the candidates and possibly the dollar amounts spent by a and B). Nevertheless we can estimate a simple regression model to find out whether spending more relative to one's challenger implies a higher percentage of the vote The estimated equation using the 173 observations is teA= 40. 90+0.306 shareA 228) This means that, if the share of Candidate As expenditures increases by one percent age point, Candidate A receives almost one-third of a percentage point more of the
EXAMPLE 2.4 (Wage and Education) For the population of people in the work force in 1976, let y wage, where wage is measured in dollars per hour. Thus, for a particular person, if wage 6.75, the hourly wage is $6.75. Let x educ denote years of schooling; for example, educ 12 corresponds to a complete high school education. Since the average wage in the sample is $5.90, the consumer price index indicates that this amount is equivalent to $16.64 in 1997 dollars. Using the data in WAGE1.RAW where n 526 individuals, we obtain the following OLS regression line (or sample regression function): waˆge 0.90 0.54 educ. (2.27) We must interpret this equation with caution. The intercept of 0.90 literally means that a person with no education has a predicted hourly wage of 90 cents an hour. This, of course, is silly. It turns out that no one in the sample has less than eight years of education, which helps to explain the crazy prediction for a zero education value. For a person with eight years of education, the predicted wage is waˆge 0.90 0.54(8) 3.42, or $3.42 per hour (in 1976 dollars). The slope estimate in (2.27) implies that one more year of education increases hourly wage by 54 cents an hour. Therefore, four more years of education increase the predicted wage by 4(0.54) 2.16 or $2.16 per hour. These are fairly large effects. Because of the linear nature of (2.27), another year of education increases the wage by the same amount, regardless of the initial level of education. In Section 2.4, we discuss some methods that allow for nonconstant marginal effects of our explanatory variables. EXAMPLE 2.5 (Voting Outcomes and Campaign Expenditures) The file VOTE1.RAW contains data on election outcomes and campaign expenditures for 173 two-party races for the U.S. House of Representatives in 1988. There are two candidates in each race, A and B. Let voteA be the percentage of the vote received by Candidate A and shareA be the the percentage of total campaign expenditures accounted for by Candidate A. Many factors other than shareA affect the election outcome (including the quality of the candidates and possibly the dollar amounts spent by A and B). Nevertheless, we can estimate a simple regression model to find out whether spending more relative to one’s challenger implies a higher percentage of the vote. The estimated equation using the 173 observations is votˆeA 40.90 0.306 shareA. (2.28) This means that, if the share of Candidate A’s expenditures increases by one percentage point, Candidate A receives almost one-third of a percentage point more of the Part 1 Regression Analysis with Cross-Sectional Data 34 QUESTION 2.2 The estimated wage from (2.27), when educ 8, is $3.42 in 1976 dollars. What is this value in 1997 dollars? (Hint: You have enough information in Example 2.4 to answer this question.) d 7/14/99 4:30 PM Page 34
The Simple Regression Model total vote. whether or not this is a causal effect is unclear but the result is what we In some cases, regression analysis is not used to determine causality but to simply look at whether two variables are positively or negatively related, much like a standard correlation analysis. An example of this QUESTION 2.3 occurs in Problem 2.12, where you are asked to use data from Biddle and In Example 2.5, what is the predicted vote for Candidate a if shareA 60(which means 60 percent)? Does this answer seem reasonable? Hamermesh(1990)on time spent sleeping d working to investigate the tradeoff between these two factors A Note on Terminolgy In most cases, we will indicate the estimation of a relationship through olS by writing an equation such as(2.26),(2.27), or(2.28). Sometimes, for the sake of brevity, it is useful to indicate that an Ols regression has been run without actually writing out the equation. We will often indicate that equation(2.23)has been obtained by Ols in say ing that we run the regression of y on x, 229) or simply that we regress y on x. The positions of y and x in(2. 29)indicate which is the dependent variable and which is the independent variable: we always regress the depen dent variable on the independent variable. For specific applications, we replace y and x with their names. Thus, to obtain(2. 26), we regress salary on roe or to obtain(2.28), we regress voteD on shareA When uch terminology in(2.29), we will always mean that we plan to esti mate the intercept, Bo, along with the slope, B,. This case is appropriate for the vast majority of applications. Occasionally, we may want to estimate the relationship between y and x assuming that the intercept is zero(so that x=0 implies that y=0); we cover this case briefly in Section 2.6. Unless explicitly stated otherwise, we always estimate an intercept along with a slope. 23 MECHANICS OF OL In this section, we cover some algebraic properties of the fitted OLs regression line Perhaps the best way to think about these properties is to realize that they are features of OLS for a particular sample of data. They can be contrasted with the statistical prop- erties of OLS, which requires deriving features of the sampling distributions of the esti- mators. We will discuss statistical properties in Section 2.5 Several of the algebraic properties we are going to derive will api Nevertheless, having a grasp of these properties helps us to figure out what happens the ols estimates and related statistics when the data are manipulated in certain ways such as when the measurement units of the dependent and independent variables change
total vote. Whether or not this is a causal effect is unclear, but the result is what we might expect. In some cases, regression analysis is not used to determine causality but to simply look at whether two variables are positively or negatively related, much like a standard correlation analysis. An example of this occurs in Problem 2.12, where you are asked to use data from Biddle and Hamermesh (1990) on time spent sleeping and working to investigate the tradeoff between these two factors. A Note on Terminolgy In most cases, we will indicate the estimation of a relationship through OLS by writing an equation such as (2.26), (2.27), or (2.28). Sometimes, for the sake of brevity, it is useful to indicate that an OLS regression has been run without actually writing out the equation. We will often indicate that equation (2.23) has been obtained by OLS in saying that we run the regression of y on x, (2.29) or simply that we regress y on x. The positions of y and x in (2.29) indicate which is the dependent variable and which is the independent variable: we always regress the dependent variable on the independent variable. For specific applications, we replace y and x with their names. Thus, to obtain (2.26), we regress salary on roe or to obtain (2.28), we regress voteA on shareA. When we use such terminology in (2.29), we will always mean that we plan to estimate the intercept, ˆ 0, along with the slope, ˆ 1. This case is appropriate for the vast majority of applications. Occasionally, we may want to estimate the relationship between y and x assuming that the intercept is zero (so that x 0 implies that yˆ 0); we cover this case briefly in Section 2.6. Unless explicitly stated otherwise, we always estimate an intercept along with a slope. 2.3MECHANICS OF OLS In this section, we cover some algebraic properties of the fitted OLS regression line. Perhaps the best way to think about these properties is to realize that they are features of OLS for a particular sample of data. They can be contrasted with the statistical properties of OLS, which requires deriving features of the sampling distributions of the estimators. We will discuss statistical properties in Section 2.5. Several of the algebraic properties we are going to derive will appear mundane. Nevertheless, having a grasp of these properties helps us to figure out what happens to the OLS estimates and related statistics when the data are manipulated in certain ways, such as when the measurement units of the dependent and independent variables change. Chapter 2 The Simple Regression Model 35 QUESTION 2.3 In Example 2.5, what is the predicted vote for Candidate A if shareA 60 (which means 60 percent)? Does this answer seem reasonable? d 7/14/99 4:30 PM Page 35
alysis with Cross-Sectional Data Fitted values and residuals We assume that the intercept and slope estimates, Bo and Bl, have been obtained for the given sample of data. Given Bo and B,, we can obtain the fitted value y, for each obser vation. [This is given by equation(.20).] By definition, each fitted value of y is on the OLS regression line. The ols residual associated with observation i, iii is the differ- ence between y; and its fitted value, as given in equation(2.21). If i, is positive, the line underpredicts y;; if ii, is negative, the line overpredicts y. The ideal case for observation i is when ii;=0, but in most cases every residual is not equal to zero. In other words, one of the data points must actually lie on the ols line E A M PLE2. 6 (CEo Salary and return on equity) Table 2.2 contains a listing of the first 15 observations in the ceo data set, along with the fitted values, called salaryhat, and the residuals, called what. ttable 2.2 Fitted Values and Residuals for the first 15 ceos obsno roe salary salaryman what 14.1 1095 1224.058 -1290581 2 10.9 1001 1164.854 163.8542 l122 1397.969 4 5.9 578 1072.348 494.3484 13.8 1368 1218.508 149.4923 1145 1333.215 188.2151 16.4 1078 1266.611 188.6108 16.3 1264.761 170.7606 9 10.5 1237 1157454 79.54626 10 26.3 833 1449773 616.7726 25.9 1442.372 875.3721 26.8 933 1459.023 5260231
Fitted Values and Residuals We assume that the intercept and slope estimates, ˆ 0 and ˆ 1, have been obtained for the given sample of data. Given ˆ 0 and ˆ 1, we can obtain the fitted value yˆi for each observation. [This is given by equation (2.20).] By definition, each fitted value of yˆi is on the OLS regression line. The OLS residual associated with observation i, uˆi , is the difference between yi and its fitted value, as given in equation (2.21). If uˆi is positive, the line underpredicts yi ; if uˆi is negative, the line overpredicts yi . The ideal case for observation i is when uˆi 0, but in most cases every residual is not equal to zero. In other words, none of the data points must actually lie on the OLS line. EXAMPLE 2.6 (CEO Salary and Return on Equity) Table 2.2 contains a listing of the first 15 observations in the CEO data set, along with the fitted values, called salaryhat, and the residuals, called uhat. Table 2.2 Fitted Values and Residuals for the First 15 CEOs obsno roe salary salaryhat uhat 1 14.1 1095 1224.058 129.0581 2 10.9 1001 1164.854 163.8542 3 23.5 1122 1397.969 275.9692 4 5.9 578 1072.348 494.3484 5 13.8 1368 1218.508 149.4923 6 20.0 1145 1333.215 188.2151 7 16.4 1078 1266.611 188.6108 8 16.3 1094 1264.761 170.7606 9 10.5 1237 1157.454 79.54626 10 26.3 833 1449.773 616.7726 11 25.9 567 1442.372 875.3721 12 26.8 933 1459.023 526.0231 Part 1 Regression Analysis with Cross-Sectional Data 36 continued d 7/14/99 4:30 PM Page 36