《计量经济学》课程教学资源（国外经典教科书）Introductory Econometrics.pdf_P31-P35

alysis with Cross-Sectional Data obtained for a given sample of data, a new sample will generate a different slope and intercept in equation(2.23) In most cases the slope estimate, which we can write as B=△M△, 224 is of primary interest. It tells us the amount by which y changes when x increases by one unit. Equivalently, 225 so that given any change in x(whether positive or negative), we can compute the pre- dicted change in We now present several examples of simple regression obtained by using real data. In other words, we find the intercept and slope estimates with equations(2.17)and (2. 19). Since these examples involve many observations, the calculations were done using an econometric software package. At this point, you should be careful not to read too much into these regressions; they are not necessarily uncovering a causal relation ship. We have said nothing so far about the statistical properties of OLS. In Section 2.5 we consider statistical properties after we explicitly impose assumptions on the popu lation model equation(2.1) E. 3 (cEo Salary and return on equity For the population of chief executive officers, let y be annual salary (salary) in thousands of dollars. Thus, y=856. 3 indicates an annual salary of $856, 300, and y= 1452. 6 indicates a salary of $1, 452, 600. Let x be the average return equity (roe) for the CEOs firm for the previous three years (Return on equity is defined in terms of net income as a percentage of common equity )For example, if roe= 10, then average return on equity is 10 percent. To study the relationship between this measure of firm performance and CEO com nsation, we postulate the simple model salary =Bo+ Biroe +I. The slope parameter B, measures the change in annual salary, in thousands of dollars, when return on equity increases by one percentage point. Because a higher roe is good for the company, we think B,>0 The data set CEOSAL1 RAW contains information on 209 CEOs for the year 1990; these data were obtained from Business Week(5/6/91). In this sample, the average annual salary is $1, 281, 120, with the smallest and largest being $223, 000 and $14, 822, 000, respective- ly. The average return on equity for the years 1988, 1989, and 1990 is 17 18 percent, with the smallest and largest values being 0.5 and 56.3 percent, respectively Using the data in CEOSAL1 RAW, the OLS regression line relating salary to roe is 963.191+18.501 226

obtained for a given sample of data, a new sample will generate a different slope and intercept in equation (2.23). In most cases the slope estimate, which we can write as ˆ 1 yˆ/x, (2.24) is of primary interest. It tells us the amount by which yˆ changes when x increases by one unit. Equivalently, yˆ ˆ 1x, (2.25) so that given any change in x (whether positive or negative), we can compute the predicted change in y. We now present several examples of simple regression obtained by using real data. In other words, we find the intercept and slope estimates with equations (2.17) and (2.19). Since these examples involve many observations, the calculations were done using an econometric software package. At this point, you should be careful not to read too much into these regressions; they are not necessarily uncovering a causal relationship. We have said nothing so far about the statistical properties of OLS. In Section 2.5, we consider statistical properties after we explicitly impose assumptions on the population model equation (2.1). EXAMPLE 2.3 (CEO Salary and Return on Equity) For the population of chief executive officers, let y be annual salary (salary) in thousands of dollars. Thus, y 856.3 indicates an annual salary of $856,300, and y 1452.6 indicates a salary of $1,452,600. Let x be the average return equity (roe) for the CEO’s firm for the previous three years. (Return on equity is defined in terms of net income as a percentage of common equity.) For example, if roe 10, then average return on equity is 10 percent. To study the relationship between this measure of firm performance and CEO compensation, we postulate the simple model salary 0 1roe u. The slope parameter 1 measures the change in annual salary, in thousands of dollars, when return on equity increases by one percentage point. Because a higher roe is good for the company, we think 1 0. The data set CEOSAL1.RAW contains information on 209 CEOs for the year 1990; these data were obtained from Business Week (5/6/91). In this sample, the average annual salary is $1,281,120, with the smallest and largest being $223,000 and $14,822,000, respectively. The average return on equity for the years 1988, 1989, and 1990 is 17.18 percent, with the smallest and largest values being 0.5 and 56.3 percent, respectively. Using the data in CEOSAL1.RAW, the OLS regression line relating salary to roe is salˆ ary 963.191 18.501 roe, (2.26) Part 1 Regression Analysis with Cross-Sectional Data 32 d 7/14/99 4:30 PM Page 32

Regressie alysis with Cross-Sectional Data EX 2.4 Wage and education) For the population of people in the work force in 1976, let y= wage, where wage is mea- sured in dollars per hour. Thus, for a particular person, if wage =6.75, the hourly wage is $6.75. Let x educ denote years of schooling: for example, educ 12 corresponds to a complete high school education. Since the average wage in the sample is $5.90, the con- sumer price index indicates that this amount is equivalent to $16.64 in 1997 dollars Using the data in WAGE1 RAW where n= 526 individuals, we obtain the following olS regression line(or sample regression function) wage=-0.90+0.54edc 227 Ve must interpret this equation with caution the intercept of -0.90 literally means that a erson with no education has a predicted hourly wage of -90 cents an hour. This, of course, is silly. It turns out that no one in the sample has less than eight years of education, which helps to explain the crazy prediction for a zero education value. For a person with eight years of education, the predicted wage is wage=-0.90+0.54(8)=3.42 Q U EST ION 2. 2 S3.42 per hour (in 1976 dollars) he slope estimate in(2.27)implies that dollars.What is this value in 1997 dollars?(Hint: You have enough one more year of education increases hourly information in Example 2. 4 to answer this question. e by 54 cents an hour. Therefore, four dicted wage by 4(0.54)=2.16 or $2. 16 per hour. These are fairly large effects. Because of the linear nature of (2.27), another year of education increases the wage by the same amount, regardless of the initial level of education In Section 2. 4, we discuss some meth ods that allow for nonconstant marginal effects of our explanatory variables E A MPLE 2. 5 The file VOTElRAW contains data on election outcomes and campaign expenditures for 173 two-party races for the U.S. House of Representatives in 1988. There are two candi- dates in each race, A and B Let vote a be the percentage of the vote received by Candidate A and sharea be the the percentage of total campaign expenditures accounted for by Candidate A. Many factors other than sharea affect the election outcome(including the quality of the candidates and possibly the dollar amounts spent by a and B). Nevertheless we can estimate a simple regression model to find out whether spending more relative to one's challenger implies a higher percentage of the vote The estimated equation using the 173 observations is teA= 40. 90+0.306 shareA 228) This means that, if the share of Candidate As expenditures increases by one percent age point, Candidate A receives almost one-third of a percentage point more of the

EXAMPLE 2.4 (Wage and Education) For the population of people in the work force in 1976, let y wage, where wage is measured in dollars per hour. Thus, for a particular person, if wage 6.75, the hourly wage is $6.75. Let x educ denote years of schooling; for example, educ 12 corresponds to a complete high school education. Since the average wage in the sample is $5.90, the consumer price index indicates that this amount is equivalent to $16.64 in 1997 dollars. Using the data in WAGE1.RAW where n 526 individuals, we obtain the following OLS regression line (or sample regression function): waˆge 0.90 0.54 educ. (2.27) We must interpret this equation with caution. The intercept of 0.90 literally means that a person with no education has a predicted hourly wage of 90 cents an hour. This, of course, is silly. It turns out that no one in the sample has less than eight years of education, which helps to explain the crazy prediction for a zero education value. For a person with eight years of education, the predicted wage is waˆge 0.90 0.54(8) 3.42, or $3.42 per hour (in 1976 dollars). The slope estimate in (2.27) implies that one more year of education increases hourly wage by 54 cents an hour. Therefore, four more years of education increase the predicted wage by 4(0.54) 2.16 or $2.16 per hour. These are fairly large effects. Because of the linear nature of (2.27), another year of education increases the wage by the same amount, regardless of the initial level of education. In Section 2.4, we discuss some methods that allow for nonconstant marginal effects of our explanatory variables. EXAMPLE 2.5 (Voting Outcomes and Campaign Expenditures) The file VOTE1.RAW contains data on election outcomes and campaign expenditures for 173 two-party races for the U.S. House of Representatives in 1988. There are two candidates in each race, A and B. Let voteA be the percentage of the vote received by Candidate A and shareA be the the percentage of total campaign expenditures accounted for by Candidate A. Many factors other than shareA affect the election outcome (including the quality of the candidates and possibly the dollar amounts spent by A and B). Nevertheless, we can estimate a simple regression model to find out whether spending more relative to one’s challenger implies a higher percentage of the vote. The estimated equation using the 173 observations is votˆeA 40.90 0.306 shareA. (2.28) This means that, if the share of Candidate A’s expenditures increases by one percentage point, Candidate A receives almost one-third of a percentage point more of the Part 1 Regression Analysis with Cross-Sectional Data 34 QUESTION 2.2 The estimated wage from (2.27), when educ 8, is $3.42 in 1976 dollars. What is this value in 1997 dollars? (Hint: You have enough information in Example 2.4 to answer this question.) d 7/14/99 4:30 PM Page 34

The Simple Regression Model total vote. whether or not this is a causal effect is unclear but the result is what we In some cases, regression analysis is not used to determine causality but to simply look at whether two variables are positively or negatively related, much like a standard correlation analysis. An example of this QUESTION 2.3 occurs in Problem 2.12, where you are asked to use data from Biddle and In Example 2.5, what is the predicted vote for Candidate a if shareA 60(which means 60 percent)? Does this answer seem reasonable? Hamermesh(1990)on time spent sleeping d working to investigate the tradeoff between these two factors A Note on Terminolgy In most cases, we will indicate the estimation of a relationship through olS by writing an equation such as(2.26),(2.27), or(2.28). Sometimes, for the sake of brevity, it is useful to indicate that an Ols regression has been run without actually writing out the equation. We will often indicate that equation(2.23)has been obtained by Ols in say ing that we run the regression of y on x, 229) or simply that we regress y on x. The positions of y and x in(2. 29)indicate which is the dependent variable and which is the independent variable: we always regress the depen dent variable on the independent variable. For specific applications, we replace y and x with their names. Thus, to obtain(2. 26), we regress salary on roe or to obtain(2.28), we regress voteD on shareA When uch terminology in(2.29), we will always mean that we plan to esti mate the intercept, Bo, along with the slope, B,. This case is appropriate for the vast majority of applications. Occasionally, we may want to estimate the relationship between y and x assuming that the intercept is zero(so that x=0 implies that y=0); we cover this case briefly in Section 2.6. Unless explicitly stated otherwise, we always estimate an intercept along with a slope. 23 MECHANICS OF OL In this section, we cover some algebraic properties of the fitted OLs regression line Perhaps the best way to think about these properties is to realize that they are features of OLS for a particular sample of data. They can be contrasted with the statistical prop- erties of OLS, which requires deriving features of the sampling distributions of the esti- mators. We will discuss statistical properties in Section 2.5 Several of the algebraic properties we are going to derive will api Nevertheless, having a grasp of these properties helps us to figure out what happens the ols estimates and related statistics when the data are manipulated in certain ways such as when the measurement units of the dependent and independent variables change

total vote. Whether or not this is a causal effect is unclear, but the result is what we might expect. In some cases, regression analysis is not used to determine causality but to simply look at whether two variables are positively or negatively related, much like a standard correlation analysis. An example of this occurs in Problem 2.12, where you are asked to use data from Biddle and Hamermesh (1990) on time spent sleeping and working to investigate the tradeoff between these two factors. A Note on Terminolgy In most cases, we will indicate the estimation of a relationship through OLS by writing an equation such as (2.26), (2.27), or (2.28). Sometimes, for the sake of brevity, it is useful to indicate that an OLS regression has been run without actually writing out the equation. We will often indicate that equation (2.23) has been obtained by OLS in saying that we run the regression of y on x, (2.29) or simply that we regress y on x. The positions of y and x in (2.29) indicate which is the dependent variable and which is the independent variable: we always regress the dependent variable on the independent variable. For specific applications, we replace y and x with their names. Thus, to obtain (2.26), we regress salary on roe or to obtain (2.28), we regress voteA on shareA. When we use such terminology in (2.29), we will always mean that we plan to estimate the intercept, ˆ 0, along with the slope, ˆ 1. This case is appropriate for the vast majority of applications. Occasionally, we may want to estimate the relationship between y and x assuming that the intercept is zero (so that x 0 implies that yˆ 0); we cover this case briefly in Section 2.6. Unless explicitly stated otherwise, we always estimate an intercept along with a slope. 2.3MECHANICS OF OLS In this section, we cover some algebraic properties of the fitted OLS regression line. Perhaps the best way to think about these properties is to realize that they are features of OLS for a particular sample of data. They can be contrasted with the statistical properties of OLS, which requires deriving features of the sampling distributions of the estimators. We will discuss statistical properties in Section 2.5. Several of the algebraic properties we are going to derive will appear mundane. Nevertheless, having a grasp of these properties helps us to figure out what happens to the OLS estimates and related statistics when the data are manipulated in certain ways, such as when the measurement units of the dependent and independent variables change. Chapter 2 The Simple Regression Model 35 QUESTION 2.3 In Example 2.5, what is the predicted vote for Candidate A if shareA 60 (which means 60 percent)? Does this answer seem reasonable? d 7/14/99 4:30 PM Page 35