N.H.Stern,Optimum income taxution 133 per effective hour is one (where the model presumes a linear tax schedule).This ely choos ing a linear scale for n. A ta ang occurs which increas and decrease -New Jersey NIT experi Atot.We observe only ml and 4 in both cases but we aow that m H decreased from n to nwo,where wo is known-since 1-wo is the increase in the marginal tax rate.For a given individual we then have two equations in two unknowns (a,n)which we can solve for a and n.Given a population subject to the experiment we could find the distribution of n(and of a)in the population. The kind of experimental information we have just been discussing is rather rare.Usually w e have a given tax structure,a distribution of labour income and uneasy abo asur ing effort by hours ed.We should like to know if he income distri tion is a good proxy for the skill distril saw at the beginning of this subsection a special ase where the distribution were identical.This case is unusual,however.and the income distribution may be very misleading as an estimate of the distribution of skills.For example,one can imagine a utility function where an individual has a target level of consump- tion or income (co)upon which he insists,but he is not prepared to work to raise his cons ion beyond this level, u(c,)= 1-1ifc≥co, 】-.co otherwise If we have a community of individuals of different skills,all of whom have this atility func should obs a complete y equal distribution of in However,s individuals would ert a great deal f(unobservable effort to ach ieve co and consequently would have low utility.Others would achieve co with comparative ease. This is an extreme example but it illustrates the point that where the un- observed supply curve of effort with respect to nw is backward sloping,the distribution of skills is more unequal than the distribution of incomes.We shall see in section 3.3 that supply curves of hours are usually found to be backward sloping for much of their the hand the ere ay be ma factors in actual situations which affect wage rates but should not be scribed as innate e abili for example,ag e luck or power.In ideal circumstances one would 1 examine a popula tion which had constant values of these complicating factors.While this may be possible for age or education it is difficult in the case of luck or power.Note that if the acquisition of education is sensitive to earnings,education should be included in supply functions and not 'skill'distributions.We return to this briefly below. 1Hall(1974)makes use of the experimental Penn-New Jersey NIT data for his actual estima- tion(see section3.1)
N.H. Stern, Optimum income taxution 133 per effective hour is one (where the model presumes a linear tax schedule). This is merely choosing a linear scale for n. A tax change occurs which increases A to A0 and decreases w from 1 to w, - as in the Penn-New Jersey NIT experiment. We observe only nwl and A in both cases but we know that nw has decreased from n to nw, , where w0 is known - since 1 -w, is the increase in the marginal tax rate. For a given individual we then have two equations in two unknowns (CI, n) which we can solve for OL and n. Given a population subject to the experiment we could find the distribution of n (and of a) in the population. The kind of experimental information we have just been discussing is rather rare.’ * Usually we have a given tax structure, a distribution of labour income and we may be uneasy about measuring effort by hours worked. We should like to know if the income distribution is a good proxy for the skill distribution. We saw at the beginning of this subsection a special case where the distributions were identical. This case is unusual, however, and the income distribution may be very misleading as an estimate of the distribution of skills. For example, one can imagine a utility function where an individual has a target level of consumption or income (c,) upon which he insists, but he is not prepared to work to raise his consumption beyond this level, ifc 1 co, otherwise. If we have a community of individuals of different skills, all of whom have this utility function, we should observe a completely equal distribution of incomes. However, some individuals would have to exert a great deal of (unobservable) effort to achieve co and consequently would have low utility. Others would achieve co with comparative ease. This is an extreme example but it illustrates the point that where the unobserved supply curve of effort I with respect to nw is backward sloping, the distribution of skills is more unequal than the distribution of incomes. We shall see in section 3.3 that supply curves of hours are usually found to be backward sloping for much of their range. On the other hand there may be many factors in actual situations which affect wage rates but should not be described as innate ability: for example, age, education, luck or power. In ideal circumstances one would examine a population which had constant values of these complicating factors. While this may be possible for age or education it is difficult in the case of luck or power. Note that if the acquisition of education is sensitive to earnings, education should be included in supply functions and not ‘skill’ distributions. We return to this briefly below. 18Hall (1974) makes use of the experimental Penn-New Jersey NIT data for his actual estimation (see section 3.1)
134 N.H.Stern.Optimum income taxation Suppose the wage rate m is equal to an (additive)combination of innate ability n and some other factor x.Then var (m)=var (n)+var (x)+2cov (n.x). If the covariance is ero or positive the distribution of wage rates is more unequal than the distrihution of abilities.It seems more likely that the factors mentioned are positively correlated with ability. We have seen that the relevant evidence for skill distribution must,therefore be based on labour carnings and,where possible,rates,and be corrected for age and education.It is clear that a casual examination of the distribution of(earned plus unearned)income is insufficient.This is an important area for further One o of the main problems for thisr tional form of the distribution to be fitted.Pareto(1897)four where N is the number of incomes above X,gave a remarkably go fit for several countries.He estimated a and found that it was around 1.5.On the other hand,Lydall examines the upper tail of the distribution of employment incomes for different countries and finds a ranging from 2.27 for France in 1964 to 3.4 for Germany in 1964 (1968.p.133).Lydall does examine employment he. and i cases (1968. 33)tries to work with populations with given numb f hours p ek He cests further that,for cisely defined occupational characteristics (1968,p.33),the log-normal distribu ution fits rather well. An interesting approach to the problem is the recent study by Schwartz (1975).He finds,disaggregating populations by race and years of education, that the power transformation of income which gives the closest approximation to normality is the cube root of income. We can come e to no firm conclusions as to whether the current distribution of income an accurate picture of the distribution of skills.We saw that t re were two powe rful influ hacku s and non agaa6。 skill factors.pulling in o irections ed that the non epend on the instituti and organisation of society,and that the relative productivity of different skills depends on the capital stock.Further it is clear that the one-dimensional model of the skill distribution is a very crude representation of reality.But the problem is deeper than this.If skills are acquired,the motivation may be the potential reward,as for example in human capital models.We should then include qudskillsin the supply functionrather than the skill distribution.This orces us to think of controversial. hat if skills (an nd effort) not acquired(or sup monetary reward?W e have to reex supply.Theoretical and empirical research on these problems is still in its infancy
134 N.H. Stern, Optimum income taxation Suppose the wage rate M is equal to an (additive) combination of innate ability n and some other factor x. Then var (m) = var (n) + var (x) + 2 cov (n, X) . If the covariance is zero or positive the distribution of wage rates is more unequal than the distribution of abilities. It seems more likely that the factors mentioned are positively correlated with ability. We have seen that the relevant evidence for skill distribution must, therefore, be based on labour earnings and, where possible, rates, and be corrected for age and education. It is clear that a casual examination of the distribution of (earned plus unearned) income is insufficient. This is an important area for further research. One of the main problems for this research will be the specification of the functional form of the distribution to be fitted. Pareto (1897) found that N, = J-X-“, where N, is the number of incomes above X, gave a remarkably good fit for several countries. He estimated a and found that it was around 1.5. On the other hand, Lydall examines the upper tail of the distribution of employment incomes for different countries and finds o! ranging from 2.27 for France in 1964 to 3.4 for Germany in 1964 (1968, p. 133). Lydall does examine employment incomes and, in some cases (1968, p. 33) tries to work with populations with given numbers of hours per week. He suggests further that, for precisely defined occupational characteristics (1968, p. 33), the log-normal distribution fits rather well. An interesting approach to the problem is the recent study by Schwartz (1975). He finds, disaggregating populations by race and years of education, that the power transformation of income which gives the closest approximation to normality is the cube root of income. We can come to no firm conclusions as to whether the current distribution of income gives an accurate picture of the distribution of skills. We saw that there were two powerful influences, backward bending supply curves and nonskill factors, pulling in opposite directions. It must be emphasised that the nonskill factors include a multitude of variables which depend on the institutions and organisation of society, and that the relative productivity of different skills depends on the capital stock. Further it is clear that the one-dimensional model of the skill distribution is a very crude representation of reality. But the problem is deeper than this. If skills are acquired, the motivation may be the potential reward, as for example in human capital models. We should then include acquired skills in the supply function rather than the skill distribution. This forces us to think of n as innate ability, a notion which is both slippery and controversial. And, what if skills (and effort) are not acquired (or supplied) for monetary reward? We have to reexamine our concepts of supply. Theoretical and empirical research on these problems is still in its infancy
N.H.Stern,Optimum income taxation 135 3.3.Supply curves as estimated Empirical estimates of the response of labour supply to changes in wages and usually expres in te s of a supply ly function.For odels of ly w functi The purpose e of this section is to describe the calculation of the param CES utility function from the estimates of income and wage responses which have been found by others. We showed in section 3.1 that for the Mirrlees case we can estimate the labour supply function directly by assuming that everyone has the same supply function and differences in skills result in differences in wage rates(see fig.1).We suppose that such an estimation has been performed and we have estimates of the ensated)wage and ine me elasticities at some level of wages and for e lump-sum comes.We want to infer a CesD We suppos the individual [ where c is consumption of goods,/is labour supply and L the maximum possible level of work.We assume that consumption is c=A+wl.We are,therefore assuming a linear tax schedule where w is the post-tax wage.We are not concerned here with the reason for the level of w so we suppress the'n'factor. The first-order condition for the above problem is obtained by putting h=n=1ineq.(③);we then have 「L-1+ 4+wi =-)w It is obvious from(6)that olog w 1+2s8, the elasticity of substitution.We differentiate(6logarithmically with respect to w and A in turn,and after a little manipulation obtain al (A-uwD)(L-1) w-wa+1DA+w万1 al (L-D (8) Given w,1,(w/l)(ollaw),(4/D)(01l04),we can solve (6),(7)and (8)for L,a,u
N.H. Stern, Optimum income taxation 135 3.3. Supply curves as estimated Empirical estimates of the response of labour supply to changes in wages and income are usually expressed in terms of a supply function. For models of optimum income taxation we usually wish to work with explicit utility functions. The purpose of this section is to describe the calculation of the parameters of a CES utility function from the estimates of income and wage responses which have been found by others. We showed in section 3.1 that for the Mirrlees case we can estimate the labour supply function directly by assuming that everyone has the same supply function and differences in skills result in differences in wage rates (see fig. 1). We suppose that such an estimation has been performed and we have estimates of the (uncompensated) wage and income elasticities at some level of wages and for some lump-sum incomes. We want to infer a CES utility function. We suppose the individual problem is to maximise [( I- OL)C-” +cl(L - I)-r]-l’P, where c is consumption of goods, 2 is labour supply and L the maximum possible level of work. We assume that consumption is c = A+ wl. We are, therefore, assuming a linear tax schedule where w is the post-tax wage. We are not concerned here with the reason for the level of w so we suppress the ‘RZ’ factor. The first-order condition for the above problem is obtained by putting h = n = 1 ineq. (3); we then have L-1 p+l II 1 a A+wl =-. (1 -a)w (6> It is obvious from (6) that L-l -alog - II A+wl 1 1 a1ogw = K = &, the elasticity of substitution. We differentiate eq. (6) logarithmically with respect to w and A in turn, and after a little manipulation obtain a1 (A-pwZ)(L- I) -= aw wCj~+ l)(A+ wL) ’ ar w-0 -=-* aA A+wL Given w, I, (w[Z)(al/aw), (A/Z)(aZ/i3A), we can solve (6), (7) and (8) for L, a, p
136 N.H.Stern,Optimum income taxatton Ashenfelter and Heckman (1973)estimate income and substitution effects from a cross-secti sample mponent of the 197U.S. le heads of families from the national probability of Economic Opp ty They restricted their sample to men not receiving welfare payments and whose wives were present but not working.They write (my notation). 41=S4w+BI*4w+4A. (9) ifrom smpe mens,is the substitution fermand lab p y of the sample and so that w approximation to 0l4.Eq.(9)is then estimated.sation represen an app the on and thus s B an Hours w vere calculated using annual earnings divided by hourly wage rates.Dummy variables for race,region and size of town were included as well as age and age squared.The age terms give an increase in hours to age 44 and a decline thereafter. They find,for the mean of their sample,that w=3.86 dollars per hour 1=2272 hours per year,4=800 dollars oer year,20 (wll(ollow)=-0.15 and 0.07 rs give s of and u of 3190.0.994 and pect vely.Note th the alue the units of measurement of lab our and income.The value of the elasticity of substitution,=1/(1+u),is 0.408 This is a rather striking result since the income tax models discussed in section 2 concentrated attention on the addilog case where u=0 and s=1. We discuss the qualifications which must be attached to this estimate at the end of this subsection.for the moment.we examine its sensitivity to the values of (allew)and of th resu mably the wage and hours of work at the mean samp can be taken a It is rather hard to meas are the lump sum income A available to an individua w to treat urity benefits,21 large range,$0-2000.The results are shown in table 1.The estimates are rather insensitive to changes in A.For A -0,we obtain e-0.444;and for 4-2000, g=0.362 Ashenfelter and Heckman compare the figure of-0.15 for (w/D(al/ow)with Sherwin Rosen's (1969)estimates of-0.07 to-0.30 from inter-industrial data. T.Aldrich Fin n's (1962)estimates of-0.25 to-0.35 from inter-occup onal data. ti Gord ate of -0.07 to -010f on country data,and John Owen's(1971)estimates of f-0 024 pfelter for su g me with this estim ate of 4
136 N.H. Stern, Optimum income taxation Ashenfelter and Heckman (1973) estimate income and substitution effects from a cross-section of 3,203 male heads of families from the national probability sample component of the 1967 U.S. Survey of Economic Opportunity. They restricted their sample to men not receiving welfare payments and whose wives were present but not working. They write (my notation), Al = SAw+B[l*Aw+AA]. (9) A represents differences from sample means, S is the substitution term, and I* is the average of the mean labour supply of the sample and 1, so that l*Aw represents an approximation to the income compensation and thus B an approximation to iT!iA. Eq. (9) is then estimated.” Hours were calculated using annual earnings divided by hourly wage rates. Dummy variables for race, region and size of town were included as well as age and age squared. The age terms give an increase in hours to age 44 and a decline thereafter. They find, for the mean of their sample, that w = 3.86 dollars per hour, 1 = 2272 hours per year, A = 800 dollars per year, 2 ’ (w,ll)(al/dw) = - 0.15 and al/aA = -0.07. These numbers give values of L, a and p of 3190, 0.994 and 1.45 (to 3 significant figures), respectively. Note that the value of a depends on the units of measurement of labour and income. The value of the elasticity of substitution, E = l/(1 +p), is 0.408. This is a rather striking result since the income tax models discussed in section 2 concentrated attention on the addiIog case where p = 0 and E = 1. We discuss the qualifications which must be attached to this estimate at the end of this subsection. For the moment, we examine its sensitivity to the values of A, (w/l)(al/aw) and al/dA - presumably the wage and hours of work at the mean of the sample can be taken as given. It is rather hard to measure the lump sum income A available to an individual. One has to make many judgements as to how to treat social security benetits,21 returns on durable assets and so on. We therefore allowed A to vary across a large range, $0-2000. The results are shown in table 1. The estimates are rather insensitive to changes in A. For A = 0, we obtain E = 0.444; and for A = 2000, E = 0.362. Ashenfelter and Heckman compare the figure of -0.15 for (w/l)(al/aw) with ‘Sherwin Rosen’s (1969) estimates of -0.07 to -0.30 from inter-industrial data, T. Aldrich Finegan’s (1962) estimates of -0.25 to -0.35 from inter-occupational data, Gordon Winston’s (1966) estimates of -0.07 to -0.10 from intercountry data, and John Owen’s (1971) estimates of -0.11 to -0.24 from U.S. lgInstrumental variable techniques were used since I* is correlated with the disturbance term [see Ashenfelter and Heckman (1973)]. The income compensation term should really allow for any differences between marginal and average tax rates. 2oI am grateful to Professor Ashenfelter for supplying me with this estimate of A. WI fact, workers receiving social security benefits were excluded from the sample