Journal of Public Economics 6 (1976)123-162. North-Holland Publishing Company ON THE SPECIFICATION OF MODELS OF OPTIMUM INCOME TAXATION N.H. STERN* St. Catherine's College, Oxford, England with programming by D. Deans Revised version received September 1975 The main concerns of the paper are the problems of estimating labour supply functions for use in models of optimum income taxation, and the calculation of the effect on the optimum linear tax rate of varying the elasticity of substitution, a, between leisure and goods from 0 to 1. Backward sloping supply curves are commonly observed and they imply a<1. Our calculation of e from estimates of supply curves by Ashenfelter and Heckman gives a=0.4. Optimum marginal rates decrcase with e when taxation is purely redistributive but may be nonmonotonic if positive revenue is to be raised. It is proved that optimum (linear or nonlinear) taxation involves a marginal rate of 100 percent when e=0. 1. Introduction There are four main ingredients for a model of optimum income taxation: an objective function, a preference relation or supply function for individuals. a skill structure and distribution, and a production relation. They are closely intertwined. An individualistic social welfare function would take into account the preference structure of individuals. The supply of various kinds of skills will depend on individuals' wishes or ability to produce these skills. The pro- duction relation must state how skills of different kinds are combined to produce outputs. The optimum income taxation problem as usually posed is to maximise a social welfare function, which depends on individual utilities, subject to two constraints. The first is that each individual should consume goods and supply factors in amounts which maximise his utility subject to the constraint of the tax function, which describes how much post-tax consumption can be acquired from pre-tax earnings. We are searching for the optimum function. The second is that the total labour supplied can produce the total quantity of goods *This paper has benefitted greatly from discussions with A.B. Atkinson, D.L. Bevan, P.A. Diamond, J.S. Flemming, J.A. Mirrlees and K.W.S. Roberts. The comments of partici- pants at a seminar in Cambridge were also helpful. Responsibility for all errors is mine. The paper was presented to the ISPE Conference on taxation in Paris. January 18-20. 1975. The comments of the discussants at that conference, E. Malinvaud and M, Bruno, were helpful. The support of the SSRC under grant HR 3733 is gratefully acknowledged
Journal of Public Economics 6 (1976) 123-162.6 North-Holland Publishing Company ON TJ3E SPECIFICATION OF MODELS OF OPTIMUM INCOME TAXATION N.H. STERN* St. Catherine’s College, Oxford, &gland with programming by D. Deans Revised version received September 1975 The main concerns of the paper are the problems of estimating labour supply functions for use in models of optimum income taxation, and the calculation of the effect on the optimum linear tax rate of varying the elasticity of substitution, e, between leisure and goods from 0 to 1. Backward sloping supply curves are commonly observed and they imply e < 1. Our calculation of e from estimates of supply curves by Ashenfelter and Heckman gives e = 0.4. Optimum marginal rates decrease with e when taxation is purely redistributive but may be nonmonotonic if positive revenue is to be raised. It is proved that optimum (linear or nonlinear) taxation involves a marginal rate of 100 percent when e = 0. 1. Introduction There are four main ingredients for a model of optimum income taxation: an objective function, a preference relation or supply function for individuals, a skill structure and distribution, and a production relation. They are closely intertwined. An individualistic social welfare function would take into account the preference structure of individuals. The supply of various kinds of skills will depend on individuals’ wishes or ability to produce these skills. The production relation must state how skills of different kinds are combined to produce outputs. The optimum income taxation problem as usually posed is to maximise a social welfare function, which depends on individual utilities, subject to two constraints. The first is that each individual should consume goods and suppiy factors in amounts which maximise his utility subject to the constraint of the tax function, which describes how much post-tax consumption can be acquired from pre-tax earnings. We are searching for the optimum function. The second is that the total labour supplied can produce the total quantity of goods *This paper has benefitted greatly from discussions with A.B. Atkinson, D.L. Bevan, P.A. Diamond, J.S. Flemming, J.A. Mirrlees and K.W.S. Roberts. The comments of participants at a seminar in Cambridge were also helpful. Responsibility for all errors is mine. The paper was presented to the ISPE Conference on taxation in Paris, January 18-20, 1975. The comments of the discussants at that conference, E. Malinvaud and M, Bruno, were helpful. The support of the SSRC under grant HR 3733 is gratefully acknowledged
名 N.H.Stern,Optimum income taxation demanded.It is the forme constraint which characterises the optimum income ation prob and which makes it a problem of the second best.Without this constraint,that individuals are on their supply curves,we have a first-best problem. When taxation is discussed it is often in terms of a trade-off between equality and efficiency,or the distribution of the cake and its size.The optimum income taxation problem is one way of formalising this trade-off and it is,perhaps g that it was not until Mirrlees (1971)that a suitable model wa We are stage of unders ture of these model and th t the the various co ponents.It should clear e outset that the purpose of this paper is not to make recommendations to the Treasury as to appropriate tax rates,but to contribute to the understanding of the dis cussion of equality versus efficiency through examination of a particular model. The particular concern of this paper is the supply function,and attention is focussed on the special case of labour supply.We shall examine the problem of estimation,which preference structures obtain support from the empirical literature on labour supply,a tes should h on o of th c level taxation. d tha most previous tax rates may have been b The next section presents the models of Mirrlees (1971)and Atkinson(1972) and contains a brief discussion of their numerical results.The problems of specifying and estimating ski!:distributions are discussed in section 3,together with calculations of the elasticity of substitution (e)between leisure and goods, based on empirical estimates of labour supply functions. The calculations of section 3 suggest that elasticities of substitution around o and and in s the ax,for m a odel。 Mirrlees (197 The e me case of0is examined,in the Mirrle mo in section 5 an we fin th optimum income taxation(linear or nonlinear)involves margina taxation at 100 percent.It is not surprising,therefore,that the calculations of section 4 show that,for small e,the optimum linear tax rate increases to 100 percent as s decreases to zero.However,where taxation is imposed to raisc revenue,as well as to redistribute,the optimum marginal rate may increase as e increases over a certain range.In section 6 the numerical discussion is evaluated. The rer ainder of this ection is devoted to a brief examination of those eler nte of the model,the obj ctive functio and the ion relation hich e no f ater discus ave worked with a concave transformation of indi- vidual cardinal.The transformation rangs from the linear utilitaran sum to the case where the 'degree of concavity'goes to infinity-the maximin, originally i more involved with my own investigations. retical and empirical work in p ess but became
124 N.H. Stern, Optimum income taxation demanded. It is the former constraint which characterises the optimum income taxation problem and which makes it a problem of the second best. Without this constraint, that individuals are on their supply curves, we have a first-best problem. When taxation is discussed it is often in terms of a trade-off between equality and efficiency, or the distribution of the cake and its size. The optimum income taxation problem is one way of formalising this trade-off and it is, perhaps, surprising that it was not until Mirrlees (1971) that a suitable model was developed. We are still at the stage of understanding the structure of these models and the importance of the various components. It should be clear at the outset that the purpose of this paper is not to make recommendations to the Treasury as to appropriate tax rates, but to contribute to the understanding of the discussion of equality versus efficiency through examination of a particular model. The particular concern of this paper is the supply function, and attention is focussed on the special case of labour supply. We shall examine the problem of estimation, which preference structures obtain support from the empirical literature on labour supply, and then the influence such estimates should have on our view of the appropriate level of income taxation. It will be suggested that most previous calculations of optimum tax rates may have been biased low. The next section presents the models of Mirrlees (1971) and Atkinson (1972) and contains a brief discussion of their numerical results.’ The problems of specifying and estimating ski!: distributions are discussed in section 3, together with calculations of the elasticity of substitution (E) between leisure and goods, based on empirical estimates of labour supply functions. The calculations of section 3 suggest that elasticities of substitution around 3 are of interest, and in section 4 the optimum linear income tax, for values of E between 0 and 1, is calculated in a model similar to that of Mirrlees (1971). The extreme case of E = 0 is examined, in the Mirrlees model, in section 5 and we find the optimum income taxation (linear or nonlinear) involves marginal taxation at 100 percent. It is not surprising, therefore, that the calculations of section 4 show that, for small E, the optimum linear tax rate increases to 100 percent as E decreases to zero. However, where taxation is imposed to raise revenue, as well as to redistribute, the optimum marginal rate may increase as E increases over a certain range. In section 6 the numerical discussion is evaluated. The remainder of this section is devoted to a brief examination of those elements of the model, the objective function and the production relation, which receive no further attention in the later discussion. Most previous writers have worked with a concave transformation of individual cardinal utilities. The transformation ranges from the linear utilitarian sum to the case where the ‘degree of concavity’ goes to infinity - the maximin, ‘1 originally intended to do a survey of theoretical and empirical work in progress but became more involved with my own investigations
N.H.Stern,Optimum income taxation 125 or Rawlsian.solution.Some might wish to claim that one is merely specifying he value i dgements of thedecison-maker by g an individual indifference curves together with a me hod by utilities are aggregated.The specification of a particular cardinal numbering for individuals and the form of the social preference relation over utilities may well be difficult,if not impossible,to disentangle,but I find it hard to understand a quantitative comparison between different forms of social welfare function for the same indifference structure(for individuals)if some benchmark of cardinality is not involved eardinality problem is much less severe when a one argument utility function is used- for example,Atkinson(1973a). .One can th nen suppose tha the government defines its values ov er the vectors whose hous hold incomes.However,when supply functions are central to the model 0n argument utility function seems out of place.It then becomes more difficul to wriggle out of the problem of numbering individual indifference curves.It is possible that part of the attraction of maximin objective functions is that the cardinality problem is less troublesome-maximising the lowest utility level will h。 cardinalisation is used when the same monotonic ng tra ation of utilities is applied to all individuals. The abe ove dis and of the literature ha s suppo i that the Be Samuelson social welfare function (no decreas g in each rgument appropriate tool for capturing social values in such analyses.Leaving aside the question of whether it should be used,it is possible that many people have some different underlying notion of welfare or distributional justice when they discuss income taxation.We illustrate the possible phenomenon with a few quotations and argu ents which might be thought plausible and vet imply non-Paretian o6 egin with thre quotations on inequality each of which clearly involves a non-Paretian position Tawney:2 When the press assails them with the sparkling epigram that they desire not merely to make the poor richer but to make the rich poorer,instead of replying,as they should,that,being sensible men,they desire both,since the extremes of both of riches and poverty are degrading and anti-social,they are apt to take refuge in gestures of depreciation. Simons:3 The case for drastic progression in taxation must be rested on the case against inequality-on the ethical or aesthetic judgement that the prevailing distribu- JPE-D
N.H. Stern, Optimum income taxation 125 or Rawlsian, solution. Some might wish to claim that one is merely specifying the value judgements of the decision-maker by using an arbitrary numbering of individual indifference curves together with a method by which individual utilities are aggregated. The specification of a particular cardinal numbering for individuals and the form of the social preference relation over utilities may well be difficult, if not impossible, to disentangle, but I find it hard to understand a quantitative comparison between different forms of social welfare function for the same indifference structure (for individuals) if some benchmark of cardinality is not involved. The cardinality problem is much less severe when a one argument utility function is used - see, for example, Atkinson (1973a). One can then suppose that the government defines its values over the vectors whose components are household incomes. However, when supply functions are central to the model a oneargument utility function seems out of place. It then becomes more difficult to wriggle out of the problem of numbering individual indifference curves. It is possible that part of the attraction of maximin objective functions is that the cardinality problem is less troublesome - maximising the lowest utility level will give the same policy whichever cardinalisation is used when the same monotonic increasing transformation of utilities is applied to all individuals. The above discussion and most of the literature has supposed that the BergsonSamuelson social welfare function (nondecreasing in each argument) is the appropriate tool for capturing social values in such analyses. Leaving aside the question of whether it shouZd be used, it is possible that many people have some different underlying notion of welfare or distributional justice when they discuss income taxation. We illustrate the possible phenomenon with a few quotations and arguments which might be thought plausible and yet imply non-Paretian objectives. We begin with three quotations on inequality each of which clearly involves a non-Paretian position. Tawney : ’ When the press assails them with the sparkling epigram that they desire not merely to make the poor richer but to make the rich poorer, instead of replying, as they should, that, being sensible men, they desire both, since the extremes of both of riches and poverty are degrading and anti-social, they are apt to take refuge in gestures of depreciation. Simons : 3 The case for drastic progression in taxation must be rested on the case against inequality - on the ethical or aesthetic judgement that the prevailing distribuYke Atkinson (1973b, p. 19). I am grateful to Kevin Roberts for drawing my attention to this quote. %ee Simons (1938, p. 15). Kevin Roberts drew my attention to this quote too. JPE- D
126 N.H.Stern.Optimum income taxation tion of wealth and income reveals a degree(and/or kind)of inequality which is distinctly evil or unlovely. Fair(1971)quotes Plato as follows Plato felt thato should bemore than four times richer than the poor membe er of so ety fo rin a society which mune from the os fatal disorders which might more properly be faction,there must be no place for penury in any section of the population, nor yet for opulence,as both breed either consequence.' Certain arguments on tax prop osals and structures might seem plausible to e.Sadka (1973) eteto Oe in prmert many and a ements ampl has sh wn that levels the optin as follows.Suppose that a given tax structure is a candic late for the optimum and it results in the most skilled person earning Er.Consider the announced marginal tax on the (Y+1)pound and suppose it is positive.Reduce it to zero. The most skilled person may work more and if he does he is better off.Similarly, others of lower skill may also work more.If they do,then they are better off (exploiting opportunities e not available to them before)and they pay 0 ckets rginal rate Thus,our cha ange has produ ed m re tax revenue and ha as made everyon least as well off as before. A Paretian should approve Many,h er,might regard a zero marginal rate at the top as offensive.It is conceivable that they may wish to retain this view even after they have understood the above argument. We should note that one cannot deduce that,where the skill distribution has positive density,for all positive skill levels the optimum marginal tax rate tends to zero.Indeed,Mirrlees(1971)gives examples where it does not.The structure is s owth model where we cannot infer from e result that nilar to ar op th del should hay apital stock at the the conclusion that the capit Istock tends to on th infinite horizo Some might propose a 10 percent tax on ink on grounds equality of opportunity for children.It is non-Paretian(if one rules out envy as the basis of the argument).since the ability to confer the inheritance makes the parent better off(the desire is to give rather than consume)and,presumably,the offspring as well.5 Many have found6 the 'equal absolute sacrifice'proposal an attractive basis for optimum income taxation.This abstracts from incentive problems and states that to raise a given revenue everyone should give up that amount of his income 40 .K tax schedules in Stern (1973)
126 N.H. Stern, Optimum income taxation tion of wealth and income reveals a degree (and/or kind) of inequality which is distinctly evil or unlovely. Fair (1971) quotes Plato as follows: Plato felt that no one in a society should be more than four times richer than the poorest member of society for ‘in a society which is to be immune from the most fatal disorders which might more properly be called distraction than faction, there must be no place for penury in any section of the population, nor yet for opulence, as both breed either consequence.’ Certain arguments on tax proposals and structures might seem plausible to many and also involve non-Paretian judgements. For example, Sadka (1973) has shown that with a finite number of individuals or skill levels the optimum marginal tax rate at the very top is zero. One can express his argument verbally as follows. Suppose that a given tax structure is a candidate for the optimum and it results in the most skilled person earning EY. Consider the announced marginal tax on the (Y+ 1) pound and suppose it is positive. Reduce it to zero. The most skilled person may work more and if he does he is better off. Similarly, others of lower skill may also work more. If they do, then they are better off (exploiting opportunities that were not available to them before) and they pay more tax since they move through tax brackets with nonnegative marginal rates. Thus, our change has produced more tax revenue and has made everyone at least as well off as before.4 A Paretian should approve. Many, however, might regard a zero marginal rate at the top as offensive. It is conceivable that they may wish to retain this view even after they have understood the above argument. We should note that one cannot deduce that, where the skill distribution has positive density, for all positive skill levels the optimum marginal tax rate tends to zero. Indeed, Mirrlees (1971) gives examples where it does not. The structure of the model is similar to an optimum growth model where we cannot infer from the result that a finite horizon model should have zero capital stock at the end, the conclusion that the capital stock tends to zero on the infinite horizon path. Some might propose a 100 percent tax on inheritance on the grounds of equality of opportunity for children, It is non-Paretian (if one rules out envy as the basis of the argument), since the ability to confer the inheritance makes the parent better off (the desire is to give rather than consume) and, presumably, the offspring as well.’ Many have found6 the ‘equal absolute sacrifice’ proposal an attractive basis for optimum income taxation. This abstracts from incentive problems and states that to raise a given revenue everyone should give up that amount of his income 4One can throw away the extra tax revenue if it is so desired. The argument is clearly rather general. ‘Better off’ has been used here in the weak sense of ‘at least as well off.’ 5Mirrlees drew my attention to this argument. 6This principle is discussed and fitted to U.K. tax schedules in Stem (1973)
N.H.Stern,Optimum income taxation which makes the sacrifice of utility equal.It turns out that one can choose a utility function which fits the U.K.income tax structure rather well.?Although it does not violate the Paretian condition,the proposal cannot be based on any ncave Bergson-Samuelson welfare function since,abstracting in ntive effects such welfare functions lead to equal post-tax incomes. The abov ve examples the standard welfar e ec ics dure based on th usual welfar o" would no starting point by many who might be prepared to comment on income tax structures Most of this paper will use a production structure with one basic input- labour in efficiency units-with a fixed wage.This does not mean that we are ea6eeagae mp the basic input ome for g0 t.Nevertheless,the assumption of one s worrying icular skill is lacking 4 h stror complementarity with other factors rather than the omple e sub titutab assumed in the case of labour in efficiency units.Felds has made a star in this direction and incorporates two different kinds of labour into his model The frequent assumption of public ownership seems less serious.If,for cxan aple,there are profits in the system,onc can carry out an analysis of the optim els esumably subiect to some constraints).The constraints on income tax would th take t of the esence of these pre other taxes r work is neessary,h wever iglitz(1976)have begu an examination of appropriate comb dons of ious taxe The absence of further discussion of the production assumptions should not be taken as a belief that they do not matter.The specification of the way different skills interact in the production process embodies an aspect of income taxation that many would regard as crucial.It is an important area for further research The models discussed here will all be static and will not,therefore,involve elasticity of its supply in any essential way.These models allow the ue apply and raise sufficien cant ar tions to w ss has been mad with dynamics but the components of the moc have to le 9 D rather simp 2.The model and numerical results of the studies of Mirrlees and Atkinson This discussion is not intended as a comprehensive survey since Atkinso (1973a)has recently provided a thorough discussion of previous numerical Sce Stern (1973). The different types oflabour ough a cobb-douglas production
N.H. Stern, Optimum income taxation 127 which makes the sacrifice of utility equal. It turns out that one can choose a utility function which fits the U.K. income tax structure rather well.’ Although it does not violate the Paretian condition, the proposal cannot be based on any symmetric strictly concave Bergson-Samuelson welfare function since, abstracting from incentive effects, such welfare functions lead to equal post-tax incomes. The above examples indicate that the standard welfare economics procedure based on the usual welfare functions would not be regarded as the obvious starting point by many who might be prepared to comment on income tax structures. Most of this paper will use a production structure with one basic input - labour in efficiency units - with a fixed wage. This does not mean that we are assuming constant returns to scale. We can regard the wage as the marginal product at the level of optimum total production and any profits that accrue as lump sum income for the government. Nevertheless, the assumption of one basic input is worrying. It is often asserted that a particular skill is lacking (say, management in the U.K.) and this carries with it a strong notion of complementarity with other factors rather than the complete substitutability assumed in the case of labour in efficiency units. Feldstein* has made a start in this direction and incorporates two different kinds of labour into his model. The frequent assumption of public ownership seems less serious. If, for example, there are profits in the system, one can carry out an analysis of the optimum levels (presumably subject to some constraints). The constraints on income taxation would then take account of the presence of these other taxes. Further work is necessary, however, and Atkinson and Stiglitz (1976) have begun an examination of appropriate combinations of various taxes. The absence of further discussion of the production assumptions should not be taken as a belief that they do not matter. The specification of the way different skills interact in the production process embodies an aspect of income taxation that many would regard as crucial. It is an important area for further research. The models discussed here will all be static and will not, therefore, involve capital and the elasticity of its supply in any essential way. These models allow the discussion of the important questions of labour supply and raise sufficient significant and difficult questions to warrant study. Some progress has been made with dynamics but the components of the models have to be kept rather simple.’ 2. The model and numerical results of the studies of Mirrlees and Atkinson This discussion is not intended as a comprehensive survey since Atkinson (1973a) has recently provided a thorough discussion of previous numerical ‘See Stem (1973). 8Feldstein (I 973). The different types of labour combine through a Cobb-Douglas production function to produce output. ?!ke, e.g., Feldstein (1973)