1 Introduction
Generations of economists have struggled with the question of the optimal degree of tax progressivity. In its modern form, this question was first posed by Vickrey (
1945), who stated that a full characterization of the optimum “produces a completely unwieldy expression,” leading him to the conclusion that “the problem resists any facile solution.” Indeed, it took another quarter of a century until Mirrlees (
1971,
1976) offered a first solution to the problem. The solution was obtained by way of an indirect approach: he first solved for the optimal
allocation, subject to resource and incentive compatibility constraints, and only then determined the tax system that would implement this allocation. Ever since, this has been the dominant approach in the literature whenever it concerns nonlinear taxation (e.g., Stiglitz,
1982; Tuomala,
1990; Diamond,
1998).
The advantage of this indirect approach is its mathematical rigor. The problem of finding the optimal allocation conveniently lends itself to the toolbox of optimal control theory, yielding a mathematically well-defined procedure for solving it. But this solution procedure also harbors the main disadvantage of this indirect approach, namely the lack of intuition involved with the derivation of the optimal tax schedule. In reality, government does not exercise any direct control over individuals’ allocations—how much they work and consume of every good in the economy. Instead, it controls the tax system. Interpreting the problem of optimal taxation as choosing the most preferred incentive-compatible allocation may well-alienate the applied world of tax policy, as well as students, from the academic discipline of tax design. In the worst case, it could lead policy makers to disregard academic insights, and academics to focus too much on technical issues that might be of limited practical relevance. In short, it could reduce the practical impact of an academic field whose
raison d’être is its potential for practical impact.
1
A more intuitive way of solving for optimal taxes is by directly considering the social welfare effects of changes in
taxes rather than allocations. For optimal linear taxes, this has always been the dominant solution procedure (e.g., Diamond and Mirrlees,
1971; Sheshinski,
1972; Diamond,
1975; Dixit and Sandmo,
1977). The likely reason for this is that a linear tax can be captured by a single parameter, which allows for straightforward optimization techniques. The same techniques cannot directly be applied to solve for the optimal nonlinear tax schedule, as the object to be optimized is a function rather than a parameter. Some recent contributions have circumvented this problem heuristically (e.g., Saez,
2001;
2002; Piketty and Saez,
2013; Jacquet et al.,
2013). They consider a small perturbation of the tax schedule and heuristically—i.e., verbally—deduce the social-welfare effects of this perturbation. Equating these social-welfare effects to zero solves for the optimum. To prove that their heuristic is valid, they subsequently show that their results correspond to results obtained by solving for the optimal incentive-compatible allocation. This last step is necessary as it may be unclear whether the heuristic derivation picked up on all the relevant welfare effects.
In what follows, I use the term “
primal approach” to refer to the indirect approach of first solving for the optimal
allocation. I use the term “
dual approach” to describe the method of directly solving for optimal
taxes.
2 I show how one can apply the dual approach to determine the optimal nonlinear income tax without relying on a verbal derivation of social-welfare effects. By doing so, I combine the intuitive appeal of the dual approach with the mathematical rigor of the primal approach. All that is needed is a minor adjustment to the definition of the tax schedule, which makes it amenable to simple optimization techniques.
The key to this adjustment is to recognize that a person’s tax burden can change for two different reasons: due to a change in his taxable income and due to a reform of the tax schedule. Thus, instead of defining a nonlinear tax as T(z), with z a person’s taxable income, I define it as \({T(z,\kappa )\equiv {\mathcal {T}}(z)+\kappa \tau (z)}\). Here, \(\kappa\) is an arbitrary parameter and \(\tau (z)\) is the schedule of any nonlinear tax reform one might want to consider. Writing social welfare in terms of \(T(z,\kappa )\), one can deduce the marginal welfare effects of a reform by simply taking the derivative with respect to the parameter \(\kappa\), and substituting for the specific reform of interest \(\tau (z)\). Expressions for the optimal nonlinear tax schedule are derived by optimizing over \(\kappa\) for any possible function \(\tau (z)\). In other words, at the optimum, social welfare is unaffected by any possible nonlinear reform of the tax schedule.
Beyond its intuitive appeal, a second advantage of the dual approach is that it allows for a large degree of flexibility regarding individual behavior. More specifically, I show that it is straightforward to account for heterogeneity not just in individuals’ income, but also in their responsiveness to tax reforms. Doing so, I replicate findings by Jacquet and Lehmann (
2021) who apply the primal approach to show that standard optimal tax formulas are adjusted by using income-conditional average elasticities. Moreover, the dual approach can easily incorporate individual behavior that is not based on utility maximization. Utility maximization might not be an appropriate behavioral framework when individuals form mistaken beliefs about the shape of their budget curve or about the functional form of their own utility function. In that case, optimal tax formulas include a corrective term, prescribing higher marginal taxes for individuals who work “too much” and lower marginal taxes for individuals who work “too little.”
3 The importance of such corrective term crucially depends on misoptimizers’ responsiveness to tax reforms.
Finally, I show how the dual approach can be applied to determine the welfare effects of tax reforms outside the tax optimum. Contrary to the primal approach, which deals with variations in allocations rather than tax schedules, the dual approach is ideally suited to study small nonlinear reforms of a given tax schedule. This is likely to be of more relevance to actual tax policy than a characterization of the optimum. Moreover, determining the desirability of a reform may be empirically less demanding than determining the optimal tax schedule. The reason for this is that the former depends in part on the responsiveness of taxable income at the actual tax system, whereas the latter depends on the responsiveness at the optimal tax system. While we typically cannot be certain about either of the two, it is arguably less problematic to use available elasticity estimates as measures of the responsiveness of taxable income at the actual tax system than as measures of the responsiveness in the optimum.
The contribution of this paper is mostly methodological and pedagogical in nature. The optimal-tax results are themselves not novel. However, they are typically derived in ways that are either mathematically daunting or verbal and therefore mathematically imprecise. The aim of this paper is to show the reader how known results on optimal taxation can be derived in a fairly simple but precise way. The hope is that this will contribute to a deeper understanding of these results among a broader audience.
Beyond the above-mentioned references, this paper relates to a number of earlier studies. To the best of my knowledge, Christiansen (
1981,
1984) was the first to parameterize the nonlinear tax schedule to make it amenable to the analysis of tax reforms. His focus is on the evaluation of public projects and commodity taxation, however, and he does not consider a full characterization of the optimal nonlinear income tax—which is the focus of this study. More recently, Golosov et al. (
2014) formalize the dual approach to optimal nonlinear income taxation in a dynamic model by applying Gateaux differentials with respect to the tax schedule; Hendren (
2020) uses the dual approach to derive implicit welfare weights; and Spiritus et al. (
2022) employ the dual approach to derive optimal taxes when households earn multiple incomes and differ across multiple dimensions. Finally, this paper also relates to earlier contributions that identify desirable tax reforms within any given non-optimal tax system (e.g., Tirole and Guesnerie,
1981; Weymark,
1981; Guesnerie,
1995; Bierbrauer et al.,
2022).
Section
2 introduces the parameterization of the tax schedule, and Sect.
3 shows how this helps in deriving the welfare effects of any nonlinear tax reform. Section
4 derives expressions for optimal tax rates using the dual approach, allowing for preference heterogeneity and individuals who do not maximize their utility. Section
5 illustrates how the dual approach can be usefully applied to obtain insights into more limited tax reforms outside the optimum. Section
6 discusses the broader applicability of the dual approach and I wrap up with some concluding remarks.
6 Broader applicability of the dual approach
The focus of this paper has been on illustrating how the dual approach can be applied to solve for optimal nonlinear income taxes. I show this within a standard context with individuals that only make one intensive-margin decision on the size of their tax base—while allowing for heterogeneous preferences and individual utility misoptimization. However, the dual approach is versatile enough to be much more broadly applicable. In what follows, I therefore illustrate how the above analysis can be adjusted to take into account various nonlinear reforms outside the optimum, multiple intensive decision margins, a participation margin, and multiple tax bases that are subject to separate nonlinear tax schedules.
Nonlinear reforms outside the optimum—The third reform in the previous section just looked at one specific tax reform that might be relevant for actual policy making. That reform was essentially linear—raising the proportional tax rate of a specific bracket—though evaluated within the context of an actual nonlinear schedule of effective marginal tax rates. However, the dual approach can be readily applied to more complicated nonlinear reforms that play a role in actual policy discussions. For example, one could analyze different types of phase-out schedules for the EITC or other welfare programs, or changes to a quadratic tax schedule.
18 Is it better to phase out the EITC at a linear rate—raising effective marginal tax rates by the same amount across the phase-out range—or at an increasing or decreasing rate? Introducing an increasing phase-out rate within the range
\([z^{a},z^{b}]\) could be modeled with a specific reform function
\(\tau (z)\) with
\(\tau _z(z)>0\) and increasing over the phase-out range. Conversely, a decreasing phase-out rate could be modeled with a reform function that has
\(\tau _z(z)>0\) and decreasing over the phase-out range. As before, substituting these reforms into Eq. (
14) allows one to readily evaluate the welfare consequences of either phase-out function for any arbitrary initial tax schedule.
Multiple intensive margins— It is straightforward to allow individuals to make more decisions than only the one that determines their tax base. As long as these decisions are unobservable to the tax authority, and therefore untaxed, the analysis remains unchanged in the case of utility-maximizing individuals. Then, even if a tax reform affects individual behavior on these additional decision margins, this does not affect their utility (because of individual utility maximization), nor does it affect government revenue (because the additional decisions are untaxed).
This convenient conclusion no longer holds if individuals do not perfectly maximize utility when making these additional decisions. To see this, notice that the term
\(\omega ^{i}\) enters Eq. (
14) as a welfare effect of the tax reform. With multiple decision margins, similar terms for every decision margin would enter Eq. (
14), thereby yielding multiple corrective reasons for marginal taxes. As a simple example, imagine that individuals perfectly maximize utility when deciding on their (taxed) labor income, but mistakenly consume too much and save too little of their earned income. Then if future consumption is complementary with leisure, higher labor income taxes would be helpful in correcting individuals’ savings decision even though there is no need for a labor-supply correction.
Participation margin—The analysis can further be adapted to allow for a participation margin. For simplicity, I only consider the standard case in which individuals with the same income have the same intensive-margin elasticities, and in which individuals maximize their utility. The latter assumption ensures that a small tax reform only mechanically affects individuals’ utility due to changes in tax burdens, but not through behavioral changes. As a result, a reform of the marginal income tax affects individuals’ utility in essentially the same way as in the case without a participation margin. I can therefore focus attention on how adding a participation margin affects a reform’s effect on government revenue.
For this, I refine the definition of
\(z^{i}\) as the “notional tax base,” i.e., the tax base individual
i would choose if he decides to participate. His actual tax base when deciding not to participate equals 0. I furthermore introduce a parameter
\(\pi ^{i}(\kappa )\) that indicates the share of labor market participants among individuals with notional income
\(z^i\). The government budget can then be rewritten as:
$$\begin{aligned} {\mathcal {B}}=\int _{{\mathcal {I}}}\left( \pi ^{i}(\kappa )T(z^{i},\kappa )+(1-\pi ^{i}(\kappa ))T(0,\kappa )\right) \textrm{d}i, \end{aligned}$$
(24)
which gives the integral over participants’ and non-participants’ tax burdens. Taking derivatives, the effect of a marginal tax reform on government revenue can be seen to equal:
$$\begin{aligned} \frac{\textrm{d}{\mathcal {B}}}{\textrm{d}\kappa }=\int _{{\mathcal {I}}}\left( \pi ^{i}(\kappa )\left( \tau (z^{i})+T^i_{z}\frac{\textrm{d}z^{i}}{\textrm{d} \kappa }\right) +(1-\pi ^{i}(\kappa ))\tau (0)+\left( T^i-T^0\right) \frac{\textrm{d}\pi ^{i}}{\textrm{d}\kappa }\right) \textrm{d}i, \end{aligned}$$
(25)
with
\(T^0\equiv T(0,\kappa )\). Thus, the reform yields mechanical revenue changes for both participants and non-participants, an intensive behavioral effect on the tax base (
\(\textrm{d}z^{i}/\textrm{d}\kappa\)), and an extensive behavioral effect on the tax base (
\(\textrm{d}\pi ^{i}/\textrm{d}\kappa\)). The latter behavioral response would typically be unaffected by changes in marginal taxes, but responsive to changes in average tax rates. As a result, the total welfare effect of an increase in the marginal tax rate at
\(z^{*}\) now includes the reduced government revenue due to lower participation rates among individuals whose notional income exceeds
\(z^{*}\). This additional cost of taxation should be taken into account in the optimum and tends to reduce optimal marginal tax rates.
Multiple tax bases—The dual approach can also be fruitfully employed to study the desirability of other types of government policy in combination with a nonlinear tax schedule. For linear commodity taxation and public good provision, this has previously been illustrated by Christiansen (
1981,
1984). But one can also deal with multiple nonlinear tax schedules as in the case of labor-income and capital-income taxes (e.g., Gerritsen et al.,
2022). For example, let
\(T^{z}\) denote a nonlinear labor-income tax with tax base
z, and
\(T^{y}\) a nonlinear capital income tax with tax base
y. Similar to the analysis above, both nonlinear taxes can be parameterized as
\(T^{z}(z,\kappa ^{z})\) and
\(T^{y}(z,\kappa ^{y})\) to allow for straightforward welfare analysis of any nonlinear reform of either tax.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.