Crop statistical models have re-emerged as an alternative approach to the traditional biophysical models for assessing the potential impacts of climate change
on crop yields
. A statistical crop yield
model is basically a regression analysis of crop yields
on weather variables. Early examples can be traced back to the early part of the last century (Wallace
1920; Hodges
1931). In this chapter, I adopt the approach developed more recently by Schlenker and Roberts (
2009). These authors developed an innovative approach that separately estimates the effect of the cumulative exposure (over the growing season) to different temperature
bins on crop yield
.
4 Mathematically, the nonlinear effect of temperature
on yield may be represented by a function of temperature
h, denoted
g(
h). Logged maize yield
y
it
in county
i and year
t can thus be represented as:
$$ {y}_{it}=\underset{\underset{\_}{h}}{\overset{\overline{h}}{\int }}g(h){\phi}_{it}(h)d(h)+{p}_{it}{\delta}_1+{p}_{it}^2{\delta}_2+{z}_{it}\tau +{c}_i+{\epsilon}_{it} $$
(1)
where
ϕ
it
(
h) is the time distribution of temperature
for April–September,
p
it
is precipitation,
z
it
is a quadratic time trend and the
c
i
are county fixed-effects that capture time-invariant factors explaining yields level across counties (e.g. soil quality, etc). However, Eq. (
1) cannot be estimated directly because of the integral. To make this model tractable one needs to approximate the integral with a summation over discrete temperature
bins:
$$ {y}_{it}=\sum_{h=0}^{36}g\left(h+0.5\right)\left[{\varPhi}_{it}\left(h+1\right)-{\varPhi}_{it}(h)\right]+{p}_{it}{\delta}_1+{p}_{it}^2{\delta}_2+{z}_{it}\tau +{c}_i+{\epsilon}_{it} $$
where
Φ
it
(
h + 1) −
Φ
it
(
h) represents the time spent over the [
h;
h + 1] interval, and
g(
h + 0.5) is a parameter to estimate. However, given
the high number of temperature
bins, collinearity between exposures to contiguous bins might create noisy estimates. As a result I assume that
g(
h) is a smooth function over temperature
bins which I can approximate with cubic B-spline with 8 degrees of freedom evaluated at each temperature
bin. This can be written as:
$$ {\displaystyle \begin{array}{ccc}{y}_{it}& =& \sum \limits_{h=0}^{36}\sum \limits_{j=1}^8{\gamma}_j{B}_j\left(h+0.5\right)\left[{\varPhi}_{it}\left(h+1\right)-{\varPhi}_{it}(h)\right]+{p}_{it}{\delta}_1+{p}_{it}^2{\delta}_2+{z}_{it}\tau +{c}_i+{\epsilon}_{it}\\ {}{y}_{it}& =& \sum \limits_{h=0}^{36}\underset{x_{it,j}}{\underbrace{\sum \limits_{j=1}^8{\gamma}_j{B}_j\left(h+0.5\right)\left[{\varPhi}_{it}\left(h+1\right)-{\varPhi}_{it}(h)\right]}}+{p}_{it}{\delta}_1+{p}_{it}^2{\delta}_2+{z}_{it}\tau +{c}_i+{\epsilon}_{it}\end{array}} $$
where
B
j
is the
jth column of the basis matrix of the natural cubic spline. The model effectively regresses yield on eight temperature
variables,
x
it, j
. The model is estimated via Least Squares and errors are clustered by county and by year to account for heteroscedasticity and contemporaneous error dependence. Once parameters
γ
j
are estimated, one can derive the marginal effects of temperature
exposure by pre-multiplying estimated coefficients by the basis matrix. These marginal effects correspond to the marginal effects of each temperature
bin on crop yield
.