We implement an empirical application from the U.S. airline industry with heteroskedastic and autocorrelated errors using a panel of 6 firms over 15 years.
5 For the data set, we set aside a portion of the data for training and the other for testing. We estimate the model with four methods, GKRLS, KRLS, LP, and Generalized Least Squares (GLS), and compare their results in terms of mean squared error (MSE). To evaluate the out of sample performance of each method, the predicted out of sample MSEs are computed as follows
$$\begin{aligned} MSE_e=\frac{1}{n^\prime T}\sum _{i=1}^{n^\prime }\sum _{t=1}^T \big (y_{0,it}-\widehat{m}_e(\textbf{x}_{0,it})\big )^2 \end{aligned}$$
(51)
where
\(MSE_e\) is the mean squared error for the
\(e^{th}\) estimator and
\(n^\prime \) is the number of observations in the testing data set and
\(j=1,\ldots ,n^\prime \). In this empirical exercise,
\(n^\prime =1\) and
\(T=15\) since we leave out the first firm as a test set. To assess the estimated average derivatives, we use the bootstrap to calculate the MSEs for the average partial effects. We report the bootstrapped MSEs for the average derivative by the following.
6$$\begin{aligned} MSE_{e,r}=\frac{1}{B} \sum _{b=1}^B \left( \widehat{m}^{(1)}_{avg,e,r,b} - \frac{1}{4}\sum _{e} \widehat{m}^{(1)}_{avg,e,r}\right) ^2 \end{aligned}$$
(52)
where
B is the number of bootstraps with
\(b=1,\ldots ,B\),
\(\widehat{m}_{avg,e,r,b}^{(1)}(\cdot )\) is the
\(b^{th}\) bootstrapped average partial first derivative with respect to the
\(r^{th}\) variable for the
\(e^{th}\) estimator, and
\(\frac{1}{4}\sum _e\widehat{m}^{(1)}_{avg,e,r}\) is the simple average of the average partial first derivatives with respect to the
\(r^{th}\) variable from the four estimators (GLS, GKRLS, KRLS, and LP):
$$\begin{aligned} \begin{aligned} \hat{m}_{{avg,e,r}}^{{(1)}} =&\,\frac{1}{{nT}}\sum \limits _{{i = 1}}^{n} {\sum \limits _{{t = 1}}^{T} {\hat{m}_{{e,r}}^{{(1)}} } } (x_{{it}} ), \\ e =&\,\left\{ {{\text {GLS}},{\text {GKRLS}},{\text {KRLS}},{\text {LP}}} \right\} \\ \end{aligned} \end{aligned}$$
(53)
7.1 U.S. airline industry
We obtain the data on the efficiency in production of airline services from Greene (
2018). Since the data are a panel of 6 firms for 15 years, we consider the one way random effects model:
$$\begin{aligned} \log C_{it}&=m(\log Q_{it},\log P_{it})+\alpha _i +\varepsilon _{it}, \end{aligned}$$
(54)
where the dependent variable
\(Y_{it} = \log C_{it}\) is the logarithm of total cost, the independent variables
\(X_{it} = (\log Q_{it}, \log P_{it})^{\top }\) are the logarithms of output and the price of fuel, respectively,
\(\alpha _i\) is the firm specific effect, and
\(\varepsilon _{it}\) is the idiosyncratic error term. In this empirical setting, we assume
\(\mathbb {E}[\varepsilon _{it}|\textbf{X}]=0,\; \mathbb {E}[\varepsilon _{it}^2|\textbf{X}]=\sigma ^2_ {\varepsilon _{i}},\; \mathbb {E}[\alpha _i|\textbf{X}]=0,\; \mathbb {E}[\alpha _i^2|\textbf{X}]=\sigma ^2_{\alpha _i},\; \mathbb {E}[\varepsilon _{it}\alpha _j|\textbf{X}]=0\) for all
i,
t,
j,
\(\mathbb {E}[\varepsilon _{it}\varepsilon _{js}|\textbf{X}]=0\) if
\(t\ne s\) or
\(i\ne j\), and
\(\mathbb {E}[\alpha _i\alpha _j|\textbf{X}]=0\) if
\(i\ne j\). Consider the composite error term
\(U_{it}\equiv \alpha _i+\varepsilon _{it}\). Then, the model in Eq. (
54) can be rewritten as
$$\begin{aligned} \log C_{it}=m(\log Q_{it},\log P_{it})+U_{it}, \end{aligned}$$
(55)
In Eq. (
55), the independent variables are strictly exogenous to the composite error term,
\(\mathbb {E}[U_{it}|\textbf{X}]=0\). The variance of the composite error term is
\(\mathbb {E}[U_{it}^2|\textbf{X}]=\sigma ^2_{\alpha _i}+\sigma ^2_{\varepsilon _{i}}\). Therefore, in this empirical example, we allow for firm specific heteroskedasticity. In other words, the variance of the error terms are not constant across firms, but are constant over time for each firm. Since there is a time component, we allow an individual firm to be correlated across time but not with other firms, that is,
\(\mathbb {E}[U_{it}U_{is}|\textbf{X}]=\sigma ^2_{\alpha _i}, \; t\ne s\) and
\(\mathbb {E}[U_{it}U_{js}|\textbf{X}]=0\) for all
t and
s if
\(i\ne j\). Note that the correlation across time can be different for every firm. Therefore, in this empirical framework, we allow the error terms to be heteroskedastic across firms and correlated across time.
To estimate Eq. (
55) by GKRLS and KRLS in the framework set up in this paper, we can write the model in matrix notation. Consider
$$\begin{aligned} \textbf{y} = \textbf{m}+\textbf{U}, \end{aligned}$$
(56)
where
\(\textbf{y}\) is the
\(nT\times 1\) vector of
\(\log C_{it}\),
\(\textbf{m}\) is the
\(nT\times 1\) vector of the regression function
\(m(X_{it})\), and
\(\textbf{U}\) is the
\(nT\times 1\) vector of
\(U_{it}\),
\(i=1,\ldots ,n\) and
\(t=1,\ldots ,T\). Then, the
\(nT\times nT\) error covariance matrix
\(\Omega \) is
$$\begin{aligned} \Omega ={\text {Var}}[\textbf{U}|\textbf{X}] = {\text {diag}}(\Sigma _1, \ldots ,\Sigma _n), \end{aligned}$$
(57)
where
\(\Sigma _i=\sigma ^2_{\varepsilon _{i}}\textbf{I}_T +\sigma ^2_{\alpha _i} \varvec{\iota }_T\varvec{\iota }^\top _T, i=1,\ldots ,n\) has dimension
\(T\times T\),
\(\textbf{I}_T\) is a
\(T\times T\) identity matrix and
\(\varvec{\iota }_T\) is a
\(T\times 1\) vector of ones. To use the GKRLS estimator in this empirical framework, we first estimate Eqs. (
55) or (
56) by KRLS and obtain the residuals, denoted by
\(\widehat{u}_{it}\). To estimate the error covariance matrix
\(\Omega \), the variances of the firm specific error and the idiosyncratic error,
\(\sigma ^2_{\alpha _i}\) and
\(\sigma ^2_{\varepsilon _{i}}\) need to be estimated. Consider the following consistent estimators using time averages,
$$\begin{aligned} \widehat{\sigma }^2_{U_i}= & {} \frac{1}{T} \widehat{\textbf{u}}_i^\top \widehat{\textbf{u}}_i \end{aligned}$$
(58)
$$\begin{aligned} \widehat{\sigma }^2_{\alpha _i}= & {} \frac{1}{T(T-1)/2} \sum _{t=1}^{T-1} \sum _{s=t+1}^{T} \widehat{u}_{it}\widehat{u}_{is} \end{aligned}$$
(59)
$$\begin{aligned} \widehat{\sigma }^2_{\varepsilon _{i}}= & {} \widehat{\sigma }^2_{U_i} - \widehat{\sigma }^2_{\alpha _i}, \end{aligned}$$
(60)
where
\(\widehat{\textbf{u}}_i\) is the
\(T\times 1\) vector of residuals for the
ith firm. Now, plugging these estimates in for
\(\Omega \), the GKRLS estimator can be estimated as in the previous sections. For further details, please see Appendix
H.
With regards to the other comparable estimators, the KRLS and LP estimators are used to estimate Eqs. (
55) or (
56) ignoring the heteroskedasticity and correlation in the composite error,
\(\textbf{U}\). Note that the KRLS estimator uses the error covariance matrix in the variances and standard errors but does not use the error covariance in estimating the regression function. Lastly, the GLS estimator is used as a parametric benchmark to compare to the standard random effects panel data model.
7
The data contain 90 observations of 6 firms for 15 years, from 1970–1984. We split the data into two parts, where the first 15 observations, which corresponds to the first firm, are used as testing data and 75 observations, which corresponds to the last five firms, are set as training data to evaluate out of sample performance. Thus, the training data,
\(i=1,\ldots ,5\) and
\(t=1,\ldots ,15\), has a total of 75 observations. For the GKRLS and KRLS estimators, all hyperparameters are chosen via LOOCV.
8Table 4
Bias corrected average partial derivatives and their standard errors in parentheses are reported for GLS, GKRLS, KRLS, and LP estimators. The columns represent the estimates of the average partial derivative with respect to each regressor
Average partial derivatives for airline data |
GLS | 0.8436 | 0.4188 |
(0.0311) | (0.0181) |
GKRLS | 0.8130 | 0.4247 |
(0.0034) | (0.0082) |
KRLS | 0.8248 | 0.4581 |
(0.016) | (0.0457) |
LP | 0.5885 | 0.2260 |
(0.0276) | (0.0138) |
The bias corrected average partial derivatives and corresponding standard errors are reported in Table . These averages are calculated by training each estimator on the five firms with 75 observations in the training data set. The estimates are bias corrected and the results from Sect.
5 are used in our calculations. All estimators display positive and significant relationships between cost and each of the regressors, output and price, with their average partial derivatives being positive. The elasticity with respect to output ranges from 0.5885 to 0.8436 and with respect to price ranges from 0.2260 to 0.4581. More specifically, for the GKRLS estimator, a 10% increase in output would increase the total cost by an average of 8.13% and a 10% increase in fuel price would increase the total cost by an average of 4.25% holding all else fixed. Comparing the GKRLS and KRLS methods, the estimates of the average partial derivatives are similar but the standard errors are significantly reduced for GKRLS for both output and fuel price, implying a gain in efficiency. Therefore, using the information and the structure of the error covariance in Eq. (
57) in estimated the regression function allows GKRLS to provide more robust estimates of the average partial effects of each independent variable compared to KRLS.
Table
4 shows that the GLS estimator slightly overestimates the elasticity with respect to output and underestimates the elasticity with respect to fuel price compared to those of GKRLS. The LP estimator appears to provide different average partial effect estimates compared to the rest of the estimators. One possible explanation is that the bandwidths may not be the most optimal since data-driven bandwidth selection methods (e.g., cross validation) fail when there is correlation in the errors (De Brabanter et al.
2018). Since the data is panel structured, there is correlation across time, making bandwidth selection for LP estimators difficult. The LP estimates are from the local constant estimator; however, the local linear estimator provides similar estimates of the average partial effects to those of the local constant estimator. Nevertheless, the LP average partial effects of each variable are positive and significant, which are consistent with the other methods. Furthermore, GKRLS provides similar average partial effects with respect to output and price but is more efficient in terms of smaller standard errors relative to the other considered estimators.
Table 5
The MSEs are reported for the GLS, GKRLS, KRLS, and LP, estimators. The first column are the out of sample MSEs calculated by Eq. (
51) and the second and third columns are the bootstrapped MSEs for the average partial derivatives calculated by Eq. (
52). The GKRLS and KRLS estimates are bias corrected
MSEs for airline data |
GLS | 0.0106 | 0.0042 | 0.0018 |
GKRLS | 0.0091 | 0.0030 | 0.00001 |
KRLS | 0.0306 | 0.0031 | 0.0024 |
LP | 0.0191 | 0.2900 | 0.0867 |
To assess the estimators in terms of out of sample performance, we calculate the MSEs using the 15 observations in the testing data set. Table reports MSEs for the four considered estimators. The first column reports the out of sample MSEs using the 15 observations from the first firm. Out of all the considered estimators, the GKRLS estimator outperforms the others in terms of MSE. In other words, the GKRLS estimator can be seen as the superior method in estimating the regression function in this empirical example. The bootstrapped MSEs for the average partial derivatives, calculated by Eq. (
52), are reported in the second and third columns of Table
5. For both the average partial derivatives with respect to output and price, GKRLS produces the lowest MSE, outperforming the other estimators. In addition, since GKRLS incorporates the error covariance structure, efficiency is gained and therefore reductions in MSEs are made relative to KRLS. Overall, GKRLS is considered to be the best method in terms of MSE for estimating both the airline cost function and the average partial effects with respect to output and price.