Skip to main content

2022 | OriginalPaper | Buchkapitel

Modal Regression for Skewed, Truncated, or Contaminated Data with Outliers

verfasst von : Sijia Xiang, Weixin Yao

Erschienen in: Advances and Innovations in Statistics and Data Science

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Built on the ideas of mean and quantile, mean regression and quantile regression are extensively investigated and popularly used to model the relationship between a dependent variable Y  and covariates x. However, the research about the regression model built on the mode is rather limited. In this article, we introduce a new regression tool, named modal regression, that aims to find the most probable conditional value (mode) of a dependent variable Y  given covariates x rather than the mean that is used by the traditional mean regression. The modal regression can reveal new interesting data structure that is possibly missed by the conditional mean or quantiles. In addition, modal regression is resistant to outliers and heavy-tailed data and can provide shorter prediction intervals when the data are skewed. Furthermore, unlike traditional mean regression, the modal regression can be directly applied to the truncated data. Modal regression could be a potentially very useful regression tool that can complement the traditional mean and quantile regressions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Amemiya, T. (1973). Regression analysis when the dependent variable is truncated normal. Econometrica, 41, 997–1016.CrossRef Amemiya, T. (1973). Regression analysis when the dependent variable is truncated normal. Econometrica, 41, 997–1016.CrossRef
Zurück zum Zitat Cardoso, A. R., & Portugal, P. (2005). Contractual wages and the wage cushion under different bargaining settings. Journal of Labor Economics, 23, 875–902.CrossRef Cardoso, A. R., & Portugal, P. (2005). Contractual wages and the wage cushion under different bargaining settings. Journal of Labor Economics, 23, 875–902.CrossRef
Zurück zum Zitat Chaouch, P., Laïb, N., & Louani, D. (2017). Rate of uniform consistency for a class of mode regression on functional stationary ergodic data. Statistical Methods & Applications, 26(1), 19–47.CrossRef Chaouch, P., Laïb, N., & Louani, D. (2017). Rate of uniform consistency for a class of mode regression on functional stationary ergodic data. Statistical Methods & Applications, 26(1), 19–47.CrossRef
Zurück zum Zitat Chauduri, P., & Marron, J. (1999). Sizer for exploration of structures in curves. Journal of the American Statistical Association, 94, 807–823.CrossRef Chauduri, P., & Marron, J. (1999). Sizer for exploration of structures in curves. Journal of the American Statistical Association, 94, 807–823.CrossRef
Zurück zum Zitat Chen, Y. (2018). Modal regression using kernel density estimation: a review. Advanced Review, 10, 1–14. Chen, Y. (2018). Modal regression using kernel density estimation: a review. Advanced Review, 10, 1–14.
Zurück zum Zitat Chen, Y. C., Genovese, C. R., Tibshirani, R. J., & Wasserman, L. (2016). Nonparametric modal regression. The Annals of Statistics, 44, 489–514.CrossRef Chen, Y. C., Genovese, C. R., Tibshirani, R. J., & Wasserman, L. (2016). Nonparametric modal regression. The Annals of Statistics, 44, 489–514.CrossRef
Zurück zum Zitat Eddy, W. P. (1980). Optimum kernel estimators of the mode. The Annals of Statistics, 8, 870–882.CrossRef Eddy, W. P. (1980). Optimum kernel estimators of the mode. The Annals of Statistics, 8, 870–882.CrossRef
Zurück zum Zitat Einbeck, J., & Tutz, G. (2006). Modelling beyond regression functions: an application of multimodal regression to speed-flow data. Applied Statistics, 55, 461–475. Einbeck, J., & Tutz, G. (2006). Modelling beyond regression functions: an application of multimodal regression to speed-flow data. Applied Statistics, 55, 461–475.
Zurück zum Zitat Feng, Y., Fan, J., & Suykens, J. A. (2020). A statistical learning approach to modal regression. Journal of Machine Learning Research, 21(2), 1–35. Feng, Y., Fan, J., & Suykens, J. A. (2020). A statistical learning approach to modal regression. Journal of Machine Learning Research, 21(2), 1–35.
Zurück zum Zitat Friedman, J. H., & Fisher, N. I. (1999). Bump hunting in high-dimensional data. Statistics and Computing, 9, 123–143.CrossRef Friedman, J. H., & Fisher, N. I. (1999). Bump hunting in high-dimensional data. Statistics and Computing, 9, 123–143.CrossRef
Zurück zum Zitat Hall, P., Minnotte, M. C., & Zhang, C. (2004). Bump hunting with non-gaussian kernels. The Annals of Statistics, 32, 2124–2141.CrossRef Hall, P., Minnotte, M. C., & Zhang, C. (2004). Bump hunting with non-gaussian kernels. The Annals of Statistics, 32, 2124–2141.CrossRef
Zurück zum Zitat Henderson, D. J., & Parmeter, C. F. (2015). Applied nonparametric econometrics. Cambridge University Press.CrossRef Henderson, D. J., & Parmeter, C. F. (2015). Applied nonparametric econometrics. Cambridge University Press.CrossRef
Zurück zum Zitat Henderson, D. J., & Russell, R. R. (2005). Human capital and convergence: a production frontier approach. International Economic Review, 46, 1167–1205.CrossRef Henderson, D. J., & Russell, R. R. (2005). Human capital and convergence: a production frontier approach. International Economic Review, 46, 1167–1205.CrossRef
Zurück zum Zitat Henderson, D. J., Parmeter, C. F., & Russell, R. R. (2008). Modes, weighted modes, and calibrated modes: evidence of clustering using modality tests. Journal of Applied Econometrics, 23, 607–638.CrossRef Henderson, D. J., Parmeter, C. F., & Russell, R. R. (2008). Modes, weighted modes, and calibrated modes: evidence of clustering using modality tests. Journal of Applied Econometrics, 23, 607–638.CrossRef
Zurück zum Zitat Kemp, G. C. R., & Santos Silva, J. M. C. (2012). Regression towards the mode. Journal of Economics, 170, 92–101.CrossRef Kemp, G. C. R., & Santos Silva, J. M. C. (2012). Regression towards the mode. Journal of Economics, 170, 92–101.CrossRef
Zurück zum Zitat Kemp, G. C. R., Parente, P., & Santos Silva, J. M. C. (2019). Dynamic vector mode regression. Journal of Business & Economic Statistics, 38, 647–661.CrossRef Kemp, G. C. R., Parente, P., & Santos Silva, J. M. C. (2019). Dynamic vector mode regression. Journal of Business & Economic Statistics, 38, 647–661.CrossRef
Zurück zum Zitat Krief, J. M. (2017). Semi-linear mode regression. The Econometrics Journal, 20(2), 149–167.CrossRef Krief, J. M. (2017). Semi-linear mode regression. The Econometrics Journal, 20(2), 149–167.CrossRef
Zurück zum Zitat Lee, M. J. (1989). Mode regression. Journal of Econometrics, 42, 337–349.CrossRef Lee, M. J. (1989). Mode regression. Journal of Econometrics, 42, 337–349.CrossRef
Zurück zum Zitat Lewbel, A., & Linton, O. (2002). Nonparametric censored and truncated regression. Econometrica, 70, 765–779.CrossRef Lewbel, A., & Linton, O. (2002). Nonparametric censored and truncated regression. Econometrica, 70, 765–779.CrossRef
Zurück zum Zitat Li, X., & Huang, X. (2019). Linear mode regression with covariate measurement error. Canadian Journal of Statistics, 47(2), 262–280.CrossRef Li, X., & Huang, X. (2019). Linear mode regression with covariate measurement error. Canadian Journal of Statistics, 47(2), 262–280.CrossRef
Zurück zum Zitat Manski, C. (1991). Regression. Journal of Economic Literature, 29, 34–50. Manski, C. (1991). Regression. Journal of Economic Literature, 29, 34–50.
Zurück zum Zitat Mirowsky, J. (2013). Analyzing associations between mental health and social circumstances. In Handbook of the sociology of mental health (pp. 143–165). Mirowsky, J. (2013). Analyzing associations between mental health and social circumstances. In Handbook of the sociology of mental health (pp. 143–165).
Zurück zum Zitat Ota, H., Kato, K., Hara, S., et al. (2019). Quantile regression approach to conditional mode estimation. Electronic Journal of Statistics, 13(2), 3120–3160.CrossRef Ota, H., Kato, K., Hara, S., et al. (2019). Quantile regression approach to conditional mode estimation. Electronic Journal of Statistics, 13(2), 3120–3160.CrossRef
Zurück zum Zitat Park, B. U., Simar, L., & Zelenyuk, V. (2008). Local likelihood estimation of truncated regression and its partial derivatives: Theory and application. Journal of Econometrics, 146, 185–198.CrossRef Park, B. U., Simar, L., & Zelenyuk, V. (2008). Local likelihood estimation of truncated regression and its partial derivatives: Theory and application. Journal of Econometrics, 146, 185–198.CrossRef
Zurück zum Zitat Parzen, E. (1962). On estimation of a probability density function and mode. Journal of American Statistical Association, 33, 1065–1076. Parzen, E. (1962). On estimation of a probability density function and mode. Journal of American Statistical Association, 33, 1065–1076.
Zurück zum Zitat Ray, S., & Lindsay, B. G. (2005). The topography of multivariate normal mixtures. The Annals of Statistics, 2042–2065. Ray, S., & Lindsay, B. G. (2005). The topography of multivariate normal mixtures. The Annals of Statistics, 2042–2065.
Zurück zum Zitat Scott, D. W. (1992). Multivariate density estimation: Theory, practice and visualization. New York: Wiley.CrossRef Scott, D. W. (1992). Multivariate density estimation: Theory, practice and visualization. New York: Wiley.CrossRef
Zurück zum Zitat Ullah, A., Wang, T., & Yao, W. (2021). Modal regression for fixed effects panel data. Empirical Economics, 60(1), 261–308.CrossRef Ullah, A., Wang, T., & Yao, W. (2021). Modal regression for fixed effects panel data. Empirical Economics, 60(1), 261–308.CrossRef
Zurück zum Zitat Wang, X., Chen, H., Shen, D., & Huang, H. (2017). Cognitive impairment prediction in Alzheimer’s disease with regularized modal regression. Advances in Neural Information Processing Systems, 1447–1457. Wang, X., Chen, H., Shen, D., & Huang, H. (2017). Cognitive impairment prediction in Alzheimer’s disease with regularized modal regression. Advances in Neural Information Processing Systems, 1447–1457.
Zurück zum Zitat Weber, M. (1993). The sociology of religion. Weber, M. (1993). The sociology of religion.
Zurück zum Zitat Yao, W. (2013). A note on EM algorithm for mixture models. Statistics Probability Letters, 83, 519–526.CrossRef Yao, W. (2013). A note on EM algorithm for mixture models. Statistics Probability Letters, 83, 519–526.CrossRef
Zurück zum Zitat Yao, W., & Li, L. (2014). A new regression model: modal linear regression. Scandinavian Journal of Statistics, 41, 656–671.CrossRef Yao, W., & Li, L. (2014). A new regression model: modal linear regression. Scandinavian Journal of Statistics, 41, 656–671.CrossRef
Zurück zum Zitat Yao, W., & Lindsay, B. G. (2009). Bayesian mixture labelling by highest posterior density. Journal of American Statistical Association, 104, 758–767.CrossRef Yao, W., & Lindsay, B. G. (2009). Bayesian mixture labelling by highest posterior density. Journal of American Statistical Association, 104, 758–767.CrossRef
Zurück zum Zitat Yao, W., & Xiang, S. (2016). Nonparametric and varying coefficient modal regression. arXiv:1602.06609. Yao, W., & Xiang, S. (2016). Nonparametric and varying coefficient modal regression. arXiv:1602.06609.
Zurück zum Zitat Zhou, H., & Huang, X. (2016). Nonparametric modal regression in the presence of measurement error. Electronic Journal of Statistics, 10(2), 3579–3620.CrossRef Zhou, H., & Huang, X. (2016). Nonparametric modal regression in the presence of measurement error. Electronic Journal of Statistics, 10(2), 3579–3620.CrossRef
Zurück zum Zitat Zhou, H., & Huang, X. (2019). Bandwidth selection for nonparametric modal regression. Communications in Statistics-Simulation and Computation, 48(4), 968–984.CrossRef Zhou, H., & Huang, X. (2019). Bandwidth selection for nonparametric modal regression. Communications in Statistics-Simulation and Computation, 48(4), 968–984.CrossRef
Metadaten
Titel
Modal Regression for Skewed, Truncated, or Contaminated Data with Outliers
verfasst von
Sijia Xiang
Weixin Yao
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-031-08329-7_12

Premium Partner