Skip to main content
Erschienen in: Advances in Data Analysis and Classification 3/2023

05.09.2022 | Regular Article

Robust regression for interval-valued data based on midpoints and log-ranges

verfasst von: Qing Zhao, Huiwen Wang, Shanshan Wang

Erschienen in: Advances in Data Analysis and Classification | Ausgabe 3/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Flexible modelling of interval-valued data is of great practical importance with the development of advanced technologies in current data collection processes. This paper proposes a new robust regression model for interval-valued data based on midpoints and log-ranges of the dependent intervals, and obtains the parameter estimators using Huber loss function to deal with possible outliers in a data set. Besides, the use of logarithm transformation avoids the non-negativity constraints for the traditional modelling of ranges, which is beneficial to the flexible use of various regression methods. We conduct extensive Monte Carlo simulation experiments to compare the finite-sample performance of our model with that of the existing regression methods for interval-valued data. Results indicate that the proposed method shows competitive performance, especially in the data set with the existence of outliers and the scenarios where both midpoints and ranges of independent variables are related to those of the dependent one. Moreover, two empirical interval-valued data sets are applied to illustrate the effectiveness of our method.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Billard L, Diday E (2000) Regression analysis for interval-valued data. In Proceedings of the Seventh Conference of the International Federation of Classification Societies, pages 369–374. Springer Billard L, Diday E (2000) Regression analysis for interval-valued data. In Proceedings of the Seventh Conference of the International Federation of Classification Societies, pages 369–374. Springer
Zurück zum Zitat Billard L, Diday E (2002) Symbolic Regression Analysis. In Classification, Clustering, and Data Analysis. Springer, pp 281–288MATH Billard L, Diday E (2002) Symbolic Regression Analysis. In Classification, Clustering, and Data Analysis. Springer, pp 281–288MATH
Zurück zum Zitat Billard L, Diday E (2003) From the statistics of data to the statistics of knowledge: symbolic data analysis. J Am Stat Assoc 98(462):470–487MathSciNetCrossRef Billard L, Diday E (2003) From the statistics of data to the statistics of knowledge: symbolic data analysis. J Am Stat Assoc 98(462):470–487MathSciNetCrossRef
Zurück zum Zitat Bock HH, Diday E (2000) Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. SpringerCrossRefMATH Bock HH, Diday E (2000) Analysis of symbolic data: exploratory methods for extracting statistical information from complex data. SpringerCrossRefMATH
Zurück zum Zitat Carvalho FATD, Souza RMCRD, Chavent M, Lechevallier Y (2006) Adaptive hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recogn Lett 27(3):167–179CrossRef Carvalho FATD, Souza RMCRD, Chavent M, Lechevallier Y (2006) Adaptive hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recogn Lett 27(3):167–179CrossRef
Zurück zum Zitat Diday E (1988) The symbolic approach in clustering and related methods of data analysis. Proceedings of IFCS, Classification and Related Methods of Data Analysis, pages 673–384 Diday E (1988) The symbolic approach in clustering and related methods of data analysis. Proceedings of IFCS, Classification and Related Methods of Data Analysis, pages 673–384
Zurück zum Zitat Diday E (2016) Thinking by classes in data science: the symbolic data analysis paradigm. Wiley Interdisciplinary Reviews: Computational Statistics 8(5):172–205MathSciNetCrossRef Diday E (2016) Thinking by classes in data science: the symbolic data analysis paradigm. Wiley Interdisciplinary Reviews: Computational Statistics 8(5):172–205MathSciNetCrossRef
Zurück zum Zitat Diday E, Noirhomme-Fraiture M (2008) Symbolic data analysis and the SODAS software. John Wiley & SonsMATH Diday E, Noirhomme-Fraiture M (2008) Symbolic data analysis and the SODAS software. John Wiley & SonsMATH
Zurück zum Zitat Domingues MAO, Souza RMCRD, Cysneiros RJA (2010) A robust method for linear regression of symbolic interval data. Pattern Recogn Lett 31(13):1991–1996CrossRef Domingues MAO, Souza RMCRD, Cysneiros RJA (2010) A robust method for linear regression of symbolic interval data. Pattern Recogn Lett 31(13):1991–1996CrossRef
Zurück zum Zitat Fagundes RAA, Souza RMCRD, Cysneiros FJA (2013) Robust regression with application to symbolic interval data. Eng Appl Artif Intell 26(1):564–573CrossRef Fagundes RAA, Souza RMCRD, Cysneiros FJA (2013) Robust regression with application to symbolic interval data. Eng Appl Artif Intell 26(1):564–573CrossRef
Zurück zum Zitat Fagundes RAA, Souza RMCRD, Soares YMG (2016) Quantile regression of interval-valued data. In 23rd International Conference on Pattern Recognition (ICPR), pp 2586–2591. IEEE Fagundes RAA, Souza RMCRD, Soares YMG (2016) Quantile regression of interval-valued data. In 23rd International Conference on Pattern Recognition (ICPR), pp 2586–2591. IEEE
Zurück zum Zitat Ferraro MB, Giordani P (2013) A proposal of robust regression for random fuzzy sets. In Synergies of Soft Computing and Statistics for Intelligent Data Analysis, pp 115–123. Springer Ferraro MB, Giordani P (2013) A proposal of robust regression for random fuzzy sets. In Synergies of Soft Computing and Statistics for Intelligent Data Analysis, pp 115–123. Springer
Zurück zum Zitat Ferraro MB, Coppi R, Rodriguez GG, Colubi A (2010) A linear regression model for imprecise response. Int J Approximate Reasoning 51(7):759–770MathSciNetCrossRefMATH Ferraro MB, Coppi R, Rodriguez GG, Colubi A (2010) A linear regression model for imprecise response. Int J Approximate Reasoning 51(7):759–770MathSciNetCrossRefMATH
Zurück zum Zitat Ferraro MB, Colubi A, Rodriguez GG, Coppi R (2011) A determination coefficient for a linear regression model with imprecise response. Environmetrics 22(4):516–529MathSciNetCrossRefMATH Ferraro MB, Colubi A, Rodriguez GG, Coppi R (2011) A determination coefficient for a linear regression model with imprecise response. Environmetrics 22(4):516–529MathSciNetCrossRefMATH
Zurück zum Zitat Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics. Wiley Online LibraryMATH Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics. Wiley Online LibraryMATH
Zurück zum Zitat Hao P, Guo J (2017) Constrained center and range joint model for interval-valued symbolic data regression. Comput Stat Data Anal 116:106–138MathSciNetCrossRefMATH Hao P, Guo J (2017) Constrained center and range joint model for interval-valued symbolic data regression. Comput Stat Data Anal 116:106–138MathSciNetCrossRefMATH
Zurück zum Zitat Huber PJ (2011) Robust statistics. Springer Huber PJ (2011) Robust statistics. Springer
Zurück zum Zitat Maronna RA, Martin DR, Yohai VJ (2006) Robust Statistics: Theory and Methods. John Wiley & SonsCrossRefMATH Maronna RA, Martin DR, Yohai VJ (2006) Robust Statistics: Theory and Methods. John Wiley & SonsCrossRefMATH
Zurück zum Zitat Neto EAL, Carvalho FATD (2008) Centre and range method for fitting a linear regression model to symbolic interval data. Comput Stat Data Anal 52(3):1500–1515MathSciNetCrossRefMATH Neto EAL, Carvalho FATD (2008) Centre and range method for fitting a linear regression model to symbolic interval data. Comput Stat Data Anal 52(3):1500–1515MathSciNetCrossRefMATH
Zurück zum Zitat Neto EAL, Carvalho FATD (2010) Constrained linear regression models for symbolic interval-valued variables. Comput Stat Data Anal 54(2):333–347MathSciNetCrossRefMATH Neto EAL, Carvalho FATD (2010) Constrained linear regression models for symbolic interval-valued variables. Comput Stat Data Anal 54(2):333–347MathSciNetCrossRefMATH
Zurück zum Zitat Neto EAL, Carvalho FATD (2018) An exponential-type kernel robust regression model for interval-valued variables. Inf Sci 454–455:419–442MathSciNetCrossRefMATH Neto EAL, Carvalho FATD (2018) An exponential-type kernel robust regression model for interval-valued variables. Inf Sci 454–455:419–442MathSciNetCrossRefMATH
Zurück zum Zitat Noirhomme-Fraiture M, Brito P (2011) Far beyond the classical data models: symbolic data analysis. Statistical Analysis and Data Mining: the ASA Data Science Journal 4(2):157–170MathSciNetCrossRefMATH Noirhomme-Fraiture M, Brito P (2011) Far beyond the classical data models: symbolic data analysis. Statistical Analysis and Data Mining: the ASA Data Science Journal 4(2):157–170MathSciNetCrossRefMATH
Zurück zum Zitat R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, (2017) R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, (2017)
Zurück zum Zitat Silva PD, Brito P (2021) MAINT.Data: Model and Analyse Interval Data Silva PD, Brito P (2021) MAINT.Data: Model and Analyse Interval Data
Zurück zum Zitat Wang H, Guan R, Wu J (2012) Linear regression of interval-valued data based on complete information in hypercubes. J Syst Sci Syst Eng 21(4):422–442CrossRef Wang H, Guan R, Wu J (2012) Linear regression of interval-valued data based on complete information in hypercubes. J Syst Sci Syst Eng 21(4):422–442CrossRef
Zurück zum Zitat Wei Y, Wang S, Wang H (2017) Interval-valued data regression using partial linear model. J Stat Comput Simul 87(16–18):3175–3194MathSciNetMATH Wei Y, Wang S, Wang H (2017) Interval-valued data regression using partial linear model. J Stat Comput Simul 87(16–18):3175–3194MathSciNetMATH
Zurück zum Zitat Xu W (2010) Symbolic data analysis: interval-valued data regression. PhD thesis, University of Georgia Xu W (2010) Symbolic data analysis: interval-valued data regression. PhD thesis, University of Georgia
Metadaten
Titel
Robust regression for interval-valued data based on midpoints and log-ranges
verfasst von
Qing Zhao
Huiwen Wang
Shanshan Wang
Publikationsdatum
05.09.2022
Verlag
Springer Berlin Heidelberg
Erschienen in
Advances in Data Analysis and Classification / Ausgabe 3/2023
Print ISSN: 1862-5347
Elektronische ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-022-00518-2

Weitere Artikel der Ausgabe 3/2023

Advances in Data Analysis and Classification 3/2023 Zur Ausgabe

Premium Partner