Skip to main content
Top

2014 | OriginalPaper | Chapter

A Box-Plot and Outliers Detection Proposal for Histogram Data: New Tools for Data Stream Analysis

Authors : Rosanna Verde, Antonio Irpino, Lidia Rivoli

Published in: Analysis and Modeling of Complex Data in Behavioral and Social Sciences

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we propose a method for monitoring the evolution of data described by histograms of values. Our proposal consists to define new order statistics on the quantile functions associated with the empirical distributions, represented by the histogram-data. We introduce the Median, the First and the Third Quartile quantile functions, as well as a generalized representation of the box and whiskers plot. For example, the proposed representations and indices are useful for identifying and classifying outliers, arriving along the time in a data stream environment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Arroyo, J., González-Riviera, G., Maté, C., & Muñoz San Roque, A. (2011). Smoothing methods for histogram-valued time series. An application to value-at-risk. Statistical Analysis and Data Mining, 4(2), 216–228.CrossRefMathSciNet Arroyo, J., González-Riviera, G., Maté, C., & Muñoz San Roque, A. (2011). Smoothing methods for histogram-valued time series. An application to value-at-risk. Statistical Analysis and Data Mining, 4(2), 216–228.CrossRefMathSciNet
go back to reference Gama, J., & Pinto, C. (2006). Discretization from data streams: Applications to histograms and data mining. In Proceedings of the ACM Symposium on Applied Computing (pp. 662–667), New York. Gama, J., & Pinto, C. (2006). Discretization from data streams: Applications to histograms and data mining. In Proceedings of the ACM Symposium on Applied Computing (pp. 662–667), New York.
go back to reference Gilchris, W. (2000). Statistical modelling with quantile functions. London/Boca Raton: Chapman & Hall/CRC.CrossRef Gilchris, W. (2000). Statistical modelling with quantile functions. London/Boca Raton: Chapman & Hall/CRC.CrossRef
go back to reference Irpino, A., & Verde, R. (2006). Dynamic clustering of histograms using Wasserstein metric. In A. Rizzi & M. Vichi (Eds.), Advances in computational statistics (pp. 869–876). Heidelberg: Physica-Verlag. Irpino, A., & Verde, R. (2006). Dynamic clustering of histograms using Wasserstein metric. In A. Rizzi & M. Vichi (Eds.), Advances in computational statistics (pp. 869–876). Heidelberg: Physica-Verlag.
go back to reference Rivoli, L., Irpino, A., & Verde, R. (2012). The median of a set of histogram data. In XLVI Riunione Scientifica della Società Italiana di Statistica, CLEUP [ISBN 978-88-6129-882-8]. Rivoli, L., Irpino, A., & Verde, R. (2012). The median of a set of histogram data. In XLVI Riunione Scientifica della Società Italiana di Statistica, CLEUP [ISBN 978-88-6129-882-8].
go back to reference Verde, R., & Irpino, A. (2007). Dynamic clustering of histogram data: Using the right metric. In Studies in classification, data analysis, and knowledge organization (vol. I, pp. 123–134). Verde, R., & Irpino, A. (2007). Dynamic clustering of histogram data: Using the right metric. In Studies in classification, data analysis, and knowledge organization (vol. I, pp. 123–134).
go back to reference Verde, R., & Irpino, A. (2008). Comparing histogram data using a Mahalanobis-Wasserstein distance (COMPSTAT 2008) (pp. 77–89). Heidelberg: Physica-Verlag. Verde, R., & Irpino, A. (2008). Comparing histogram data using a Mahalanobis-Wasserstein distance (COMPSTAT 2008) (pp. 77–89). Heidelberg: Physica-Verlag.
go back to reference Tukey, J. W. (1977). Exploratory data analysis. Reading: Addison-Wesley.MATH Tukey, J. W. (1977). Exploratory data analysis. Reading: Addison-Wesley.MATH
Metadata
Title
A Box-Plot and Outliers Detection Proposal for Histogram Data: New Tools for Data Stream Analysis
Authors
Rosanna Verde
Antonio Irpino
Lidia Rivoli
Copyright Year
2014
DOI
https://doi.org/10.1007/978-3-319-06692-9_30

Premium Partner