Skip to main content
Log in

Multivariate functional outlier detection

  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

Functional data are occurring more and more often in practice, and various statistical techniques have been developed to analyze them. In this paper we consider multivariate functional data, where for each curve and each time point a \(p\)-dimensional vector of measurements is observed. For functional data the study of outlier detection has started only recently, and was mostly limited to univariate curves \((p=1)\). In this paper we set up a taxonomy of functional outliers, and construct new numerical and graphical techniques for the detection of outliers in multivariate functional data, with univariate curves included as a special case. Our tools include statistical depth functions and distance measures derived from them. The methods we study are affine invariant in \(p\)-dimensional space, and do not assume elliptical or any other symmetry.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30

Similar content being viewed by others

References

  • Arribas-Gil A, Romo J (2014) Shape outlier detection and visualization for functional data: the outliergram. Biostatistics 15(4):603–619

    Article  Google Scholar 

  • Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml

  • Bai ZD, He X (1999) Asymptotic distributions of the maximal depth estimators for regression and multivariate location. Ann Stat 27(5):1616–1637

    Article  MathSciNet  MATH  Google Scholar 

  • Berrendero J, Justel A, Svarc M (2011) Principal components for multivariate functional data. Comput Stat Data Anal 55(9):2619–2634

    Article  MathSciNet  Google Scholar 

  • Brys G, Hubert M, Struyf A (2004) A robust measure of skewness. J Comput Graph Stat 13:996–1017

    Article  MathSciNet  Google Scholar 

  • Brys G, Hubert M, Rousseeuw PJ (2005) A robustification of independent component analysis. J Chemom 19:364–375

    Article  Google Scholar 

  • Claeskens G, Hubert M, Slaets L, Vakili K (2014) Multivariate functional halfspace depth. J Am Stat Assoc 109(505):411–423

    Article  MathSciNet  Google Scholar 

  • Cuevas A, Febrero M, Fraiman R (2006) On the use of the bootstrap for estimating functions with functional data. Comput Stat Data Anal 51(2):1063–1074

    Article  MathSciNet  MATH  Google Scholar 

  • Cuevas A, Febrero M, Fraiman R (2007) Robust estimation and classification for functional data via projection-based depth notions. Comput Stat 22:481–496

    Article  MathSciNet  MATH  Google Scholar 

  • Dang X, Serfling R (2010) Nonparametric depth-based multivariate outlier identifiers, and masking robustness properties. J Stat Plan Inference 140(1):198–213

    Article  MathSciNet  MATH  Google Scholar 

  • Donoho D (1982) Breakdown properties of multivariate location estimators. PhD Qualifying paper. Dept Statistics, Harvard University, Boston

  • Donoho D, Gasko M (1992) Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann Stat 20(4):1803–1827

    Article  MathSciNet  MATH  Google Scholar 

  • Dyrby M, Engelsen S, Nørgaard L, Bruhn M, Lundsberg-Nielsen L (2002) Chemometric quantization of the active substance in a pharmaceutical tablet using near-infrared (NIR) transmittance and NIR FT-Raman spectra. Appl Spectrosc 56(5):579–585

    Article  Google Scholar 

  • Esbensen K (2001) Multivariate data analysis in practice, 5th edn. Camo Software, Trondheim, Norway

  • Febrero-Bande M, Galeano P, González-Manteiga W (2008) Outlier detection in functional data by depth measures, with application to identify abnormal \({\rm NO}_x\) levels. Environmetrics 19(4):331–345

    Article  MathSciNet  Google Scholar 

  • Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer, New York

    Google Scholar 

  • Fraiman R, Muniz G (2001) Trimmed means for functional data. Test 10:419–440

    Article  MathSciNet  MATH  Google Scholar 

  • Hallin M, Paindaveine D, Šiman M (2010) Multivariate quantiles and multiple-output regression quantiles: from \(L_1\) optimization to halfspace depth. Ann Stat 38(2):635–669

    Article  MATH  Google Scholar 

  • He X, Wang G (1997) Convergence of depth contours for multivariate datasets. Ann Stat 25(2):495–504

    Article  MATH  Google Scholar 

  • Hubert M, Vandervieren E (2008) An adjusted boxplot for skewed distributions. Comput Stat Data Anal 52(12):5186–5201

    Article  MathSciNet  MATH  Google Scholar 

  • Hubert M, Van der Veeken S (2008) Outlier detection for skewed data. J Chemom 22:235–246

    Article  Google Scholar 

  • Hubert M, Van der Veeken S (2010) Robust classification for skewed data. Adv Data Anal Classif 4:239–254

    Article  MathSciNet  MATH  Google Scholar 

  • Hubert M, Claeskens G, De Ketelaere B, Vakili K (2012) A new depth-based approach for detecting outlying curves. In: Colubi A, Fokianos K, Gonzalez-Rodriguez G, Kontoghiorghes E (eds) Proceedings of COMPSTAT 2012, pp 329–340

  • Hyndman R (1996) Computing and graphing highest density regions. Am Stat 50:120–126

    Google Scholar 

  • Hyndman R, Shang H (2010) Rainbow plots, bagplots, and boxplots for functional data. J Comput Graph Stat 19(1):29–45

    Article  MathSciNet  Google Scholar 

  • Ieva F, Paganoni AM (2013) Depth measures for multivariate functional data. Commun Stat Theory Methods 42(7):1265–1276

    Article  MathSciNet  MATH  Google Scholar 

  • Larsen F, van den Berg F, Engelsen S (2006) An exploratory chemometric study of NMR spectra of table wines. J Chemom 20(5):198–208

    Article  Google Scholar 

  • Liu R (1990) On a notion of data depth based on random simplices. Ann Stat 18(1):405–414

    Article  MATH  Google Scholar 

  • Liu X, Zuo Y (2014) Computing halfspace depth and regression depth. Commun Stat Simul Comput 43(5):969–985

    Article  MathSciNet  MATH  Google Scholar 

  • López-Pintado S, Romo J (2009) On the concept of depth for functional data. J Am Stat Assoc 104:718–734

    Article  Google Scholar 

  • López-Pintado S, Romo J (2011) A half-region depth for functional data. Comput Stat Data Anal 55:1679–1695

    Article  Google Scholar 

  • López-Pintado S, Sun Y, Lin J, Genton M (2014) Simplicial band depth for multivariate functional data. Adv Data Anal Classif 8:321–338

    Article  MathSciNet  Google Scholar 

  • Maronna R, Martin D, Yohai V (2006) Robust statistics: theory and methods. Wiley, New York

    Book  Google Scholar 

  • Massé JC (2004) Asymptotics for the Tukey depth process, with an application to a multivariate trimmed mean. Bernoulli 10(3):397–419

    Article  MathSciNet  MATH  Google Scholar 

  • Massé JC, Theodorescu R (1994) Halfplane trimming for bivariate distributions. J Multivar Anal 48(2):188–202

    Article  MATH  Google Scholar 

  • Mizera I, Volauf M (2002) Continuity of halfspace depth contours and maximum depth estimators: diagnostics of depth-related methods. J Multivar Anal 83(2):365–388

    Article  MathSciNet  MATH  Google Scholar 

  • Mosler K (2013) Depth statistics. In: Becker C, Fried R, Kuhnt S (eds) Robustness and complex data structures, Festschrift in Honour of Ursula Gather. Springer, Berlin, pp 17–34

    Chapter  Google Scholar 

  • Paindavaine D, Šiman M (2012) Computing multiple-output regression quantile regions. Comput Stat Data Anal 56:840–853

    Article  Google Scholar 

  • Pigoli D, Sangalli L (2012) Wavelets in functional data analysis: estimation of multidimensional curves and their derivatives. Comput Stat Data Anal 56(6):1482–1498

    Article  MathSciNet  MATH  Google Scholar 

  • Ramsay J, Silverman BW (2002) Applied functional data analysis. Springer Series in Statistics. Springer, Berlin

    Google Scholar 

  • Ramsay J, Silverman BW (2006) Functional data analysis, 2nd edn. Springer, New York

    Google Scholar 

  • Ramsay JO, Li X (1998) Curve registration. J R Stat Soc Ser B 60(2):351–363

    Article  MathSciNet  MATH  Google Scholar 

  • Romanazzi M (2001) Influence function of halfspace depth. J Multivar Anal 77:138–161

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseeuw PJ, Leroy A (1987) Robust regression and outlier detection. Wiley-Interscience, New York

    Book  MATH  Google Scholar 

  • Rousseeuw PJ, Ruts I (1996) Bivariate location depth. Appl Stat 45:516–526

    Article  MATH  Google Scholar 

  • Rousseeuw PJ, Ruts I (1998) Constructing the bivariate Tukey median. Stat Sin 8:827–839

    MathSciNet  MATH  Google Scholar 

  • Rousseeuw PJ, Ruts I (1999) The depth function of a population distribution. Metrika 49:213–244

    MathSciNet  MATH  Google Scholar 

  • Rousseeuw PJ, Struyf A (1998) Computing location depth and regression depth in higher dimensions. Stat Comput 8:193–203

    Article  Google Scholar 

  • Rousseeuw PJ, Ruts I, Tukey J (1999) The bagplot: a bivariate boxplot. Am Stat 53:382–387

    Google Scholar 

  • Rousseeuw PJ, Debruyne M, Engelen S, Hubert M (2006) Robustness and outlier detection in chemometrics. Crit Rev Anal Chem 36:221–242

    Article  Google Scholar 

  • Ruts I, Rousseeuw PJ (1996) Computing depth contours of bivariate point clouds. Comput Stat Data Anal 23:153–168

    Article  MATH  Google Scholar 

  • Stahel W (1981) Robuste Schätzungen: infinitesimale Optimalität und Schätzungen von Kovarianzmatrizen. PhD thesis, ETH Zürich

  • Struyf A, Rousseeuw PJ (1999) Halfspace depth and regression depth characterize the empirical distribution. J Multivar Anal 69(1):135–153

    Article  MathSciNet  MATH  Google Scholar 

  • Struyf A, Rousseeuw PJ (2000) High-dimensional computation of the deepest location. Comput Stat Data Anal 34(4):415–426

    Article  MATH  Google Scholar 

  • Sun Y, Genton M (2011) Functional boxplots. J Comput Graph Stat 20(2):316–334

    Article  MathSciNet  Google Scholar 

  • Tukey J (1977) Exploratory data analysis. Addison-Wesley, Reading, MA

    MATH  Google Scholar 

  • Wang K, Gasser T (1997) Alignment of curves by dynamic time warping. Ann Stat 25(3):1251–1276

    Article  MathSciNet  MATH  Google Scholar 

  • Zuo Y (2003) Projection-based depth functions and associated medians. Ann Stat 31(5):1460–1490

    Article  MATH  Google Scholar 

  • Zuo Y, Serfling R (2000a) General notions of statistical depth function. Ann Stat 28:461–482

    Article  MathSciNet  MATH  Google Scholar 

  • Zuo Y, Serfling R (2000b) On the performance of some robust nonparametric location measures relative to a general notion of multivariate symmetry. J Stat Plan Inference 84:55–79

    Article  MathSciNet  MATH  Google Scholar 

  • Zuo Y, Serfling R (2000c) Structural properties and convergence results for contours of sample statistical depth functions. Ann Stat 28(2):483–499

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mia Hubert.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hubert, M., Rousseeuw, P.J. & Segaert, P. Multivariate functional outlier detection. Stat Methods Appl 24, 177–202 (2015). https://doi.org/10.1007/s10260-015-0297-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-015-0297-8

Keywords

Navigation