Skip to main content
Log in

Benchmarking local classification methods

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In recent years in the fields of statistics and machine learning an increasing amount of so called local classification methods has been developed. Local approaches to classification are not new, but have lately become popular. Well-known examples are the \(k\) nearest neighbors method and classification trees. However, in most publications on this topic the term “local” is used without further explanation of its particular meaning. Only little is known about the properties of local methods and the types of classification problems for which they may be beneficial. We explain the basic principles and introduce the most important variants of local methods. To our knowledge there are very few extensive studies in the literature that compare several types of local methods and global methods across many data sets. In order to assess their performance we conduct a benchmark study on real-world and synthetic tasks. We cluster data sets and considered learning algorithms with regard to the obtained performance structures and try to relate our theoretical considerations and intuitions to these results. We also address some general issues of benchmark studies and cover some pitfalls, extensions and improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Alpaydin E, Jordan M (1996) Local linear perceptrons for classification. IEEE Trans Neural Netw 7(3): 788–792

    Google Scholar 

  • Atkeson C, Moore A, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73

    Article  Google Scholar 

  • Binder H, Schumacher M (2008) Adapting prediction error estimates for biased complexity selection in high-dimensional bootstrap samples. Stat Appl Genet Mol Biol 7(1):Article 12

  • Bischl B (2010) mlr: Machine learning in R. http://mlr.r-forge.r-project.org/

  • Bishop CM (2006) Pattern recognition and machine learning. Information Science and Statistics, Springer, New York

    MATH  Google Scholar 

  • Blanzieri E, Melgani F (2006) An adaptive SVM nearest neighbor classifier for remotely sensed imagery In: Proceedings of the IEEE international conference on geoscience and remote sensing, symposium (IGARSS-2006), pp 3931–3934

  • Brailovsky V, Barzilay O, Shahave R (1999) On global, local, mixed and neighborhood kernels for support vector machines. Pattern Recognit Lett 20:1183–1190

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont

  • Chen D, Burrell P (2001) On decision-boundary-based approaches for structure selection of multilayered neural nets. In: Proceedings of the international conferences on info-tech and info-net 2001, ICII 2001, vol 3, pp 486–490

  • Cheng H, Tan PN, Jin R (2010) Efficient algorithm for localized support vector machine. IEEE Trans Knowl Data Eng 22(4):537–549

    Article  Google Scholar 

  • Clarke B, Fokoué E, Zhang H (2009) principles and theory for data mining and machine learning. Springer Series in Statistics, Springer, New York

    Book  MATH  Google Scholar 

  • Czogiel I, Luebke K, Zentgraf M, Weihs C (2007) Localized linear discriminant analysis. In: Decker R, Lenz H (eds) Advances in data analysis, Studies in Classification, data analysis, and knowledge organization, vol 34. Springer, Berlin, pp 133–140

    Google Scholar 

  • Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Applications of mathematics, Springer, New York

    Book  MATH  Google Scholar 

  • Dimitriadou E, Hornik K, Leisch F, Meyer D, Weingessel A (2010) e1071: Misc functions of the Department of Statistics (e1071), TU Wien

  • Eugster M, Hothorn T, Leisch F (2008) Exploratory and inferential analysis of benchmark experiments. Tech. Rep. 30, Department of Statistics, LMU München

  • Fix E, Hodges J (1951) Discriminatory analysis-nonparametric discrimination: consistency properties. Report 4, U.S. Airforce School of Aviation Medicine, Randolph Field, Texas

  • Foody G (1999) The significance of border training patterns in classification by a feedforward neural network using back propagation learning. Int J Remote Sens 20(18):3549–3562

    Article  Google Scholar 

  • Frank A, Asuncion A, University of California, Irvine, School of Information and Computer Sciences (2010) UCI machine learning repository

  • Gönen M, Alpaydin E (2008) Localized multiple kernel learning. In: Proceedings of the 25th international conference on machine learning (ICML ’08), pp 352–359

  • Hand DJ, Vinciotti V (2003) Local versus global models for classification problems: fitting models where it matters. Am Stat 57(2):124–131

    Article  MathSciNet  Google Scholar 

  • Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans Pattern Anal Mach Intell 18(6):607–616

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Data mining, inference, and prediction. Springer Series in Statistics, Springer, New York

  • S original by Hastie T, Tibshirani R, Original R port by Leisch F, Hornik K, Ripley B (2009) mda: Mixture and flexible discriminant analysis

  • Hothorn T, Leisch F, Zeileis A, Hornik K (2005) The design and analysis of benchmark experiments. J Comput Graph Stat 14:675–699

    Google Scholar 

  • Kim H, Loh WY (2003) Classification trees with bivariate linear discriminant node models. J Comput Graph Stat 12:512–530

    Article  MathSciNet  Google Scholar 

  • Kohonen T (1989) Self-organization and associative memory. Springer, Berlin

    Book  Google Scholar 

  • Kotsiantis S, Pintelas P (2004) Local boosting of weak classifiers. In: Proceedings of the IEEE 4th international conference on intelligent systems design and applications (ISDA 2004), pp 175–180

  • Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59:161–205

    Google Scholar 

  • Leisch F, Dimitriadou E (2010) mlbench: Machine learning benchmark problems. R package version 2.0-0

  • Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22

    Google Scholar 

  • Lyhyaoui A, Martinez M, Mora I, Vaquez M, Sancho J, Figueiras-Vidal A (1999) Sample selection via clustering to construct support vector-like classifiers. IEEE Trans Neural Netw 10(6):1474–1481

    Google Scholar 

  • Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281

    Article  MATH  Google Scholar 

  • R Development Core Team (2010) R: A language and environment for statistical computing

  • Saitta L, Neri F (1998) Learning in the “real world”. Mach Learn 30(2–3):133–163

    Article  Google Scholar 

  • Schiffner J, Bischl B, Weihs C (2012) Bias-variance analysis of local classification methods. In: Gaul WA, Geyer-Schulz A, Schmidt-Thieme L, Kunze J (eds) Challenges at the interface of data analysis, computer science, and optimization, Studies in Classification, data analysis, and knowledge organization, vol 43. Springer, Berlin, pp 49–57

    Chapter  Google Scholar 

  • Schliep K, Hechenbichler K (2010) kknn: Weighted k-nearest neighbors

  • Seewald AK, Petrak J, Widmer G (2001) Hybrid decision tree learners with alternative leaf classifiers: an empirical study. Proceedings of the fourteenth international Florida artificial intelligence research society conference. AAAI Press, pp 407–411

  • Segata N, Blanzieri E (2010a) Fast and scalable local kernel machines. J Mach Learn Res 11:1883–1926

    MathSciNet  MATH  Google Scholar 

  • Segata N, Blanzieri E (2010b) Operators for transforming kernels into quasi-local kernels that improve SVM accuracy. J Intell Inform Syst 1–32

  • Shen Z (2006) Classification: distance-based algorithms. In: Berry M, Browne M (eds) Lecture notes in data mining. World Scientific Publishing, Singapore

    Google Scholar 

  • Shin H, Cho S (2002) Pattern selection for support vector classifiers. In: Yin H, Allinson N, Freeman R, Keane J, Hubbard S (eds) Intelligent data engineering and automated learning-IDEAL 2002, Lecture Notes in Computer Science, vol 2412. Springer, Berlin, pp 97–103

    Google Scholar 

  • Simon R (2007) Fundamentals of data mining in genomics and proteomics, chap resampling strategies for model assessment and selection, Springer, USA pp 173–186

  • Smits G, Jordaan E (2002) Improved SVM regression using mixtures of kernels. In: Proceedings of the international joint conference on, neural networks, vol 3, pp 2785–2790

  • Soares C (2003) Is the UCI repository useful for data mining? Prog Artif Intell 2902:209–223

    Article  Google Scholar 

  • Sugiyama M (2007) Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis. J Mach Learn Res 8:1027–1061

    MATH  Google Scholar 

  • Szepannek G, Schiffner J, Wilson J, Weihs C (2008) Local modelling in classification. In: Perner P (ed) Advances in data mining. Medical applications, e-commerce, marketing, and theoretical aspects, Lecture Notes in Computer Science, vol 5077. Springer, Berlin, pp 153–164

  • Therneau T, Atkinson B, R port by Ripley B (2010) rpart: Recursive partitioning

  • Tutz G, Binder H (2005) Localized classification. Stat Comput 15:155–166

    Article  MathSciNet  Google Scholar 

  • Venables WN, Ripley BD (2002) Modern applied statistics with S. Springer, New York

    Book  MATH  Google Scholar 

  • Weihs C, Ligges U, Luebke K, Raabe N (2005) klaR analyzing german business cycles. In: Baier D, Decker R, Schmidt-Thieme L (eds) Data analysis and decision support, Studies in Classification, data analysis, and knowledge organization, vol 30. Springer, Berlin, pp 335–343

    Chapter  Google Scholar 

  • Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17(2): 492–514

    Google Scholar 

  • Zhang CX, Zhang JS (2008) A local boosting algorithm for solving classification problems. Comput Stat Data Anal 52:1928–1941

    Article  MATH  Google Scholar 

  • Zhang H, Berg A, Maire M, Malik J (2006) SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR 2006), vol 2

  • Zhu Yf, Tian Lf, Mao Zy, Wei LfT (2005) Mixtures of kernels for SVM modeling. In: Wang L, Chen K, Ong Y (eds) Advances in natural computation, Lecture Notes in Computer Science, vol 3610. Springer, Berlin, pp 601–607

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernd Bischl.

Additional information

Bernd Bischl and Julia Schiffner contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bischl, B., Schiffner, J. & Weihs, C. Benchmarking local classification methods. Comput Stat 28, 2599–2619 (2013). https://doi.org/10.1007/s00180-013-0420-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-013-0420-y

Keywords

Navigation