Abstract
In recent years in the fields of statistics and machine learning an increasing amount of so called local classification methods has been developed. Local approaches to classification are not new, but have lately become popular. Well-known examples are the \(k\) nearest neighbors method and classification trees. However, in most publications on this topic the term “local” is used without further explanation of its particular meaning. Only little is known about the properties of local methods and the types of classification problems for which they may be beneficial. We explain the basic principles and introduce the most important variants of local methods. To our knowledge there are very few extensive studies in the literature that compare several types of local methods and global methods across many data sets. In order to assess their performance we conduct a benchmark study on real-world and synthetic tasks. We cluster data sets and considered learning algorithms with regard to the obtained performance structures and try to relate our theoretical considerations and intuitions to these results. We also address some general issues of benchmark studies and cover some pitfalls, extensions and improvements.
Similar content being viewed by others
References
Alpaydin E, Jordan M (1996) Local linear perceptrons for classification. IEEE Trans Neural Netw 7(3): 788–792
Atkeson C, Moore A, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73
Binder H, Schumacher M (2008) Adapting prediction error estimates for biased complexity selection in high-dimensional bootstrap samples. Stat Appl Genet Mol Biol 7(1):Article 12
Bischl B (2010) mlr: Machine learning in R. http://mlr.r-forge.r-project.org/
Bishop CM (2006) Pattern recognition and machine learning. Information Science and Statistics, Springer, New York
Blanzieri E, Melgani F (2006) An adaptive SVM nearest neighbor classifier for remotely sensed imagery In: Proceedings of the IEEE international conference on geoscience and remote sensing, symposium (IGARSS-2006), pp 3931–3934
Brailovsky V, Barzilay O, Shahave R (1999) On global, local, mixed and neighborhood kernels for support vector machines. Pattern Recognit Lett 20:1183–1190
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont
Chen D, Burrell P (2001) On decision-boundary-based approaches for structure selection of multilayered neural nets. In: Proceedings of the international conferences on info-tech and info-net 2001, ICII 2001, vol 3, pp 486–490
Cheng H, Tan PN, Jin R (2010) Efficient algorithm for localized support vector machine. IEEE Trans Knowl Data Eng 22(4):537–549
Clarke B, Fokoué E, Zhang H (2009) principles and theory for data mining and machine learning. Springer Series in Statistics, Springer, New York
Czogiel I, Luebke K, Zentgraf M, Weihs C (2007) Localized linear discriminant analysis. In: Decker R, Lenz H (eds) Advances in data analysis, Studies in Classification, data analysis, and knowledge organization, vol 34. Springer, Berlin, pp 133–140
Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Applications of mathematics, Springer, New York
Dimitriadou E, Hornik K, Leisch F, Meyer D, Weingessel A (2010) e1071: Misc functions of the Department of Statistics (e1071), TU Wien
Eugster M, Hothorn T, Leisch F (2008) Exploratory and inferential analysis of benchmark experiments. Tech. Rep. 30, Department of Statistics, LMU München
Fix E, Hodges J (1951) Discriminatory analysis-nonparametric discrimination: consistency properties. Report 4, U.S. Airforce School of Aviation Medicine, Randolph Field, Texas
Foody G (1999) The significance of border training patterns in classification by a feedforward neural network using back propagation learning. Int J Remote Sens 20(18):3549–3562
Frank A, Asuncion A, University of California, Irvine, School of Information and Computer Sciences (2010) UCI machine learning repository
Gönen M, Alpaydin E (2008) Localized multiple kernel learning. In: Proceedings of the 25th international conference on machine learning (ICML ’08), pp 352–359
Hand DJ, Vinciotti V (2003) Local versus global models for classification problems: fitting models where it matters. Am Stat 57(2):124–131
Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans Pattern Anal Mach Intell 18(6):607–616
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Data mining, inference, and prediction. Springer Series in Statistics, Springer, New York
S original by Hastie T, Tibshirani R, Original R port by Leisch F, Hornik K, Ripley B (2009) mda: Mixture and flexible discriminant analysis
Hothorn T, Leisch F, Zeileis A, Hornik K (2005) The design and analysis of benchmark experiments. J Comput Graph Stat 14:675–699
Kim H, Loh WY (2003) Classification trees with bivariate linear discriminant node models. J Comput Graph Stat 12:512–530
Kohonen T (1989) Self-organization and associative memory. Springer, Berlin
Kotsiantis S, Pintelas P (2004) Local boosting of weak classifiers. In: Proceedings of the IEEE 4th international conference on intelligent systems design and applications (ISDA 2004), pp 175–180
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59:161–205
Leisch F, Dimitriadou E (2010) mlbench: Machine learning benchmark problems. R package version 2.0-0
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Lyhyaoui A, Martinez M, Mora I, Vaquez M, Sancho J, Figueiras-Vidal A (1999) Sample selection via clustering to construct support vector-like classifiers. IEEE Trans Neural Netw 10(6):1474–1481
Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281
R Development Core Team (2010) R: A language and environment for statistical computing
Saitta L, Neri F (1998) Learning in the “real world”. Mach Learn 30(2–3):133–163
Schiffner J, Bischl B, Weihs C (2012) Bias-variance analysis of local classification methods. In: Gaul WA, Geyer-Schulz A, Schmidt-Thieme L, Kunze J (eds) Challenges at the interface of data analysis, computer science, and optimization, Studies in Classification, data analysis, and knowledge organization, vol 43. Springer, Berlin, pp 49–57
Schliep K, Hechenbichler K (2010) kknn: Weighted k-nearest neighbors
Seewald AK, Petrak J, Widmer G (2001) Hybrid decision tree learners with alternative leaf classifiers: an empirical study. Proceedings of the fourteenth international Florida artificial intelligence research society conference. AAAI Press, pp 407–411
Segata N, Blanzieri E (2010a) Fast and scalable local kernel machines. J Mach Learn Res 11:1883–1926
Segata N, Blanzieri E (2010b) Operators for transforming kernels into quasi-local kernels that improve SVM accuracy. J Intell Inform Syst 1–32
Shen Z (2006) Classification: distance-based algorithms. In: Berry M, Browne M (eds) Lecture notes in data mining. World Scientific Publishing, Singapore
Shin H, Cho S (2002) Pattern selection for support vector classifiers. In: Yin H, Allinson N, Freeman R, Keane J, Hubbard S (eds) Intelligent data engineering and automated learning-IDEAL 2002, Lecture Notes in Computer Science, vol 2412. Springer, Berlin, pp 97–103
Simon R (2007) Fundamentals of data mining in genomics and proteomics, chap resampling strategies for model assessment and selection, Springer, USA pp 173–186
Smits G, Jordaan E (2002) Improved SVM regression using mixtures of kernels. In: Proceedings of the international joint conference on, neural networks, vol 3, pp 2785–2790
Soares C (2003) Is the UCI repository useful for data mining? Prog Artif Intell 2902:209–223
Sugiyama M (2007) Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis. J Mach Learn Res 8:1027–1061
Szepannek G, Schiffner J, Wilson J, Weihs C (2008) Local modelling in classification. In: Perner P (ed) Advances in data mining. Medical applications, e-commerce, marketing, and theoretical aspects, Lecture Notes in Computer Science, vol 5077. Springer, Berlin, pp 153–164
Therneau T, Atkinson B, R port by Ripley B (2010) rpart: Recursive partitioning
Tutz G, Binder H (2005) Localized classification. Stat Comput 15:155–166
Venables WN, Ripley BD (2002) Modern applied statistics with S. Springer, New York
Weihs C, Ligges U, Luebke K, Raabe N (2005) klaR analyzing german business cycles. In: Baier D, Decker R, Schmidt-Thieme L (eds) Data analysis and decision support, Studies in Classification, data analysis, and knowledge organization, vol 30. Springer, Berlin, pp 335–343
Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17(2): 492–514
Zhang CX, Zhang JS (2008) A local boosting algorithm for solving classification problems. Comput Stat Data Anal 52:1928–1941
Zhang H, Berg A, Maire M, Malik J (2006) SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR 2006), vol 2
Zhu Yf, Tian Lf, Mao Zy, Wei LfT (2005) Mixtures of kernels for SVM modeling. In: Wang L, Chen K, Ong Y (eds) Advances in natural computation, Lecture Notes in Computer Science, vol 3610. Springer, Berlin, pp 601–607
Author information
Authors and Affiliations
Corresponding author
Additional information
Bernd Bischl and Julia Schiffner contributed equally to this work.
Rights and permissions
About this article
Cite this article
Bischl, B., Schiffner, J. & Weihs, C. Benchmarking local classification methods. Comput Stat 28, 2599–2619 (2013). https://doi.org/10.1007/s00180-013-0420-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-013-0420-y