Skip to main content
Erschienen in: Soft Computing 19/2017

13.04.2016 | Methodologies and Application

An experimental study on rank methods for prototype selection

verfasst von: Jose J. Valero-Mas, Jorge Calvo-Zaragoza, Juan R. Rico-Juan, José M. Iñesta

Erschienen in: Soft Computing | Ausgabe 19/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Prototype selection is one of the most popular approaches for addressing the low efficiency issue typically found in the well-known k-Nearest Neighbour classification rule. These techniques select a representative subset from an original collection of prototypes with the premise of maintaining the same classification accuracy. Most recently, rank methods have been proposed as an alternative to develop new selection strategies. Following a certain heuristic, these methods sort the elements of the initial collection according to their relevance and then select the best possible subset by means of a parameter representing the amount of data to maintain. Due to the relative novelty of these methods, their performance and competitiveness against other strategies is still unclear. This work performs an exhaustive experimental study of such methods for prototype selection. A representative collection of both classic and sophisticated algorithms are compared to the aforementioned techniques in a number of datasets, including different levels of induced noise. Results report the remarkable competitiveness of these rank methods as well as their excellent trade-off between prototype reduction and achieved accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Given that this number of elements is highly dependent on the memory and computation capabilities of the system considered, we shall restrict ourselves to the definition by Garcia et al. (2012) in which this threshold is set to 2000 prototypes.
 
Literatur
Zurück zum Zitat Angiulli F (2007) Fast nearest neighbor condensation for large data sets classification. IEEE Trans Knowl Data Eng 19(11):1450–1464CrossRef Angiulli F (2007) Fast nearest neighbor condensation for large data sets classification. IEEE Trans Knowl Data Eng 19(11):1450–1464CrossRef
Zurück zum Zitat Brighton H, Mellish C (1999) On the consistency of information filters for lazy learning algorithms. In: Zytkow J, Rauch J (eds) Principles of data mining and knowledge discovery. Lecture notes in computer science, vol 1704. Springer, Berlin, pp 283–288CrossRef Brighton H, Mellish C (1999) On the consistency of information filters for lazy learning algorithms. In: Zytkow J, Rauch J (eds) Principles of data mining and knowledge discovery. Lecture notes in computer science, vol 1704. Springer, Berlin, pp 283–288CrossRef
Zurück zum Zitat Calvo-Zaragoza, J., Oncina, J.: Recognition of pen-based music notation: the HOMUS dataset. In: Proceedings of the 22nd international conference on pattern recognition. Stockholm, Sweden, pp 3038–3043 (2014) Calvo-Zaragoza, J., Oncina, J.: Recognition of pen-based music notation: the HOMUS dataset. In: Proceedings of the 22nd international conference on pattern recognition. Stockholm, Sweden, pp 3038–3043 (2014)
Zurück zum Zitat Calvo-Zaragoza J, Valero-Mas JJ, Rico-Juan JR (2016) Prototype generation on structural data using dissimilarity space representation. Neural Comput Appl. doi:10.1007/s00521-016-2278-8 Calvo-Zaragoza J, Valero-Mas JJ, Rico-Juan JR (2016) Prototype generation on structural data using dissimilarity space representation. Neural Comput Appl. doi:10.​1007/​s00521-016-2278-8
Zurück zum Zitat Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefMATH Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefMATH
Zurück zum Zitat Dasarathy BV, Sánchez JS, Townsend S (2000) Nearest neighbour editing and condensing tools-synergy exploitation. Pattern Anal Appl 19–30 (2000) Dasarathy BV, Sánchez JS, Townsend S (2000) Nearest neighbour editing and condensing tools-synergy exploitation. Pattern Anal Appl 19–30 (2000)
Zurück zum Zitat Devijver PA, Kittler J (1982) Pattern recognition: a statistical approach. Prentice Hall, Upper Saddle River Devijver PA, Kittler J (1982) Pattern recognition: a statistical approach. Prentice Hall, Upper Saddle River
Zurück zum Zitat Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New YorkMATH Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New YorkMATH
Zurück zum Zitat Eshelman LJ (1990) The CHC adaptive search algorithm: how to have safe search when engaging in nontraditional genetic recombination. In: Proceedings of the first workshop on foundations of genetic algorithms. Bloomington Campus, Indiana, pp 265–283 Eshelman LJ (1990) The CHC adaptive search algorithm: how to have safe search when engaging in nontraditional genetic recombination. In: Proceedings of the first workshop on foundations of genetic algorithms. Bloomington Campus, Indiana, pp 265–283
Zurück zum Zitat Garcia S, Derrac J, Cano J, Herrera F (2012) Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans Pattern Anal Mach Intell 34(3):417–435. doi:10.1109/TPAMI.2011.142 CrossRef Garcia S, Derrac J, Cano J, Herrera F (2012) Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans Pattern Anal Mach Intell 34(3):417–435. doi:10.​1109/​TPAMI.​2011.​142 CrossRef
Zurück zum Zitat Hart P (1968) The condensed nearest neighbor rule (corresp.). IEEE Trans Inf Theory 14(3):515–516CrossRef Hart P (1968) The condensed nearest neighbor rule (corresp.). IEEE Trans Inf Theory 14(3):515–516CrossRef
Zurück zum Zitat Natarajan N, Dhillon I, Ravikumar P, Tewari A (2013) Learning with noisy labels. In: Advances in neural information processing systems, pp 1196–1204 (2013) Natarajan N, Dhillon I, Ravikumar P, Tewari A (2013) Learning with noisy labels. In: Advances in neural information processing systems, pp 1196–1204 (2013)
Zurück zum Zitat Rico-Juan JR, Iñesta JM (2012) New rank methods for reducing the size of the training set using the nearest neighbor rule. Pattern Recognit Lett 33(5):654–660 Rico-Juan JR, Iñesta JM (2012) New rank methods for reducing the size of the training set using the nearest neighbor rule. Pattern Recognit Lett 33(5):654–660
Zurück zum Zitat Sakoe H, Chiba S (1990) Readings in speech recognition. In: Waibel A, Lee KF (eds) Readings in speech recognition, dynamic programming algorithm optimization for spoken word recognition. Morgan Kaufmann Publishers Inc., San Francisco, pp 159–165 (1990) Sakoe H, Chiba S (1990) Readings in speech recognition. In: Waibel A, Lee KF (eds) Readings in speech recognition, dynamic programming algorithm optimization for spoken word recognition. Morgan Kaufmann Publishers Inc., San Francisco, pp 159–165 (1990)
Zurück zum Zitat Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. In: IEEE transactions on systems, man and cybernetics SMC-2(3), pp 408–421 (1972). doi:10.1109/TSMC.1972.4309137 Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. In: IEEE transactions on systems, man and cybernetics SMC-2(3), pp 408–421 (1972). doi:10.​1109/​TSMC.​1972.​4309137
Zurück zum Zitat Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34 Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34
Metadaten
Titel
An experimental study on rank methods for prototype selection
verfasst von
Jose J. Valero-Mas
Jorge Calvo-Zaragoza
Juan R. Rico-Juan
José M. Iñesta
Publikationsdatum
13.04.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 19/2017
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-016-2148-4

Weitere Artikel der Ausgabe 19/2017

Soft Computing 19/2017 Zur Ausgabe