Skip to main content
Erschienen in: Pattern Analysis and Applications 3-4/2008

01.09.2008 | Theoretical Advances

Converting non-parametric distance-based classification to anytime algorithms

verfasst von: Xiaopeng Xi, Ken Ueno, Eamonn Keogh, Dah-Jye Lee

Erschienen in: Pattern Analysis and Applications | Ausgabe 3-4/2008

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

For many real world problems we must perform classification under widely varying amounts of computational resources. For example, if asked to classify an instance taken from a bursty stream, we may have anywhere from several milliseconds to several minutes to return a class prediction. For such problems an anytime algorithm may be especially useful. In this work we show how we convert the ubiquitous nearest neighbor classifier into an anytime algorithm that can produce an instant classification, or if given the luxury of additional time, can continue computations to increase classification accuracy. We demonstrate the utility of our approach with a comprehensive set of experiments on data from diverse domains. We further show the utility of our work with two deployed applications, in classifying and counting fish, and in classifying insects.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Note that one of the current authors is also an author of this study. However, it is more natural to refer to this work in the third person.
 
Literatur
1.
Zurück zum Zitat Aggarwal C, Han J, Wang J, Yu PS (2004) On demand classification of data streams. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04), pp 503–508 Aggarwal C, Han J, Wang J, Yu PS (2004) On demand classification of data streams. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04), pp 503–508
2.
Zurück zum Zitat Shah R, Krishnaswamy S, Gaber MM (2005) Resource-aware very fast K-means for ubiquitous data stream mining. In: Proceedings of 2nd international workshop on knowledge discovery in data streams Shah R, Krishnaswamy S, Gaber MM (2005) Resource-aware very fast K-means for ubiquitous data stream mining. In: Proceedings of 2nd international workshop on knowledge discovery in data streams
3.
Zurück zum Zitat Grass J, Zilberstein S (1996) Anytime algorithm development tools. SIGART artificial intelligence, vol 7(2). ACM Press, New York Grass J, Zilberstein S (1996) Anytime algorithm development tools. SIGART artificial intelligence, vol 7(2). ACM Press, New York
4.
Zurück zum Zitat Bradley P, Fayyad U, Reina C (1998) Scaling clustering algorithms to large databases. In: Proceedings of the 4th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’98), pp 9–15 Bradley P, Fayyad U, Reina C (1998) Scaling clustering algorithms to large databases. In: Proceedings of the 4th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’98), pp 9–15
5.
Zurück zum Zitat Esmeir S, Markovitch S (2005) Interruptible anytime algorithms for iterative improvement of decision trees. In: Proceedings of workshop on the utility-based data mining, held with the 11th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’05) Esmeir S, Markovitch S (2005) Interruptible anytime algorithms for iterative improvement of decision trees. In: Proceedings of workshop on the utility-based data mining, held with the 11th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’05)
6.
Zurück zum Zitat Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the 18th international conference on machine learning (ICML’01), pp 441–448 Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the 18th international conference on machine learning (ICML’01), pp 441–448
7.
Zurück zum Zitat Wei L, Keogh E, Van Herle H, Mafra-Neto A (2005) Atomic Wedgie: efficient query filtering for streaming time series. In: Proceedings of the 5th IEEE international conference on data mining (ICDM’05), pp 490–497 Wei L, Keogh E, Van Herle H, Mafra-Neto A (2005) Atomic Wedgie: efficient query filtering for streaming time series. In: Proceedings of the 5th IEEE international conference on data mining (ICDM’05), pp 490–497
8.
Zurück zum Zitat Bozma HI, Yalcin H (2002) Visual processing and classification of items on a moving conveyor: a selective perception approach. Rob Comput Integr Manuf 18(2):125–133CrossRef Bozma HI, Yalcin H (2002) Visual processing and classification of items on a moving conveyor: a selective perception approach. Rob Comput Integr Manuf 18(2):125–133CrossRef
9.
Zurück zum Zitat Adamek T, Connor NE (2004) A multiscale representation method for nonrigid shapes with a single closed contour. IEEE Circuits Syst Video Technol 14:742–753CrossRef Adamek T, Connor NE (2004) A multiscale representation method for nonrigid shapes with a single closed contour. IEEE Circuits Syst Video Technol 14:742–753CrossRef
10.
Zurück zum Zitat Wen Z, Tao Y (2000) Dual-camera NIR/MIR imaging for stem-end/calyx identification in apple defect sorting. Trans ASAE 43(2):446–452 Wen Z, Tao Y (2000) Dual-camera NIR/MIR imaging for stem-end/calyx identification in apple defect sorting. Trans ASAE 43(2):446–452
11.
Zurück zum Zitat Sánchez JS, Mollineda RA, Sotoca JM (2007) An analysis of how training data complexity affects the nearest neighbor classifiers. Pattern Anal Appl 10(3):189–201CrossRefMathSciNet Sánchez JS, Mollineda RA, Sotoca JM (2007) An analysis of how training data complexity affects the nearest neighbor classifiers. Pattern Anal Appl 10(3):189–201CrossRefMathSciNet
12.
Zurück zum Zitat Zilberstein S, Russell S (1995) Approximate reasoning using anytime algorithms. Imprecise and approximate computation. Kluwer, Dordrecht Zilberstein S, Russell S (1995) Approximate reasoning using anytime algorithms. Imprecise and approximate computation. Kluwer, Dordrecht
13.
Zurück zum Zitat Myers K, Kearns MJ, Singh SP, Walker MA (2000) A boosting approach to topic spotting on subdialogues. In: Proceedings of the international conference on machine learning (ICML’00), pp 655–662 Myers K, Kearns MJ, Singh SP, Walker MA (2000) A boosting approach to topic spotting on subdialogues. In: Proceedings of the international conference on machine learning (ICML’00), pp 655–662
14.
Zurück zum Zitat Heidemann G, Bekel H, Bax II, Ritter H (2005) Interactive online learning. Pattern Recogn Image Anal 15(1):55–58 Heidemann G, Bekel H, Bax II, Ritter H (2005) Interactive online learning. Pattern Recogn Image Anal 15(1):55–58
15.
Zurück zum Zitat Yamada S, Nagino N (2002) Constructing a personal web map with anytime-control of web robots. Int J Coop Inform Syst 11(1–2):1–19CrossRef Yamada S, Nagino N (2002) Constructing a personal web map with anytime-control of web robots. Int J Coop Inform Syst 11(1–2):1–19CrossRef
16.
Zurück zum Zitat Kotenko I, Stankevitch L (2002) The control of teams of autonomous objects in the time-constrained environments. Proc IEEE Int Conf Artif Intell Syst 158–163 Kotenko I, Stankevitch L (2002) The control of teams of autonomous objects in the time-constrained environments. Proc IEEE Int Conf Artif Intell Syst 158–163
17.
Zurück zum Zitat Lindgren T (2000) Anytime inductive logic programming. In: Proceedings of the 15th international conference on computers and their applications, pp 439–442 Lindgren T (2000) Anytime inductive logic programming. In: Proceedings of the 15th international conference on computers and their applications, pp 439–442
18.
Zurück zum Zitat Hulten G, Domingos P (2002) Mining complex models from arbitrarily large databases in constant time. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02), pp 525–531 Hulten G, Domingos P (2002) Mining complex models from arbitrarily large databases in constant time. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02), pp 525–531
19.
Zurück zum Zitat Grumberg O, Livne S, Markovitch S (2003) Learning to order BDD variables in verification. J Artif Intell Res 18:83–116MATHMathSciNet Grumberg O, Livne S, Markovitch S (2003) Learning to order BDD variables in verification. J Artif Intell Res 18:83–116MATHMathSciNet
20.
Zurück zum Zitat Webb GI, Yang Y, Boughton J, Korb K, Ting K-M (2005) Classifying under computational resource constraints: anytime classification using probabilistic estimators. Technical Report 2005/185, Clayton School of Information Technology, Monash University Webb GI, Yang Y, Boughton J, Korb K, Ting K-M (2005) Classifying under computational resource constraints: anytime classification using probabilistic estimators. Technical Report 2005/185, Clayton School of Information Technology, Monash University
21.
Zurück zum Zitat Barandela R, Ferri FJ, Sánchez JS (2005) Decision boundary preserving prototype selection for nearest neighbor classification. Int J Pattern Recogn Artif Intell 19(6):787–806CrossRef Barandela R, Ferri FJ, Sánchez JS (2005) Decision boundary preserving prototype selection for nearest neighbor classification. Int J Pattern Recogn Artif Intell 19(6):787–806CrossRef
22.
Zurück zum Zitat Pekalska E, Duin R, Paclik P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogn 39(2):189–208MATHCrossRef Pekalska E, Duin R, Paclik P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogn 39(2):189–208MATHCrossRef
23.
Zurück zum Zitat Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38:257–286. (Kluwer Acadamic Publishers)MATHCrossRef Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38:257–286. (Kluwer Acadamic Publishers)MATHCrossRef
24.
Zurück zum Zitat Herrero JR, Juan J Navarro (2007) Exploiting computer resources for fast Nearest Neighbor Classification. Pattern Anal Appl 10(4):265–275CrossRef Herrero JR, Juan J Navarro (2007) Exploiting computer resources for fast Nearest Neighbor Classification. Pattern Anal Appl 10(4):265–275CrossRef
25.
Zurück zum Zitat Ueno K, Xi X, Keogh E, Lee D-J (2006) Anytime classification using the nearest neighbor algorithm with applications to stream mining. In: Proceedings of the 6th IEEE international conference on data mining (ICDM’06), pp 623–632 Ueno K, Xi X, Keogh E, Lee D-J (2006) Anytime classification using the nearest neighbor algorithm with applications to stream mining. In: Proceedings of the 6th IEEE international conference on data mining (ICDM’06), pp 623–632
26.
Zurück zum Zitat Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23nd international conference on machine learning (ICDM’06), Pittsburgh, PA Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23nd international conference on machine learning (ICDM’06), Pittsburgh, PA
27.
Zurück zum Zitat Keogh E, Kasetty S (2002) On the need for time series data mining benchmarks: a survey and empirical demonstration. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02), Edmonton, Canada, pp 102–111 Keogh E, Kasetty S (2002) On the need for time series data mining benchmarks: a survey and empirical demonstration. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02), Edmonton, Canada, pp 102–111
28.
Zurück zum Zitat Keogh E, Wei L, Xi X, Lee SH, Vlachos M (2006) LB_Keogh supports exact indexing of shapes under rotation invariance with arbitrary representations and distance measures. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp 882–893 Keogh E, Wei L, Xi X, Lee SH, Vlachos M (2006) LB_Keogh supports exact indexing of shapes under rotation invariance with arbitrary representations and distance measures. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp 882–893
31.
Zurück zum Zitat Geurts P (2002) Contributions to decision tree induction: bias/variance tradeoff and time series classification. Ph.D. thesis, Department of Electrical Engineering and Computer Science, University of Liege, Belgium Geurts P (2002) Contributions to decision tree induction: bias/variance tradeoff and time series classification. Ph.D. thesis, Department of Electrical Engineering and Computer Science, University of Liege, Belgium
32.
Zurück zum Zitat Ratanamahatana CA, Keogh E (2005) Three myths about dynamic time warping. In: Proceedings of the SIAM international conference on data mining (SDM’05), pp 506–510 Ratanamahatana CA, Keogh E (2005) Three myths about dynamic time warping. In: Proceedings of the SIAM international conference on data mining (SDM’05), pp 506–510
33.
Zurück zum Zitat Lee D-J, Schoenberger R, Shiozawa D, Xu X, Zhan P (2004) Contour matching for a fish recognition and migration monitoring system. In: Proceedings of the SPIE optics east, two and three-dimensional vision systems for inspection, control, and metrology II, vol 5606-05, pp 25–28 Lee D-J, Schoenberger R, Shiozawa D, Xu X, Zhan P (2004) Contour matching for a fish recognition and migration monitoring system. In: Proceedings of the SPIE optics east, two and three-dimensional vision systems for inspection, control, and metrology II, vol 5606-05, pp 25–28
34.
Zurück zum Zitat Hardin RW (2006) Vision system monitors fish populations. Vis Syst Des (January) Hardin RW (2006) Vision system monitors fish populations. Vis Syst Des (January)
35.
Zurück zum Zitat Jolliffe IT (2002) Principal component analysis. Springer, HeidelbergMATH Jolliffe IT (2002) Principal component analysis. Springer, HeidelbergMATH
36.
Zurück zum Zitat Chung K-C, Kee SC, Kim SR (1999) Face recognition using principal component analysis of Gabor filter responses. In: Proceedings of the international workshop on recognition, analysis, and tracking of faces and gestures in real-time systems, pp 53–57 Chung K-C, Kee SC, Kim SR (1999) Face recognition using principal component analysis of Gabor filter responses. In: Proceedings of the international workshop on recognition, analysis, and tracking of faces and gestures in real-time systems, pp 53–57
37.
Zurück zum Zitat Bimbo AD (1999) Visual information retrieval. Morgan Khaufman, San Franscico Bimbo AD (1999) Visual information retrieval. Morgan Khaufman, San Franscico
38.
Zurück zum Zitat Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of international conference on management of data, Boston, MA, pp 47–57 Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of international conference on management of data, Boston, MA, pp 47–57
39.
Zurück zum Zitat Li M, Chen X, Li X, Ma B, Vitányi P (2003) The similarity metric. In: Proceedings of the 14th annual ACM-SIAM symposium on discrete algorithms, pp 863–872 Li M, Chen X, Li X, Ma B, Vitányi P (2003) The similarity metric. In: Proceedings of the 14th annual ACM-SIAM symposium on discrete algorithms, pp 863–872
40.
Zurück zum Zitat Rodríguez JJ, Alonso CJ (2004) Interval and dynamic time warping-based decision trees. In: Proceedings of the 2004 ACM symposium on applied computing, pp 548–552 Rodríguez JJ, Alonso CJ (2004) Interval and dynamic time warping-based decision trees. In: Proceedings of the 2004 ACM symposium on applied computing, pp 548–552
41.
Zurück zum Zitat Guarino M, Costa A, van Hirtum A, Jans P, Ghesquiere K, Aerts JM, Navarotto P, Berckmans D (2004) Automatic detection of infective pig coughing from continuous recording in field situations. Riv Ingegneria Agraria 35(4):69–73 Guarino M, Costa A, van Hirtum A, Jans P, Ghesquiere K, Aerts JM, Navarotto P, Berckmans D (2004) Automatic detection of infective pig coughing from continuous recording in field situations. Riv Ingegneria Agraria 35(4):69–73
Metadaten
Titel
Converting non-parametric distance-based classification to anytime algorithms
verfasst von
Xiaopeng Xi
Ken Ueno
Eamonn Keogh
Dah-Jye Lee
Publikationsdatum
01.09.2008
Verlag
Springer-Verlag
Erschienen in
Pattern Analysis and Applications / Ausgabe 3-4/2008
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-007-0098-2

Weitere Artikel der Ausgabe 3-4/2008

Pattern Analysis and Applications 3-4/2008 Zur Ausgabe