Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 6/2015

01.11.2015

A general framework for never-ending learning from time series streams

verfasst von: Yanping Chen, Yuan Hao, Thanawin Rakthanmanon, Jesin Zakaria, Bing Hu, Eamonn Keogh

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 6/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Time series classification has been an active area of research in the data mining community for over a decade, and significant progress has been made in the tractability and accuracy of learning. However, virtually all work assumes a one-time training session in which labeled examples of all the concepts to be learned are provided. This assumption may be valid in a handful of situations, but it does not hold in most medical and scientific applications where we initially may have only the vaguest understanding of what concepts can be learned. Based on this observation, we propose a never-ending learning framework for time series in which an agent examines an unbounded stream of data and occasionally asks a teacher (which may be a human or an algorithm) for a label. We demonstrate the utility of our ideas with experiments that consider real-world problems in domains as diverse as medicine, entomology, wildlife monitoring, and human behavior analyses.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
For our purposes, a “never-ending” stream may only last for days or hours. The salient point is the contrast with the batch learning algorithms that the vast majority of time series papers consider (Ding et al. 2008).
 
2
Recall from Sect. 3.3 that the Subsequence Processing Module may choose to discard a subsequence rather than pass it to Frequent Pattern Maintenance.
 
3
And for some sexually dimorphic species such as mosquitoes, the sex.
 
4
Dr. John Michael Criley, MD, FACC, MACP is Professor Emeritus at the David Geffen School of Medicine at UCLA.
 
5
Usually the top thirteen coefficients are used for audio analysis. The first coefficient is a normalized energy parameter, which is not used for speech recognition (Mermelstein 1976).
 
Literatur
Zurück zum Zitat Achtert E, Bohm C, Kriegel H-P, Kröger P (2005) Online hierarchical clustering in a data warehouse environment data mining. ICDM, pp 10–17 Achtert E, Bohm C, Kriegel H-P, Kröger P (2005) Online hierarchical clustering in a data warehouse environment data mining. ICDM, pp 10–17
Zurück zum Zitat Ambert JD, Hodgman TP, Laurent EJ, Brewer GL, Iliff MJ, Dettmers R (2009) The northeast bird monitoring handbook. American Bird Conservancy, The Plains, VA Ambert JD, Hodgman TP, Laurent EJ, Brewer GL, Iliff MJ, Dettmers R (2009) The northeast bird monitoring handbook. American Bird Conservancy, The Plains, VA
Zurück zum Zitat Bardeli R, Wolff D, Kurth F, Koch M, Frommolt KH (2010) Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognit Lett 31:1524–1534CrossRef Bardeli R, Wolff D, Kurth F, Koch M, Frommolt KH (2010) Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognit Lett 31:1524–1534CrossRef
Zurück zum Zitat Barrenetxea G et al (2008) Sensorscope: out-of-the-box environmental monitoring. In: IPSN, San Francisco Barrenetxea G et al (2008) Sensorscope: out-of-the-box environmental monitoring. In: IPSN, San Francisco
Zurück zum Zitat Batista G, Keogh E, Mafra-Neto A, Rowton E (2011) Sensors and software to allow computational entomology, an emerging application of data mining. In: KDD, pp 761–764 Batista G, Keogh E, Mafra-Neto A, Rowton E (2011) Sensors and software to allow computational entomology, an emerging application of data mining. In: KDD, pp 761–764
Zurück zum Zitat Berges M, Goldman E, Matthews HS, Soibelman L (2010) Enhancing electricity audits in residential buildings with non-intrusive load monitoring. J Ind Ecol 14(5):844–858CrossRef Berges M, Goldman E, Matthews HS, Soibelman L (2010) Enhancing electricity audits in residential buildings with non-intrusive load monitoring. J Ind Ecol 14(5):844–858CrossRef
Zurück zum Zitat Berlin E, Laerhoven K (2012) Detecting leisure activities with dense motif discovery. In: Proceedings of the 2012 intl conference on uniquitous computing, pp 250–259 Berlin E, Laerhoven K (2012) Detecting leisure activities with dense motif discovery. In: Proceedings of the 2012 intl conference on uniquitous computing, pp 250–259
Zurück zum Zitat Borazio M, Laerhoven K (2012) Combining wearable and environmental sensing into an unobtrusive tool for long-term sleep studies. In: 2nd ACM SIGHIT Borazio M, Laerhoven K (2012) Combining wearable and environmental sensing into an unobtrusive tool for long-term sleep studies. In: 2nd ACM SIGHIT
Zurück zum Zitat Campanharo ASLO, Sirer MI, Malgren RD, Ramos FM, Nunes LAN (2011) Duality between time series and networks. Plos One 6:e23378CrossRef Campanharo ASLO, Sirer MI, Malgren RD, Ramos FM, Nunes LAN (2011) Duality between time series and networks. Plos One 6:e23378CrossRef
Zurück zum Zitat Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka Jr ER, Mitchell TM (2010) Toward an architecture for never-ending language learning. In: Proc’ AAAI Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka Jr ER, Mitchell TM (2010) Toward an architecture for never-ending language learning. In: Proc’ AAAI
Zurück zum Zitat Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. InL Proceedings of the 29th ICALP international colloquium on automata, languages and programming, pp 693–703 Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. InL Proceedings of the 29th ICALP international colloquium on automata, languages and programming, pp 693–703
Zurück zum Zitat Chen Y, Why A, Batista G, Mafra-Neto A, Keogh E (2014) Flying insect classification with inexpensive sensors. J Insect Behav 27(5):657–677CrossRef Chen Y, Why A, Batista G, Mafra-Neto A, Keogh E (2014) Flying insect classification with inexpensive sensors. J Insect Behav 27(5):657–677CrossRef
Zurück zum Zitat Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 493–498 Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 493–498
Zurück zum Zitat Cormode G, Hadjieleftheriou M (2010) Methods for finding frequent items in data streams. VLDB J 19(1):3–20CrossRef Cormode G, Hadjieleftheriou M (2010) Methods for finding frequent items in data streams. VLDB J 19(1):3–20CrossRef
Zurück zum Zitat Dagan I, Engelson SP (1995) Committee-based sampling for training probabilistic classifiers. In: ICML, vol 95, pp 150–157 Dagan I, Engelson SP (1995) Committee-based sampling for training probabilistic classifiers. In: ICML, vol 95, pp 150–157
Zurück zum Zitat Dawson DK, Efford MG (2009) Bird Population Density Estimated from Acoustic Signals. Journal of Applied Ecology. 46(6):1201–1209CrossRef Dawson DK, Efford MG (2009) Bird Population Density Estimated from Acoustic Signals. Journal of Applied Ecology. 46(6):1201–1209CrossRef
Zurück zum Zitat Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh EJ (2008) Querying and mining of time series data. Experimental comparison of representations and distance measures. PVLDB 1(2):1542–1552 Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh EJ (2008) Querying and mining of time series data. Experimental comparison of representations and distance measures. PVLDB 1(2):1542–1552
Zurück zum Zitat Dodge Y (2003) Oxford dictionary of statistical terms. OUP, Oxford. ISBN 0-19-850994-4MATH Dodge Y (2003) Oxford dictionary of statistical terms. OUP, Oxford. ISBN 0-19-850994-4MATH
Zurück zum Zitat Elkan C, Noto K (2008) Learning classifiers from only positive and unlabeled data. KDD, pp 213–220 Elkan C, Noto K (2008) Learning classifiers from only positive and unlabeled data. KDD, pp 213–220
Zurück zum Zitat Estan C, Varghese G (2003) New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice. ACM Trans Comput Syst 21(3):270–313CrossRef Estan C, Varghese G (2003) New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice. ACM Trans Comput Syst 21(3):270–313CrossRef
Zurück zum Zitat Ferreira F, Bota D, Bross A, Mélot C, Vincent J (2001) Serial evaluation of the SOFA score to predict outcome in critically ill patients. JAMA 286(14):1754–1758CrossRef Ferreira F, Bota D, Bross A, Mélot C, Vincent J (2001) Serial evaluation of the SOFA score to predict outcome in critically ill patients. JAMA 286(14):1754–1758CrossRef
Zurück zum Zitat Fujii A, Tokunaga T, Inui K, Tanaka H (1998) Selective sampling for example-based word sense disambiguation. Comput Linguist 24(4):573–597 Fujii A, Tokunaga T, Inui K, Tanaka H (1998) Selective sampling for example-based word sense disambiguation. Comput Linguist 24(4):573–597
Zurück zum Zitat Goldberger A et al (2000) PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101(23):215–220CrossRef Goldberger A et al (2000) PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101(23):215–220CrossRef
Zurück zum Zitat Gupta S, Reynolds S, Patel SN (2010) ElectriSense: single-point sensing using EMI for electrical event detection and classification in the home. In: Proceedings of the conference on ubiquitous computing Gupta S, Reynolds S, Patel SN (2010) ElectriSense: single-point sensing using EMI for electrical event detection and classification in the home. In: Proceedings of the conference on ubiquitous computing
Zurück zum Zitat Hinman J, Hickey E (2009) Modeling and forecasting short-term electricity load using regression analysis. Working paper, Illinois State University, Normal (US), Fall Hinman J, Hickey E (2009) Modeling and forecasting short-term electricity load using regression analysis. Working paper, Illinois State University, Normal (US), Fall
Zurück zum Zitat Holyoak DT (2001) Nightjars and their allies: the caprimulgiformes. Oxford University Press, Oxford, New York. ISBN 0-19-854987-3 Holyoak DT (2001) Nightjars and their allies: the caprimulgiformes. Oxford University Press, Oxford, New York. ISBN 0-19-854987-3
Zurück zum Zitat Hu B, Chen Y, Keogh EJ (2013) Time series classification under more realistic assumptions. In: SDM Hu B, Chen Y, Keogh EJ (2013) Time series classification under more realistic assumptions. In: SDM
Zurück zum Zitat Jin C, Qian W, Sha C, Yu J, Zhou A (2003) Dynamically maintaining frequent items over a data stream. In: Proceedings of the 12th ACM CIKM international conference on information and knowledge management, pp 287–294 Jin C, Qian W, Sha C, Yu J, Zhou A (2003) Dynamically maintaining frequent items over a data stream. In: Proceedings of the 12th ACM CIKM international conference on information and knowledge management, pp 287–294
Zurück zum Zitat Jin S, Chen Z, Backus E, Sun X, Xiao B (2012) Characterization of EPG waveforms for the tea green leafhopper on tea plants and their correlation with stylet activities. J Insect Physiol 58:1235–1244CrossRef Jin S, Chen Z, Backus E, Sun X, Xiao B (2012) Characterization of EPG waveforms for the tea green leafhopper on tea plants and their correlation with stylet activities. J Insect Physiol 58:1235–1244CrossRef
Zurück zum Zitat Karp R, Papadimitriou C, Shenker S (2003) A simple algorithm for finding frequent elements in sets and bags. ACM Trans Database Syst 28:51–55CrossRef Karp R, Papadimitriou C, Shenker S (2003) A simple algorithm for finding frequent elements in sets and bags. ACM Trans Database Syst 28:51–55CrossRef
Zurück zum Zitat Kolter J, Jaakkola T (2012) Approximate inference in additive factorial HMMs with application to energy disaggregation. J Mach Learn Res 22:1472–1482 Kolter J, Jaakkola T (2012) Approximate inference in additive factorial HMMs with application to energy disaggregation. J Mach Learn Res 22:1472–1482
Zurück zum Zitat Krishnamurthy A, Balakrishnan S, Xu M, Singh A (2012) Efficient active algorithms for hierarchical clustering. arXiv:1206.4672 Krishnamurthy A, Balakrishnan S, Xu M, Singh A (2012) Efficient active algorithms for hierarchical clustering. arXiv:​1206.​4672
Zurück zum Zitat Lines J, Bagnall A, Caiger-Smith P, Anderson S (2011) Classification of household devices by electricity usage profiles. In: IDEAL, pp 403–412 Lines J, Bagnall A, Caiger-Smith P, Anderson S (2011) Classification of household devices by electricity usage profiles. In: IDEAL, pp 403–412
Zurück zum Zitat MacLeod J, Greene T, MacKenzie DI, Allen RB (2012) Monitoring widespread and common bird species on New Zealand’s conservation lands: a pilot study. N Z J Ecol 36(3):300–311 MacLeod J, Greene T, MacKenzie DI, Allen RB (2012) Monitoring widespread and common bird species on New Zealand’s conservation lands: a pilot study. N Z J Ecol 36(3):300–311
Zurück zum Zitat Manku G, Motwani R (2002) Approximate frequency counts over data streams. In: International conference on very large databases, pp 346–357 Manku G, Motwani R (2002) Approximate frequency counts over data streams. In: International conference on very large databases, pp 346–357
Zurück zum Zitat Mermelstein P (1976) Distance measures for speech recognition, psychological and instrumental. In: Chen CH (ed) Pattern recognition and artificial intelligence. Academic Press, New York Mermelstein P (1976) Distance measures for speech recognition, psychological and instrumental. In: Chen CH (ed) Pattern recognition and artificial intelligence. Academic Press, New York
Zurück zum Zitat Metwally A, Agrawal D, Abbadi AE (2005) Efficient computation of frequent and top-k elements in data streams. In: International conference on database theory Metwally A, Agrawal D, Abbadi AE (2005) Efficient computation of frequent and top-k elements in data streams. In: International conference on database theory
Zurück zum Zitat Mitchell L (1981) Time segregated mosquito collections with a CDC miniature light trap. Mosquito News 42:12 Mitchell L (1981) Time segregated mosquito collections with a CDC miniature light trap. Mosquito News 42:12
Zurück zum Zitat Morales L, Arbetman MP, Cameron SA, Aizen MA (2013) Rapid ecological replacement of a native bumble bee by invasive species. Frontiers Ecol Environ 11:529–534CrossRef Morales L, Arbetman MP, Cameron SA, Aizen MA (2013) Rapid ecological replacement of a native bumble bee by invasive species. Frontiers Ecol Environ 11:529–534CrossRef
Zurück zum Zitat Mueen A, Keogh EJ, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: KDD, pp 1154–1162 Mueen A, Keogh EJ, Young N (2011) Logical-shapelets: an expressive primitive for time series classification. In: KDD, pp 1154–1162
Zurück zum Zitat Mueen A, Keogh EJ (2010) Online discovery and maintenance of time series motifs. In: KDD, pp 1089–1098 Mueen A, Keogh EJ (2010) Online discovery and maintenance of time series motifs. In: KDD, pp 1089–1098
Zurück zum Zitat Nassar S, Sander J, Cheng C (2004) Incremental and effective data summarization for dynamic hierarchical clustering. In: SIGMOD Conference, pp 467–478 Nassar S, Sander J, Cheng C (2004) Incremental and effective data summarization for dynamic hierarchical clustering. In: SIGMOD Conference, pp 467–478
Zurück zum Zitat Norris JR (1998) Markov chains. Cambridge university press, CambridgeMATH Norris JR (1998) Markov chains. Cambridge university press, CambridgeMATH
Zurück zum Zitat Robbins CS (1981) Effect of time of day on bird activity. Stud Avian Biol 6:275–286 Robbins CS (1981) Effect of time of day on bird activity. Stud Avian Biol 6:275–286
Zurück zum Zitat Roggen D et al (2012) Collecting complex activity data sets in highly rich networked sensor environments. In: Proc’ 7th IEEE INSS, pp 233–240 Roggen D et al (2012) Collecting complex activity data sets in highly rich networked sensor environments. In: Proc’ 7th IEEE INSS, pp 233–240
Zurück zum Zitat Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q et al (2012) Searching and mining trillions of time series subsequences under dynamic time warping. Proceedings of the 18th ACM SIGKDD intersnational conference on knowledge discovery and data mining. ACM, New York, pp 262–270 Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q et al (2012) Searching and mining trillions of time series subsequences under dynamic time warping. Proceedings of the 18th ACM SIGKDD intersnational conference on knowledge discovery and data mining. ACM, New York, pp 262–270
Zurück zum Zitat Rowling JK (1997) Harry Potter and the chamber of secrets. Levine Books (scholastics), New York, Read by Stephan Fry, Arthur A Rowling JK (1997) Harry Potter and the chamber of secrets. Levine Books (scholastics), New York, Read by Stephan Fry, Arthur A
Zurück zum Zitat Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, New York. ISBN 0-07-054484-0 Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, New York. ISBN 0-07-054484-0
Zurück zum Zitat Settles B (2012) Active learning. Morgan & Claypool, San RafaelMATH Settles B (2012) Active learning. Morgan & Claypool, San RafaelMATH
Zurück zum Zitat Settles B, Craven M, Friedland L (2008) Active learning with real annotation costs. In: Proceedings of the NIPS workshop on cost-sensitive learning Settles B, Craven M, Friedland L (2008) Active learning with real annotation costs. In: Proceedings of the NIPS workshop on cost-sensitive learning
Zurück zum Zitat Shrivastava N, Buragohain C, Agrawal D, Suri S (2004) Medians and beyond: new aggregation techniques for sensor networks. In: ACM SenSys Shrivastava N, Buragohain C, Agrawal D, Suri S (2004) Medians and beyond: new aggregation techniques for sensor networks. In: ACM SenSys
Zurück zum Zitat Stikic M, Huynh T, Laerhoven KV, Schiele B (2008) ADL recognition based on the combination of RFID and accelerometer sensing. In: PervasiveHealth, pp 258–263 Stikic M, Huynh T, Laerhoven KV, Schiele B (2008) ADL recognition based on the combination of RFID and accelerometer sensing. In: PervasiveHealth, pp 258–263
Zurück zum Zitat Tur G, Hakkani-Tür D, Schapire RE (2005) Combining active and semi-supervised learning for spoken language understanding. Speech Commun 45(2):171–186CrossRef Tur G, Hakkani-Tür D, Schapire RE (2005) Combining active and semi-supervised learning for spoken language understanding. Speech Commun 45(2):171–186CrossRef
Zurück zum Zitat Van Rijsbergen J (1979) Information retrieval, 2nd edn. Butterworths, London Van Rijsbergen J (1979) Information retrieval, 2nd edn. Butterworths, London
Zurück zum Zitat Veeraraghavan A, Chellappa R, Roy-Chowdhury AK (2006) The function space of an activity. In: Computer vision and pattern recognition Veeraraghavan A, Chellappa R, Roy-Chowdhury AK (2006) The function space of an activity. In: Computer vision and pattern recognition
Zurück zum Zitat Wu Y, Zhou C, Xiao J, Kurths J, Schellnhuber HJ (2010) Evidence for a bimodal distribution in human communication. Proc Natl Acad Sci USA 107:18803–18808CrossRef Wu Y, Zhou C, Xiao J, Kurths J, Schellnhuber HJ (2010) Evidence for a bimodal distribution in human communication. Proc Natl Acad Sci USA 107:18803–18808CrossRef
Zurück zum Zitat Yu H (2005) SVM selective sampling for ranking with application to data retrieval. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. ACM, New York Yu H (2005) SVM selective sampling for ranking with application to data retrieval. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining. ACM, New York
Zurück zum Zitat Zhai S, Kristensson PO, Appert C, Anderson TH, Cao X (2012) Foundational issues in touch-surface stroke gesture design-an integrative review. Found Trends Hum Comput Interact 5(2):97–205CrossRef Zhai S, Kristensson PO, Appert C, Anderson TH, Cao X (2012) Foundational issues in touch-surface stroke gesture design-an integrative review. Found Trends Hum Comput Interact 5(2):97–205CrossRef
Metadaten
Titel
A general framework for never-ending learning from time series streams
verfasst von
Yanping Chen
Yuan Hao
Thanawin Rakthanmanon
Jesin Zakaria
Bing Hu
Eamonn Keogh
Publikationsdatum
01.11.2015
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 6/2015
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-014-0388-4

Weitere Artikel der Ausgabe 6/2015

Data Mining and Knowledge Discovery 6/2015 Zur Ausgabe

Premium Partner