Skip to main content
Erschienen in:
Buchtitelbild

2021 | OriginalPaper | Buchkapitel

Machine Learning for Streaming Data: Overview, Applications and Challenges

verfasst von : Shikha Verma

Erschienen in: Applied Advanced Analytics

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter gives a brief overview of machine learning for streaming data by establishing the need for special algorithms suited for prediction tasks for data streams, why conventional batch learning methods are not adequate, followed by applications in various business domains.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aggarwal, C. C. (Ed.). (2007). Data streams: Models and algorithms (Vol. 31). Springer Science & Business Media. Aggarwal, C. C. (Ed.). (2007). Data streams: Models and algorithms (Vol. 31). Springer Science & Business Media.
Zurück zum Zitat Aggarwal, C. C., & Yu, P. S. (2005, April). Online analysis of community evolution in data streams. In Proceedings of the 2005 SIAM International Conference on Data Mining (pp. 56–67). Society for Industrial and Applied Mathematics. Aggarwal, C. C., & Yu, P. S. (2005, April). Online analysis of community evolution in data streams. In Proceedings of the 2005 SIAM International Conference on Data Mining (pp. 56–67). Society for Industrial and Applied Mathematics.
Zurück zum Zitat Baena-Garcia, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., & Morales-Bueno, R. (2006, September). Early drift detection method. In Fourth international workshop on knowledge discovery from data streams (Vol. 6, pp. 77–86). Baena-Garcia, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., & Morales-Bueno, R. (2006, September). Early drift detection method. In Fourth international workshop on knowledge discovery from data streams (Vol. 6, pp. 77–86).
Zurück zum Zitat Bajwa, R., Rajagopal, R., Varaiya, P., & Kavaler, R. (2011, April). In-pavement wireless sensor network for vehicle classification. In Proceedings of the 10th ACM/IEEE International Conference on Information Processing in Sensor Networks (pp. 85–96). IEEE. Bajwa, R., Rajagopal, R., Varaiya, P., & Kavaler, R. (2011, April). In-pavement wireless sensor network for vehicle classification. In Proceedings of the 10th ACM/IEEE International Conference on Information Processing in Sensor Networks (pp. 85–96). IEEE.
Zurück zum Zitat Bifet, A., & Gavalda, R. (2007, April). Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM international conference on data mining (pp. 443–448). Society for Industrial and Applied Mathematics. Bifet, A., & Gavalda, R. (2007, April). Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM international conference on data mining (pp. 443–448). Society for Industrial and Applied Mathematics.
Zurück zum Zitat Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., & Gavaldà, R. (2009, June). New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 139–148). ACM. Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., & Gavaldà, R. (2009, June). New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 139–148). ACM.
Zurück zum Zitat Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey. Knowledge-based systems, 46, 109–132. Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey. Knowledge-based systems, 46, 109–132.
Zurück zum Zitat Boukhechba, M., Bouzouane, A., Bouchard, B., Gouin-Vallerand, C., & Giroux, S. (2015). Online prediction of people’s next Point-of-Interest: Concept drift support. In Human Behavior Understanding (pp. 97–116). Springer, Cham. Boukhechba, M., Bouzouane, A., Bouchard, B., Gouin-Vallerand, C., & Giroux, S. (2015). Online prediction of people’s next Point-of-Interest: Concept drift support. In Human Behavior Understanding (pp. 97–116). Springer, Cham.
Zurück zum Zitat Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 15.CrossRef Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 15.CrossRef
Zurück zum Zitat Chang, S., Zhang, Y., Tang, J., Yin, D., Chang, Y., Hasegawa-Johnson, M. A., & Huang, T. S. (2017, April). Streaming recommender systems. In Proceedings of the 26th International Conference on World Wide Web (pp. 381–389). International World Wide Web Conferences Steering Committee. Chang, S., Zhang, Y., Tang, J., Yin, D., Chang, Y., Hasegawa-Johnson, M. A., & Huang, T. S. (2017, April). Streaming recommender systems. In Proceedings of the 26th International Conference on World Wide Web (pp. 381–389). International World Wide Web Conferences Steering Committee.
Zurück zum Zitat Domingos, P. M. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78–87.CrossRef Domingos, P. M. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10), 78–87.CrossRef
Zurück zum Zitat Domingos, P., & Hulten, G. (2000, August). Mining high-speed data streams. In Kdd (Vol. 2, p. 4). Domingos, P., & Hulten, G. (2000, August). Mining high-speed data streams. In Kdd (Vol. 2, p. 4).
Zurück zum Zitat Faria, E. R., Gama, J., & Carvalho, A. C. (2013, March). Novelty detection algorithm for data streams multi-class problems. In Proceedings of the 28th annual ACM symposium on applied computing (pp. 795–800). ACM. Faria, E. R., Gama, J., & Carvalho, A. C. (2013, March). Novelty detection algorithm for data streams multi-class problems. In Proceedings of the 28th annual ACM symposium on applied computing (pp. 795–800). ACM.
Zurück zum Zitat Gama, J., Medas, P., Castillo, G., & Rodrigues, P. (2004, September). Learning with drift detection. In Brazilian symposium on artificial intelligence (pp. 286–295). Springer, Berlin, Heidelberg. Gama, J., Medas, P., Castillo, G., & Rodrigues, P. (2004, September). Learning with drift detection. In Brazilian symposium on artificial intelligence (pp. 286–295). Springer, Berlin, Heidelberg.
Zurück zum Zitat Hastie T. T. R., & Friedman, J. H. (2003). Elements of statistical learning: data mining, inference, and prediction. Hastie T. T. R., & Friedman, J. H. (2003). Elements of statistical learning: data mining, inference, and prediction.
Zurück zum Zitat Hayat, M. Z., Basiri, J., Seyedhossein, L., & Shakery, A. (2010, December). Content-based concept drift detection for email spam filtering. In 2010 5th International Symposium on Telecommunications (pp. 531–536). IEEE. Hayat, M. Z., Basiri, J., Seyedhossein, L., & Shakery, A. (2010, December). Content-based concept drift detection for email spam filtering. In 2010 5th International Symposium on Telecommunications (pp. 531–536). IEEE.
Zurück zum Zitat Huang, H., Cheng, Y., & Weibel, R. (2019). Transport mode detection based on mobile phone network data: A systematic review. Transportation Research Part C: Emerging Technologies. Huang, H., Cheng, Y., & Weibel, R. (2019). Transport mode detection based on mobile phone network data: A systematic review. Transportation Research Part C: Emerging Technologies.
Zurück zum Zitat Ikonomovska, E., & Gama, J. (2008, October). Learning model trees from data streams. In International Conference on Discovery Science (pp. 52–63). Springer, Berlin, Heidelberg. Ikonomovska, E., & Gama, J. (2008, October). Learning model trees from data streams. In International Conference on Discovery Science (pp. 52–63). Springer, Berlin, Heidelberg.
Zurück zum Zitat Ikonomovska, E., Gama, J., & Džeroski, S. (2015). Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing, 150, 458–470.CrossRef Ikonomovska, E., Gama, J., & Džeroski, S. (2015). Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing, 150, 458–470.CrossRef
Zurück zum Zitat Ikonomovska, E., Gama, J., Sebastião, R., & Gjorgjevik, D. (2009, October). Regression trees from data streams with drift detection. In International Conference on Discovery Science (pp. 121–135). Springer, Berlin, Heidelberg. Ikonomovska, E., Gama, J., Sebastião, R., & Gjorgjevik, D. (2009, October). Regression trees from data streams with drift detection. In International Conference on Discovery Science (pp. 121–135). Springer, Berlin, Heidelberg.
Zurück zum Zitat Khamassi, I., Sayed-Mouchaweh, M., Hammami, M., & Ghédira, K. (2018). Discussion and review on evolving data streams and concept drift adapting. Evolving Systems, 9(1), 1–23.CrossRef Khamassi, I., Sayed-Mouchaweh, M., Hammami, M., & Ghédira, K. (2018). Discussion and review on evolving data streams and concept drift adapting. Evolving Systems, 9(1), 1–23.CrossRef
Zurück zum Zitat Kolter, J. Z., & Maloof, M. A. (2003, November). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Third IEEE international conference on data mining (pp. 123–130). IEEE. Kolter, J. Z., & Maloof, M. A. (2003, November). Dynamic weighted majority: A new ensemble method for tracking concept drift. In Third IEEE international conference on data mining (pp. 123–130). IEEE.
Zurück zum Zitat Kourtellis, N., Morales, G. D. F., Bifet, A., & Murdopo, A. (2016, December). Vht: Vertical hoeffding tree. In 2016 IEEE International Conference on Big Data (Big Data) (pp. 915–922). IEEE. Kourtellis, N., Morales, G. D. F., Bifet, A., & Murdopo, A. (2016, December). Vht: Vertical hoeffding tree. In 2016 IEEE International Conference on Big Data (Big Data) (pp. 915–922). IEEE.
Zurück zum Zitat Laha, A. K., & Putatunda, S. (2018). Real time location prediction with taxi-GPS data streams. Transportation Research Part C: Emerging Technologies, 92, 298–322.CrossRef Laha, A. K., & Putatunda, S. (2018). Real time location prediction with taxi-GPS data streams. Transportation Research Part C: Emerging Technologies, 92, 298–322.CrossRef
Zurück zum Zitat Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70), 1. Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70), 1.
Zurück zum Zitat Masud, M., Gao, J., Khan, L., Han, J., & Thuraisingham, B. M. (2010). Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Transactions on Knowledge and Data Engineering, 23(6), 859–874.CrossRef Masud, M., Gao, J., Khan, L., Han, J., & Thuraisingham, B. M. (2010). Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Transactions on Knowledge and Data Engineering, 23(6), 859–874.CrossRef
Zurück zum Zitat Mazhelis, O., & Puuronen, S. (2007, April). Comparing classifier combining techniques for mobile-masquerader detection. In The Second International Conference on Availability, Reliability and Security (ARES'07) (pp. 465–472). IEEE. Mazhelis, O., & Puuronen, S. (2007, April). Comparing classifier combining techniques for mobile-masquerader detection. In The Second International Conference on Availability, Reliability and Security (ARES'07) (pp. 465–472). IEEE.
Zurück zum Zitat Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., & Damas, L. (2013). Predicting taxi–passenger demand using streaming data. IEEE Transactions on Intelligent Transportation Systems, 14(3), 1393–1402.CrossRef Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., & Damas, L. (2013). Predicting taxi–passenger demand using streaming data. IEEE Transactions on Intelligent Transportation Systems, 14(3), 1393–1402.CrossRef
Zurück zum Zitat Nasraoui, O., Cerwinske, J., Rojas, C., & Gonzalez, F. (2007, April). Performance of recommendation systems in dynamic streaming environments. In Proceedings of the 2007 SIAM International Conference on Data Mining (pp. 569–574). Society for Industrial and Applied Mathematics. Nasraoui, O., Cerwinske, J., Rojas, C., & Gonzalez, F. (2007, April). Performance of recommendation systems in dynamic streaming environments. In Proceedings of the 2007 SIAM International Conference on Data Mining (pp. 569–574). Society for Industrial and Applied Mathematics.
Zurück zum Zitat Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2), 100–115.CrossRef Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2), 100–115.CrossRef
Zurück zum Zitat Parthasarathy, S., Ghoting, A., & Otey, M. E. (2007). A survey of distributed mining of data streams. In Data Streams (pp. 289–307). Springer, Boston, MA. Parthasarathy, S., Ghoting, A., & Otey, M. E. (2007). A survey of distributed mining of data streams. In Data Streams (pp. 289–307). Springer, Boston, MA.
Zurück zum Zitat Parveen, P., Evans, J., Thuraisingham, B., Hamlen, K. W., & Khan, L. (2011, October). Insider threat detection using stream mining and graph mining. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing (pp. 1102–1110). IEEE. Parveen, P., Evans, J., Thuraisingham, B., Hamlen, K. W., & Khan, L. (2011, October). Insider threat detection using stream mining and graph mining. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing (pp. 1102–1110). IEEE.
Zurück zum Zitat Sethi, T. S., Kantardzic, M., & Hu, H. (2016). A grid density based framework for classifying streaming data in the presence of concept drift. Journal of Intelligent Information Systems, 46(1), 179–211.CrossRef Sethi, T. S., Kantardzic, M., & Hu, H. (2016). A grid density based framework for classifying streaming data in the presence of concept drift. Journal of Intelligent Information Systems, 46(1), 179–211.CrossRef
Zurück zum Zitat Spinosa, E. J., de Leon F de Carvalho, A. P., & Gama, J. (2007, March). Olindda: A cluster-based approach for detecting novelty and concept drift in data streams. In Proceedings of the 2007 ACM symposium on Applied computing (pp. 448–452). ACM. Spinosa, E. J., de Leon F de Carvalho, A. P., & Gama, J. (2007, March). Olindda: A cluster-based approach for detecting novelty and concept drift in data streams. In Proceedings of the 2007 ACM symposium on Applied computing (pp. 448–452). ACM.
Zurück zum Zitat Street, W. N., & Kim, Y. (2001, August). A streaming ensemble algorithm (SEA) for large-scale classification. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 377–382). ACM. Street, W. N., & Kim, Y. (2001, August). A streaming ensemble algorithm (SEA) for large-scale classification. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 377–382). ACM.
Zurück zum Zitat Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. arXiv preprint arXiv:1906.02243. Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. arXiv preprint arXiv:​1906.​02243.
Zurück zum Zitat Sun, Y., Tang, K., Minku, L. L., Wang, S., & Yao, X. (2016). Online ensemble learning of data streams with gradually evolved classes. IEEE Transactions on Knowledge and Data Engineering, 28(6), 1532–1545.CrossRef Sun, Y., Tang, K., Minku, L. L., Wang, S., & Yao, X. (2016). Online ensemble learning of data streams with gradually evolved classes. IEEE Transactions on Knowledge and Data Engineering, 28(6), 1532–1545.CrossRef
Metadaten
Titel
Machine Learning for Streaming Data: Overview, Applications and Challenges
verfasst von
Shikha Verma
Copyright-Jahr
2021
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-33-6656-5_1