research-article

Free Access

Stream-based Machine Learning for Network Security and Anomaly Detection

Authors:
Pavol Mulinka

CTU Czech Technical University in Prague, AIT Austrian Institute of Technology

CTU Czech Technical University in Prague, AIT Austrian Institute of Technology
View Profile

,
Pedro Casas

AIT Austrian Institute of Technology

AIT Austrian Institute of Technology
View Profile

Big-DAMA '18: Proceedings of the 2018 Workshop on Big Data Analytics and Machine Learning for Data Communication NetworksAugust 2018Pages 1–7https://doi.org/10.1145/3229607.3229612

Published:07 August 2018Publication History

Big-DAMA '18: Proceedings of the 2018 Workshop on Big Data Analytics and Machine Learning for Data Communication Networks

Pages 1–7

ABSTRACT

Data Stream Machine Learning is rapidly gaining popularity within the network monitoring community as the big data produced by network devices and end-user terminals goes beyond the memory constraints of standard monitoring equipment. Critical network monitoring applications such as the detection of anomalies, network attacks and intrusions, require fast and continuous mechanisms for on-line analysis of data streams. In this paper we consider a stream-based machine learning approach for network security and anomaly detection, applying and evaluating multiple machine learning algorithms in the analysis of continuously evolving network data streams. The continuous evolution of the data stream analysis algorithms coming from the data stream mining domain, as well as the multiple evaluation approaches conceived for benchmarking such kind of algorithms makes it difficult to choose the appropriate machine learning model. Results of the different approaches may significantly differ and it is crucial to determine which approach reflects the algorithm performance the best. We therefore compare and analyze the results from the most recent evaluation approaches for sequential data on commonly used batch-based machine learning algorithms and their corresponding stream-based extensions, for the specific problem of on-line network security and anomaly detection. Similar to our previous findings when dealing with off-line machine learning approaches for network security and anomaly detection, our results suggest that adaptive random forests and stochastic gradient descent models are able to keep up with important concept drifts in the underlying network data streams, by keeping high accuracy with continuous re-training at concept drift detection times.

References

R. Fontugne, P. Borgnat, P. Abry, and K. Fukuda, "Mawilab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking," in Proceedings of the 6th ACM CoNEXT Conference, 2010. Google ScholarDigital Library
V. Chandola, A. Banerjee, and V. Kumar, "Anomaly detection: A survey," ACM Comput. Surv., vol. 41, no. 3, pp. 15:1--15:58, Jul. 2009. Google ScholarDigital Library
M. Ahmed, A. Naser Mahmood, and J. Hu, "A survey of network anomaly detection techniques," J. Netw. Comput. Appl., vol. 60, no. C, pp. 19--31, Jan. 2016. Google ScholarDigital Library
W. Zhang, Q. Yang, and Y. Geng, "A survey of anomaly detection methods in networks," in 2009 CNMT, Jan 2009, pp. 1--3.Google Scholar
T. T. T. Nguyen and G. Armitage, "A survey of techniques for internet traffic classification using machine learning," IEEE Communications Surveys Tutorials, vol. 10, no. 4, pp. 56--76, Fourth 2008. Google ScholarDigital Library
J. Vanerio and P. Casas, "Ensemble-learning approaches for network security and anomaly detection," in Proceedings of the ACM SIGCOMM Big-DAMA Workshop. New York, NY, USA: ACM, 2017, pp. 1--6. Google ScholarDigital Library
P. Casas, F. Soro, J. Vanerio, G. Settanni, and A. D'Alconzo, "Network security and anomaly detection with big-dama, a big data analytics framework," in 2017 IEEE 6th CloudNet Conference, Sept 2017, pp. 1--7.Google Scholar
P. Casas, J. Vanerio, and K. Fukuda, "Gml learning, a generic machine learning model for network measurements analysis," in 2017 13th International Conference on Network and Service Management (CNSM), Nov 2017, pp. 1--9.Google Scholar
P. Casas and J. Vanerio, "Super learning for anomaly detection in cellular networks," in 2017 IEEE 13th WiMob Conference, Oct 2017, pp. 1--8.Google Scholar
V. Carela-Español, P. Barlet-Ros, A. Bifet, and K. Fukuda, "A streaming flow-based technique for traffic classification applied to 12+ 1 years of internet traffic," Telecommunication Systems, vol. 63, no. 2, pp. 191--204, 2016. Google ScholarDigital Library
P. M. Domingos and G. Hulten, "Catching up with the data: Research issues in mining data streams." in DMKD, 2001.Google Scholar
M. Stonebraker, U. Çetintemel, and S. Zdonik, "The 8 requirements of real-time stream processing," ACM Sigmod Record, vol. 34, no. 4, pp. 42--47, 2005. Google ScholarDigital Library
G. Hulten, P. Domingos, and L. Spencer, Mining massive data streams. University of Washington, 2005.Google Scholar
J. Gama, R. Sebastião, and P. P. Rodrigues, "Issues in evaluation of stream learning algorithms," in Proceedings of the 15th ACM SIGKDD Conference. ACM, 2009, pp. 329--338. Google ScholarDigital Library
J. Gama, R. Sebastião, and P. P. Rodrigues, "On evaluating stream learning algorithms," Machine learning, vol. 90, no. 3, pp. 317--346, 2013. Google ScholarDigital Library
T. R. Hoens, R. Polikar, and N. V. Chawla, "Learning from streaming data with concept drift and imbalance: an overview," Progress in Artificial Intelligence, vol. 1, no. 1, pp. 89--101, 2012.Google ScholarCross Ref
G. Hulten and P. Domingos, "Vfml--a toolkit for mining high-speed time-changing data streams," Software toolkit, p. 51, 2003.Google Scholar
X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. Tsai, M. Amde, S. Owen et al., "Mllib: Machine learning in apache spark," The Journal of Machine Learning Research, vol. 17, no. 1, pp. 1235--1241, 2016. Google ScholarDigital Library
A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer, "Moa: Massive online analysis," Journal of Machine Learning Research, vol. 11, no. May, pp. 1601--1604, 2010. Google ScholarDigital Library
O. Rittho, R. Klinkenberg, S. Fischer, I. Mierswa, and S. Felske, "Yale: Yet another learning environment," in LLWA 01-Tagungsband der GI-Workshop-Woche Lernen-Lehren-Wissen-Adaptivität, no. 763. Citeseer, 2001, pp. 84--92.Google Scholar
G. D. F. Morales and A. Bifet, "Samoa: scalable advanced massive online analysis." Journal of Machine Learning Research, vol. 16, no. 1, pp. 149--153, 2015. Google ScholarDigital Library
A. Bifet and R. Gavalda, "Learning from time-changing data with adaptive windowing," in Proceedings of the 2007 SIAM Conference, 2007, pp. 443--448.Google Scholar
A. Bifet, G. de Francisci Morales, J. Read, G. Holmes, and B. Pfahringer, "Efficient online evaluation of big data stream classifiers," in Proceedings of the 21th ACM SIGKDD Conference. ACM, 2015, pp. 59--68. Google ScholarDigital Library
D. Brzezinski and J. Stefanowski, "Prequential auc: properties of the area under the roc curve for data streams with concept drift," Knowledge and Information Systems, vol. 52, no. 2, pp. 531--562, 2017. Google ScholarDigital Library
J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, and A. Bouchachia, "A survey on concept drift adaptation," ACM CSUR, vol. 46, no. 4, p. 44, 2014. Google ScholarDigital Library
P. Casas, A. D'Alconzo, T. Zseby, and M. Mellia, "Big-dama: big data analytics for network traffic monitoring and analysis," in Proceedings of the 2016 ACM SIGCOMM LANCOMM Workshop. ACM, 2016, pp. 1--3. Google ScholarDigital Library
P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi, and K. Tzoumas, "Apache flink: Stream and batch processing in a single engine," Bulletin of the IEEE Computer Society TC on Data Engineering, vol. 36, no. 4, 2015.Google Scholar

Index Terms

Stream-based Machine Learning for Network Security and Anomaly Detection
1. Computing methodologies
  1. Machine learning
2. Security and privacy
  1. Network security

Recommendations

Ensemble-learning Approaches for Network Security and Anomaly Detection
Big-DAMA '17: Proceedings of the Workshop on Big Data Analytics and Machine Learning for Data Communication Networks

The application of machine learning models to network security and anomaly detection problems has largely increased in the last decade; however, there is still no clear best-practice or silver bullet approach to address these problems in a general ...
Read More
Adaptive Network Security through Stream Machine Learning
SIGCOMM '18: Proceedings of the ACM SIGCOMM 2018 Conference on Posters and Demos

Stream Machine Learning is rapidly gaining popularity within the network monitoring community as the big data produced by network devices and end-user terminals goes beyond the memory constraints of standard monitoring equipment. We consider a stream-...
Read More
Reservoir-based network traffic stream summarization for anomaly detection

Summarization is an important intermediate step for expediting knowledge discovery tasks such as anomaly detection. In the context of anomaly detection from data stream, the summary needs to represent both anomalous and normal data. But streaming data ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

Big-DAMA '18: Proceedings of the 2018 Workshop on Big Data Analytics and Machine Learning for Data Communication Networks
August 2018
58 pages
ISBN:9781450359047
DOI:10.1145/3229607

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 August 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Data Stream mining
High-Dimensional Data
Machine Learning
Network Attacks
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate7of11submissions,64%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 2,603
  Total Downloads
- Downloads (Last 12 months)376
- Downloads (Last 6 weeks)38
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Stream-based Machine Learning for Network Security and Anomaly Detection

Big-DAMA '18: Proceedings of the 2018 Workshop on Big Data Analytics and Machine Learning for Data Communication Networks

ABSTRACT

References

Cited By

Index Terms

Recommendations

Ensemble-learning Approaches for Network Security and Anomaly Detection

Adaptive Network Security through Stream Machine Learning

Reservoir-based network traffic stream summarization for anomaly detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Stream-based Machine Learning for Network Security and Anomaly Detection

Big-DAMA '18: Proceedings of the 2018 Workshop on Big Data Analytics and Machine Learning for Data Communication Networks

ABSTRACT

References

Cited By

Index Terms

Recommendations

Ensemble-learning Approaches for Network Security and Anomaly Detection

Adaptive Network Security through Stream Machine Learning

Reservoir-based network traffic stream summarization for anomaly detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media