Skip to main content
Top
Published in: Mobile Networks and Applications 5/2019

19-05-2018

Efficient Identification of TOP-K Heavy Hitters over Sliding Windows

Authors: Haina Tang, Yulei Wu, Tong Li, Chunjing Han, Jingguo Ge, Xiangpeng Zhao

Published in: Mobile Networks and Applications | Issue 5/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Due to the increasing volume of network traffic and growing complexity of network environment, rapid identification of heavy hitters is quite challenging. To deal with the massive data streams in real-time, accurate and scalable solution is required. The traditional method to keep an individual counter for each host in the whole data streams is very resource-consuming. This paper presents a new data structure called FCM and its associated algorithms. FCM combines the count-min sketch with the stream-summary structure simultaneously for efficient TOP-K heavy hitter identification in one pass. The key point of this algorithm is that it introduces a novel filter-and-jump mechanism. Given that the Internet traffic has the property of being heavy-tailed and hosts of low frequencies account for the majority of the IP addresses, FCM periodically filters the mice from input streams to efficiently improve the accuracy of TOP-K heavy hitter identification. On the other hand, considering that abnormal events are always time sensitive, our algorithm works by adjusting its measurement window to the newly arrived elements in the data streams automatically. Our experimental results demonstrate that the performance of FCM is superior to the previous related algorithm. Additionally this solution has a good prospect of application in advanced network environment.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Zhao Q, Kumar A, Xu J (2005) Joint data streaming and sampling techniques for detection of super sources and destinations. Proc 5th ACM SIGCOMM Conf Internet Measure: 7–7 Zhao Q, Kumar A, Xu J (2005) Joint data streaming and sampling techniques for detection of super sources and destinations. Proc 5th ACM SIGCOMM Conf Internet Measure: 7–7
2.
go back to reference Kompella R, Singh S, Varghese G (2004) On scalable attack detection in the network. Proc 4th ACM SIGCOMM Conf Internet Measure: 187–200 Kompella R, Singh S, Varghese G (2004) On scalable attack detection in the network. Proc 4th ACM SIGCOMM Conf Internet Measure: 187–200
4.
go back to reference Shapsough S, Qatan F, Aburukba R, Aloul F, Ali A (2015) Smart grid cyber security: challenges and solutions. Int Conf Smart Grid Clean Energy Technol (ICSGCE): 170–175 Shapsough S, Qatan F, Aburukba R, Aloul F, Ali A (2015) Smart grid cyber security: challenges and solutions. Int Conf Smart Grid Clean Energy Technol (ICSGCE): 170–175
5.
go back to reference Yao Y, Xiong S, Qi H, Liu Y, Tolbert L, Cao Q (2014) Efficient histogram estimation for smart grid data processing with the Loglog-bloom-filter. IEEE Trans Smart Grid 6(1):199–208CrossRef Yao Y, Xiong S, Qi H, Liu Y, Tolbert L, Cao Q (2014) Efficient histogram estimation for smart grid data processing with the Loglog-bloom-filter. IEEE Trans Smart Grid 6(1):199–208CrossRef
6.
go back to reference Procopiou A, Komninos N (2015) Current and future threats framework in smart grid domain. IEEE Int Conf Cyber Technol Auto Contrl Intell Syst(CYBER): 1852–1857 Procopiou A, Komninos N (2015) Current and future threats framework in smart grid domain. IEEE Int Conf Cyber Technol Auto Contrl Intell Syst(CYBER): 1852–1857
7.
go back to reference Homem N, Carvalho J (December 2010) Finding top- k elements in data streams. Inf Sci 180(24):4958–4974CrossRef Homem N, Carvalho J (December 2010) Finding top- k elements in data streams. Inf Sci 180(24):4958–4974CrossRef
8.
go back to reference Roesch M (1999) Snort–lightweight intrusion detection for networks. Proc USENIX LISA 1999:229–238 Roesch M (1999) Snort–lightweight intrusion detection for networks. Proc USENIX LISA 1999:229–238
9.
go back to reference Plonka D (2000) FlowScan: a network traffic flow reporting and visualization tool. Proc USENIX LISA 2000:305–317 Plonka D (2000) FlowScan: a network traffic flow reporting and visualization tool. Proc USENIX LISA 2000:305–317
10.
go back to reference Estan C, Varghese G, Fiskin M (2003) Bitmap algorithms for counting active flows on high speed links. Proc 3rd ACM SIGCOMM Conf Internet Measure: 153–166 Estan C, Varghese G, Fiskin M (2003) Bitmap algorithms for counting active flows on high speed links. Proc 3rd ACM SIGCOMM Conf Internet Measure: 153–166
11.
go back to reference Wang P, Guan X, Gong W, Towsley D (2011) A new virtual indexing method for measuring host connection degrees. INFOCOM 2011:156–160 Wang P, Guan X, Gong W, Towsley D (2011) A new virtual indexing method for measuring host connection degrees. INFOCOM 2011:156–160
12.
go back to reference S. Venkataraman, D. Song, P. Gibbons, and A. Blum, (2005) New streaming algorithms for fast detection of Superspreaders. Proc Netwk Distributed Syst Security Sym (NDSS): 149–166 S. Venkataraman, D. Song, P. Gibbons, and A. Blum, (2005) New streaming algorithms for fast detection of Superspreaders. Proc Netwk Distributed Syst Security Sym (NDSS): 149–166
13.
go back to reference Bandi N, Agrawal D, El A (2007) Fast Algorithms for heavy distinct hitters using associative memories. Int Conf Distrib Comput Syst: 6–6 Bandi N, Agrawal D, El A (2007) Fast Algorithms for heavy distinct hitters using associative memories. Int Conf Distrib Comput Syst: 6–6
14.
go back to reference Dimitropoulos X, Hurley P, Kind A (January 2008) Probabilistic lossy counting: an efficient algorithm for finding heavy hitters. ACM SIGCOMM Comput Commun Rev 38(1):5–5CrossRef Dimitropoulos X, Hurley P, Kind A (January 2008) Probabilistic lossy counting: an efficient algorithm for finding heavy hitters. ACM SIGCOMM Comput Commun Rev 38(1):5–5CrossRef
15.
go back to reference Karp R, Shenker S, Papadimitriou C (March 2003) A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems (TODS) 28(1):51–55CrossRef Karp R, Shenker S, Papadimitriou C (March 2003) A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems (TODS) 28(1):51–55CrossRef
16.
go back to reference Metwally A, Agrawal D, El A (2005) Efficient computation of frequent and top-k elements in data streams. Int Conf Database Theory. Springer Berlin Heidelberg: 398–412CrossRef Metwally A, Agrawal D, El A (2005) Efficient computation of frequent and top-k elements in data streams. Int Conf Database Theory. Springer Berlin Heidelberg: 398–412CrossRef
17.
go back to reference M. Charikar, K. Chen, and M. Farach-Colton, (2002) Finding frequent items in data streams. International colloquium on automata, languages, and programming. Springer berlin Heidelberg, pp. 693–703 M. Charikar, K. Chen, and M. Farach-Colton, (2002) Finding frequent items in data streams. International colloquium on automata, languages, and programming. Springer berlin Heidelberg, pp. 693–703
18.
go back to reference Cormode G (November 2014) Count-min sketch. Encyclopedia Algorithms Springer US 29(1):1–6 Cormode G (November 2014) Count-min sketch. Encyclopedia Algorithms Springer US 29(1):1–6
19.
go back to reference Huang Q, Lee P (2014) LD-sketch: a distributed sketching design for accurate and scalable anomaly detection in network data streams. Int Conf Comput Commun: 1420–1428 Huang Q, Lee P (2014) LD-sketch: a distributed sketching design for accurate and scalable anomaly detection in network data streams. Int Conf Comput Commun: 1420–1428
20.
go back to reference Anceaume E, Busnel Y, Rivetti N (2015) Estimating the frequency of data items in massive distributed streams. IEEE Sym Netwrk Cloud Comput Appl (NCCA): 59–66 Anceaume E, Busnel Y, Rivetti N (2015) Estimating the frequency of data items in massive distributed streams. IEEE Sym Netwrk Cloud Comput Appl (NCCA): 59–66
21.
go back to reference Roy P, Khan A, Alonso G (2016) Augmented sketch: faster and more accurate stream processing. Proc 2016 Int Conf Manag Data: 1449–1463 Roy P, Khan A, Alonso G (2016) Augmented sketch: faster and more accurate stream processing. Proc 2016 Int Conf Manag Data: 1449–1463
22.
go back to reference Pitel G, Fouquier G (2015) Count-min-log sketch: approximately counting with approximate counters. 1st Int Sym Web Algorithms Pitel G, Fouquier G (2015) Count-min-log sketch: approximately counting with approximate counters. 1st Int Sym Web Algorithms
23.
go back to reference Ben-Basat R, Einziger G, Friedman R, Kassner Y (2016) Heavy hitters in streams and sliding windows. IEEE INFOCOM 2016:1–9MATH Ben-Basat R, Einziger G, Friedman R, Kassner Y (2016) Heavy hitters in streams and sliding windows. IEEE INFOCOM 2016:1–9MATH
24.
go back to reference Roy P, Teubner J, Alonso G (2012) Efficient frequent item counting in multi-core hardware. Proc 18th ACM SIGKDD Int Conf Knowledge Discov Data Mining: 1451–1459 Roy P, Teubner J, Alonso G (2012) Efficient frequent item counting in multi-core hardware. Proc 18th ACM SIGKDD Int Conf Knowledge Discov Data Mining: 1451–1459
25.
go back to reference Das S, Antony S, Agrawal D, El A (2009) Thread cooperation in multicore architectures for frequency counting over multiple data streams. Proc VLDB Endowment 2(1):217–228CrossRef Das S, Antony S, Agrawal D, El A (2009) Thread cooperation in multicore architectures for frequency counting over multiple data streams. Proc VLDB Endowment 2(1):217–228CrossRef
26.
go back to reference Einziger G, Friedman R (2016) Counting with TinyTable: every bit counts!. Proc Int Conf Distrib Comput Network (ICDCN 2016), Article No 27 Einziger G, Friedman R (2016) Counting with TinyTable: every bit counts!. Proc Int Conf Distrib Comput Network (ICDCN 2016), Article No 27
27.
go back to reference Homem N, Carvalho J (2011) Finding top-k elements in a time-sliding window. Evol Syst 2(1):51–70CrossRef Homem N, Carvalho J (2011) Finding top-k elements in a time-sliding window. Evol Syst 2(1):51–70CrossRef
28.
go back to reference Zhang Z, Wang B, Lan J (2015) Identifying elephant flows in internet backbone traffic with bloom filters and LRU. Comput Commun 61:70–78CrossRef Zhang Z, Wang B, Lan J (2015) Identifying elephant flows in internet backbone traffic with bloom filters and LRU. Comput Commun 61:70–78CrossRef
29.
go back to reference Cafaro M, Pulimeno M, Epicoco I, Aloisio G (2016) Mining frequent items in the time fading model. Inf Sci 370:221–238CrossRef Cafaro M, Pulimeno M, Epicoco I, Aloisio G (2016) Mining frequent items in the time fading model. Inf Sci 370:221–238CrossRef
30.
go back to reference Cormode G, Hadjieleftheriou M (2010) Methods for finding frequent items in data streams. VLDB J 19(1):3–20CrossRef Cormode G, Hadjieleftheriou M (2010) Methods for finding frequent items in data streams. VLDB J 19(1):3–20CrossRef
31.
go back to reference Cormode G, Hadjieleftheriou M (2008) Finding frequent items in data streams. Proc VLDB Endowment 1(2):1530–1541CrossRef Cormode G, Hadjieleftheriou M (2008) Finding frequent items in data streams. Proc VLDB Endowment 1(2):1530–1541CrossRef
Metadata
Title
Efficient Identification of TOP-K Heavy Hitters over Sliding Windows
Authors
Haina Tang
Yulei Wu
Tong Li
Chunjing Han
Jingguo Ge
Xiangpeng Zhao
Publication date
19-05-2018
Publisher
Springer US
Published in
Mobile Networks and Applications / Issue 5/2019
Print ISSN: 1383-469X
Electronic ISSN: 1572-8153
DOI
https://doi.org/10.1007/s11036-018-1051-x

Other articles of this Issue 5/2019

Mobile Networks and Applications 5/2019 Go to the issue