Skip to main content

2018 | OriginalPaper | Buchkapitel

FROD: Fast and Robust Distance-Based Outlier Detection with Active-Inliers-Patterns in Data Streams

verfasst von : Zongren Li, Yijie Wang, Guohong Zhao, Li Cheng, Xingkong Ma

Erschienen in: Artificial Neural Networks and Machine Learning – ICANN 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The detection of distance-based outliers from streaming data is critical for modern applications ranging from telecommunications to cybersecurity. However, existing works mainly concentrate on improving the responding speed, none of these proposals can perform well in streams with varying data distribution. In this paper, we propose a Fast and Robust Outlier Detection method (FROD in short) to solve this dilemma and achieve the promotion in both detection performance and processing throughput. Specifically, to adapt the changing distribution in data streams, we employ the Active-Inliers-Pattern which dynamically selects reserved objects for further outlier analysis. Moreover, an effective micro-cluster-based data storing structure is proposed to improve the detection efficiency, which is supported by our theoretical analysis on the complexity bounds. Moreover, we present a potential background updating optimization approach to hide the updating time. Experiments performed on real-world and synthetic datasets verify our theoretical study and demonstrate that our algorithm is not only faster than state-of-the-art methods, but also achieve a better detection performance when the outlier rate fluctuates.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Angiulli, F., Fassetti, F.: Detecting distance-based outliers in streams of data. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 811–820. ACM (2007) Angiulli, F., Fassetti, F.: Detecting distance-based outliers in streams of data. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 811–820. ACM (2007)
3.
Zurück zum Zitat Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J., Rundensteiner, E.A.: Scalable distance-based outlier detection over high-volume data streams. In: Data Engineering (ICDE), IEEE 30th International Conference on 2014. pp. 76–87. IEEE (2014) Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J., Rundensteiner, E.A.: Scalable distance-based outlier detection over high-volume data streams. In: Data Engineering (ICDE), IEEE 30th International Conference on 2014. pp. 76–87. IEEE (2014)
4.
Zurück zum Zitat Huang, H., Kasiviswanathan, S.P.: Streaming anomaly detection using randomized matrix sketching. Proc. VLDB Endowment 9(3), 192–203 (2015)CrossRef Huang, H., Kasiviswanathan, S.P.: Streaming anomaly detection using randomized matrix sketching. Proc. VLDB Endowment 9(3), 192–203 (2015)CrossRef
5.
Zurück zum Zitat Kalyan, V., Ignacio, A., Alfredo, C.: AI2: training a big data machine to defend. In: IEEE International Conference on Big Data Security, New York (2016) Kalyan, V., Ignacio, A., Alfredo, C.: AI2: training a big data machine to defend. In: IEEE International Conference on Big Data Security, New York (2016)
6.
Zurück zum Zitat Knox, E.M.: Algorithms for mining distance based outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases, pp. 392–403. Citeseer (1998) Knox, E.M.: Algorithms for mining distance based outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases, pp. 392–403. Citeseer (1998)
7.
Zurück zum Zitat Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: Data Engineering (ICDE), IEEE 27th International Conference on 2011. pp. 135–146. IEEE (2011) Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: Data Engineering (ICDE), IEEE 27th International Conference on 2011. pp. 135–146. IEEE (2011)
8.
Zurück zum Zitat Tran, L., Fan, L., Shahabi, C.: Distance-based outlier detection in data streams. Proc. VLDB Endowment 9(12), 1089–1100 (2016)CrossRef Tran, L., Fan, L., Shahabi, C.: Distance-based outlier detection in data streams. Proc. VLDB Endowment 9(12), 1089–1100 (2016)CrossRef
9.
Zurück zum Zitat Yang, D., Rundensteiner, E.A., Ward, M.O.: Neighbor-based pattern detection for windows over streaming data. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 529–540. ACM (2009) Yang, D., Rundensteiner, E.A., Ward, M.O.: Neighbor-based pattern detection for windows over streaming data. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 529–540. ACM (2009)
10.
Zurück zum Zitat Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49. ACM (1999) Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49. ACM (1999)
Metadaten
Titel
FROD: Fast and Robust Distance-Based Outlier Detection with Active-Inliers-Patterns in Data Streams
verfasst von
Zongren Li
Yijie Wang
Guohong Zhao
Li Cheng
Xingkong Ma
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01418-6_62