Skip to main content

2023 | OriginalPaper | Buchkapitel

Online Influence Forest for Streaming Anomaly Detection

verfasst von : Inês Martins, João S. Resende, João Gama

Erschienen in: Advances in Intelligent Data Analysis XXI

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As the digital world grows, data is being collected at high speed on a continuous and real-time scale. Hence, the imposed imbalanced and evolving scenario that introduces learning from streaming data remains a challenge. As the research field is still open to consistent strategies that assess continuous and evolving data properties, this paper proposes an unsupervised, online, and incremental anomaly detection ensemble of influence trees that implement adaptive mechanisms to deal with inactive or saturated leaves. This proposal features the fourth standardized moment, also known as kurtosis, as the splitting criteria and the isolation score, Shannon’s information content, and the influence function of an instance as the anomaly score. In addition to improving interpretability, this proposal is also evaluated on publicly available datasets, providing a detailed discussion of the results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ramírez-Gallego, S., et al.: A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239, 39–57 (2017)CrossRef Ramírez-Gallego, S., et al.: A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239, 39–57 (2017)CrossRef
2.
Zurück zum Zitat Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. (CSUR) 49(2), 1–50 (2016)CrossRef Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. (CSUR) 49(2), 1–50 (2016)CrossRef
3.
Zurück zum Zitat Gomes, H.M., Read, J., Bifet, A., Barddal, J.P., Gama, J.: Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explor. Newsl. 21(2), 6–22 (2019)CrossRef Gomes, H.M., Read, J., Bifet, A., Barddal, J.P., Gama, J.: Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explor. Newsl. 21(2), 6–22 (2019)CrossRef
4.
Zurück zum Zitat Guha, S., Mishra, N., Roy, G., Schrijvers, O.: Robust random cut forest based anomaly detection on streams. In: International Conference on Machine Learning. PMLR, pp. 2712–2721 (2016) Guha, S., Mishra, N., Roy, G., Schrijvers, O.: Robust random cut forest based anomaly detection on streams. In: International Conference on Machine Learning. PMLR, pp. 2712–2721 (2016)
5.
Zurück zum Zitat Thimonier, H., Popineau, F., Rimmel, A., Doan, B.-L., Daniel, F.: Tracinad: measuring influence for anomaly detection. arXiv preprint arXiv:2205.01362 (2022) Thimonier, H., Popineau, F., Rimmel, A., Doan, B.-L., Daniel, F.: Tracinad: measuring influence for anomaly detection. arXiv preprint arXiv:​2205.​01362 (2022)
6.
Zurück zum Zitat Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 665–674 (2017) Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 665–674 (2017)
8.
Zurück zum Zitat Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 8th IEEE International Conference on Data Mining. IEEE, vol. 2008, pp. 413–422 (2008) Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 8th IEEE International Conference on Data Mining. IEEE, vol. 2008, pp. 413–422 (2008)
9.
Zurück zum Zitat Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000) Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)
10.
Zurück zum Zitat Schölkopf, B.: Support vector method for novelty detection. In: Advances in Neural Information Processing Systems, vol. 12 (1999) Schölkopf, B.: Support vector method for novelty detection. In: Advances in Neural Information Processing Systems, vol. 12 (1999)
11.
Zurück zum Zitat Pokrajac, D., Lazarevic, A., Latecki, L.J.: Incremental local outlier detection for data streams. In: IEEE Symposium on Computational Intelligence and Data Mining. IEEE, vol. 2007, pp. 504–515 (2007) Pokrajac, D., Lazarevic, A., Latecki, L.J.: Incremental local outlier detection for data streams. In: IEEE Symposium on Computational Intelligence and Data Mining. IEEE, vol. 2007, pp. 504–515 (2007)
12.
Zurück zum Zitat Salehi, M., Rashidi, L.: A survey on anomaly detection in evolving data: [with application to forest fire risk prediction]. ACM SIGKDD Explorations Newsl. 20(1), 13–23 (2018)CrossRef Salehi, M., Rashidi, L.: A survey on anomaly detection in evolving data: [with application to forest fire risk prediction]. ACM SIGKDD Explorations Newsl. 20(1), 13–23 (2018)CrossRef
13.
Zurück zum Zitat Putina, A., Sozio, M., Rossi, D., Navarro, J.M.: Random histogram forest for unsupervised anomaly detection. In: 2020 IEEE International Conference on Data Mining (ICDM). IEEE, pp. 1226–1231 (2020) Putina, A., Sozio, M., Rossi, D., Navarro, J.M.: Random histogram forest for unsupervised anomaly detection. In: 2020 IEEE International Conference on Data Mining (ICDM). IEEE, pp. 1226–1231 (2020)
15.
Zurück zum Zitat Ding, Z., Fei, M.: An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window. IFAC 46(20), 12–17 (2013) Ding, Z., Fei, M.: An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window. IFAC 46(20), 12–17 (2013)
18.
Zurück zum Zitat Loperfido, N.: Kurtosis-based projection pursuit for outlier detection in financial time series. European J. Financ. 26(2–3), 142–164 (2020)CrossRef Loperfido, N.: Kurtosis-based projection pursuit for outlier detection in financial time series. European J. Financ. 26(2–3), 142–164 (2020)CrossRef
19.
20.
Zurück zum Zitat Fiori, A.M., Zenga, M.: The meaning of kurtosis, the influence function and an early intuition by l. faleschini, Statistica 65(2), 135–144 (2005) Fiori, A.M., Zenga, M.: The meaning of kurtosis, the influence function and an early intuition by l. faleschini, Statistica 65(2), 135–144 (2005)
21.
Zurück zum Zitat Lovric, M., et al.: International Encyclopedia of Statistical Science. Springer, Berlin (2011)CrossRefMATH Lovric, M., et al.: International Encyclopedia of Statistical Science. Springer, Berlin (2011)CrossRefMATH
22.
Zurück zum Zitat Oza, N.C., Russell, S.J.: Online bagging and boosting. In: International Workshop on Artificial Intelligence and Statistics. PMLR, pp. 229–236 (2001) Oza, N.C., Russell, S.J.: Online bagging and boosting. In: International Workshop on Artificial Intelligence and Statistics. PMLR, pp. 229–236 (2001)
23.
25.
Zurück zum Zitat Lavin, A., Ahmad, S.: Evaluating real-time anomaly detection algorithms-the numenta anomaly benchmark. In: IEEE ICMLA, pp. 38–44 (2015) Lavin, A., Ahmad, S.: Evaluating real-time anomaly detection algorithms-the numenta anomaly benchmark. In: IEEE ICMLA, pp. 38–44 (2015)
Metadaten
Titel
Online Influence Forest for Streaming Anomaly Detection
verfasst von
Inês Martins
João S. Resende
João Gama
Copyright-Jahr
2023
DOI
https://doi.org/10.1007/978-3-031-30047-9_22