nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

Convolutional Neural Networks for Unsupervised Anomaly Detection in Text Data

verfasst von : Oleg Gorokhov, Mikhail Petrovskiy, Igor Mashechkin

Erschienen in: Intelligent Data Engineering and Automated Learning – IDEAL 2017

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this paper, we discuss the problem of anomaly detection in text data using convolutional neural network (CNN). Recently CNNs have become one of the most popular and powerful tools for various machine learning tasks. CNN’s main advantage is an ability to extract complicated hidden features from high dimensional data with complex structure. Usually CNNs are applied in supervised learning mode. On the other hand, unsupervised anomaly detection is an important problem in many applications, including computer security, behavioral analytics, etc. Since there is no specified target in unsupervised mode, traditional CNN’s objective functions cannot be used. In this paper, we develop a specific CNN architecture. It consists of one convolutional layer and one subsampling layer, we use RBF activation function and logarithmic loss function on the final layer. Minimization of the corresponding objective function helps us to calculate the location parameter of the features’ weights discovered on the last network layer. We use \(l_2\)-regularization to avoid degenerate solution. Proposed CNN has been tested on anomalies discovering in a stream of text documents modeled with well-known Enron dataset, where proposed method demonstrates better results in comparison with the traditional outlier detection methods based on one-class SVM and NMF.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel A Comparative Study on Lagrange Ying-Yang Alternation Method in Gaussian Mixture-Based Clustering

Nächstes Kapitel Solving the Bi-criteria Max-Cut Problem with Different Neighborhood Combination Strategies

Britz, D.: Implementing a CNN for text classification in tensorflow (2015). http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/

Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009)CrossRef

Clifton, L., Clifton, D.A., Zhang, Y., Watkinson, P., Tarassenko, L., Yin, H.: Probabilistic novelty detection with support vector machines. IEEE Trans. Reliab. 63(2), 455–467 (2014)CrossRef

Hawkins, S., He, H., Williams, G., Baxter, R.: Outlier detection using replicator neural networks. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 170–180. Springer, Heidelberg (2002). doi:10.1007/3-540-46145-0_17 CrossRef

Enron email dataset. www.cs.cmu.edu/./enron/

Kannan, R., Woo, H., Aggarwal, C.C., Park, H.: Outlier detection for text data: An extended version. CoRR abs/1701.01325 (2017)

Kim, Y.: Convolutional neural networks for sentence classification. CoRR abs/1408.5882 (2014)

Lee, J.Y., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks. CoRR abs/1603.03827 (2016). http://arxiv.org/abs/1603.03827

Manevitz, L.M., Yousef, M.: One-class SVMS for document classification. J. Mach. Learn. Res. 2, 139–154 (2001)MATH

10.

Mashechkin, I.V., Petrovskii, M.I., Tsarev, D.V.: Machine learning methods for analyzing user behavior when accessing text data in information security problems. Mosc. Univ. Comput. Math. Cybern. 40(4), 179–184 (2016)MathSciNetCrossRefMATH

11.

Mirzal, A.: Converged algorithms for orthogonal nonnegative matrix factorizations. CoRR abs/1010.5290 (2010)

12.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetMATH

13.

Tsarev, D.V., Petrovskiy, M.I., Mashechkin, I.V., Korchagin, A.Y., Korolev, V.Y.: Applying time series to the task of background user identification based on their text data analysis. Proc. Inst. Syst. Program. 27(1), 151–172 (2015)CrossRef

Titel: Convolutional Neural Networks for Unsupervised Anomaly Detection in Text Data
verfasst von: Oleg Gorokhov
Mikhail Petrovskiy
Igor Mashechkin
Verlag: Springer International Publishing
Buch: Intelligent Data Engineering and Automated Learning – IDEAL 2017
Print ISBN: 978-3-319-68934-0

Electronic ISBN: 978-3-319-68935-7

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-68935-7_54

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner