Skip to main content

2020 | OriginalPaper | Buchkapitel

Topic Reconstruction: A Novel Method Based on LDA Oriented to Intrusion Detection

verfasst von : Shengwei Lei, Chunhe Xia, Tianbo Wang, Shizhao Wang

Erschienen in: Algorithms and Architectures for Parallel Processing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Traditional intrusion detection methods are facing the problems of distinguishing different types of intrusion with high similarity. The methods use a single value to characterize each attribute and mine the relationship of each attribute at the feature extraction stage. However, this granularity of features extraction is not sufficient to distinguish different intrusions whose network flow characteristics are similar. Facing the problem, we establish an intrusion detection model based on Latent Dirichlet Allocation (ID-LDA) and propose a novel topic reconstruction method to extract the distinctive features. We mine the value distribution of each attribute and the association of multiple attributes to extract the more implicit semantic features. These features are more useful for identifying slight differences in different kinds of intrusions. However, the current LDA models are difficult in determining the most optimal topic number. Meanwhile, the recent methods ignore the multiple topics selection. These above problems result in difficulty in generating the perfect Document-Topic Distribution (DTD) and lower detection accuracy. So we propose a topic overlap degree and a dispersion degree to quantitatively assess the quality of the DTD. Finally, we get the most optimal topic number and select the best topics. Experiments on the public NSL-KDD dataset have verified the validity of the ID-LDA. These results outperform many state-of-the-art intrusion detection methods in terms of accuracy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Mach. Learn. Res. 3, 993–1022 (2003)MATH
2.
Zurück zum Zitat Zhang, Y., Chen, W., Zha, H., et al.: A time-topic coupled LDA model for IPTV user behaviors. IEEE Trans. Broadcast. 61(1), 56–65 (2015)CrossRef Zhang, Y., Chen, W., Zha, H., et al.: A time-topic coupled LDA model for IPTV user behaviors. IEEE Trans. Broadcast. 61(1), 56–65 (2015)CrossRef
3.
Zurück zum Zitat Farrahi, K., Gatica-Perez, D.: Discovering routines from large scale human locations using probabilistic topic models. ACM Trans. Intell. Syst. Technol. 2(1), (2011) Farrahi, K., Gatica-Perez, D.: Discovering routines from large scale human locations using probabilistic topic models. ACM Trans. Intell. Syst. Technol. 2(1), (2011)
4.
Zurück zum Zitat Huynh, T., Fritz, M., Schiele, B.: Discovery of activity patterns using topic models. In: Proceedings of the 10th International Conference on Ubiquitous Computing, Seoul, Korea, pp. 10–19. ACM (2008) Huynh, T., Fritz, M., Schiele, B.: Discovery of activity patterns using topic models. In: Proceedings of the 10th International Conference on Ubiquitous Computing, Seoul, Korea, pp. 10–19. ACM (2008)
5.
Zurück zum Zitat Guixian, X., Xu, W., Yao, H., et al.: Research on topic recognition of network sensitive information based on SW-LDA model. IEEE Access 7, 21527–21538 (2019)CrossRef Guixian, X., Xu, W., Yao, H., et al.: Research on topic recognition of network sensitive information based on SW-LDA model. IEEE Access 7, 21527–21538 (2019)CrossRef
6.
Zurück zum Zitat Zhang, Y., Wang, Z., Yongtao, Yu., et al.: LF-LDA: a supervised topic model for multi-label documents classification. IJDWM 14(2), 18–36 (2018) Zhang, Y., Wang, Z., Yongtao, Yu., et al.: LF-LDA: a supervised topic model for multi-label documents classification. IJDWM 14(2), 18–36 (2018)
7.
Zurück zum Zitat Casale, P., Pujol, O., Radeva, P., et al.: A first approach to activity recognition using topic models. In: Artificial Intelligence Research & Development, International Conference of the Catalan Association for Artificial Intelligence, CCIA, Vilar Rural De Cardona, October. DBLP (2009) Casale, P., Pujol, O., Radeva, P., et al.: A first approach to activity recognition using topic models. In: Artificial Intelligence Research & Development, International Conference of the Catalan Association for Artificial Intelligence, CCIA, Vilar Rural De Cardona, October. DBLP (2009)
8.
Zurück zum Zitat Yang, Y., Sun, J., Guo, L.: PersonaIA: a lightweight implicit authentication system based on customized user behavior selection. IEEE Trans. Dependable Secure Comput. 16(1), 113–126 (2019)CrossRef Yang, Y., Sun, J., Guo, L.: PersonaIA: a lightweight implicit authentication system based on customized user behavior selection. IEEE Trans. Dependable Secure Comput. 16(1), 113–126 (2019)CrossRef
9.
Zurück zum Zitat Wilson, J., Chaudhury, S., Lall, B.: Clustering short temporal behaviour sequences for customer segmentation using LDA. Expert Syst. e12250 (2009) Wilson, J., Chaudhury, S., Lall, B.: Clustering short temporal behaviour sequences for customer segmentation using LDA. Expert Syst. e12250 (2009)
10.
Zurück zum Zitat Xie, L., Shi, Y., Li, Z.: Driving pattern recognition based on improved LDA model. In: 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), Nanjing, China, pp. 320–324 (2018) Xie, L., Shi, Y., Li, Z.: Driving pattern recognition based on improved LDA model. In: 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), Nanjing, China, pp. 320–324 (2018)
11.
Zurück zum Zitat Gao, Y., Wei, X., Zhang, X., et al.: A combinational LDA-based topic model for user interest inference of energy efficient IPTV service in smart building. IEEE Access 6, 48921–48933 (2018)CrossRef Gao, Y., Wei, X., Zhang, X., et al.: A combinational LDA-based topic model for user interest inference of energy efficient IPTV service in smart building. IEEE Access 6, 48921–48933 (2018)CrossRef
12.
Zurück zum Zitat Chen, W., Zhang, Y., Zha, H.: Mining IPTV user behaviors with a coupled LDA model. In: IEEE International Symposium on Broadband Multimedia Systems & Broadcasting, London, pp. 1–6. IEEE (2013) Chen, W., Zhang, Y., Zha, H.: Mining IPTV user behaviors with a coupled LDA model. In: IEEE International Symposium on Broadband Multimedia Systems & Broadcasting, London, pp. 1–6. IEEE (2013)
13.
Zurück zum Zitat Wang, Z., Gu, S., Xu, X.: GSLDA: LDA-based group spamming detection in product reviews. Appl. Intell. 1, 1–14 (2018) Wang, Z., Gu, S., Xu, X.: GSLDA: LDA-based group spamming detection in product reviews. Appl. Intell. 1, 1–14 (2018)
14.
Zurück zum Zitat Budhiraja, A., Reddy, R., Shrivastava, M.: Poster: LWE: LDA refined word embeddings for duplicate bug report detection. In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion Proceedings, Gothenburg, pp. 165–166. IEEE Computer Society (2018) Budhiraja, A., Reddy, R., Shrivastava, M.: Poster: LWE: LDA refined word embeddings for duplicate bug report detection. In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion Proceedings, Gothenburg, pp. 165–166. IEEE Computer Society (2018)
15.
16.
Zurück zum Zitat Mäntylä, M., Claes, M., Farooq, U.: Measuring LDA topic stability from clusters of replicated runs. In: ESEM 2018 ACM, Oulu, Finland (2018) Mäntylä, M., Claes, M., Farooq, U.: Measuring LDA topic stability from clusters of replicated runs. In: ESEM 2018 ACM, Oulu, Finland (2018)
17.
Zurück zum Zitat Gollapalli, S.D., Li, X.-l.: Using PageRank for characterizing topic quality in LDA. In: 2018 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2018), Tianjin, China, pp. 115–122 (2018) Gollapalli, S.D., Li, X.-l.: Using PageRank for characterizing topic quality in LDA. In: 2018 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2018), Tianjin, China, pp. 115–122 (2018)
18.
Zurück zum Zitat Morstatter, F., Liu, H.: A novel measure for coherence in statistical topic models. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 543–548 (2016) Morstatter, F., Liu, H.: A novel measure for coherence in statistical topic models. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 543–548 (2016)
19.
Zurück zum Zitat Newman, D., Lau, J.H., Grieser, K., et al.: Automatic evaluation of topic coherence. In: Human Language Technologies: Conference of the North American Chapter of the Association of the ACL, Los Angeles, California, pp. 100–108 (2010) Newman, D., Lau, J.H., Grieser, K., et al.: Automatic evaluation of topic coherence. In: Human Language Technologies: Conference of the North American Chapter of the Association of the ACL, Los Angeles, California, pp. 100–108 (2010)
20.
Zurück zum Zitat Jonathan, C., Boyd-Graber, J., et al.: Reading tea leaves: how humans interpret topic models. In: NIPS, Vancouver, British Columbia, Canada (2009) Jonathan, C., Boyd-Graber, J., et al.: Reading tea leaves: how humans interpret topic models. In: NIPS, Vancouver, British Columbia, Canada (2009)
21.
Zurück zum Zitat Zhao, W., Chen, J.J., Perkins, R., et al.: A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinformatics 16(Suppl 13), S8 (2015)CrossRef Zhao, W., Chen, J.J., Perkins, R., et al.: A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinformatics 16(Suppl 13), S8 (2015)CrossRef
22.
Zurück zum Zitat Grant, S., Cordy, J.R., Skillicorn, D.B.: Using heuristics to estimate an appropriate number of latent topics in source code analysis. Sci. Comput. Program. 78(9), 1663–1678 (2013)CrossRef Grant, S., Cordy, J.R., Skillicorn, D.B.: Using heuristics to estimate an appropriate number of latent topics in source code analysis. Sci. Comput. Program. 78(9), 1663–1678 (2013)CrossRef
23.
25.
Zurück zum Zitat Zhihua, C., Lei, D., et al.: Malicious code detection based on CNNs and multi-objective algorithm. Parallel Distrib. Comput. 129, 50–58 (2019)CrossRef Zhihua, C., Lei, D., et al.: Malicious code detection based on CNNs and multi-objective algorithm. Parallel Distrib. Comput. 129, 50–58 (2019)CrossRef
26.
Zurück zum Zitat Xiaoyu, G., Hui, Z., et al.: A single attention-based combination of CNN and RNN for relation classification. IEEE Access 7, 12467–12475 (2019)CrossRef Xiaoyu, G., Hui, Z., et al.: A single attention-based combination of CNN and RNN for relation classification. IEEE Access 7, 12467–12475 (2019)CrossRef
27.
Zurück zum Zitat Yao, H., Sun, X., et al.: An enhanced LSTM for trend following of time series. IEEE Access 7, 34020–34030 (2019)CrossRef Yao, H., Sun, X., et al.: An enhanced LSTM for trend following of time series. IEEE Access 7, 34020–34030 (2019)CrossRef
28.
Zurück zum Zitat Alguliyev, R.M., Aliguliyev, R.M., et al.: The improved LSTM and CNN models for DDoS attacks prediction in social media. IJCWT 9(1), 1–18 (2019) Alguliyev, R.M., Aliguliyev, R.M., et al.: The improved LSTM and CNN models for DDoS attacks prediction in social media. IJCWT 9(1), 1–18 (2019)
Metadaten
Titel
Topic Reconstruction: A Novel Method Based on LDA Oriented to Intrusion Detection
verfasst von
Shengwei Lei
Chunhe Xia
Tianbo Wang
Shizhao Wang
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-38991-8_38