Skip to main content
Erschienen in:
Buchtitelbild

2020 | OriginalPaper | Buchkapitel

Role-Based Access Classification: Evaluating the Performance of Machine Learning Algorithms

verfasst von : Randy Julian, Edward Guyot, Shaowen Zhou, Geong Sen Poh, Stéphane Bressan

Erschienen in: Transactions on Large-Scale Data- and Knowledge-Centered Systems XLIII

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The analysis of relational database access for the purpose of audit and anomaly detection can be based on the classification of queries according to user roles. One such approach is DBSAFE, a database anomaly detection system, which uses a Naïve Bayes classifier to detect anomalous queries in Role-based Access Control (RBAC) environments. We propose to consider the usual machine learning algorithms for classification tasks: K-Nearest Neighbours, Random Forest, Support Vector Machine and Convolutional Neural Network, as alternatives to DBSAFE’s Naïve Bayes classifier. We identify the need for an effective representation of the input to the classifiers. We propose the utilisation of a query embedding mechanism with the classifiers. We comparatively and empirically evaluate the performance of different algorithms and variants with two benchmarks: the comprehensive off-the-shelf OLTP-Bench benchmark and a variant of the CH-benCHmark that we extended with hand-crafted user roles for database access classification. The empirical comparative evaluation shows clear benefits in the utilisation of the machine learning tools.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
4.
Zurück zum Zitat Baldi, P.: Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp. 37–49 (2012) Baldi, P.: Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp. 37–49 (2012)
6.
Zurück zum Zitat Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATH Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATH
13.
Zurück zum Zitat Cole, R., et al.: The mixed workload CH-benCHmark. In: Proceedings of the Fourth International Workshop on Testing Database Systems, p. 8. ACM (2011) Cole, R., et al.: The mixed workload CH-benCHmark. In: Proceedings of the Fourth International Workshop on Testing Database Systems, p. 8. ACM (2011)
19.
Zurück zum Zitat Hinton, G.E., et al.: Learning distributed representations of concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, Amherst, MA, vol. 1, p. 12 (1986) Hinton, G.E., et al.: Learning distributed representations of concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, Amherst, MA, vol. 1, p. 12 (1986)
22.
Zurück zum Zitat Jain, S., Howe, B., Yan, J., Cruanes, T.: Query2Vec: an evaluation of NLP techniques for generalized workload analytics (2018) Jain, S., Howe, B., Yan, J., Cruanes, T.: Query2Vec: an evaluation of NLP techniques for generalized workload analytics (2018)
26.
Zurück zum Zitat Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014) Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
28.
Zurück zum Zitat LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
29.
Zurück zum Zitat Li, J., Luong, M.T., Jurafsky, D.: A hierarchical neural autoencoder for paragraphs and documents. arXiv preprint arXiv:1506.01057 (2015) Li, J., Luong, M.T., Jurafsky, D.: A hierarchical neural autoencoder for paragraphs and documents. arXiv preprint arXiv:​1506.​01057 (2015)
30.
Zurück zum Zitat Liaw, A., Wiener, M.: Classification and regression by randomforest. Forest 2(3), 18–22 (2001) Liaw, A., Wiener, M.: Classification and regression by randomforest. Forest 2(3), 18–22 (2001)
31.
Zurück zum Zitat Lu, S., Wei, X., Li, Y., Wang, L.: Detecting anomaly in big data system logs using convolutional neural network. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 151–158. IEEE (2018) Lu, S., Wei, X., Li, Y., Wang, L.: Detecting anomaly in big data system logs using convolutional neural network. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 151–158. IEEE (2018)
32.
Zurück zum Zitat Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015) Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:​1508.​04025 (2015)
34.
Zurück zum Zitat Mazzawi, H., Dalal, G., Rozenblatz, D., Ein-Dorx, L., Niniox, M., Lavi, O.: Anomaly detection in large databases using behavioral patterning. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 1140–1149 (2017). https://doi.org/10.1109/ICDE.2017.158 Mazzawi, H., Dalal, G., Rozenblatz, D., Ein-Dorx, L., Niniox, M., Lavi, O.: Anomaly detection in large databases using behavioral patterning. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 1140–1149 (2017). https://​doi.​org/​10.​1109/​ICDE.​2017.​158
35.
Zurück zum Zitat Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781 (2013)
36.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
37.
Zurück zum Zitat Neuvonen, S., Wolski, A., Manner, M., Raatikka, V.: Telecommunication application transaction processing (TATP) benchmark description 1.0 (2009) Neuvonen, S., Wolski, A., Manner, M., Raatikka, V.: Telecommunication application transaction processing (TATP) benchmark description 1.0 (2009)
40.
Zurück zum Zitat Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)CrossRef Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)CrossRef
42.
Zurück zum Zitat Sallam, A., Bertino, E.: Result-based detection of insider threats to relational databases. In: Ahn, G., Thuraisingham, B.M., Kantarcioglu, M., Krishnan, R. (eds.) Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, CODASPY 2019, Richardson, TX, USA, 25–27 March 2019, pp. 133–143. ACM (2019). https://doi.org/10.1145/3292006.3300039 Sallam, A., Bertino, E.: Result-based detection of insider threats to relational databases. In: Ahn, G., Thuraisingham, B.M., Kantarcioglu, M., Krishnan, R. (eds.) Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, CODASPY 2019, Richardson, TX, USA, 25–27 March 2019, pp. 133–143. ACM (2019). https://​doi.​org/​10.​1145/​3292006.​3300039
46.
Zurück zum Zitat Sallam, A., Xiao, Q., Bertino, E., Fadolalkarim, D.: Anomaly detection techniques for database protection against insider threats (invited paper). In: 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI), pp. 20–29 (2016). https://doi.org/10.1109/IRI.2016.12 Sallam, A., Xiao, Q., Bertino, E., Fadolalkarim, D.: Anomaly detection techniques for database protection against insider threats (invited paper). In: 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI), pp. 20–29 (2016). https://​doi.​org/​10.​1109/​IRI.​2016.​12
50.
Zurück zum Zitat Singh, I., Sareen, S., Ahuja, H.: Detection of malicious transactions in databases using dynamic sensitivity and weighted rule mining. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–8 (2017). https://doi.org/10.1109/ICIIECS.2017.8276084 Singh, I., Sareen, S., Ahuja, H.: Detection of malicious transactions in databases using dynamic sensitivity and weighted rule mining. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–8 (2017). https://​doi.​org/​10.​1109/​ICIIECS.​2017.​8276084
52.
Zurück zum Zitat Smola, A.J., et al.: Regression estimation with support vector learning machines. Ph.D. thesis, Master’s thesis, Technische Universität München (1996) Smola, A.J., et al.: Regression estimation with support vector learning machines. Ph.D. thesis, Master’s thesis, Technische Universität München (1996)
55.
Zurück zum Zitat Transaction Processing Performance Council (TPC): TPC Benchmark C Standard Specification (2010). Revision 5.11 Transaction Processing Performance Council (TPC): TPC Benchmark C Standard Specification (2010). Revision 5.11
57.
Zurück zum Zitat Transaction Processing Performance Council (TPC): TPC Benchmark H Standard Specification (2018). Revision 2.18.0 Transaction Processing Performance Council (TPC): TPC Benchmark H Standard Specification (2018). Revision 2.18.0
58.
Zurück zum Zitat Vapnik, V.: Pattern recognition using generalized portrait method. Autom. Remote Control 24, 774–780 (1963) Vapnik, V.: Pattern recognition using generalized portrait method. Autom. Remote Control 24, 774–780 (1963)
Metadaten
Titel
Role-Based Access Classification: Evaluating the Performance of Machine Learning Algorithms
verfasst von
Randy Julian
Edward Guyot
Shaowen Zhou
Geong Sen Poh
Stéphane Bressan
Copyright-Jahr
2020
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-62199-8_1