nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Role-Based Access Classification: Evaluating the Performance of Machine Learning Algorithms

verfasst von : Randy Julian, Edward Guyot, Shaowen Zhou, Geong Sen Poh, Stéphane Bressan

Erschienen in: Transactions on Large-Scale Data- and Knowledge-Centered Systems XLIII

Verlag: Springer Berlin Heidelberg

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The analysis of relational database access for the purpose of audit and anomaly detection can be based on the classification of queries according to user roles. One such approach is DBSAFE, a database anomaly detection system, which uses a Naïve Bayes classifier to detect anomalous queries in Role-based Access Control (RBAC) environments. We propose to consider the usual machine learning algorithms for classification tasks: K-Nearest Neighbours, Random Forest, Support Vector Machine and Convolutional Neural Network, as alternatives to DBSAFE’s Naïve Bayes classifier. We identify the need for an effective representation of the input to the classifiers. We propose the utilisation of a query embedding mechanism with the classifiers. We comparatively and empirically evaluate the performance of different algorithms and variants with two benchmarks: the comprehensive off-the-shelf OLTP-Bench benchmark and a variant of the CH-benCHmark that we extended with hand-crafted user roles for database access classification. The empirical comparative evaluation shows clear benefits in the utilisation of the machine learning tools.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel Top-k Queries over Distributed Uncertain Categorical Data

Alomari, M., Cahill, M., Fekete, A., Rohm, U.: The cost of serializability on platforms that use snapshot isolation. In: 2008 IEEE 24th International Conference on Data Engineering, pp. 576–585 (2008). https://doi.org/10.1109/ICDE.2008.4497466

Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992). https://doi.org/10.1080/00031305.1992.10475879. https://amstat.tandfonline.com/doi/abs/10.1080/00031305.1992.10475879MathSciNetCrossRef

Angkanawaraphan, V., Pavlo, A.: AuctionMark: A benchmark for high-performance OLTP systems. https://hstore.cs.brown.edu/projects/auctionmark/

Baldi, P.: Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp. 37–49 (2012)

Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50CrossRef

Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)MATH

Bertino, E., Terzi, E., Kamra, A., Vakali, A.: Intrusion detection in RBAC-administered databases. In: 21st Annual Computer Security Applications Conference (ACSAC 2005), pp. 10–182 (2005). https://doi.org/10.1109/CSAC.2005.33

Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT 1992, pp. 144–152. ACM, New York (1992). https://doi.org/10.1145/130385.130401. http://doi.acm.org.libproxy1.nus.edu.sg/10.1145/130385.130401

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324CrossRefMATH

10.

Bu, S.J., Cho, S.B.: A convolutional neural-based learning classifier system for detecting database intrusion via insider attack. Inf. Sci. 512, 123–136 (2020). https://doi.org/10.1016/j.ins.2019.09.055. http://www.sciencedirect.com/science/article/pii/S0020025519309004CrossRef

11.

Hsu, C.-W., Lin, C.-J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002). https://doi.org/10.1109/72.991427CrossRef

12.

Chung, C.Y., Gertz, M., Levitt, K.: DEMIDS: a misuse detection system for database systems. In: van Biene-Hershey, M.E., Strous, L. (eds.) Integrity and Internal Control in Information Systems. ITIFIP, vol. 37, pp. 159–178. Springer, Boston, MA (2000). https://doi.org/10.1007/978-0-387-35501-6_12CrossRef

13.

Cole, R., et al.: The mixed workload CH-benCHmark. In: Proceedings of the Fourth International Workshop on Testing Database Systems, p. 8. ACM (2011)

14.

Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018CrossRefMATH

15.

Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967). https://doi.org/10.1109/TIT.1967.1053964CrossRefMATH

16.

Difallah, D.E., Pavlo, A., Curino, C., Cudre-Mauroux, P.: OLTP-bench: an extensible testbed for benchmarking relational databases. Proc. VLDB Endow. 7(4), 277–288 (2013). https://doi.org/10.14778/2732240.2732246CrossRef

17.

Gongxing, W., Yimin, H.: Design of a new intrusion detection system based on database. In: 2009 International Conference on Signal Processing Systems, pp. 814–817 (2009). https://doi.org/10.1109/ICSPS.2009.139

18.

Hastie, T., Friedman, J., Tibshirani, R.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer, New York (2001). https://doi.org/10.1007/978-0-387-21606-5CrossRefMATH

19.

Hinton, G.E., et al.: Learning distributed representations of concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, Amherst, MA, vol. 1, p. 12 (1986)

20.

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735CrossRef

21.

Hussain, S.R., Sallam, A.M., Bertino, E.: DetAnom: detecting anomalous database transactions by insiders. In: Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, CODASPY 2015, pp. 25–35. ACM, New York (2015). https://doi.org/10.1145/2699026.2699111. http://doi.acm.org/10.1145/2699026.2699111

22.

Jain, S., Howe, B., Yan, J., Cruanes, T.: Query2Vec: an evaluation of NLP techniques for generalized workload analytics (2018)

23.

Kamra, A., Terzi, E., Bertino, E.: Detecting anomalous access patterns in relational databases. VLDB J. 17(5), 1063–1077 (2008). https://doi.org/10.1007/s00778-007-0051-4CrossRef

24.

Keller, J.M., Gray, M.R., Givens, J.A.: A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. SMC 15(4), 580–585 (1985). https://doi.org/10.1109/TSMC.1985.6313426CrossRef

25.

Khraisat, A., Gondal, I., Vamplew, P., Kamruzzaman, J.: Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity 2(1), 20 (2019). https://doi.org/10.1186/s42400-019-0038-7CrossRef

26.

Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)

27.

LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989). https://doi.org/10.1162/neco.1989.1.4.541CrossRef

28.

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef

29.

Li, J., Luong, M.T., Jurafsky, D.: A hierarchical neural autoencoder for paragraphs and documents. arXiv preprint arXiv:1506.01057 (2015)

30.

Liaw, A., Wiener, M.: Classification and regression by randomforest. Forest 2(3), 18–22 (2001)

31.

Lu, S., Wei, X., Li, Y., Wang, L.: Detecting anomaly in big data system logs using convolutional neural network. In: 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 151–158. IEEE (2018)

32.

Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)

33.

Mathew, S., Petropoulos, M., Ngo, H.Q., Upadhyaya, S.: A data-centric approach to insider attack detection in database systems. In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 382–401. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15512-3_20CrossRef

34.

Mazzawi, H., Dalal, G., Rozenblatz, D., Ein-Dorx, L., Niniox, M., Lavi, O.: Anomaly detection in large databases using behavioral patterning. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 1140–1149 (2017). https://doi.org/10.1109/ICDE.2017.158

35.

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

36.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

37.

Neuvonen, S., Wolski, A., Manner, M., Raatikka, V.: Telecommunication application transaction processing (TATP) benchmark description 1.0 (2009)

38.

Roichman, A., Gudes, E.: DIWeDa - detecting intrusions in web databases. In: Atluri, V. (ed.) DBSec 2008. LNCS, vol. 5094, pp. 313–329. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-70567-3_24CrossRef

39.

Ronao, C.A., Cho, S.B.: Anomalous query access detection in RBAC-administered databases with random forest and PCA. Inf. Sci. 369, 238–250 (2016). https://doi.org/10.1016/j.ins.2016.06.038CrossRef

40.

Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)CrossRef

41.

Sallam, A., Bertino, E.: Detection of temporal data ex-filtration threats to relational databases. In: 2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC), pp. 146–155 (2018). https://doi.org/10.1109/CIC.2018.00030

42.

Sallam, A., Bertino, E.: Result-based detection of insider threats to relational databases. In: Ahn, G., Thuraisingham, B.M., Kantarcioglu, M., Krishnan, R. (eds.) Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy, CODASPY 2019, Richardson, TX, USA, 25–27 March 2019, pp. 133–143. ACM (2019). https://doi.org/10.1145/3292006.3300039

43.

Sallam, A., Bertino, E.: Techniques and systems for anomaly detection in database systems. In: Calo, S., Bertino, E., Verma, D. (eds.) Policy-Based Autonomic Data Governance. LNCS, vol. 11550, pp. 113–133. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17277-0_7CrossRef

44.

Sallam, A., Bertino, E., Hussain, S.R., Landers, D., Lefler, R.M., Steiner, D.: DBSAFE - an anomaly detection system to protect databases from exfiltration attempts. IEEE Syst. J. 11(2), 483–493 (2017). https://doi.org/10.1109/JSYST.2015.2487221CrossRef

45.

Sallam, A., Fadolalkarim, D., Bertino, E., Xiao, Q.: Data and syntax centric anomaly detection for relational databases. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 6(6), 231–239 (2016). https://doi.org/10.1002/widm.1195CrossRef

46.

Sallam, A., Xiao, Q., Bertino, E., Fadolalkarim, D.: Anomaly detection techniques for database protection against insider threats (invited paper). In: 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI), pp. 20–29 (2016). https://doi.org/10.1109/IRI.2016.12

47.

Sandhu, R.S., Coyne, E.J., Feinstein, H.L., Youman, C.E.: Role-based access control models. Computer 29(2), 38–47 (1996). https://doi.org/10.1109/2.485845CrossRef

48.

Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015). https://doi.org/10.1016/j.neunet.2014.09.003CrossRef

49.

Shebaro, B., Sallam, A., Kamra, A., Bertino, E.: PostgreSQL anomalous query detector. In: Joint 2013 EDBT/ICDT Conferences, EDBT 2013 Proceedings, Genoa, Italy, 18–22 March 2013, pp. 741–744 (2013). https://doi.org/10.1145/2452376.2452469

50.

Singh, I., Sareen, S., Ahuja, H.: Detection of malicious transactions in databases using dynamic sensitivity and weighted rule mining. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–8 (2017). https://doi.org/10.1109/ICIIECS.2017.8276084

51.

Singh, M.P., Sural, S., Vaidya, J., Atluri, V.: Managing attribute-based access control policies in a unified framework using data warehousing and in-memory database. Comput. Secur. 86, 183–205 (2019). https://doi.org/10.1016/j.cose.2019.06.001. http://www.sciencedirect.com/science/article/pii/S0167404819301166CrossRef

52.

Smola, A.J., et al.: Regression estimation with support vector learning machines. Ph.D. thesis, Master’s thesis, Technische Universität München (1996)

53.

Stonebraker, M., Pavlo, A.: The seats airline ticketing systems benchmark. http://hstore.cs.brown.edu/projects/seats

54.

Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282 (1995). https://doi.org/10.1109/ICDAR.1995.598994

55.

Transaction Processing Performance Council (TPC): TPC Benchmark C Standard Specification (2010). Revision 5.11

56.

Transaction Processing Performance Council (TPC): TPC Benchmark E, Standard Specification. Version 1.14.0 (2015). http://www.tpc.org/tpce/

57.

Transaction Processing Performance Council (TPC): TPC Benchmark H Standard Specification (2018). Revision 2.18.0

58.

Vapnik, V.: Pattern recognition using generalized portrait method. Autom. Remote Control 24, 774–780 (1963)

Titel: Role-Based Access Classification: Evaluating the Performance of Machine Learning Algorithms
verfasst von: Randy Julian
Edward Guyot
Shaowen Zhou
Geong Sen Poh
Stéphane Bressan
Verlag: Springer Berlin Heidelberg
Buch: Transactions on Large-Scale Data- and Knowledge-Centered Systems XLIII
Print ISBN: 978-3-662-62198-1

Electronic ISBN: 978-3-662-62199-8

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-662-62199-8_1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"