Skip to main content
Erschienen in: Cluster Computing 1/2016

01.03.2016

Data security rules/regulations based classification of file data using TsF-kNN algorithm

verfasst von: Munwar Ali Zardari, Low Tang Jung

Erschienen in: Cluster Computing | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Personal and organizational data are getting larger in volume with respect to time. Due to the importance of data for organisations, effective and efficient management and categorization of data need a special focus. Understanding and applying data security policies to the appropriate data types therefore is one of the core concerns in large organisations such as cloud service providers. With data classification, the identification of security requirements for the data can be accomplished without manual intervention where the encryption process is applied only to the confidential data thus saving encryption time, decryption time, storage and processing power. The proposed data classification approach is to reduce the network traffic, the additional data movement, the overload, and the storage place for confidential data can be decided where security requirements of the confidential data are fulfilled. In this paper, an intelligent data classification approach is presented for predicting the confidentiality/sensitivity level of the data in a file based on the corporate objective and government policies/rules. An enhanced version of the k-NN algorithm is also proposed to reduce the computational complexity of the traditional k-NN algorithm at data classification phase. The proposed algorithm is called Training dataset Filtration-kNN (TsF-kNN). The experimental results show that data in a file can be classified into confidential and non-confidential classes and TsF-kNN algorithm has better performance against the traditional k-NN and Naïve Bayes algorithm.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Certz, M., Jajodia, S. (eds.): Security re-egnineering for databases: concepts and techinques. Handbook of Database Security- Applications and Trends. Sringer, USA (2008) Certz, M., Jajodia, S. (eds.): Security re-egnineering for databases: concepts and techinques. Handbook of Database Security- Applications and Trends. Sringer, USA (2008)
2.
Zurück zum Zitat Chen, K., Guo, S.: PerturBoost: Practical Confidential Classifier Learning in the cloud. In: 13th International Conference on Data Mining, pp. 991–996, (2013) Chen, K., Guo, S.: PerturBoost: Practical Confidential Classifier Learning in the cloud. In: 13th International Conference on Data Mining, pp. 991–996, (2013)
3.
Zurück zum Zitat Chen, K., Guo, S.: RASP-Boost: confidential boosting-model learning with purturbed data in the cloud. IEEE Trans. Cloud Comput. 1(1) (2015) Chen, K., Guo, S.: RASP-Boost: confidential boosting-model learning with purturbed data in the cloud. IEEE Trans. Cloud Comput. 1(1) (2015)
4.
Zurück zum Zitat Chiang, T., Lo, H., Lin, S.D.: A ranking-based KNN approach for multi-label classification. In: Asia Conference on Machine Learning, pp. 81–96 (2012) Chiang, T., Lo, H., Lin, S.D.: A ranking-based KNN approach for multi-label classification. In: Asia Conference on Machine Learning, pp. 81–96 (2012)
5.
Zurück zum Zitat Choi, Y.B., Crowgey, R.L., Price, J.M., VanPelt, J.S.: The State-of-the-art of mobile payment architecture and emerging issues. Int. J. Electron. Financ. 1(1), 94–103 (2006)CrossRef Choi, Y.B., Crowgey, R.L., Price, J.M., VanPelt, J.S.: The State-of-the-art of mobile payment architecture and emerging issues. Int. J. Electron. Financ. 1(1), 94–103 (2006)CrossRef
6.
Zurück zum Zitat Chung, M., Gertz, M., Levitt, K.: Demids: a misuse detection system for database systems, Integrity and Internal Control in Information Systems, Strategic views on the Need of Control. IFIP TCII WG11.5 Third working conference on Integrity and Internal Control in Information Systerms, Amsterdam, The Natherland, pp. 159–178. Kluwer Academic Publishers, Norwell, (2000) Chung, M., Gertz, M., Levitt, K.: Demids: a misuse detection system for database systems, Integrity and Internal Control in Information Systems, Strategic views on the Need of Control. IFIP TCII WG11.5 Third working conference on Integrity and Internal Control in Information Systerms, Amsterdam, The Natherland, pp. 159–178. Kluwer Academic Publishers, Norwell, (2000)
7.
Zurück zum Zitat Clark, D.L.: The Manager’s Defense Guide. Addison-Wesly, USA (2003) Clark, D.L.: The Manager’s Defense Guide. Addison-Wesly, USA (2003)
8.
Zurück zum Zitat Fabrizio, A.: Fast condensed nearest neighbor rule, Technical report, In: Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, pp. 25–32, (2005) Fabrizio, A.: Fast condensed nearest neighbor rule, Technical report, In: Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, pp. 25–32, (2005)
9.
Zurück zum Zitat Frank, S.: Data Classification for Cloud Readiness, pp. 1–19. Published by, Microsoft (2014) Frank, S.: Data Classification for Cloud Readiness, pp. 1–19. Published by, Microsoft (2014)
10.
Zurück zum Zitat Gates, G.: The reduced nearest neighbor rule. IEEE Trans. Inf. Theory 18, 431–433 (1972)CrossRef Gates, G.: The reduced nearest neighbor rule. IEEE Trans. Inf. Theory 18, 431–433 (1972)CrossRef
11.
Zurück zum Zitat Gibbon, D., Moore, R. K., Winski, R. (Eds.) Handbook of Standards and Resources for Spoken Language Systems—Google Books, (1997) Gibbon, D., Moore, R. K., Winski, R. (Eds.) Handbook of Standards and Resources for Spoken Language Systems—Google Books, (1997)
12.
Zurück zum Zitat Hart, P.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 14, 515–516 (1968)CrossRef Hart, P.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 14, 515–516 (1968)CrossRef
13.
14.
Zurück zum Zitat He, J., Tan, A., Tan, C.: A Comparative Study on Chinese Text Categorization Methods. In: PRICAI 200 Workshop on Text and Web Mining, pp. 24–35, (2000) He, J., Tan, A., Tan, C.: A Comparative Study on Chinese Text Categorization Methods. In: PRICAI 200 Workshop on Text and Web Mining, pp. 24–35, (2000)
15.
Zurück zum Zitat Hosmer, H.: Using fuzzy logic to represent security policies in the multipolicy paradigm. ACM SIGSAC Rev. 10(4), 12–21 (1992)CrossRef Hosmer, H.: Using fuzzy logic to represent security policies in the multipolicy paradigm. ACM SIGSAC Rev. 10(4), 12–21 (1992)CrossRef
16.
Zurück zum Zitat Hosmer, H.: Using fuzzy logic to represent security policies in the multipolicies paradigm. ACM SIGSAC Rev. 10(4), 12–21 (1992)CrossRef Hosmer, H.: Using fuzzy logic to represent security policies in the multipolicies paradigm. ACM SIGSAC Rev. 10(4), 12–21 (1992)CrossRef
17.
Zurück zum Zitat Jain, A.K., Murty, M.N., Flynn, P.: Data clustering: a review. ACM Comput. Surv. 31, 264–323 (2002) Jain, A.K., Murty, M.N., Flynn, P.: Data clustering: a review. ACM Comput. Surv. 31, 264–323 (2002)
18.
Zurück zum Zitat Jiang, W., Liu, Q.: Dependency Parsing and Projection Based on Word-Pair Classification. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 12–20, (2009) Jiang, W., Liu, Q.: Dependency Parsing and Projection Based on Word-Pair Classification. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 12–20, (2009)
19.
Zurück zum Zitat Kamra, A., Terzi, E., Bertino, E.: Detecting anomalous access patterns in relational databases. VLDB J. 17(5), 1063–1077 (2007)CrossRef Kamra, A., Terzi, E., Bertino, E.: Detecting anomalous access patterns in relational databases. VLDB J. 17(5), 1063–1077 (2007)CrossRef
20.
Zurück zum Zitat Kerdprasop, N., Kerdprasop, K.: Discrete decision tree induction to avoud overfiting on categorical. ... and intelligent systems, and 10th WSEAS .... pp. 247–252, (2011) Kerdprasop, N., Kerdprasop, K.: Discrete decision tree induction to avoud overfiting on categorical. ... and intelligent systems, and 10th WSEAS .... pp. 247–252, (2011)
21.
Zurück zum Zitat Krasimir, G., Iankiev, Y., Wu, Venu, G.: Improved k-nearest neighbor classification. Pattern Recogn. 35, 2311–2318 (2002)CrossRefMATH Krasimir, G., Iankiev, Y., Wu, Venu, G.: Improved k-nearest neighbor classification. Pattern Recogn. 35, 2311–2318 (2002)CrossRefMATH
22.
Zurück zum Zitat Liu, F., Ng, K., W., Zhang, W. Encrypted associatioin rules mining for outsourced data mining. In: 29th International Conference on Advanced Information Networking and Application, pp. 550–557, (2015) Liu, F., Ng, K., W., Zhang, W. Encrypted associatioin rules mining for outsourced data mining. In: 29th International Conference on Advanced Information Networking and Application, pp. 550–557, (2015)
23.
Zurück zum Zitat Masoud, M.: Classification of data based on the a fuzzy logic system. In: International Conference on Computational Intelligence for Modelling Control and Automation, pp. 1288–1292, (2008) Masoud, M.: Classification of data based on the a fuzzy logic system. In: International Conference on Computational Intelligence for Modelling Control and Automation, pp. 1288–1292, (2008)
24.
Zurück zum Zitat Masoud, M., Dimitios, H.: Data classification process for security and privacy based on a fuzzy logic classifier. Int. J. Electron. Financ. 3(4), 374–386 (2009)CrossRef Masoud, M., Dimitios, H.: Data classification process for security and privacy based on a fuzzy logic classifier. Int. J. Electron. Financ. 3(4), 374–386 (2009)CrossRef
25.
Zurück zum Zitat Michael, G., Madhavi, G.: Security re-engineering for databases: concepts and techniques, USA. In: Michael, G., Sushil, J. (eds.) Handbook of Database Security, pp. 267–296. Springer, Berlin (2008) Michael, G., Madhavi, G.: Security re-engineering for databases: concepts and techniques, USA. In: Michael, G., Sushil, J. (eds.) Handbook of Database Security, pp. 267–296. Springer, Berlin (2008)
26.
Zurück zum Zitat Randall, W., Tony, R.M.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000)CrossRefMATH Randall, W., Tony, R.M.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000)CrossRefMATH
27.
Zurück zum Zitat Samanthula, B.K., Elmehdwi, Y., Jiang, W.: k-nearest neighbor classification over semantically secure encrypted relational data. IEEE Comput. Soc. 27(5), 1261–1273 (2015) Samanthula, B.K., Elmehdwi, Y., Jiang, W.: k-nearest neighbor classification over semantically secure encrypted relational data. IEEE Comput. Soc. 27(5), 1261–1273 (2015)
28.
Zurück zum Zitat Spalka, A., Lehndardt, J. A comprehensive approach to anomaly detection in relational databases. In: Data and Applications Security 205, LNCS 3654, Springer, Germany, pp. 207–221, (2005) Spalka, A., Lehndardt, J. A comprehensive approach to anomaly detection in relational databases. In: Data and Applications Security 205, LNCS 3654, Springer, Germany, pp. 207–221, (2005)
29.
Zurück zum Zitat Spalka, A., Lehnhardt, J.: A Comprehensive Approach to Anomaly Detection in Relational Databases. Data and Applications Security. Springer, Heidelberg (2005) Spalka, A., Lehnhardt, J.: A Comprehensive Approach to Anomaly Detection in Relational Databases. Data and Applications Security. Springer, Heidelberg (2005)
30.
Zurück zum Zitat Steve, S., Uwe, B., Oliver, K., Frank, L., Tobias, U.: Cloud data patterns for confidentiality. In: Proceedings of the 2nd International Conference on Cloud Computing and Service Science, pp. 387–394, (2012) Steve, S., Uwe, B., Oliver, K., Frank, L., Tobias, U.: Cloud data patterns for confidentiality. In: Proceedings of the 2nd International Conference on Cloud Computing and Service Science, pp. 387–394, (2012)
31.
Zurück zum Zitat Tan, C., Wang, Y., Lee, C.: The use of bigrams to enhance text categorization. Inf. Process. Manag. 38(4), 529–546 (2002)CrossRefMATH Tan, C., Wang, Y., Lee, C.: The use of bigrams to enhance text categorization. Inf. Process. Manag. 38(4), 529–546 (2002)CrossRefMATH
32.
Zurück zum Zitat Tsuruoka, Y.: Developing a robust part-of-speech tagger for biomedical text. Lecture notes in computer science, vol. 3746, pp. 382–392 (2005) Tsuruoka, Y.: Developing a robust part-of-speech tagger for biomedical text. Lecture notes in computer science, vol. 3746, pp. 382–392 (2005)
33.
Zurück zum Zitat Wu, Q.H., Nikolaidis, K., Goulermas, J.Y.: A class boundary preserving algorithm for data condensation. Pattern Recogn. 44, 704–715 (2011)CrossRefMATH Wu, Q.H., Nikolaidis, K., Goulermas, J.Y.: A class boundary preserving algorithm for data condensation. Pattern Recogn. 44, 704–715 (2011)CrossRefMATH
34.
Zurück zum Zitat Yang, Y., Liu, X.: A re-examination of text categorization methods. In: 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49, (1999) Yang, Y., Liu, X.: A re-examination of text categorization methods. In: 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49, (1999)
35.
Zurück zum Zitat Yao, Q., An, A., Huang, X.: Finding and analyzing database user sessions. In: Procedings of the 10th International Conference on Database Systems for Advanced Applications, pp. 851–862, (2005) Yao, Q., An, A., Huang, X.: Finding and analyzing database user sessions. In: Procedings of the 10th International Conference on Database Systems for Advanced Applications, pp. 851–862, (2005)
36.
Zurück zum Zitat Zhu, Y., Xu, R., Takagi, T. Secure k-NN computation on encrypted cloud data without sharing key with query users. In: Proceedings of the 2013 International Workshop on Security in Cloud Computing—Cloud Computing ’13, pp. 55–60, (2013) Zhu, Y., Xu, R., Takagi, T. Secure k-NN computation on encrypted cloud data without sharing key with query users. In: Proceedings of the 2013 International Workshop on Security in Cloud Computing—Cloud Computing ’13, pp. 55–60, (2013)
Metadaten
Titel
Data security rules/regulations based classification of file data using TsF-kNN algorithm
verfasst von
Munwar Ali Zardari
Low Tang Jung
Publikationsdatum
01.03.2016
Verlag
Springer US
Erschienen in
Cluster Computing / Ausgabe 1/2016
Print ISSN: 1386-7857
Elektronische ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-016-0539-z

Weitere Artikel der Ausgabe 1/2016

Cluster Computing 1/2016 Zur Ausgabe

Premium Partner