Skip to main content
Top

2021 | OriginalPaper | Chapter

3. Knowledge Hiding in Decision Trees for Learning Analytics Applications

Authors : Georgios Feretzakis, Dimitris Kalles, Vassilios S. Verykios

Published in: Advances in Core Computer Science-Based Technologies

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Nowadays there is a wide range of digital information available to educational institutions regarding learners, including performance records, educational resources, student attendance, feedback on the course material, evaluations of courses and social network data. Although collecting, using, and sharing educational data do offer substantial potential, the privacy-sensitivity of the data raises legitimate privacy concerns. The sharing of data among education organizations has become an increasingly common procedure. However, any organization will most likely try to keep some patterns hidden if it must share its datasets with others. This chapter focuses on preserving the privacy of sensitive patterns when inducing decision trees and demonstrates the application of a heuristic to an educational data set. The employed heuristic hiding method allows the sanitized raw data to be readily available for public use and, thus, is preferable over other heuristic solutions, like output perturbation or cryptographic techniques, which limit the usability of the data.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference L. Cranor, T. Rabin, V. Shmatikov, S. Vadhan, D. Weitzner, Towards a privacy research roadmap for the computing community, in Computing Community Consortium committee of the Computing Research Association, Washington, DC, USA, White Paper (2015) L. Cranor, T. Rabin, V. Shmatikov, S. Vadhan, D. Weitzner, Towards a privacy research roadmap for the computing community, in Computing Community Consortium committee of the Computing Research Association, Washington, DC, USA, White Paper (2015)
3.
go back to reference S. Yu, Big privacy: challenges and opportunities of privacy study in the age of big data. IEEE Access. 4, 2751–2763 (2016)CrossRef S. Yu, Big privacy: challenges and opportunities of privacy study in the age of big data. IEEE Access. 4, 2751–2763 (2016)CrossRef
4.
go back to reference S. Laughlin, A. Westin, Privacy and freedom. Mich. Law Rev. 66, 1064 (1968)CrossRef S. Laughlin, A. Westin, Privacy and freedom. Mich. Law Rev. 66, 1064 (1968)CrossRef
5.
go back to reference E. Bertino, D. Lin, W. Jiang, A survey of quantification of privacy preserving data mining algorithms, in Privacy-Preserving Data Mining (Springer, New York, NY, USA, 2008), pp. 183–205 E. Bertino, D. Lin, W. Jiang, A survey of quantification of privacy preserving data mining algorithms, in Privacy-Preserving Data Mining (Springer, New York, NY, USA, 2008), pp. 183–205
6.
go back to reference C.C. Aggarwal, P.S. Yu, A general survey of privacy-preserving data mining models and algorithms, in Privacy-Preserving Data Mining (Springer, New York, NY, USA, 2008), pp. 11–52 C.C. Aggarwal, P.S. Yu, A general survey of privacy-preserving data mining models and algorithms, in Privacy-Preserving Data Mining (Springer, New York, NY, USA, 2008), pp. 11–52
7.
go back to reference C.C. Aggarwal, Data Mining: The Textbook (Springer, New York, NY, USA, 2015)MATH C.C. Aggarwal, Data Mining: The Textbook (Springer, New York, NY, USA, 2015)MATH
8.
go back to reference S. Dua, X. Du, Data Mining and Machine Learning in Cybersecurity (CRC Press, Boca Raton, FL, USA, 2011)MATH S. Dua, X. Du, Data Mining and Machine Learning in Cybersecurity (CRC Press, Boca Raton, FL, USA, 2011)MATH
9.
go back to reference S. Fletcher, M. Islam, Measuring information quality for privacy preserving data mining. Int. J. Comput. Theory Eng. 7, 21–28 (2014)CrossRef S. Fletcher, M. Islam, Measuring information quality for privacy preserving data mining. Int. J. Comput. Theory Eng. 7, 21–28 (2014)CrossRef
11.
go back to reference A. Shah, R. Gulati, Privacy Preserving data mining: techniques, classification and implications—a survey. Int. J. Comput. Appl. 137, 40–46 (2016) A. Shah, R. Gulati, Privacy Preserving data mining: techniques, classification and implications—a survey. Int. J. Comput. Appl. 137, 40–46 (2016)
12.
go back to reference Y. Aldeen, M. Salleh, M. Razzaque, A comprehensive review on privacy preserving data mining. SpringerPlus 4 (2015) Y. Aldeen, M. Salleh, M. Razzaque, A comprehensive review on privacy preserving data mining. SpringerPlus 4 (2015)
13.
go back to reference E. Bertino, I.N. Fovino, Information driven evaluation of data hiding algorithms, in Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (2005), pp. 418–427 E. Bertino, I.N. Fovino, Information driven evaluation of data hiding algorithms, in Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (2005), pp. 418–427
14.
go back to reference V.S. Verykios, E. Bertino, I.N. Fovino, L.P. Provenza, Y. Saygin, Y. Theodoridis, State-of-the-art in privacy preserving data mining. ACM SIGMOD Rec. 33(1), 50–57 (2004) V.S. Verykios, E. Bertino, I.N. Fovino, L.P. Provenza, Y. Saygin, Y. Theodoridis, State-of-the-art in privacy preserving data mining. ACM SIGMOD Rec. 33(1), 50–57 (2004)
16.
go back to reference R. Agrawal, R. Srikant, Privacy-preserving data mining. ACM SIGMOD Rec. 29, 439–450 (2000)CrossRef R. Agrawal, R. Srikant, Privacy-preserving data mining. ACM SIGMOD Rec. 29, 439–450 (2000)CrossRef
18.
go back to reference A. Pardo, G. Siemens, Ethical and privacy principles for learning analytics. Br. J. Edu. Technol. 45, 438–450 (2014)CrossRef A. Pardo, G. Siemens, Ethical and privacy principles for learning analytics. Br. J. Edu. Technol. 45, 438–450 (2014)CrossRef
19.
go back to reference L.P. Macfadyen, S. Dawson, A. Pardo, D. Gasevic, Embracing big data in complex educational systems: the learning analytics imperative and the policy challenge. Res. Pract. Assess. 9 (2014) L.P. Macfadyen, S. Dawson, A. Pardo, D. Gasevic, Embracing big data in complex educational systems: the learning analytics imperative and the policy challenge. Res. Pract. Assess. 9 (2014)
20.
go back to reference G. Siemens, P. Long, Penetrating the fog: analytics in learning and education. Educ. Rev. 48(5), 31–40 (2011) G. Siemens, P. Long, Penetrating the fog: analytics in learning and education. Educ. Rev. 48(5), 31–40 (2011)
21.
go back to reference Y. Lou, P. Abrami, J. Spence, C. Poulsen, B. Chambers, S. d’Apollonia, Within-class grouping: a meta-analysis. Rev. Educ. Res. 66, 423–458 (1996)CrossRef Y. Lou, P. Abrami, J. Spence, C. Poulsen, B. Chambers, S. d’Apollonia, Within-class grouping: a meta-analysis. Rev. Educ. Res. 66, 423–458 (1996)CrossRef
22.
go back to reference EUP, Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (European Union, European Parliament, 2002) EUP, Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (European Union, European Parliament, 2002)
23.
go back to reference T.W. House, Consumer data privacy in a networked world. Retrieved 13 April 2013 (2012) T.W. House, Consumer data privacy in a networked world. Retrieved 13 April 2013 (2012)
24.
go back to reference M. Crook, The risks of absolute medical confidentiality. Sci. Eng. Ethics 19, 107–122 (2011)CrossRef M. Crook, The risks of absolute medical confidentiality. Sci. Eng. Ethics 19, 107–122 (2011)CrossRef
25.
go back to reference H. Nissenbaum, Privacy as contextual integrity. Wash. Law Rev. 79(1), 101–139 (2004) H. Nissenbaum, Privacy as contextual integrity. Wash. Law Rev. 79(1), 101–139 (2004)
26.
go back to reference H. Drachsler, S. Dietze, E. Herder, M. d’Aquin, D. Taibi, The learning analytics & knowledge (LAK) data challenge 2014, in Proceedings of the Fourth International Conference on Learning Analytics and Knowledge (ACM, 2014), pp. 289–290 H. Drachsler, S. Dietze, E. Herder, M. d’Aquin, D. Taibi, The learning analytics & knowledge (LAK) data challenge 2014, in Proceedings of the Fourth International Conference on Learning Analytics and Knowledge (ACM, 2014), pp. 289–290
27.
go back to reference M. Gursoy, A. Inan, M. Nergiz, Y. Saygin, Privacy-preserving learning analytics: challenges and techniques. IEEE Trans. Learn. Technol. 10, 68–81 (2017)CrossRef M. Gursoy, A. Inan, M. Nergiz, Y. Saygin, Privacy-preserving learning analytics: challenges and techniques. IEEE Trans. Learn. Technol. 10, 68–81 (2017)CrossRef
28.
go back to reference V. Mayer-Schonberger, K. Cukier, Learning with Big Data: The Future of Education (Houghton Mifflin Harcourt, 2014) V. Mayer-Schonberger, K. Cukier, Learning with Big Data: The Future of Education (Houghton Mifflin Harcourt, 2014)
29.
go back to reference P. Ice, S. Díaz, K. Swan, M. Burgess, M. Sharkey, J. Sherrill, D. Huston, H. Okimoto, The PAR framework proof of concept: initial findings from a multi-institutional analysis of federated postsecondary data. Online Learn. 16 (2012) P. Ice, S. Díaz, K. Swan, M. Burgess, M. Sharkey, J. Sherrill, D. Huston, H. Okimoto, The PAR framework proof of concept: initial findings from a multi-institutional analysis of federated postsecondary data. Online Learn. 16 (2012)
30.
go back to reference G. Siemens, R.S. d Baker, Learning analytics and educational data mining: towards communication and collaboration, in Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (ACM, 2012), pp. 252–254 G. Siemens, R.S. d Baker, Learning analytics and educational data mining: towards communication and collaboration, in Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (ACM, 2012), pp. 252–254
31.
go back to reference J. Heath, Contemporary privacy theory contributions to learning analytics. J. Learn. Anal. 1(1), 140–149 (2014)CrossRef J. Heath, Contemporary privacy theory contributions to learning analytics. J. Learn. Anal. 1(1), 140–149 (2014)CrossRef
32.
go back to reference S. Slade, P. Prinsloo, Learning analytics. Am. Behav. Sci. 57(10), 1510–1529 (2013)CrossRef S. Slade, P. Prinsloo, Learning analytics. Am. Behav. Sci. 57(10), 1510–1529 (2013)CrossRef
33.
go back to reference P. Prinsloo, S. Slade, An evaluation of policy frameworks for addressing ethical considerations in learning analytics, in Proceedings of the Third International Conference on Learning Analytics and Knowledge (ACM, 2013), pp. 240–244 P. Prinsloo, S. Slade, An evaluation of policy frameworks for addressing ethical considerations in learning analytics, in Proceedings of the Third International Conference on Learning Analytics and Knowledge (ACM, 2013), pp. 240–244
34.
go back to reference K. Verbert, H. Drachsler, N. Manouselis, M. Wolpers, R. Vuorikari, E. Duval, Dataset-driven research for improving recommender systems for learning, in Proceedings of the 1st International Conference on Learning Analytics and Knowledge (ACM Press, New York, USA, 2011), pp. 44–53. https://doi.org/10.1145/2090116.2090122 K. Verbert, H. Drachsler, N. Manouselis, M. Wolpers, R. Vuorikari, E. Duval, Dataset-driven research for improving recommender systems for learning, in Proceedings of the 1st International Conference on Learning Analytics and Knowledge (ACM Press, New York, USA, 2011), pp. 44–53. https://​doi.​org/​10.​1145/​2090116.​2090122
35.
go back to reference L. Chang, I. Moskowitz, Parsimonious downgrading and decision trees applied to the inference problem, in Proceedings of the 1998 Workshop on New Security Paradigms—NSPW ‘98, Charlottesville, VA, USA, 22–26 September (1998) L. Chang, I. Moskowitz, Parsimonious downgrading and decision trees applied to the inference problem, in Proceedings of the 1998 Workshop on New Security Paradigms—NSPW ‘98, Charlottesville, VA, USA, 22–26 September (1998)
36.
go back to reference J. Natwichai, X. Li, M. Orlowska, Hiding classification rules for data sharing with privacy preservation, in Proceedings of the 7th International Conference, DaWak 2005, Copenhagen, Denmark, 22–26 August (2005), pp. 468–467 J. Natwichai, X. Li, M. Orlowska, Hiding classification rules for data sharing with privacy preservation, in Proceedings of the 7th International Conference, DaWak 2005, Copenhagen, Denmark, 22–26 August (2005), pp. 468–467
37.
go back to reference J. Natwichai, X. Li, M. Orlowska, A reconstruction-based algorithm for classification rules hiding, in Proceedings of 17th Australasian Database Conference, (ADC2006), Hobart, Tasmania, Australia, 16–19 January (2006), pp. 49–58 J. Natwichai, X. Li, M. Orlowska, A reconstruction-based algorithm for classification rules hiding, in Proceedings of 17th Australasian Database Conference, (ADC2006), Hobart, Tasmania, Australia, 16–19 January (2006), pp. 49–58
38.
go back to reference J. Quinlan, C4.5 (Morgan Kaufmann Publishers, San Mateo, California, 1993) J. Quinlan, C4.5 (Morgan Kaufmann Publishers, San Mateo, California, 1993)
39.
go back to reference W.W. Cohen, Fast, effective rule induction, in Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, CA, USA, 9–12 July (1995) W.W. Cohen, Fast, effective rule induction, in Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, CA, USA, 9–12 July (1995)
40.
go back to reference A. Katsarou, A. Gkouvalas-Divanis, V.S. Verykios, Reconstruction-based classification rule hiding through controlled data modification, in Artificial Intelligence Applications and Innovations III, vol. 296, ed. by L. Iliadis, I. Vlahavas, M. Bramer (Springer, Boston, MA, USA, 2009), pp. 449–458 A. Katsarou, A. Gkouvalas-Divanis, V.S. Verykios, Reconstruction-based classification rule hiding through controlled data modification, in Artificial Intelligence Applications and Innovations III, vol. 296, ed. by L. Iliadis, I. Vlahavas, M. Bramer (Springer, Boston, MA, USA, 2009), pp. 449–458
41.
go back to reference J. Natwichai, X. Sun, X. Li, Data reduction approach for sensitive associative classification rule hiding, in Proceedings of the 19th Australian Database Conference, Wollongong, NSW, Australia, 22–25 January (2008) J. Natwichai, X. Sun, X. Li, Data reduction approach for sensitive associative classification rule hiding, in Proceedings of the 19th Australian Database Conference, Wollongong, NSW, Australia, 22–25 January (2008)
42.
go back to reference K. Wang, B.C. Fung, P.S. Yu, Template-based privacy preservation in classification problems, in Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM’05), Houston, Texas, 27–30 November (2005) K. Wang, B.C. Fung, P.S. Yu, Template-based privacy preservation in classification problems, in Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM’05), Houston, Texas, 27–30 November (2005)
43.
go back to reference A. Delis, V. Verykios, A. Tsitsonis, A data perturbation approach to sensitive classification rule hiding, in Proceedings of the 2010 ACM Symposium on Applied Computing—SAC ‘10, Sierre, Switzerland, 22–26 March (2010) A. Delis, V. Verykios, A. Tsitsonis, A data perturbation approach to sensitive classification rule hiding, in Proceedings of the 2010 ACM Symposium on Applied Computing—SAC ‘10, Sierre, Switzerland, 22–26 March (2010)
44.
go back to reference R. Bost, R. Popa, S. Tu, S. Goldwasser, Machine learning classification over encrypted data, in Proceedings of the 2015 Network And Distributed System Security Symposium, San Diego, CA, USA, 8–11 February (2015) R. Bost, R. Popa, S. Tu, S. Goldwasser, Machine learning classification over encrypted data, in Proceedings of the 2015 Network And Distributed System Security Symposium, San Diego, CA, USA, 8–11 February (2015)
46.
go back to reference D. Kalles, V.S. Verykios, G. Feretzakis, A. Papagelis, Data set operations to hide decision tree rules, in Proceedings of the Twenty-second European Conference on Artificial Intelligence, Hague, The Netherlands, 29 August–2 September (2016) D. Kalles, V.S. Verykios, G. Feretzakis, A. Papagelis, Data set operations to hide decision tree rules, in Proceedings of the Twenty-second European Conference on Artificial Intelligence, Hague, The Netherlands, 29 August–2 September (2016)
47.
go back to reference D. Kalles, V. Verykios, G. Feretzakis, A. Papagelis, Data set operations to hide decision tree rules, in Proceedings of the 1St International Workshop on AI for Privacy and Security—Praise ‘16, Hague, The Netherlands, 29–30 August (2016) D. Kalles, V. Verykios, G. Feretzakis, A. Papagelis, Data set operations to hide decision tree rules, in Proceedings of the 1St International Workshop on AI for Privacy and Security—Praise ‘16, Hague, The Netherlands, 29–30 August (2016)
48.
go back to reference G. Feretzakis, D. Kalles, V. Verykios, On using linear diophantine equations for in-parallel hiding of decision tree rules. Entropy 21, 66 (2019)CrossRef G. Feretzakis, D. Kalles, V. Verykios, On using linear diophantine equations for in-parallel hiding of decision tree rules. Entropy 21, 66 (2019)CrossRef
49.
go back to reference G. Feretzakis, D. Kalles, V. Verykios, On using linear diophantine equations for efficient hiding of decision tree rules, in Proceedings of the 10th Hellenic Conference on Artificial Intelligence—SETN ‘18, Patras, Greece, 9–12 July (2018) G. Feretzakis, D. Kalles, V. Verykios, On using linear diophantine equations for efficient hiding of decision tree rules, in Proceedings of the 10th Hellenic Conference on Artificial Intelligence—SETN ‘18, Patras, Greece, 9–12 July (2018)
50.
go back to reference R. Li, D. de Vries, J. Roddick, Bands of privacy preserving objectives: classification of PPDM strategies, in Proceedings of the 9th Australasian Data Mining Conference, Ballarat, Australia, 1–2 December 2011 (2011) pp. 137–151 R. Li, D. de Vries, J. Roddick, Bands of privacy preserving objectives: classification of PPDM strategies, in Proceedings of the 9th Australasian Data Mining Conference, Ballarat, Australia, 1–2 December 2011 (2011) pp. 137–151
51.
go back to reference G. Feretzakis, D. Kalles, V. Verykios, Using minimum local distortion to hide decision tree rules. Entropy 21, 334 (2019)MathSciNetCrossRef G. Feretzakis, D. Kalles, V. Verykios, Using minimum local distortion to hide decision tree rules. Entropy 21, 334 (2019)MathSciNetCrossRef
52.
go back to reference G. Feretzakis, D. Kalles, V. Verykios, Hiding decision tree rules in medical data: a case study, in Proceedings of the 17th International Conference on Informatics, Management and Technology in Healthcare—ICIMTH ‘19, Athens, Greece, 5–7 July (2019) G. Feretzakis, D. Kalles, V. Verykios, Hiding decision tree rules in medical data: a case study, in Proceedings of the 17th International Conference on Informatics, Management and Technology in Healthcare—ICIMTH ‘19, Athens, Greece, 5–7 July (2019)
56.
go back to reference J.R. Quinlan, Induction of decision trees, in Machine Learning 1 (Kluwer Academic Publishers, Boston, MA, USA, 1986), pp. 81–106 J.R. Quinlan, Induction of decision trees, in Machine Learning 1 (Kluwer Academic Publishers, Boston, MA, USA, 1986), pp. 81–106
58.
go back to reference M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. Witten, The WEKA data mining software. ACM SIGKDD Explor. Newsl. 11, 10–18 (2009)CrossRef M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. Witten, The WEKA data mining software. ACM SIGKDD Explor. Newsl. 11, 10–18 (2009)CrossRef
59.
go back to reference I.H. Witten, E. Frank, M.A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2011) I.H. Witten, E. Frank, M.A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2011)
Metadata
Title
Knowledge Hiding in Decision Trees for Learning Analytics Applications
Authors
Georgios Feretzakis
Dimitris Kalles
Vassilios S. Verykios
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-41196-1_3

Premium Partners