Skip to main content
Top

2019 | OriginalPaper | Chapter

Classification, Clustering and Association Rule Mining in Educational Datasets Using Data Mining Tools: A Case Study

Authors : Sadiq Hussain, Rasha Atallah, Amirrudin Kamsin, Jiten Hazarika

Published in: Cybernetics and Algorithms in Intelligent Systems

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Educational Data Mining is an emerging field in the data mining domain. In this competitive world scenario, the quality of education needs to improve. Unfortunately most of the students’ data are becoming data tombs for not analyzing the hidden knowledge. The educational data mining tries to uncover the hidden knowledge by discovering relationships between student learning characteristics and behavior. With this educational data modeling, the educators may plan for future learning pedagogy to support the student’s learning style. This knowledge may be applied by the academic planners to improve the quality of education and decrease the failure rate. In this paper, we had collected real dataset containing 666 instances with 11 attributes. The data is from the Common Entrance Examination (CEE) data of a particular year for admission to medical colleges of Assam, India conducted by Dibrugarh University. We tried to find out the association rules using the data. Various clustering and classification methods were also used to compare the suitable one for the dataset. The data mining tools applied in the educational data were Orange, Weka and R Studio.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bhardwaj, B.K., Pal, S.: Data mining: a prediction for performance improvement using classification. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 9(4), 136–140 (2012) Bhardwaj, B.K., Pal, S.: Data mining: a prediction for performance improvement using classification. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 9(4), 136–140 (2012)
2.
go back to reference Yadav, S.K., Pal, S.: Data mining: a prediction for performance improvement of engineering students using classification. World Comput. Sci. Inf. Technol. J. 2(2), 51–56 (2012) Yadav, S.K., Pal, S.: Data mining: a prediction for performance improvement of engineering students using classification. World Comput. Sci. Inf. Technol. J. 2(2), 51–56 (2012)
3.
go back to reference Kukasvadiya, M.S., Divecha, N.H.: Analysis of Data Using Data Mining tool Orange. Int. J. Eng. Develop. Res. 5(2), 1836–1840 (2017) Kukasvadiya, M.S., Divecha, N.H.: Analysis of Data Using Data Mining tool Orange. Int. J. Eng. Develop. Res. 5(2), 1836–1840 (2017)
4.
go back to reference DeFreitas, K., Bernard, M.: Comparative performance analysis of clustering techniques in educational data mining. IADIS Int. J. Comput. Sci. Inf. Syst. 10(2), 65–78 (2015) DeFreitas, K., Bernard, M.: Comparative performance analysis of clustering techniques in educational data mining. IADIS Int. J. Comput. Sci. Inf. Syst. 10(2), 65–78 (2015)
5.
go back to reference Dutt, A., Aghabozrgi, S., Ismail, M.A.B., Mahroein, H.: Clustering algorithms applied in educational data mining. Int. J. Inf. Electron. Eng. 5(2), 112–116 (2015) Dutt, A., Aghabozrgi, S., Ismail, M.A.B., Mahroein, H.: Clustering algorithms applied in educational data mining. Int. J. Inf. Electron. Eng. 5(2), 112–116 (2015)
6.
go back to reference Nagy, H.M., Aly, W.M., Hegazy, O.F.: An educational data mining system for advising higher education students. Int. J. Comput. Inf. Eng. 7(10), 1226–1270 (2013) Nagy, H.M., Aly, W.M., Hegazy, O.F.: An educational data mining system for advising higher education students. Int. J. Comput. Inf. Eng. 7(10), 1226–1270 (2013)
7.
go back to reference Oyelade, O.J., Oladipupo, O.O., Obagbuwa, I.C.: Application of K-means clustering algorithm for prediction of students’ academic performance. Int. J. Comput. Sci. Inf. Secur. 7(1), 292–295 (2010) Oyelade, O.J., Oladipupo, O.O., Obagbuwa, I.C.: Application of K-means clustering algorithm for prediction of students’ academic performance. Int. J. Comput. Sci. Inf. Secur. 7(1), 292–295 (2010)
8.
go back to reference Almarabeh, H.: Analysis of students’ performance by using different data mining classifiers. Int. J. Mod. Educ. Comput. Sci. 9(8), 9–15 (2017)CrossRef Almarabeh, H.: Analysis of students’ performance by using different data mining classifiers. Int. J. Mod. Educ. Comput. Sci. 9(8), 9–15 (2017)CrossRef
9.
go back to reference Sivogolovko, E., Novikov, B.: Validating cluster structures in data mining tasks. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops on - EDBT-ICDT 2012, p. 245. ACM, New York (2012) Sivogolovko, E., Novikov, B.: Validating cluster structures in data mining tasks. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops on - EDBT-ICDT 2012, p. 245. ACM, New York (2012)
11.
go back to reference Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Exp. Syst. Appl. 36(2), 3336–3341 (2009)CrossRef Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Exp. Syst. Appl. 36(2), 3336–3341 (2009)CrossRef
12.
go back to reference Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Patt. Anal. Mach. Intel. 24(12), 1650–1654 (2002)CrossRef Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Patt. Anal. Mach. Intel. 24(12), 1650–1654 (2002)CrossRef
13.
go back to reference Berkhin, P.P.: A Survey of Clustering Data Mining Techniques. Springer, Heidelberg (2006)CrossRef Berkhin, P.P.: A Survey of Clustering Data Mining Techniques. Springer, Heidelberg (2006)CrossRef
15.
go back to reference Ahmed, A.B.E.D., Elaraby, I.S.: Data mining: a prediction for student’s performance using classification method. World J. Comput. Appl. Technol. 2(2), 43–47 (2014) Ahmed, A.B.E.D., Elaraby, I.S.: Data mining: a prediction for student’s performance using classification method. World J. Comput. Appl. Technol. 2(2), 43–47 (2014)
16.
go back to reference Pandey, U.K., Pal, S.: Data mining: a prediction of performer or underperformer using classification. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 2(2), 686–690 (2011) Pandey, U.K., Pal, S.: Data mining: a prediction of performer or underperformer using classification. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 2(2), 686–690 (2011)
17.
go back to reference Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience Publication, New York (2000)MATH Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience Publication, New York (2000)MATH
18.
go back to reference Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–130 (1997)CrossRefMATH Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–130 (1997)CrossRefMATH
19.
go back to reference Jiawei, H., Micheline, K.: Data Mining: Concepts and Techniques. Elsevier Book Series (2000) Jiawei, H., Micheline, K.: Data Mining: Concepts and Techniques. Elsevier Book Series (2000)
20.
go back to reference Rakesh, A., Ramakrishnan, S.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pp. 487–499 (1994) Rakesh, A., Ramakrishnan, S.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pp. 487–499 (1994)
21.
go back to reference Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30, 79–82 (2005)CrossRef Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30, 79–82 (2005)CrossRef
22.
go back to reference Powers, D.M.W.: Evaluation: from precision, recall and f-measure to roc., informedness, markedness & correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)MathSciNet Powers, D.M.W.: Evaluation: from precision, recall and f-measure to roc., informedness, markedness & correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)MathSciNet
24.
go back to reference Borg, I., Groenen, P.: Modern Multidimensional Scaling: Theory and Applications, pp. 207–212, 2nd edn. Springer, New York (2005). ISBN 0-387-94845-7 Borg, I., Groenen, P.: Modern Multidimensional Scaling: Theory and Applications, pp. 207–212, 2nd edn. Springer, New York (2005). ISBN 0-387-94845-7
25.
go back to reference Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., Možina, M., Polajnar, M., Toplak, M., Starič, A., Stajdohar, M., Umek, L., Žagar, L., Žbontar, J., Žitnik, M., Zupan, B.: Orange: data mining toolbox in Python. JMLR. 14(1), 2349–2353 (2013)MATH Demšar, J., Curk, T., Erjavec, A., Gorup, Č., Hočevar, T., Milutinovič, M., Možina, M., Polajnar, M., Toplak, M., Starič, A., Stajdohar, M., Umek, L., Žagar, L., Žbontar, J., Žitnik, M., Zupan, B.: Orange: data mining toolbox in Python. JMLR. 14(1), 2349–2353 (2013)MATH
26.
go back to reference Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011) Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)
27.
go back to reference Verzani, J.: Getting Started with RStudio, p. 4. O’Reilly Media, Inc. (2011). ISBN 9781449309039 Verzani, J.: Getting Started with RStudio, p. 4. O’Reilly Media, Inc. (2011). ISBN 9781449309039
28.
go back to reference Sharma, A., Dey, S.: Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis. IJCA 3, 15–20 (2012). Special Issue on Advanced Computing and Communication Technologies for HPC Applications ACCTHPCA Sharma, A., Dey, S.: Performance investigation of feature selection methods and sentiment lexicons for sentiment analysis. IJCA 3, 15–20 (2012). Special Issue on Advanced Computing and Communication Technologies for HPC Applications ACCTHPCA
Metadata
Title
Classification, Clustering and Association Rule Mining in Educational Datasets Using Data Mining Tools: A Case Study
Authors
Sadiq Hussain
Rasha Atallah
Amirrudin Kamsin
Jiten Hazarika
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-319-91192-2_21

Premium Partner