Skip to main content
Top
Published in: Arabian Journal for Science and Engineering 8/2022

17-01-2022 | Research Article-Computer Engineering and Computer Science

Predicting Student Performance from Online Engagement Activities Using Novel Statistical Features

Author: Ghassen Ben Brahim

Published in: Arabian Journal for Science and Engineering | Issue 8/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Predicting students’ performance during their years of academic study has been investigated tremendously. It offers important insights that can help and guide institutions to make timely decisions and changes leading to better student outcome achievements. In the post-COVID-19 pandemic era, the adoption of e-learning has gained momentum and has increased the availability of online related learning data. This has encouraged researchers to develop machine learning (ML)-based models to predict students’ performance during online classes. The study presented in this paper, focuses on predicting student performance during a series of online interactive sessions by considering a dataset collected using digital electronics education and design suite. The dataset tracks the interaction of students during online lab work in terms of text editing, a number of keystrokes, time spent in each activity, etc., along with the exam score achieved per session. Our proposed prediction model consists of extracting a total of 86 novel statistical features, which were semantically categorized in three broad categories based on different criteria: (1) activity type, (2) timing statistics, and (3) peripheral activity count. This set of features were further reduced during the feature selection phase and only influential features were retained for training purposes. Our proposed ML model aims to predict whether a student’s performance will be low or high. Five popular classifiers were used in our study, namely: random forest (RF), support vector machine, Naïve Bayes, logistic regression, and multilayer perceptron. We evaluated our model under three different scenarios: (1) 80:20 random data split for training and testing, (2) fivefold cross-validation, and (3) train the model on all sessions but one which will be used for testing. Results showed that our model achieved the best classification accuracy performance of 97.4% with the RF classifier. We demonstrated that, under similar experimental setup, our model outperformed other existing studies.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Vahdat, M.; Oneto, L.; Anguita, D.; Funk, M.; Rauterberg, M.: A learning analytics approach to correlate the academic achievements of students with interaction data from an educational simulator. In: Design for Teaching and Learning in a Networked World, pp. 352–366. Springer, Cham (2015) Vahdat, M.; Oneto, L.; Anguita, D.; Funk, M.; Rauterberg, M.: A learning analytics approach to correlate the academic achievements of students with interaction data from an educational simulator. In: Design for Teaching and Learning in a Networked World, pp. 352–366. Springer, Cham (2015)
2.
go back to reference Tomasevic, N.; Gvozdenovic, N.; Vranes, S.: An overview and comparison of supervised data mining techniques for student exam performance prediction. Comput. Educ. 143, 103676 (2020)CrossRef Tomasevic, N.; Gvozdenovic, N.; Vranes, S.: An overview and comparison of supervised data mining techniques for student exam performance prediction. Comput. Educ. 143, 103676 (2020)CrossRef
3.
go back to reference Hellas, A.; Ihantola, P.; Petersen; A.; Ajanovski, V.; Gutica, M.; Hynninen, T; Liao, S.N.: Predicting academic performance: a systematic literature review. In: Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, pp. 175–199 (2018) Hellas, A.; Ihantola, P.; Petersen; A.; Ajanovski, V.; Gutica, M.; Hynninen, T; Liao, S.N.: Predicting academic performance: a systematic literature review. In: Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, pp. 175–199 (2018)
4.
go back to reference Hussain, M.; Zhu, W.; Zhang, W.; Abidi, S.M.R.; Ali, S.: Using machine learning to predict student difficulties from learning session data. Artif. Intell. Rev. 52(1), 381–407 (2019)CrossRef Hussain, M.; Zhu, W.; Zhang, W.; Abidi, S.M.R.; Ali, S.: Using machine learning to predict student difficulties from learning session data. Artif. Intell. Rev. 52(1), 381–407 (2019)CrossRef
5.
go back to reference Buenaño-Fernández, D.; Gil, D.; Luján-Mora, S.: Application of machine learning in predicting performance for computer engineering students: a case study. Sustainability 11(10), 2833 (2019)CrossRef Buenaño-Fernández, D.; Gil, D.; Luján-Mora, S.: Application of machine learning in predicting performance for computer engineering students: a case study. Sustainability 11(10), 2833 (2019)CrossRef
6.
go back to reference Ofori, F.; Maina, E.; Gitonga, R.: Using machine learning algorithms to predict students performance and improve learning outcome: a literature based review. J. Inf. Technol. 4(1), 33–55 (2020) Ofori, F.; Maina, E.; Gitonga, R.: Using machine learning algorithms to predict students performance and improve learning outcome: a literature based review. J. Inf. Technol. 4(1), 33–55 (2020)
7.
go back to reference Huang, S.; Fang, N.: Predicting student academic performance in an engineering dynamics course: a comparison of four types of predictive mathematical models. Comput. Educ. 61, 133–145 (2013)CrossRef Huang, S.; Fang, N.: Predicting student academic performance in an engineering dynamics course: a comparison of four types of predictive mathematical models. Comput. Educ. 61, 133–145 (2013)CrossRef
8.
go back to reference Rastrollo-Guerrero, J.L.; Gomez-Pulido, J.A.; Duran-Dominguez, A.: Analyzing and predicting students’ performance by means of machine learning: a review. Appl. Sci. 10(3), 1042 (2020)CrossRef Rastrollo-Guerrero, J.L.; Gomez-Pulido, J.A.; Duran-Dominguez, A.: Analyzing and predicting students’ performance by means of machine learning: a review. Appl. Sci. 10(3), 1042 (2020)CrossRef
9.
go back to reference Sundar, P.P.: A comparative study for predicting students academic performance using Bayesian network classifiers. IOSR J. Eng. IOSRJEN e-ISSN, 2250-3021 (2013) Sundar, P.P.: A comparative study for predicting students academic performance using Bayesian network classifiers. IOSR J. Eng. IOSRJEN e-ISSN, 2250-3021 (2013)
10.
go back to reference Burgos, C.; Campanario, M.L.; de la Peña, D.; Lara, J.A.; Lizcano, D.; Martínez, M.A.: Data mining for modeling students’ performance: a tutoring action plan to prevent academic dropout. Comput. Electr. Eng. 66, 541–556 (2018)CrossRef Burgos, C.; Campanario, M.L.; de la Peña, D.; Lara, J.A.; Lizcano, D.; Martínez, M.A.: Data mining for modeling students’ performance: a tutoring action plan to prevent academic dropout. Comput. Electr. Eng. 66, 541–556 (2018)CrossRef
11.
go back to reference Ma, X.; Zhou, Z.: Student pass rates prediction using optimized support vector machine and decision tree. In: 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), pp. 209–215. IEEE (2018) Ma, X.; Zhou, Z.: Student pass rates prediction using optimized support vector machine and decision tree. In: 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), pp. 209–215. IEEE (2018)
12.
go back to reference Masci, C.; Johnes, G.; Agasisti, T.: Student and school performance across countries: a machine learning approach. Eur. J. Oper. Res. 269(3), 1072–1085 (2018)MathSciNetCrossRef Masci, C.; Johnes, G.; Agasisti, T.: Student and school performance across countries: a machine learning approach. Eur. J. Oper. Res. 269(3), 1072–1085 (2018)MathSciNetCrossRef
13.
go back to reference Pardo, A.; Han, F.; Ellis, R.A.: Combining university student self-regulated learning indicators and engagement with online learning events to predict academic performance. IEEE Trans. Learn. Technol. 10(1), 82–92 (2016)CrossRef Pardo, A.; Han, F.; Ellis, R.A.: Combining university student self-regulated learning indicators and engagement with online learning events to predict academic performance. IEEE Trans. Learn. Technol. 10(1), 82–92 (2016)CrossRef
14.
go back to reference Gray, G.; McGuinness, C.; Owende, P.: An application of classification models to predict learner progression in tertiary education. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 549–554. IEEE (2014) Gray, G.; McGuinness, C.; Owende, P.: An application of classification models to predict learner progression in tertiary education. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 549–554. IEEE (2014)
15.
go back to reference Hussain, M.; Zhu, W.; Zhang, W.; Abidi, S.M.R.: Student engagement predictions in an e-learning system and their impact on student course assessment scores. Comput. Intell. Neurosci. (2018) Hussain, M.; Zhu, W.; Zhang, W.; Abidi, S.M.R.: Student engagement predictions in an e-learning system and their impact on student course assessment scores. Comput. Intell. Neurosci. (2018)
16.
go back to reference Elbadrawy, A.; Studham, R.S.; Karypis, G.: Collaborative multi-regression models for predicting students' performance in course activities. In: Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pp. 103–107 (2015) Elbadrawy, A.; Studham, R.S.; Karypis, G.: Collaborative multi-regression models for predicting students' performance in course activities. In: Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pp. 103–107 (2015)
17.
go back to reference Liu, S.; d'Aquin, M.: Unsupervised learning for understanding student achievement in a distance learning setting. In: 2017 IEEE Global Engineering Education Conference (EDUCON), pp. 1373–1377. IEEE (2017) Liu, S.; d'Aquin, M.: Unsupervised learning for understanding student achievement in a distance learning setting. In: 2017 IEEE Global Engineering Education Conference (EDUCON), pp. 1373–1377. IEEE (2017)
18.
go back to reference Kuzilek, J.; Hlosta, M.; Herrmannova, D.; Zdrahal, Z.; Vaclavek, J.; Wolff, A.: OU Analyse: analysing at-risk students at The Open University. Learn. Analyt. Rev. 1–16 (2015) Kuzilek, J.; Hlosta, M.; Herrmannova, D.; Zdrahal, Z.; Vaclavek, J.; Wolff, A.: OU Analyse: analysing at-risk students at The Open University. Learn. Analyt. Rev. 1–16 (2015)
19.
go back to reference Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. IEEE (1995) Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. IEEE (1995)
20.
go back to reference Bauer, E.; Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach. Learn. 36(1), 105–139 (1999)CrossRef Bauer, E.; Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach. Learn. 36(1), 105–139 (1999)CrossRef
21.
go back to reference Latif, G.; Iskandar, D.A.; Alghazo, J.M.; Mohammad, N.: Enhanced MR image classification using hybrid statistical and wavelets features. IEEE Access 7, 9634–9644 (2018)CrossRef Latif, G.; Iskandar, D.A.; Alghazo, J.M.; Mohammad, N.: Enhanced MR image classification using hybrid statistical and wavelets features. IEEE Access 7, 9634–9644 (2018)CrossRef
22.
go back to reference Suthaharan, S.: Machine learning models and algorithms for big data classification. Integr. Ser. Inf. Syst 36, 1–12 (2016)MathSciNetMATH Suthaharan, S.: Machine learning models and algorithms for big data classification. Integr. Ser. Inf. Syst 36, 1–12 (2016)MathSciNetMATH
23.
go back to reference Misra, S.; Li, H.; He, J.: Machine Learning for subsurface Characterization. Gulf Professional Publishing, Oxford (2019) Misra, S.; Li, H.; He, J.: Machine Learning for subsurface Characterization. Gulf Professional Publishing, Oxford (2019)
24.
go back to reference Bewick, V.; Cheek, L.; Ball, J.: Statistics review 14: logistic regression. Crit. Care 9(1), 1–7 (2005)CrossRef Bewick, V.; Cheek, L.; Ball, J.: Statistics review 14: logistic regression. Crit. Care 9(1), 1–7 (2005)CrossRef
25.
go back to reference Meurer, W.J.; Tolles, J.: Logistic regression diagnostics: understanding how well a model predicts outcomes. JAMA 317(10), 1068–1069 (2017)CrossRef Meurer, W.J.; Tolles, J.: Logistic regression diagnostics: understanding how well a model predicts outcomes. JAMA 317(10), 1068–1069 (2017)CrossRef
26.
go back to reference Rehman, A.; Naz, S.; Razzak, M.I.; Hameed, I.A.: Automatic visual features for writer identification: a deep learning approach. IEEE Access 7, 17149–17157 (2019)CrossRef Rehman, A.; Naz, S.; Razzak, M.I.; Hameed, I.A.: Automatic visual features for writer identification: a deep learning approach. IEEE Access 7, 17149–17157 (2019)CrossRef
27.
go back to reference Trstenjak, B.; Đonko, D.: Determining the impact of demographic features in predicting student success in Croatia. In: 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1222–1227. IEEE (2014) Trstenjak, B.; Đonko, D.: Determining the impact of demographic features in predicting student success in Croatia. In: 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1222–1227. IEEE (2014)
28.
go back to reference Kursa, M.B.; Rudnicki, W.R.: Feature selection with the Boruta package. J Stat Softw 36(11), 1–13 (2010)CrossRef Kursa, M.B.; Rudnicki, W.R.: Feature selection with the Boruta package. J Stat Softw 36(11), 1–13 (2010)CrossRef
29.
go back to reference Shaw, R.G.; Mitchell-Olds, T.: ANOVA for unbalanced data: an overview. Ecology 74(6), 1638–1645 (1993)CrossRef Shaw, R.G.; Mitchell-Olds, T.: ANOVA for unbalanced data: an overview. Ecology 74(6), 1638–1645 (1993)CrossRef
30.
go back to reference Sriram, K.; Chakravarthy, T.; Anastraj, K.: A comparative analysis of student performance prediction using machine learning techniques with DEEDS lab. J. Compos. Theory XII(VIII) (2019) Sriram, K.; Chakravarthy, T.; Anastraj, K.: A comparative analysis of student performance prediction using machine learning techniques with DEEDS lab. J. Compos. Theory XII(VIII) (2019)
31.
go back to reference Maksud, M.; Nesar, A.: Machine learning approaches to digital learning performance analysis. Int. J. Comput. Digit. Syst. 10, 2–9 (2020) Maksud, M.; Nesar, A.: Machine learning approaches to digital learning performance analysis. Int. J. Comput. Digit. Syst. 10, 2–9 (2020)
32.
go back to reference Leena, H. A; Ranim, S. A; Mona, S. A; Dana, K. A; Irfan, U. K; Nida, A.: Predicting Student Academic Performance using Support Vector Machine and Random Forest. 3rd International Conference on Education Technology Management. pp. 100–107 (2020) Leena, H. A; Ranim, S. A; Mona, S. A; Dana, K. A; Irfan, U. K; Nida, A.: Predicting Student Academic Performance using Support Vector Machine and Random Forest. 3rd International Conference on Education Technology Management. pp. 100–107 (2020)
33.
go back to reference Hasan, R.; Sellappan, P.; Salman, M.; Ali, A.; Kamal, U.S.; Mian, U.S.: Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Appl. Sci. 10(11), 3894 (2020)CrossRef Hasan, R.; Sellappan, P.; Salman, M.; Ali, A.; Kamal, U.S.; Mian, U.S.: Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Appl. Sci. 10(11), 3894 (2020)CrossRef
34.
go back to reference Aydoğdu, Ş: Predicting student final performance using artificial neural networks in online learning environments. Educ. Inf. Technol. 25(3), 1913–1927 (2020)CrossRef Aydoğdu, Ş: Predicting student final performance using artificial neural networks in online learning environments. Educ. Inf. Technol. 25(3), 1913–1927 (2020)CrossRef
35.
go back to reference Biesiada, J.; Włodzisław D.; Adam K.; Krystian M.; Sebastian P.: Feature ranking methods based on information entropy with parzen windows. In: International Conference on Research in Electrotechnology and Applied Informatics, vol. 1, p. 1. (2005) Biesiada, J.; Włodzisław D.; Adam K.; Krystian M.; Sebastian P.: Feature ranking methods based on information entropy with parzen windows. In: International Conference on Research in Electrotechnology and Applied Informatics, vol. 1, p. 1. (2005)
36.
go back to reference Horino, H,; Hirofumi, N.; Elisa, C.A.C.; Toru, H.: Development of an entropy-based feature selection method and analysis of online reviews on real estate. In: IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), pp. 2351–2355. IEEE (2017) Horino, H,; Hirofumi, N.; Elisa, C.A.C.; Toru, H.: Development of an entropy-based feature selection method and analysis of online reviews on real estate. In: IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), pp. 2351–2355. IEEE (2017)
Metadata
Title
Predicting Student Performance from Online Engagement Activities Using Novel Statistical Features
Author
Ghassen Ben Brahim
Publication date
17-01-2022
Publisher
Springer Berlin Heidelberg
Published in
Arabian Journal for Science and Engineering / Issue 8/2022
Print ISSN: 2193-567X
Electronic ISSN: 2191-4281
DOI
https://doi.org/10.1007/s13369-021-06548-w

Other articles of this Issue 8/2022

Arabian Journal for Science and Engineering 8/2022 Go to the issue

Research Article-Computer Engineering and Computer Science

A Recommender System Integrating Long Short-Term Memory and Latent Factor

Research Article-Computer Engineering and Computer Science

Fingerprint Denoising Using Iterative Rule-Based Filter

Research Article-Computer Engineering and Computer Science

IRText: An Item Response Theory-Based Approach for Text Categorization

Premium Partners