Skip to main content
Top
Published in: Knowledge and Information Systems 5/2022

07-04-2022 | Regular Paper

Modeling and predicting students’ engagement behaviors using mixture Markov models

Authors: Rabia Maqsood, Paolo Ceravolo, Cristóbal Romero, Sebastián Ventura

Published in: Knowledge and Information Systems | Issue 5/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Students’ engagements reflect their level of involvement in an ongoing learning process which can be estimated through their interactions with a computer-based learning or assessment system. A pre-requirement for stimulating student engagement lies in the capability to have an approximate representation model for comprehending students’ varied (dis)engagement behaviors. In this paper, we utilized model-based clustering for this purpose which generates \(K\) mixture Markov models to group students’ traces containing their (dis)engagement behavioral patterns. To prevent the Expectation–Maximization (EM) algorithm from getting stuck in a local maxima, we also introduced a K-means-based initialization method named as K-EM. We performed an experimental work on two real datasets using the three variants of the EM algorithm: the original EM, emEM, K-EM; and, non-mixture baseline models for both datasets. The proposed K-EM has shown very promising results and achieved significant performance difference in comparison with the other approaches particularly using the Dataset1. Hence, we suggest to perform further experiments using large dataset(s) to validate our method. Additionally, visualization of the resultant clusters through first-order Markov chains reveals very useful insights about (dis)engagement behaviors depicted by the students. We conclude the paper with a discussion on the usefulness of our approach, limitations and potential extensions of this work.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
We used binary scale for confidence measurement instead of a more complex rating (e.g., percentage rating between 0 and 100, Likert scale response, etc.), which may confuse students in estimating their confidence about solution’s correctness [46, 53].
 
2
We have provided the R code implementation of this step at GitHub; https://​github.​com/​r-maqsood/​Mixture-Markov-Models-R.
 
3
We referred to the value of \(K\) used in K-means algorithm as \(K'\) to differentiate it from the optimal value of \(K\) used for the EM and emEM algorithms.
 
4
These results are not reported in Table 8 for conciseness.
 
5
All plots were drawn using r-igraph: https://​igraph.​org/​r/​.
 
6
Furthermore, states are filled with different colors to highlight their meanings. For example, engagement behavior reflected with either confidence level is represented by two states, FG and LE, which are given the same color (yellow) in the images. Similarly, states representing disengagement behaviors: KG and NI, are shaded with the same color (blue). High knowledge (HK) and less knowledge (LK) states are differentiated with gray and white colors, respectively; see colored pictures in online PDF version.
 
Literature
1.
go back to reference Akaike H (1998) Information theory and an extension of the maximum likelihood principle. Selected papers of Hirotugu Akaike. Springer, Berlin, pp 199–213CrossRef Akaike H (1998) Information theory and an extension of the maximum likelihood principle. Selected papers of Hirotugu Akaike. Springer, Berlin, pp 199–213CrossRef
2.
go back to reference Anderson E (2017) Measurement of online student engagement: Utilization of continuous online student behaviors as items in a partial credit Rasch model. PhD thesis, Morgridge College of Education, University of Denver, USA, Electronic Theses and Dissertations. 1248 Anderson E (2017) Measurement of online student engagement: Utilization of continuous online student behaviors as items in a partial credit Rasch model. PhD thesis, Morgridge College of Education, University of Denver, USA, Electronic Theses and Dissertations. 1248
3.
go back to reference Beal CR, Qu L, Lee H (2006) Classifying learner engagement through integration of multiple data sources. In: AAAI, pp 151–156 Beal CR, Qu L, Lee H (2006) Classifying learner engagement through integration of multiple data sources. In: AAAI, pp 151–156
4.
go back to reference Beal C, Mitra S, Cohen P (2007) Modeling learning patterns of students with a tutoring system using hidden Markov model. In: Luckin R et al (eds) Proceedings of the 13th international conference on Artificial intelligence in education (AIED). Marina del Rey Beal C, Mitra S, Cohen P (2007) Modeling learning patterns of students with a tutoring system using hidden Markov model. In: Luckin R et al (eds) Proceedings of the 13th international conference on Artificial intelligence in education (AIED). Marina del Rey
5.
go back to reference Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate gaussian mixture models. Comput Stat Data Anal 41(3–4):561–575MathSciNetCrossRef Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate gaussian mixture models. Comput Stat Data Anal 41(3–4):561–575MathSciNetCrossRef
6.
go back to reference Boroujeni MS, Dillenbourg P (2018) Discovery and temporal analysis of latent study patterns in MOOC interaction sequences. In: Proceedings of the 8th international conference on learning analytics and knowledge. ACM, pp 206–215 Boroujeni MS, Dillenbourg P (2018) Discovery and temporal analysis of latent study patterns in MOOC interaction sequences. In: Proceedings of the 8th international conference on learning analytics and knowledge. ACM, pp 206–215
7.
go back to reference Botelho AF, Baker RS, Heffernan NT (2019) Machine-learned or expert-engineered features? Exploring feature engineering methods in detectors of student behavior and affect. In: The twelfth international conference on educational data mining Botelho AF, Baker RS, Heffernan NT (2019) Machine-learned or expert-engineered features? Exploring feature engineering methods in detectors of student behavior and affect. In: The twelfth international conference on educational data mining
8.
go back to reference Bouchet F, Harley JM, Trevors GJ, Azevedo R (2013) Clustering and profiling students according to their interactions with an intelligent tutoring system fostering self-regulated learning. JEDM J Educ Data Min 5(1):104–146 Bouchet F, Harley JM, Trevors GJ, Azevedo R (2013) Clustering and profiling students according to their interactions with an intelligent tutoring system fostering self-regulated learning. JEDM J Educ Data Min 5(1):104–146
9.
go back to reference Bouvier P, Sehaba K, Lavoué É (2014) A trace-based approach to identifying users’ engagement and qualifying their engaged-behaviours in interactive systems: application to a social game. User Model User Adapt Interact 24(5):413–451CrossRef Bouvier P, Sehaba K, Lavoué É (2014) A trace-based approach to identifying users’ engagement and qualifying their engaged-behaviours in interactive systems: application to a social game. User Model User Adapt Interact 24(5):413–451CrossRef
10.
go back to reference Brown LN, Howard AM (2014) A real-time model to assess student engagement during interaction with intelligent educational agents. In: 2014 ASEE annual conference & exposition, pp 24–95 Brown LN, Howard AM (2014) A real-time model to assess student engagement during interaction with intelligent educational agents. In: 2014 ASEE annual conference & exposition, pp 24–95
11.
go back to reference Cadez I, Heckerman D, Meek C et al (2003) Model-based clustering and visualization of navigation patterns on a web site. Data Min Knowl Discov 7(4):399–424MathSciNetCrossRef Cadez I, Heckerman D, Meek C et al (2003) Model-based clustering and visualization of navigation patterns on a web site. Data Min Knowl Discov 7(4):399–424MathSciNetCrossRef
12.
go back to reference Chapman E (2003) Alternative approaches to assessing student engagement rates. Pract Assess 8(13):1–7 Chapman E (2003) Alternative approaches to assessing student engagement rates. Pract Assess 8(13):1–7
13.
go back to reference Charrad M, Ghazzali N, Boiteau V, Niknafs A (2012) Nbclust package: finding the relevant number of clusters in a dataset. UseR! 2012 Charrad M, Ghazzali N, Boiteau V, Niknafs A (2012) Nbclust package: finding the relevant number of clusters in a dataset. UseR! 2012
14.
go back to reference Cocea M, Weibelzahl S (2007) Cross-system validation of engagement prediction from log files. European conference on technology enhanced learning. Springer, Berlin, pp 14–25 Cocea M, Weibelzahl S (2007) Cross-system validation of engagement prediction from log files. European conference on technology enhanced learning. Springer, Berlin, pp 14–25
15.
go back to reference Cocea M, Weibelzahl S (2009) Log file analysis for disengagement detection in e-learning environments. User Model User Adapt Interact 19(4):341–385CrossRef Cocea M, Weibelzahl S (2009) Log file analysis for disengagement detection in e-learning environments. User Model User Adapt Interact 19(4):341–385CrossRef
16.
go back to reference Cocea M, Weibelzahl S (2011) Disengagement detection in online learning: Validation studies and perspectives. IEEE Trans Learn Technol 4(2):114–124CrossRef Cocea M, Weibelzahl S (2011) Disengagement detection in online learning: Validation studies and perspectives. IEEE Trans Learn Technol 4(2):114–124CrossRef
17.
go back to reference Cohen PR, Beal CR (2009) Temporal data mining for educational applications. Int J Softw Inform 3(1):31–46 Cohen PR, Beal CR (2009) Temporal data mining for educational applications. Int J Softw Inform 3(1):31–46
18.
go back to reference Cumming G, Finch S (2005) Inference by eye: confidence intervals and how to read pictures of data. Am Psychol 60(2):170CrossRef Cumming G, Finch S (2005) Inference by eye: confidence intervals and how to read pictures of data. Am Psychol 60(2):170CrossRef
19.
go back to reference Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–22MathSciNetMATH Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–22MathSciNetMATH
20.
go back to reference Desmarais MC, Baker RS (2012) A review of recent advances in learner and skill modeling in intelligent learning environments. User Model User Adapt Interact 22(1–2):9–38CrossRef Desmarais MC, Baker RS (2012) A review of recent advances in learner and skill modeling in intelligent learning environments. User Model User Adapt Interact 22(1–2):9–38CrossRef
21.
go back to reference Dziak JJ, Coffman DL, Lanza ST et al (2019) Sensitivity and specificity of information criteria. bioRxiv, p 449751 Dziak JJ, Coffman DL, Lanza ST et al (2019) Sensitivity and specificity of information criteria. bioRxiv, p 449751
22.
go back to reference Fok AW, Wong HS, Chen Y (2005) Hidden Markov model based characterization of content access patterns in an e-learning environment. In: 2005 IEEE international conference on multimedia and expo. IEEE, pp 201–204 Fok AW, Wong HS, Chen Y (2005) Hidden Markov model based characterization of content access patterns in an e-learning environment. In: 2005 IEEE international conference on multimedia and expo. IEEE, pp 201–204
23.
go back to reference Fredricks JA, Blumenfeld PC, Paris AH (2004) School engagement: potential of the concept, state of the evidence. Rev Educ Res 74(1):59–109CrossRef Fredricks JA, Blumenfeld PC, Paris AH (2004) School engagement: potential of the concept, state of the evidence. Rev Educ Res 74(1):59–109CrossRef
24.
go back to reference Gardner-Medwin AR, Gahan M (2003) Formative and summative confidence-based assessment. Loughborough University Gardner-Medwin AR, Gahan M (2003) Formative and summative confidence-based assessment. Loughborough University
25.
go back to reference Gupta MR, Chen Y et al (2011) Theory and use of the EM algorithm. Found Trends® Signal Process 4(3):223–296 Gupta MR, Chen Y et al (2011) Theory and use of the EM algorithm. Found Trends® Signal Process 4(3):223–296
26.
go back to reference Hansen C, Hansen C, Hjuler N et al. (2017) Sequence modelling for analysing student interaction with educational systems. In: Proceedings of the 10th international conference on educational data mining (2017), pp 232–237 Hansen C, Hansen C, Hjuler N et al. (2017) Sequence modelling for analysing student interaction with educational systems. In: Proceedings of the 10th international conference on educational data mining (2017), pp 232–237
27.
go back to reference Hershkovitz A, Nachmias R (2009) Learning about online learning processes and students’ motivation through web usage mining. Interdiscip J E-Learning Learn Objects 5(1):197–214 Hershkovitz A, Nachmias R (2009) Learning about online learning processes and students’ motivation through web usage mining. Interdiscip J E-Learning Learn Objects 5(1):197–214
28.
go back to reference Hu Z (2015) Initializing the EM algorithm for data clustering and sub-population detection. PhD thesis, The Ohio State University Hu Z (2015) Initializing the EM algorithm for data clustering and sub-population detection. PhD thesis, The Ohio State University
29.
go back to reference Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304CrossRef Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 2(3):283–304CrossRef
30.
go back to reference Hunt DP (2003) The concept of knowledge and how to measure it. J intellect Cap 4(1):100–113CrossRef Hunt DP (2003) The concept of knowledge and how to measure it. J intellect Cap 4(1):100–113CrossRef
31.
go back to reference Joseph E (2005) Engagement tracing: using response times to model student disengagement. Artif Intell Educ Support Learn Intell Soc Inf Technol 125:88 Joseph E (2005) Engagement tracing: using response times to model student disengagement. Artif Intell Educ Support Learn Intell Soc Inf Technol 125:88
32.
go back to reference Khalil F, Wang H, Li J (2007) Integrating Markov model with clustering for predicting web page accesses. In: Proceeding of the 13th Australasian world wide web conference (AusWeb07), AusWeb, pp 63–74 Khalil F, Wang H, Li J (2007) Integrating Markov model with clustering for predicting web page accesses. In: Proceeding of the 13th Australasian world wide web conference (AusWeb07), AusWeb, pp 63–74
33.
go back to reference Köck M, Paramythis A (2011) Activity sequence modelling and dynamic clustering for personalized e-learning. User Model User Adapt Interact 21(1–2):51–97CrossRef Köck M, Paramythis A (2011) Activity sequence modelling and dynamic clustering for personalized e-learning. User Model User Adapt Interact 21(1–2):51–97CrossRef
34.
go back to reference Lopez MI, Luna JM, Romero C, Ventura S (2012) Classification via clustering for predicting final marks based on student participation in forums. In: International educational data mining society Lopez MI, Luna JM, Romero C, Ventura S (2012) Classification via clustering for predicting final marks based on student participation in forums. In: International educational data mining society
35.
go back to reference Magidson J, Vermunt J (2002) Latent class models for clustering: a comparison with k-means. Can J Mark Res 20(1):36–43 Magidson J, Vermunt J (2002) Latent class models for clustering: a comparison with k-means. Can J Mark Res 20(1):36–43
36.
go back to reference Maqsood R, Ceravolo P (2018) Modeling behavioral dynamics in confidence-based assessment. In: 2018 IEEE 18th international conference on advanced learning technologies (ICALT). IEEE, pp 452–454 Maqsood R, Ceravolo P (2018) Modeling behavioral dynamics in confidence-based assessment. In: 2018 IEEE 18th international conference on advanced learning technologies (ICALT). IEEE, pp 452–454
37.
go back to reference Maqsood R, Ceravolo P (2019) Corrective feedback and its implications on students’ confidence-based assessment. Technology enhanced assessment 2018–communications in computer and information science (CCIS). Springer, Berlin, pp 55–72 Maqsood R, Ceravolo P (2019) Corrective feedback and its implications on students’ confidence-based assessment. Technology enhanced assessment 2018–communications in computer and information science (CCIS). Springer, Berlin, pp 55–72
38.
go back to reference Maqsood R, Ceravolo P, Ventura S (2019) Discovering students’ engagement behaviors in confidence-based assessment. In: 2019 IEEE global engineering education conference (EDUCON). IEEE, pp 841–846 Maqsood R, Ceravolo P, Ventura S (2019) Discovering students’ engagement behaviors in confidence-based assessment. In: 2019 IEEE global engineering education conference (EDUCON). IEEE, pp 841–846
39.
go back to reference Melnykov V (2016) Clickclust: an R package for model-based clustering of categorical sequences. J Stat Softw 74(i09) Melnykov V (2016) Clickclust: an R package for model-based clustering of categorical sequences. J Stat Softw 74(i09)
40.
41.
go back to reference Michael S, Melnykov V (2016) An effective strategy for initializing the EM algorithm in finite mixture models. Adv Data Anal Classif 10(4):563–583MathSciNetCrossRef Michael S, Melnykov V (2016) An effective strategy for initializing the EM algorithm in finite mixture models. Adv Data Anal Classif 10(4):563–583MathSciNetCrossRef
42.
go back to reference Muldner K, Burleson W, Van de Sande B, VanLehn K (2011) An analysis of students’ gaming behaviors in an intelligent tutoring system: predictors and impacts. User Model User Adapt Interact 21(1–2):99–135CrossRef Muldner K, Burleson W, Van de Sande B, VanLehn K (2011) An analysis of students’ gaming behaviors in an intelligent tutoring system: predictors and impacts. User Model User Adapt Interact 21(1–2):99–135CrossRef
43.
go back to reference Pardos ZA, Baker RS, San Pedro M et al (2014) Affective states and state tests: investigating how affect and engagement during the school year predict end-of-year learning outcomes. J Learn Anal 1(1):107–128CrossRef Pardos ZA, Baker RS, San Pedro M et al (2014) Affective states and state tests: investigating how affect and engagement during the school year predict end-of-year learning outcomes. J Learn Anal 1(1):107–128CrossRef
44.
go back to reference Park J, Yu R, Rodriguez F, et al (2018) Understanding student procrastination via mixture models. In: Proceedings of the 11th international conference on educational data mining (2018) Park J, Yu R, Rodriguez F, et al (2018) Understanding student procrastination via mixture models. In: Proceedings of the 11th international conference on educational data mining (2018)
45.
go back to reference Pelánek R (2018) The details matter: methodological nuances in the evaluation of student models. User Model User Adapt Interact 28(3):207–235CrossRef Pelánek R (2018) The details matter: methodological nuances in the evaluation of student models. User Model User Adapt Interact 28(3):207–235CrossRef
46.
go back to reference Petr DW (2000) Measuring (and enhancing?) student confidence with confidence scores. In: Frontiers in education conference, 2000. FIE 2000. 30th Annual, IEEE, vol 1, pp T4B–1 Petr DW (2000) Measuring (and enhancing?) student confidence with confidence scores. In: Frontiers in education conference, 2000. FIE 2000. 30th Annual, IEEE, vol 1, pp T4B–1
47.
go back to reference Rabiner LR, Juang BH (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16CrossRef Rabiner LR, Juang BH (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16CrossRef
48.
go back to reference Romero C, Ventura S, De Bra P (2004) Knowledge discovery with genetic programming for providing feedback to courseware authors. User Model User Adapt Interact 14(5):425–464CrossRef Romero C, Ventura S, De Bra P (2004) Knowledge discovery with genetic programming for providing feedback to courseware authors. User Model User Adapt Interact 14(5):425–464CrossRef
49.
go back to reference Salzberg SL (1997) On comparing classifiers: pitfalls to avoid and a recommended approach. Data Min Knowl Discov 1(3):317–328CrossRef Salzberg SL (1997) On comparing classifiers: pitfalls to avoid and a recommended approach. Data Min Knowl Discov 1(3):317–328CrossRef
51.
go back to reference Tan L, Sun X, Khoo ST (2014) Can engagement be compared? Measuring academic engagement for comparison. In: EDM, pp 213–216 Tan L, Sun X, Khoo ST (2014) Can engagement be compared? Measuring academic engagement for comparison. In: EDM, pp 213–216
52.
go back to reference Taraghi B, Saranti A, Ebner M et al (2015) Towards a learning-aware application guided by hierarchical classification of learner profiles. J UCS 21(1):93–109 Taraghi B, Saranti A, Ebner M et al (2015) Towards a learning-aware application guided by hierarchical classification of learner profiles. J UCS 21(1):93–109
53.
go back to reference Vasilyeva E, Pechenizkiy M, De Bra P (2008) Tailoring of feedback in web-based learning: the role of response certitude in the assessment. Intelligent tutoring systems. Springer, Berlin, pp 771–773CrossRef Vasilyeva E, Pechenizkiy M, De Bra P (2008) Tailoring of feedback in web-based learning: the role of response certitude in the assessment. Intelligent tutoring systems. Springer, Berlin, pp 771–773CrossRef
54.
go back to reference Vogt KL (2016) Measuring student engagement using learning management systems. PhD thesis, University of Toronto, Canada Vogt KL (2016) Measuring student engagement using learning management systems. PhD thesis, University of Toronto, Canada
Metadata
Title
Modeling and predicting students’ engagement behaviors using mixture Markov models
Authors
Rabia Maqsood
Paolo Ceravolo
Cristóbal Romero
Sebastián Ventura
Publication date
07-04-2022
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 5/2022
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-022-01674-9

Other articles of this Issue 5/2022

Knowledge and Information Systems 5/2022 Go to the issue

Premium Partner