Introduction
Background
Educational early warning systems
Methodology
Participants and data collection
Event Type | Number of Logs |
---|---|
open | 2121 |
next | 46,176 |
previous | 17,025 |
jump | 470 |
redmarker | 51 |
yellowmarker | 82 |
memo | 416 |
bookmark | 93 |
other | 1491 |
total | 67,925 |
Data preprocess
Features | Description |
---|---|
totalevent | Total number of events |
content | Number of different contents studied by the student |
session | Number of reading sessions by the student |
time | Total time spend on eBook system in minutes |
week | Number of different weeks that student use the system |
day | Number of different days that student use the system |
completionrate | Average completion rate of all books |
longevent | Number of events longer than 3 s |
shortevent | Number of events less than or equal to 3 s |
next | Number of Next events |
previous | Number of Previous events |
jump | Number of Jump events |
redmarker | Number of red markers added by the student |
yellowmarker | Number of yellow markers added by the student |
memo | Number of memos added by the student |
bookmark | Number of bookmarks added by the student |
score | Academic performance of students at the end of the semester |
Data analysis
Results
Descriptive statistics and data preprocess
Features | High Performers (n = 45) | Low Performers (n = 45) | t Test | ||
---|---|---|---|---|---|
mean | SD | mean | SD | t-value | |
contentcount | 9.33 | 3.84 | 5.2 | 5.15 | t = 4.31 (p < 0.001) |
sessioncount | 11.62 | 5.82 | 4.73 | 4.83 | t = 6.11 (p < 0.001) |
totaltime | 320 | 288 | 93.44 | 109 | t = 4.92 (p < 0.001) |
totalevent | 1068 | 1093 | 441 | 607 | t = 3.36 (p < 0.01) |
uniqueweek | 6.24 | 2.47 | 2.58 | 2.58 | t = 6.88 (p < 0.001) |
uniqueday | 7.44 | 3.30 | 3.02 | 2.97 | t = 6.68 (p < 0.001) |
completionrate | 67.13 | 24.77 | 40.84 | 34.69 | t = 4.13 (p < 0.001) |
longevent | 328 | 330 | 130 | 172 | t = 3.57 (p < 0.001) |
shortevent | 739 | 789 | 311 | 456 | t = 3.15 (p < 0.01) |
next | 719 | 694 | 307 | 425 | t = 3.39 (p < 0.01) |
prev | 272 | 305 | 105 | 155 | t = 3.26 (p < 0.01) |
open | 32.67 | 26.38 | 14.47 | 17.03 | t = 3.88 (p < 0.001) |
jump* | 8.93 | 17.29 | 1.51 | 4.19 | t = 2.79 (p < 0.01) |
marker* | 2.36 | 4.28 | 0.6 | 2.09 | t = 2.47 (p < 0.05) |
memo* | 8.91 | 42.32 | 0.33 | 1.55 | t = 1.36 (p < 0.5) |
bookmark* | 0.44 | 0.72 | 3.42 | 15.26 | t = −1.3 (p < 0.5) |
score | 84.58 | 11.25 | 47.71 | 19.72 | t = 10.8 (p < 0.001) |
Prediction models
Prediction models with all data
Algorithm | Raw Data | Transformed Data | Categorical Data | |||
---|---|---|---|---|---|---|
Accuracy | Kappa | Accuracy | Kappa | Accuracy | Kappa | |
Adaboost | 0.790 | 0.580 | 0.783 | 0.567 | 0.748 | 0.494 |
bartMachine | 0.811 | 0.620 |
0.813
|
0.625
|
0.792
|
0.583
|
gbm | 0.795 | 0.589 | 0.793 | 0.586 | 0.782 | 0.565 |
glm | 0.753 | 0.504 | 0.728 | 0.454 | 0.680 | 0.359 |
J48 | 0.813 | 0.625 |
0.833
|
0.665
| 0.766 | 0.530 |
JRip | 0.795 | 0.587 | 0.782 | 0.564 | 0.755 | 0.509 |
knn |
0.813
|
0.627
| 0.798 | 0.595 |
0.805
|
0.611
|
naive_bayes |
0.823
|
0.646
| 0.801 | 0.601 |
0.811
|
0.621
|
nnet | 0.710 | 0.420 | 0.780 | 0.558 | 0.754 | 0.505 |
rf |
0.823
|
0.647
|
0.824
|
0.644
| 0.782 | 0.563 |
rpart | 0.752 | 0.501 | 0.759 | 0.516 | 0.727 | 0.454 |
svmLinear | 0.776 | 0.550 | 0.749 | 0.498 | 0.684 | 0.369 |
xgbLinear | 0.798 | 0.596 | 0.780 | 0.560 | 0.726 | 0.452 |
Prediction models with weekly data
Raw Data - RF | Transformed Data - J48 | Categorical Data - NB | ||||
---|---|---|---|---|---|---|
Week | Accuracy | Kappa | Accuracy | Kappa | Accuracy | Kappa |
W1 |
0.735
|
0.466
| 0.674 | 0.345 | 0.725 | 0.446 |
W2 |
0.737
|
0.475
| 0.705 | 0.412 | 0.720 | 0.443 |
W3 |
0.790
|
0.580
| 0.718 | 0.435 | 0.775 | 0.548 |
W6 | 0.803 | 0.606 | 0.778 | 0.554 |
0.809
|
0.617
|
W8 |
0.801
|
0.600
| 0.711 | 0.416 | 0.796 | 0.590 |
W9 | 0.798 | 0.596 | 0.739 | 0.478 |
0.819
|
0.637
|
W14 |
0.826
|
0.651
| 0.818 | 0.635 | 0.800 | 0.598 |
W15 |
0.840
|
0.679
| 0.803 | 0.605 | 0.826 | 0.651 |
W16 |
0.777
|
0.551
| 0.697 | 0.394 | 0.718 | 0.438 |
Raw Data - RF | Categorical Data - NB | ||||||
---|---|---|---|---|---|---|---|
Week | Class | HP | LP | Total | HP | LP | Total |
W3 | HP | 40.0% | 10.9% | 50.9% | 36.0% | 8.4% | 44.4% |
LP | 10.0% | 39.1% | 49.1% | 14.0% | 41.6% | 55.6% | |
Total | 50% | 50% | 100% | 50% | 50% | 100% | |
W6 | HP | 39.9% | 9.6% | 49.5% | 33.2% | 2.3% | 35.5% |
LP | 10.1% | 40.4% | 50.5% | 16.8% | 47.7% | 64.5% | |
Total | 50% | 50% | 100% | 50% | 50% | 100% | |
W15 | HP | 43.9% | 7.9% | 51.8% | 38.3% | 5.8% | 44.1% |
LP | 8.1% | 42.1% | 50.2% | 11.7% | 44.2% | 55.9% | |
Total | 50% | 50% | 100% | 50% | 50% | 100% |