Skip to main content
Top
Published in: Journal of Science Education and Technology 2/2021

04-01-2021

Testing the Impact of Novel Assessment Sources and Machine Learning Methods on Predictive Outcome Modeling in Undergraduate Biology

Authors: Roberto Bertolini, Stephen J. Finch, Ross H. Nehm

Published in: Journal of Science Education and Technology | Issue 2/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

High levels of attrition characterize undergraduate science courses in the USA. Predictive analytics research seeks to build models that identify at-risk students and suggest interventions that enhance student success. This study examines whether incorporating a novel assessment type (concept inventories [CI]) and using machine learning (ML) methods (1) improves prediction quality, (2) reduces the time point of successful prediction, and (3) suggests more actionable course-level interventions. A corpus of university and course-level assessment and non-assessment variables (53 variables in total) from 3225 students (over six semesters) was gathered. Five ML methods were employed (two individuals, three ensembles) at three time points (pre-course, week 3, week 6) to quantify predictive efficacy. Inclusion of course-specific CI data along with university-specific corpora significantly improved prediction performance. Ensemble ML methods, in particular the generalized linear model with elastic net (GLMNET), yielded significantly higher area under the curve (AUC) values compared with non-ensemble techniques. Logistic regression achieved the poorest prediction performance and consistently underperformed. Surprisingly, increasing corpus size (i.e., amount of historical data) did not meaningfully impact prediction success. We discuss the roles that novel assessment types and ML techniques may play in advancing predictive learning analytics and addressing attrition in undergraduate science education.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Ade, R., & Deshmukh, P.R. (2014, October). Classification of students by using an incremental ensemble of classifiers. In Proceedings of the 3rd International Conference on Reliability, Infocom Technologies and Optimization (pp. 1-5). IEEE. Ade, R., & Deshmukh, P.R. (2014, October). Classification of students by using an incremental ensemble of classifiers. In Proceedings of the 3rd International Conference on Reliability, Infocom Technologies and Optimization (pp. 1-5). IEEE.
go back to reference Adekitan, A. I., & Noma-Osaghae, E. (2019). Data mining approach to predicting the performance of first year student in a university using the admissions requirement. Education and Information Technologies, 24(2), 1527–1543.CrossRef Adekitan, A. I., & Noma-Osaghae, E. (2019). Data mining approach to predicting the performance of first year student in a university using the admissions requirement. Education and Information Technologies, 24(2), 1527–1543.CrossRef
go back to reference Alexandro, D. (2018). Aiming for Success: Evaluating Statistical and Machine Learning Methods to Predict High School Student Performance and Improve Early Warning Systems. (Doctoral Dissertation). University of Connecticut, Storrs, Connecticut. Alexandro, D. (2018). Aiming for Success: Evaluating Statistical and Machine Learning Methods to Predict High School Student Performance and Improve Early Warning Systems. (Doctoral Dissertation). University of Connecticut, Storrs, Connecticut.
go back to reference Allensworth, E. M., & Easton, J. Q. (2005). The on-track indicator as a predictor of high school graduation. Chicago, Illinois: Consortium on Chicago School Research. Allensworth, E. M., & Easton, J. Q. (2005). The on-track indicator as a predictor of high school graduation. Chicago, Illinois: Consortium on Chicago School Research.
go back to reference Al-Shabandar, R., Hussain, A., Laws, A., Keight, R., Lunn, J., & Radi, N. (2017). Machine learning approaches to predict learning outcomes in Massive open online courses. 2017 International Joint Conference on Neural Networks (IJCNN) (pp. 713–720). Anchorage: IEEE. Al-Shabandar, R., Hussain, A., Laws, A., Keight, R., Lunn, J., & Radi, N. (2017). Machine learning approaches to predict learning outcomes in Massive open online courses. 2017 International Joint Conference on Neural Networks (IJCNN) (pp. 713–720). Anchorage: IEEE.
go back to reference Ambler, G., Omar, R. Z., & Royston, P. (2007). A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Statistical methods in medical research, 16(3), 277–298.CrossRef Ambler, G., Omar, R. Z., & Royston, P. (2007). A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Statistical methods in medical research, 16(3), 277–298.CrossRef
go back to reference American Association for the Advancement of Science (2011). Vision and change in undergraduate biology education. AAAS, Washington D.C. American Association for the Advancement of Science (2011). Vision and change in undergraduate biology education. AAAS, Washington D.C.
go back to reference Amrieh, E. A., Hamtini, T., & Alijarah, I. (2016). Mining educational data to predict student’s academic performance using ensemble methods. International Journal of Database Theory and Application, 9(8), 119–136.CrossRef Amrieh, E. A., Hamtini, T., & Alijarah, I. (2016). Mining educational data to predict student’s academic performance using ensemble methods. International Journal of Database Theory and Application, 9(8), 119–136.CrossRef
go back to reference Anderson, D. L., Fisher, K. M., & Norman, G. J. (2002). Development and evaluation of the conceptual inventory of natural selection. Journal of research in science teaching, 39(10), 952–978.CrossRef Anderson, D. L., Fisher, K. M., & Norman, G. J. (2002). Development and evaluation of the conceptual inventory of natural selection. Journal of research in science teaching, 39(10), 952–978.CrossRef
go back to reference Aulck, L., Aras, R., Li, L., L’Heureux, C., Lu, P., & West, J. (2017). STEM-ming the tide: Predicting STEM attrition using student transcript data. Knowledge Discovery and Data Mining (KDD): Halifax. Aulck, L., Aras, R., Li, L., L’Heureux, C., Lu, P., & West, J. (2017). STEM-ming the tide: Predicting STEM attrition using student transcript data. Knowledge Discovery and Data Mining (KDD): Halifax.
go back to reference Baker, M. (2016). Reproducibility crisis. Nature, 533(26), 353–366. Baker, M. (2016). Reproducibility crisis. Nature, 533(26), 353–366.
go back to reference Baker, R. (2010). Data mining for education. International Encyclopedia of Education, 7(3), 112–118.CrossRef Baker, R. (2010). Data mining for education. International Encyclopedia of Education, 7(3), 112–118.CrossRef
go back to reference Bayer, J., Bydzovská, H., Géryk, J., Obšıvac, T., & Popelinský, L. (2012). Predicting Drop-Out from Social Behaviour of Students. Proceedings of the 5th International Conference on Educational Data Mining - EDM 2012, (pp. 103–109). Chania, Greece. Bayer, J., Bydzovská, H., Géryk, J., Obšıvac, T., & Popelinský, L. (2012). Predicting Drop-Out from Social Behaviour of Students. Proceedings of the 5th International Conference on Educational Data Mining - EDM 2012, (pp. 103–109). Chania, Greece.
go back to reference Beck, H. P., & Davidson, W. D. (2001). Establishing an early warning system: Predicting low grades in college students from survey of academic orientations scores. Research in Higher Education, 42(6), 709–723.CrossRef Beck, H. P., & Davidson, W. D. (2001). Establishing an early warning system: Predicting low grades in college students from survey of academic orientations scores. Research in Higher Education, 42(6), 709–723.CrossRef
go back to reference Beemer, J., Spoon, K., He, L., Fan, J., & Levine, R. (2018). Ensemble learning for estimating individualized treatment effects in student success studies. International Journal of Artificial Intelligence in Education, 28(3), 315–335.CrossRef Beemer, J., Spoon, K., He, L., Fan, J., & Levine, R. (2018). Ensemble learning for estimating individualized treatment effects in student success studies. International Journal of Artificial Intelligence in Education, 28(3), 315–335.CrossRef
go back to reference Beggrow, E. P., Ha, M., Nehm, R. H., Pearl, D., & Boone, W. J. (2014). Assessing scientific practices using machine-learning methods: How closely do they match clinic interview performance? Journal of Science Education and Technology, 23(1), 160–182.CrossRef Beggrow, E. P., Ha, M., Nehm, R. H., Pearl, D., & Boone, W. J. (2014). Assessing scientific practices using machine-learning methods: How closely do they match clinic interview performance? Journal of Science Education and Technology, 23(1), 160–182.CrossRef
go back to reference Bekkar, M., Djemaa, H. K., & Alitouche, T. A. (2013). Evaluation measures for models assessment over imbalanced data sets. Journal of Information Engineering and Applications, 3(10), 27–38. Bekkar, M., Djemaa, H. K., & Alitouche, T. A. (2013). Evaluation measures for models assessment over imbalanced data sets. Journal of Information Engineering and Applications, 3(10), 27–38.
go back to reference Bennett, R. E. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy, & Practice, 18(1), 5–25. Bennett, R. E. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy, & Practice, 18(1), 5–25.
go back to reference Boyd, D., & Crawford, K. (2011). Six provocations for big data. A decade in internet time: Symposium on the dynamics of the internet and society (Volume 21). Oxford, UK: Oxford Internet Institute. Boyd, D., & Crawford, K. (2011). Six provocations for big data. A decade in internet time: Symposium on the dynamics of the internet and society (Volume 21). Oxford, UK: Oxford Internet Institute.
go back to reference Brooks, C., & Thompson, C. (2017). Predictive modelling in teaching and learning. In C. Lang, G. Siemens, A. Wise, & D. Gašević. Handbook of learning analytics (pp. 61–68). SOLAR, Society of Learning Analytics and Research. Brooks, C., & Thompson, C. (2017). Predictive modelling in teaching and learning. In C. Lang, G. Siemens, A. Wise, & D. Gašević. Handbook of learning analytics (pp. 61–68). SOLAR, Society of Learning Analytics and Research.
go back to reference Bucos, M., & Drăgulescu, B. (2018). Predicting student success using data generated in traditional educational environments. TEM Journal, 7(3), 617. Bucos, M., & Drăgulescu, B. (2018). Predicting student success using data generated in traditional educational environments. TEM Journal, 7(3), 617.
go back to reference Buuren, S. V., & Groothuis-Oudshoorn, K. (2010). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1–68. Buuren, S. V., & Groothuis-Oudshoorn, K. (2010). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1–68.
go back to reference Chang, M. J., Sharkness, J., Hurtado, S., & Newman, C. B. (2014). What matters in college for retaining aspiring scientists and engineers from underrepresented racial groups. Journal of Research in Science Teaching, 51(5), 555–580.CrossRef Chang, M. J., Sharkness, J., Hurtado, S., & Newman, C. B. (2014). What matters in college for retaining aspiring scientists and engineers from underrepresented racial groups. Journal of Research in Science Teaching, 51(5), 555–580.CrossRef
go back to reference Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321–357.CrossRef Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321–357.CrossRef
go back to reference Chung, J. Y., & Lee, S. (2019). Dropout early warning systems for high school students using machine learning. Children and Youth Services Review, 96, 346–353.CrossRef Chung, J. Y., & Lee, S. (2019). Dropout early warning systems for high school students using machine learning. Children and Youth Services Review, 96, 346–353.CrossRef
go back to reference Cohen, W. (1995). Fast effective rule induction. In Machine Learning Proceedings 1995 (pp. 115–123). Elsevier. Cohen, W. (1995). Fast effective rule induction. In Machine Learning Proceedings 1995 (pp. 115–123). Elsevier.
go back to reference Colton, J., Sbeglia, G., Finch, S. J., & Nehm, R. H. (2018). A quasi-experimental study of short-and long-term learning of evolution in misconception-focused classes. Paper presented at the American Educational Research Association International conference. New York: NY. Colton, J., Sbeglia, G., Finch, S. J., & Nehm, R. H. (2018). A quasi-experimental study of short-and long-term learning of evolution in misconception-focused classes. Paper presented at the American Educational Research Association International conference. New York: NY.
go back to reference Conijn, R., Snijders, C., Kleingeld, A., & Matzat, U. (2016). Predicting student performance from LMS data: A comparison of 17 blended courses using Moodle LMS. IEEE Transactions on Learning Technologies, 10(1), 17–29.CrossRef Conijn, R., Snijders, C., Kleingeld, A., & Matzat, U. (2016). Predicting student performance from LMS data: A comparison of 17 blended courses using Moodle LMS. IEEE Transactions on Learning Technologies, 10(1), 17–29.CrossRef
go back to reference Costa, E. B., Fonseca, B., Santana, M. A., de Araújo, F. F., & Rego, J. (2017). Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Computers in Human Behavior, 73, 247–256.CrossRef Costa, E. B., Fonseca, B., Santana, M. A., de Araújo, F. F., & Rego, J. (2017). Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Computers in Human Behavior, 73, 247–256.CrossRef
go back to reference Croninger, R. G., & Douglas, K. M. (2005). Missing data and institutional research. New directions for institutional research, 2005(127), 33–49.CrossRef Croninger, R. G., & Douglas, K. M. (2005). Missing data and institutional research. New directions for institutional research, 2005(127), 33–49.CrossRef
go back to reference Cox, B. E., McIntosh, K., Reason, R. D., & Terenzini, P. T. (2014). Working with missing data in higher education research: A primer and real-world example. The Review of Higher Education, 37(3), 377–402.CrossRef Cox, B. E., McIntosh, K., Reason, R. D., & Terenzini, P. T. (2014). Working with missing data in higher education research: A primer and real-world example. The Review of Higher Education, 37(3), 377–402.CrossRef
go back to reference Daniel, B.K. (2019). Improving the Pedagogy of Research Methodology through Learning Analytics. Electronics Journal of Business Research Methods, 17(1). Daniel, B.K. (2019). Improving the Pedagogy of Research Methodology through Learning Analytics. Electronics Journal of Business Research Methods, 17(1).
go back to reference Davidson, A.C. & Hinkley, D.V. (1997). Bootstrap Methods and their Application (Volume 1). Cambridge University Press. Davidson, A.C. & Hinkley, D.V. (1997). Bootstrap Methods and their Application (Volume 1). Cambridge University Press.
go back to reference Dobson, J. L. (2008). The use of formative online quizzes to enhance class preparation and scores on summative exams. Advances in Physiology Education, 32(4), 297–302.CrossRef Dobson, J. L. (2008). The use of formative online quizzes to enhance class preparation and scores on summative exams. Advances in Physiology Education, 32(4), 297–302.CrossRef
go back to reference Domingos, P. (1999, August). A general method for making classifiers cost-sensitive. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 155–164). Domingos, P. (1999, August). A general method for making classifiers cost-sensitive. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 155–164).
go back to reference Dong, Y., & Peng, C. Y. J. (2013). Principled missing data methods for researchers. SpringerPlus, 2(1), 222.CrossRef Dong, Y., & Peng, C. Y. J. (2013). Principled missing data methods for researchers. SpringerPlus, 2(1), 222.CrossRef
go back to reference Eddy, S. L., Brownell, S. E., & Wenderoth, M. P. (2014). Gender gaps in achievement and participation in multiple introductory biology classrooms. CBE - Life Sciences Education, 13(3), 478–492.CrossRef Eddy, S. L., Brownell, S. E., & Wenderoth, M. P. (2014). Gender gaps in achievement and participation in multiple introductory biology classrooms. CBE - Life Sciences Education, 13(3), 478–492.CrossRef
go back to reference Epling, M., Timmons, S., & Wharrad, H. (2003). An educational panopticon? New technology, nurse education and surveillance. Nurse Education Today, 23(6), 412–418.CrossRef Epling, M., Timmons, S., & Wharrad, H. (2003). An educational panopticon? New technology, nurse education and surveillance. Nurse Education Today, 23(6), 412–418.CrossRef
go back to reference Feng, M., Beck, J.E., & Heffernan, N.T. (2009). Using Learning Decomposition and Bootstrapping with Randomization to Compare the Impact of Different Educational Interventions on Learning. International Working Group on Educational Data Mining. Feng, M., Beck, J.E., & Heffernan, N.T. (2009). Using Learning Decomposition and Bootstrapping with Randomization to Compare the Impact of Different Educational Interventions on Learning. International Working Group on Educational Data Mining.
go back to reference Fox, J., & Weisberg, S. (2018). An R Companion to Applied Regression. Sage Publications. Fox, J., & Weisberg, S. (2018). An R Companion to Applied Regression. Sage Publications.
go back to reference Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Volume 1, No. 10). New York: Springer . Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning (Volume 1, No. 10). New York: Springer .
go back to reference Furrow, R.E., & Hsu, J.L. (2019). Concept inventories as a resource for teaching evolution. Evolution: Education and Outreach, 12(1), 2. Furrow, R.E., & Hsu, J.L. (2019). Concept inventories as a resource for teaching evolution. Evolution: Education and Outreach, 12(1), 2.
go back to reference Getachew, M. (2017). Students' Placement Prediction Model: A Data Mining Approach. (Doctoral Dissertation). Addis Ababa University, Arada, Ethiopia. Getachew, M. (2017). Students' Placement Prediction Model: A Data Mining Approach. (Doctoral Dissertation). Addis Ababa University, Arada, Ethiopia.
go back to reference Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60(1), 549–576.CrossRef Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60(1), 549–576.CrossRef
go back to reference Grimes, P. (2002). The overconfident principles of economics student: An examination of a metacognitive skill. Journal of Economic Education, 33(1), 15–30.CrossRef Grimes, P. (2002). The overconfident principles of economics student: An examination of a metacognitive skill. Journal of Economic Education, 33(1), 15–30.CrossRef
go back to reference Gundlach, E., Richards, K., Nelson, D., & Levesque-Bristol, C. (2015). A comparison of student attitudes, statistical reasoning, performance, and perceptions for web-augmented traditional, fully online, and flipped sections of a statistical literacy class. Journal of Statistics Education, 23(1), 1. Gundlach, E., Richards, K., Nelson, D., & Levesque-Bristol, C. (2015). A comparison of student attitudes, statistical reasoning, performance, and perceptions for web-augmented traditional, fully online, and flipped sections of a statistical literacy class. Journal of Statistics Education, 23(1), 1.
go back to reference Hake, R. R. (1998). Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American Journal of Physics, 66, 64–74.CrossRef Hake, R. R. (1998). Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American Journal of Physics, 66, 64–74.CrossRef
go back to reference Haudek, K. C., Kaplan, J. J., Knight, J., Long, T., Merrill, J., Munn, A., et al. (2011). Harnessing technology to improve formative assessment of student conceptions in STEM: Forging a national network. CBE - Life Science Education, 10(2), 149–155.CrossRef Haudek, K. C., Kaplan, J. J., Knight, J., Long, T., Merrill, J., Munn, A., et al. (2011). Harnessing technology to improve formative assessment of student conceptions in STEM: Forging a national network. CBE - Life Science Education, 10(2), 149–155.CrossRef
go back to reference Ioannidis, J. P. (2005). Why most published research findings are false. PLoS medicine, 2(8), e124.CrossRef Ioannidis, J. P. (2005). Why most published research findings are false. PLoS medicine, 2(8), e124.CrossRef
go back to reference Jago, R., Zakeri, I., Baranowski, T., & Watson, K. (2007). Decision boundaries and receiver operating characteristic curves: New methods for determining accelerometer cutpoints. Journal of sports sciences, 25(8), 937–944.CrossRef Jago, R., Zakeri, I., Baranowski, T., & Watson, K. (2007). Decision boundaries and receiver operating characteristic curves: New methods for determining accelerometer cutpoints. Journal of sports sciences, 25(8), 937–944.CrossRef
go back to reference Jakobsen, J. C., Gluud, C., Wetterslev, J., & Winkel, P. (2017). When and how should multiple imputation be used for handling missing data in randomised clinical trials – a practical guide with flowcharts. BMC Medical Research Methodology, 17(1), 162.CrossRef Jakobsen, J. C., Gluud, C., Wetterslev, J., & Winkel, P. (2017). When and how should multiple imputation be used for handling missing data in randomised clinical trials – a practical guide with flowcharts. BMC Medical Research Methodology, 17(1), 162.CrossRef
go back to reference James, G., Witten, D., Hastie, T., Tibshirani, R. (2013). An Introduction to Statistical Learning (Vol. 112, p. 184). New York: Springer. James, G., Witten, D., Hastie, T., Tibshirani, R. (2013). An Introduction to Statistical Learning (Vol. 112, p. 184). New York: Springer.
go back to reference Jiménez, S., Angeles-Valdez, D., Villicaña, V., Reyes-Zamorano, E., Alcala-Lozano, R., Gonzalez-Olvera, J.J., & Garza-Villarreal, E.A. (2019). Identifying cognitive deficits in cocaine dependence using standard tests and machine learning. Progress in Neuro-Psychopharmacology and Biological Psychiatry, 109709. Jiménez, S., Angeles-Valdez, D., Villicaña, V., Reyes-Zamorano, E., Alcala-Lozano, R., Gonzalez-Olvera, J.J., & Garza-Villarreal, E.A. (2019). Identifying cognitive deficits in cocaine dependence using standard tests and machine learning. Progress in Neuro-Psychopharmacology and Biological Psychiatry, 109709.
go back to reference Kalinowski, S. T., Leonard, M. J., & Taper, M. L. (2016). Development and validation of the conceptual assessment of natural selection (CANS). CBE - Life Sciences Education, 15(4), 64.CrossRef Kalinowski, S. T., Leonard, M. J., & Taper, M. L. (2016). Development and validation of the conceptual assessment of natural selection (CANS). CBE - Life Sciences Education, 15(4), 64.CrossRef
go back to reference Khobragade, L. P., & Mahadik, P. (2015). Students’ academic failure prediction using data mining. International Journal of Advanced Research in Computer and Communication Engineering, 4(11), 290–298. Khobragade, L. P., & Mahadik, P. (2015). Students’ academic failure prediction using data mining. International Journal of Advanced Research in Computer and Communication Engineering, 4(11), 290–298.
go back to reference Kirpich, A., Ainsworth, E. A., Wedow, J. M., Newman, J. R., Michailidis, G., & McIntyre, L. M. (2018). Variable selection in omics data: A practical evaluation of small sample sizes. PLoS, 13(6), e0197910.CrossRef Kirpich, A., Ainsworth, E. A., Wedow, J. M., Newman, J. R., Michailidis, G., & McIntyre, L. M. (2018). Variable selection in omics data: A practical evaluation of small sample sizes. PLoS, 13(6), e0197910.CrossRef
go back to reference Knowles, J. E. (2015). Of needles and haystacks: Building an accurate statewide dropout early warning system in Wisconsin. Journal of Educational Data Mining, 7(3), 18–67. Knowles, J. E. (2015). Of needles and haystacks: Building an accurate statewide dropout early warning system in Wisconsin. Journal of Educational Data Mining, 7(3), 18–67.
go back to reference Kotsiantis, S. (2009). Educational data mining: A case study for predicting dropout-prone students. International Journal of Knowledge Engineering and Soft Data Paradigms, 1(2), 101–111.CrossRef Kotsiantis, S. (2009). Educational data mining: A case study for predicting dropout-prone students. International Journal of Knowledge Engineering and Soft Data Paradigms, 1(2), 101–111.CrossRef
go back to reference Kotsiantis, S., Patriarcheas, K., & Xenos, M. (2010). A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowledge-Based Systems, 23(6), 529–535.CrossRef Kotsiantis, S., Patriarcheas, K., & Xenos, M. (2010). A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowledge-Based Systems, 23(6), 529–535.CrossRef
go back to reference Krstajic, D., Buturovic, L. J., Leahy, D. E., & Thomas, S. (2014). Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of cheminformatics, 6(1), 1–15.CrossRef Krstajic, D., Buturovic, L. J., Leahy, D. E., & Thomas, S. (2014). Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of cheminformatics, 6(1), 1–15.CrossRef
go back to reference Kuhn, M. (2015). Caret: classification and regression training. Astrophysics Source Code Library. Kuhn, M. (2015). Caret: classification and regression training. Astrophysics Source Code Library.
go back to reference Kumar, M., & Singh, A. (2017). Evaluation of data mining techniques for predicting student’s performance. International Journal of Modern Education and Computer Science, 9(8), 25–31. Kumar, M., & Singh, A. (2017). Evaluation of data mining techniques for predicting student’s performance. International Journal of Modern Education and Computer Science, 9(8), 25–31.
go back to reference Lavesson, N., & Davidsson, P. (2006, July). Quantifying the impact of learning algorithm parameter tuning. In AAAI (Vol. 6, pp. 395–400). Lavesson, N., & Davidsson, P. (2006, July). Quantifying the impact of learning algorithm parameter tuning. In AAAI (Vol. 6, pp. 395–400).
go back to reference Lee, U. J., Sbeglia, G. C., Ha, M., Finch, S. J., & Nehm, R. H. (2015). Clicker score trajectories and concept inventory scores as predictors for early warning systems for large STEM classes. Journal of Science Education and Technology, 24(6), 848–860.CrossRef Lee, U. J., Sbeglia, G. C., Ha, M., Finch, S. J., & Nehm, R. H. (2015). Clicker score trajectories and concept inventory scores as predictors for early warning systems for large STEM classes. Journal of Science Education and Technology, 24(6), 848–860.CrossRef
go back to reference Libarkin, J. C. (2008, October 13–14). Concept inventories in higher education science. Prepared for the national research council promising practices in undergraduate STEM education workshop 2. Washington D.C., United States. Libarkin, J. C. (2008, October 13–14). Concept inventories in higher education science. Prepared for the national research council promising practices in undergraduate STEM education workshop 2. Washington D.C., United States.
go back to reference Lisitsyna, L., & Oreshin, S. (2019). Machine Learning Approach of Predicting Learning Outcomes of MOOCs to Increase Its Performance. Smart Education and e-Learning 2019 (pp. 107–115). Springer. Lisitsyna, L., & Oreshin, S. (2019). Machine Learning Approach of Predicting Learning Outcomes of MOOCs to Increase Its Performance. Smart Education and e-Learning 2019 (pp. 107–115). Springer.
go back to reference Lu, F., & Petkova, E. (2014). A comparative study of variable selection methods in the context of developing psychiatric screening instruments. Statistics in Medicine, 33(3), 401–421.CrossRef Lu, F., & Petkova, E. (2014). A comparative study of variable selection methods in the context of developing psychiatric screening instruments. Statistics in Medicine, 33(3), 401–421.CrossRef
go back to reference Lu, W., Benson, R., Glaser, K., Platts, L., Corna, L., Worts, D., et al. (2017). Relationship between employment histories and frailty trajectories in later life: Evidence from the English Longitudinal Study of Ageing. Journal of Epidemiology Community Health, 71(5), 439–445.CrossRef Lu, W., Benson, R., Glaser, K., Platts, L., Corna, L., Worts, D., et al. (2017). Relationship between employment histories and frailty trajectories in later life: Evidence from the English Longitudinal Study of Ageing. Journal of Epidemiology Community Health, 71(5), 439–445.CrossRef
go back to reference Luengo, J., García, S., & Herrera, F. (2012). On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowledge and information systems, 32(1), 77–108.CrossRef Luengo, J., García, S., & Herrera, F. (2012). On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowledge and information systems, 32(1), 77–108.CrossRef
go back to reference Luo, Y., Li, Z., Guo, H., Cao, H., Song, C., Guo, X., & Zhang, Y. (2017). Predicting congenital heart defects: A comparison of three data mining methods. PLoS ONE, 12(5), e0177811–e0177811.CrossRef Luo, Y., Li, Z., Guo, H., Cao, H., Song, C., Guo, X., & Zhang, Y. (2017). Predicting congenital heart defects: A comparison of three data mining methods. PLoS ONE, 12(5), e0177811–e0177811.CrossRef
go back to reference Lykourentzou, I., Giannoukos, I., Mpardis, G., Nikolopoulos, V., & Loumos, V. (2009). Early and dynamic student achievement prediction in e-learning courses using neural networks. Journal of the American Society for Information Science and Technology, 60(2), 372–380.CrossRef Lykourentzou, I., Giannoukos, I., Mpardis, G., Nikolopoulos, V., & Loumos, V. (2009). Early and dynamic student achievement prediction in e-learning courses using neural networks. Journal of the American Society for Information Science and Technology, 60(2), 372–380.CrossRef
go back to reference Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & education, 54(2), 588–599.CrossRef Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & education, 54(2), 588–599.CrossRef
go back to reference Márquez-Vera, C., Morales, C. R., & Soto, S. V. (2013). Predicting school failure and dropout by using data mining techniques. IEEE Revista Iberoamericana de Tecnologias del Aprendizaje, 8(1), 7–14.CrossRef Márquez-Vera, C., Morales, C. R., & Soto, S. V. (2013). Predicting school failure and dropout by using data mining techniques. IEEE Revista Iberoamericana de Tecnologias del Aprendizaje, 8(1), 7–14.CrossRef
go back to reference Márquez-Vera, C., Romero, C., & Ventura, S. (2010). Predicting School Failure Using Data Mining. 4th International Conference on Educational Data Mining, (p. 271). Eindhoven, Netherlands. Márquez-Vera, C., Romero, C., & Ventura, S. (2010). Predicting School Failure Using Data Mining. 4th International Conference on Educational Data Mining, (p. 271). Eindhoven, Netherlands.
go back to reference Marr, B. (2015). Big Data: Using SMART big data, analytics and metrics to make better decisions and improve performance. John Wiley & Sons, 2015. Marr, B. (2015). Big Data: Using SMART big data, analytics and metrics to make better decisions and improve performance. John Wiley & Sons, 2015.
go back to reference Marshall, A., Altman, D. G., Royston, P., & Holder, R. L. (2010). Comparison of techniques for handling missing covariate data within prognostic modelling studies: A simulation study. BMC medical research methodology, 10(1), 7.CrossRef Marshall, A., Altman, D. G., Royston, P., & Holder, R. L. (2010). Comparison of techniques for handling missing covariate data within prognostic modelling studies: A simulation study. BMC medical research methodology, 10(1), 7.CrossRef
go back to reference Minaei-Bidgoli, B., Kashy, D. A., Kortemeyer, G., & Punch, W. F. (2003, November). Predicting student performance: An application of data mining methods with an education web-based system. 33rd Annual Frontiers in Education, 2003. FIE 2003. (Vol. 1, pp.T2A-13). IEEE. Minaei-Bidgoli, B., Kashy, D. A., Kortemeyer, G., & Punch, W. F. (2003, November). Predicting student performance: An application of data mining methods with an education web-based system. 33rd Annual Frontiers in Education, 2003. FIE 2003. (Vol. 1, pp.T2A-13). IEEE.
go back to reference Moharreri, K., Ha, M., & Nehm, R. H. (2014). EvoGrader: an online formative assessment tool for automatically evaluating written evolutionary explanations. Evolution: Education and Outreach, 7(1), 15. Moharreri, K., Ha, M., & Nehm, R. H. (2014). EvoGrader: an online formative assessment tool for automatically evaluating written evolutionary explanations. Evolution: Education and Outreach, 7(1), 15.
go back to reference Mwitondi, K. S., & Said, R. A. (2013). A data-based method for harmonising heterogeneous data modelling techniques across data mining applications. Journal of statistics applications and probability, 2(3), 157–162.CrossRef Mwitondi, K. S., & Said, R. A. (2013). A data-based method for harmonising heterogeneous data modelling techniques across data mining applications. Journal of statistics applications and probability, 2(3), 157–162.CrossRef
go back to reference National Research Council. (2012). Thinking evolutionarily: evolution education across the life sciences. Washington D.C: National Academic Press. National Research Council. (2012). Thinking evolutionarily: evolution education across the life sciences. Washington D.C: National Academic Press.
go back to reference National Research Council and National Academy of Education. (2011). High school dropout, graduation, and completion rates: better data, better measures, better decisions. Washington D.C.: The National Academics Press. National Research Council and National Academy of Education. (2011). High school dropout, graduation, and completion rates: better data, better measures, better decisions. Washington D.C.: The National Academics Press.
go back to reference Nehm, R. H. (2019). Biology education research: Building integrative frameworks for teaching and learning about living systems. Disciplinary and Interdisciplinary Science Education Research, 1(1), 15.CrossRef Nehm, R. H. (2019). Biology education research: Building integrative frameworks for teaching and learning about living systems. Disciplinary and Interdisciplinary Science Education Research, 1(1), 15.CrossRef
go back to reference Nehm, R. H., & Reilly, L. (2007). Biology majors’ knowledge and misconceptions of natural selection. BioScience, 57(3), 263–272.CrossRef Nehm, R. H., & Reilly, L. (2007). Biology majors’ knowledge and misconceptions of natural selection. BioScience, 57(3), 263–272.CrossRef
go back to reference Nehm, R. H., Beggrow, E. P., Opfer, E. P., & Ha, M. (2012). Reasoning about natural selection: diagnosing contextual competency using the ACORNS instrument. The American Biology Teacher, 74(2), 92–98.CrossRef Nehm, R. H., Beggrow, E. P., Opfer, E. P., & Ha, M. (2012). Reasoning about natural selection: diagnosing contextual competency using the ACORNS instrument. The American Biology Teacher, 74(2), 92–98.CrossRef
go back to reference Neild, R. C., Balfanz, R., & Herzog, L. (2007). An early warning system. Educational leadership, 65(2), 28–33. Neild, R. C., Balfanz, R., & Herzog, L. (2007). An early warning system. Educational leadership, 65(2), 28–33.
go back to reference Opfer, J. E., Nehm, R. H., & Ha, M. (2012). Cognitive foundations for science assessment design: Knowing what students know about evolution. Journal of Research in Science Teaching, 49(6), 744–777.CrossRef Opfer, J. E., Nehm, R. H., & Ha, M. (2012). Cognitive foundations for science assessment design: Knowing what students know about evolution. Journal of Research in Science Teaching, 49(6), 744–777.CrossRef
go back to reference Orr, R., & Foster, S. (2013). Increasing student success using online quizzing in introductory (majors) biology. CBE - Life Sciences Education, 12(3), 509–514.CrossRef Orr, R., & Foster, S. (2013). Increasing student success using online quizzing in introductory (majors) biology. CBE - Life Sciences Education, 12(3), 509–514.CrossRef
go back to reference Patel, J.A., & Sharma, P. (2014, August). Big data for better health planning. In 2014 International Conference on Advances in Engineering & Technology Research (ICAETR-2014). (pp. 1–5). IEEE. Patel, J.A., & Sharma, P. (2014, August). Big data for better health planning. In 2014 International Conference on Advances in Engineering & Technology Research (ICAETR-2014). (pp. 1–5). IEEE.
go back to reference PCAST, PsCoSaT. . (2012). Engage to excel: Producing one million additional college graduates with degrees in science, technology, engineering, and mathematics. Washington DC: Executive Office of the President. PCAST, PsCoSaT. . (2012). Engage to excel: Producing one million additional college graduates with degrees in science, technology, engineering, and mathematics. Washington DC: Executive Office of the President.
go back to reference Perkins, N. J., & Schisterman, E. F. (2006). The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve. American Journal of Epidemiology, 163(7), 670–675.CrossRef Perkins, N. J., & Schisterman, E. F. (2006). The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve. American Journal of Epidemiology, 163(7), 670–675.CrossRef
go back to reference Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74(4), 525–556.CrossRef Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74(4), 525–556.CrossRef
go back to reference Prinsloo, P., Archer, E., Barnes, G., Chetty, Y., & Van Zyl, D. (2015). Big(ger) data as better data in open distance learning. International Review of Research in Open and Distributed Learning, 16(1), 284–306.CrossRef Prinsloo, P., Archer, E., Barnes, G., Chetty, Y., & Van Zyl, D. (2015). Big(ger) data as better data in open distance learning. International Review of Research in Open and Distributed Learning, 16(1), 284–306.CrossRef
go back to reference Radwan, A., & Cataltepe, Z. (2017). Improving performance prediction on education data with noise and class imbalance. Intelligent Automation & Soft Computing, 1–8. Radwan, A., & Cataltepe, Z. (2017). Improving performance prediction on education data with noise and class imbalance. Intelligent Automation & Soft Computing, 1–8.
go back to reference Ransom, C. J., Kitchen, N. R., Camberato, J. J., Carter, P. R., Ferguson, R. B., et al. (2019). Statistical and machine learning methods evaluated for incorporating soil and weather into corn nitrogen recommendations. Computers and Electronics in Agriculture, 164, 104872.CrossRef Ransom, C. J., Kitchen, N. R., Camberato, J. J., Carter, P. R., Ferguson, R. B., et al. (2019). Statistical and machine learning methods evaluated for incorporating soil and weather into corn nitrogen recommendations. Computers and Electronics in Agriculture, 164, 104872.CrossRef
go back to reference Rath, K., Peterfreund, A., Xenos, S., Bayliss, F., & Carnal, N. (2007). Supplemental instruction in introductory biology I: Enhancing the performance and retention of underrepresented minority students. CBE- Life Science Education, 6(3), 203–216.CrossRef Rath, K., Peterfreund, A., Xenos, S., Bayliss, F., & Carnal, N. (2007). Supplemental instruction in introductory biology I: Enhancing the performance and retention of underrepresented minority students. CBE- Life Science Education, 6(3), 203–216.CrossRef
go back to reference Rebok, G. W., Ball, K., Guey, L. T., Jones, R. N., Kim, H. Y., Kim, H. Y., et al. (2014). Ten-year effects of the advanced cognitive training for independent and vital elderly cognitive training trial on cognition and everyday functioning in older adults. Journal of the American Geriatrics Society, 62(1), 16–24.CrossRef Rebok, G. W., Ball, K., Guey, L. T., Jones, R. N., Kim, H. Y., Kim, H. Y., et al. (2014). Ten-year effects of the advanced cognitive training for independent and vital elderly cognitive training trial on cognition and everyday functioning in older adults. Journal of the American Geriatrics Society, 62(1), 16–24.CrossRef
go back to reference Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1–2), 1–39.CrossRef Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1–2), 1–39.CrossRef
go back to reference Rovira, S., Puertas, E., & Igual, L. (2017). Data-driven system to predict academic grades and dropout. PLoS, 12(2), e0171207.CrossRef Rovira, S., Puertas, E., & Igual, L. (2017). Data-driven system to predict academic grades and dropout. PLoS, 12(2), e0171207.CrossRef
go back to reference Sayre, E. C., & Heckler, A. F. (2009). Peaks and decays of student knowledge in an introductory E&M course. Physical Review Special Topics-Physics Education Research, 5(1), 1–5.CrossRef Sayre, E. C., & Heckler, A. F. (2009). Peaks and decays of student knowledge in an introductory E&M course. Physical Review Special Topics-Physics Education Research, 5(1), 1–5.CrossRef
go back to reference Schisterman, E. F., Perkins, N. J., Liu, A., & Bondell, H. (2005). Optimal cut-points and its corresponding Youden index to discriminate individuals using pooled blood samples. Epidemiology, 16(1), 73–81.CrossRef Schisterman, E. F., Perkins, N. J., Liu, A., & Bondell, H. (2005). Optimal cut-points and its corresponding Youden index to discriminate individuals using pooled blood samples. Epidemiology, 16(1), 73–81.CrossRef
go back to reference Seymour, E. & Hunter, A.B. (Eds.) (2019). Talking about Leaving Revisited. Springer. Nature: Switzerland. Seymour, E. & Hunter, A.B. (Eds.) (2019). Talking about Leaving Revisited. Springer. Nature: Switzerland.
go back to reference Shepherd, D. L., (2016). The open door of learning - Access restricted: School effectiveness and efficiency across the South African education system. (Doctoral Dissertation). Stellenbosch University, Stellenbosch, South Africa . Shepherd, D. L., (2016). The open door of learning - Access restricted: School effectiveness and efficiency across the South African education system. (Doctoral Dissertation). Stellenbosch University, Stellenbosch, South Africa .
go back to reference Silva, C., & Fonseca, J. (2017). Educational Data Mining: A Literature Review. Europe and MENA Cooperation Advances in Information and Communication Technologies: Advances in Intelligent Systems and Computing, vol 520 (pp. 87–94). Springer, Cham. Silva, C., & Fonseca, J. (2017). Educational Data Mining: A Literature Review. Europe and MENA Cooperation Advances in Information and Communication Technologies: Advances in Intelligent Systems and Computing, vol 520 (pp. 87–94). Springer, Cham.
go back to reference Tekin, A. (2014). Early prediction of students’ grade point averages at graduation: A data mining approach. Eurasian Journal of Educational Research, 54, 207–226.CrossRef Tekin, A. (2014). Early prediction of students’ grade point averages at graduation: A data mining approach. Eurasian Journal of Educational Research, 54, 207–226.CrossRef
go back to reference Thai-Nghe, N., Gantner, Z., & Schmidt-Thieme, L. (2010). Cost-sensitive learning methods for imbalanced data. In The 2010 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). Barcelona, Spain, 2010. Thai-Nghe, N., Gantner, Z., & Schmidt-Thieme, L. (2010). Cost-sensitive learning methods for imbalanced data. In The 2010 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). Barcelona, Spain, 2010.
go back to reference Tops, W., Callens, M., Lammertyn, J., Van Hees, V., & Brysbaert, M. (2012). Identifying students with dyslexia in higher education. Annals of Dyslexia, 62(3), 186–203.CrossRef Tops, W., Callens, M., Lammertyn, J., Van Hees, V., & Brysbaert, M. (2012). Identifying students with dyslexia in higher education. Annals of Dyslexia, 62(3), 186–203.CrossRef
go back to reference Vovides, Y., Sanchez-Alonso, S., Mitropoulou, V., & Nickmans, G. (2007). The use of e-learning course management systems to support learning strategies and to improve self-regulated learning. Educational Research Review, 2(1), 64–74.CrossRef Vovides, Y., Sanchez-Alonso, S., Mitropoulou, V., & Nickmans, G. (2007). The use of e-learning course management systems to support learning strategies and to improve self-regulated learning. Educational Research Review, 2(1), 64–74.CrossRef
go back to reference Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133.CrossRef Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133.CrossRef
go back to reference Waterhouse, J. K., Carroll, M. C., & Beeman, P. B. (1993). National council licensure examination success: Accurate prediction of student performance on the post-1988 examination. Journal of Professional Nursing, 9(5), 278–283.CrossRef Waterhouse, J. K., Carroll, M. C., & Beeman, P. B. (1993). National council licensure examination success: Accurate prediction of student performance on the post-1988 examination. Journal of Professional Nursing, 9(5), 278–283.CrossRef
go back to reference Watson, C., Li, F., & Godwin, J. (2013). Predicting performance in an introductory programming course by logging and analyzing student programming behavior. 2013 IEEE 13th International Conference on Advanced Learning Technologies (pp. 319–323). Beijing: IEEE. Watson, C., Li, F., & Godwin, J. (2013). Predicting performance in an introductory programming course by logging and analyzing student programming behavior. 2013 IEEE 13th International Conference on Advanced Learning Technologies (pp. 319–323). Beijing: IEEE.
go back to reference Xue, Y. (2018, June). Testing the differential efficacy of Data Mining Techniques to predicting student outcomes in higher education. (Doctoral Dissertation). Stony Brook University, Stony Brook, New York. Xue, Y. (2018, June). Testing the differential efficacy of Data Mining Techniques to predicting student outcomes in higher education. (Doctoral Dissertation). Stony Brook University, Stony Brook, New York.
go back to reference Yang, Q., & Wu, X. (2006). 10 challenging problems in data mining research. International Journal of Information Technology & Decision Making, 5, 597–604.CrossRef Yang, Q., & Wu, X. (2006). 10 challenging problems in data mining research. International Journal of Information Technology & Decision Making, 5, 597–604.CrossRef
go back to reference Yukselturk, E., Ozekes, S., & Turel, Y. K. (2014). Predicting dropout student: An application of data mining methods in an online education program. European Journal of Open, Distance, and e-learning, 17(1), 118–133.CrossRef Yukselturk, E., Ozekes, S., & Turel, Y. K. (2014). Predicting dropout student: An application of data mining methods in an online education program. European Journal of Open, Distance, and e-learning, 17(1), 118–133.CrossRef
go back to reference Zhai, X., Yin, Y., Pellegrino, J. W., Haudek, K. C., & Shi, L. (2020). Applying machine learning in science assessments: A systematic review. Studies in Science Education, 56(1), 111–151.CrossRef Zhai, X., Yin, Y., Pellegrino, J. W., Haudek, K. C., & Shi, L. (2020). Applying machine learning in science assessments: A systematic review. Studies in Science Education, 56(1), 111–151.CrossRef
Metadata
Title
Testing the Impact of Novel Assessment Sources and Machine Learning Methods on Predictive Outcome Modeling in Undergraduate Biology
Authors
Roberto Bertolini
Stephen J. Finch
Ross H. Nehm
Publication date
04-01-2021
Publisher
Springer Netherlands
Published in
Journal of Science Education and Technology / Issue 2/2021
Print ISSN: 1059-0145
Electronic ISSN: 1573-1839
DOI
https://doi.org/10.1007/s10956-020-09888-8

Other articles of this Issue 2/2021

Journal of Science Education and Technology 2/2021 Go to the issue

Premium Partners