ABSTRACT
Many school districts have developed successful intervention programs to help students graduate high school on time. However, identifying and prioritizing students who need those interventions the most remains challenging. This paper describes a machine learning framework to identify such students, discusses features that are useful for this task, applies several classification algorithms, and evaluates them using metrics important to school administrators. To help test this framework and make it practically useful, we partnered with two U.S. school districts with a combined enrollment of approximately 200,000 students. We together designed several evaluation metrics to assess the goodness of machine learning algorithms from an educator's perspective. This paper focuses on students at risk of not finishing high school on time, but our framework lays a strong foundation for future work on other adverse academic outcomes.
Supplemental Material
- Building a Grad Nation. http://www.americaspromise.org/sites/default/files/legacy/bodyfiles/BuildingAGradNation2012.pdf.Google Scholar
- E. Aguiar, G. A. Ambrose, N. V. Chawla, V. Goodrich, and J. Brockman. Engagement vs performance: Using electronic portfolios to predict first semester engineering student persistence. Journal of Learning Analytics, 1(3), 2014.Google ScholarCross Ref
- E. Aguiar, H. Lakkaraju, N. Bhanpuri, D. Miller, B. Yuhas, and K. Addison. Who, When, and Why: A machine learning approach to prioritizing students at risk of not graduating high school on time. In Proceedings of the Learning Analytics and Knowledge Conference(LAK), 2015. Google ScholarDigital Library
- E. M. Allensworth and J. Q. Easton. What matters for staying on track and graduating in chicago public high schools. Chicago, IL: Consortium on Chicago school research. Retrieved December, 17, 2007.Google Scholar
- E. M. Allensworth, J. A. Gwynne, P. Moore, and M. D. L. Torre. Looking forward to high school and college: Middle grade indicators of readiness in chicago public schools. 2014.Google Scholar
- R. Balfanz, L. Herzog, and D. J. Mac Iver. Preventing student disengagement and keeping students on the graduation path in urban middle-grades schools: Early identification and effective interventions. Educational Psychologist, 42(4), 2007.Google Scholar
- A. J. Bowers and R. Sprott. Why tenth graders fail to finish high school: Dropout typology latent class analysis. Journal of Education for Students Placed at Risk, 17(3), 2012.Google ScholarCross Ref
- A. J. Bowers, R. Sprott, and S. A. Taff. Do we know who will drop out?: A review of the predictors of dropping out of high school: Precision, sensitivity, and specificity. The High School Journal, 96(2), 2013.Google Scholar
- L. Breiman. Random forests. Machine learning, 45(1), 2001. Google ScholarDigital Library
- N. V. Chawla and D. A. Cieslak. Evaluating probability estimates from decision trees. In Proceedings of the AAAI Conference on Artificial Intelligence Workshops, 2006.Google Scholar
- G. W. Dekker, M. Pechenizkiy, and J. M. Vleeshouwers. Predicting students drop out: A case study. International Working Group on Educational Data Mining, 2009.Google Scholar
- E. Er. Identifying at-risk students using machine learning techniques: A case study with is 100. International Journal of Machine Learning and Computing(IJMLC), 2(4), 2012.Google ScholarCross Ref
- D. C. French and J. Conrad. School dropout as predicted by peer rejection and antisocial behavior. Journal of Research on Adolescence, 11(3), 2001.Google ScholarCross Ref
- J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In ACM SIGMOD Record, volume 29. ACM, 2000. Google ScholarDigital Library
- A. Hershkovitz, R. Baker, S. M. Gowda, and A. T. Corbett. Predicting future learning better using quantitative analysis of moment-by-moment learning. In Educational Data Mining, 2013.Google Scholar
- C. Hoxby, S. Turner, et al. Expanding college opportunities for high-achieving, low income students. Stanford Institute for Economic Policy Research Discussion Paper, (12-014), 2013.Google Scholar
- M. Kendall. Rank correlation methods. Griffin, London, 1948.Google Scholar
- H. M. Levin and C. Belfield. The price we pay: Economic and social consequences of inadequate education. Brookings Institution Press, 2007.Google Scholar
- H.-T. Lin, C.-J. Lin, and R. C. Weng. A note on platt's probabilistic outputs for support vector machines. Machine learning, 68(3), 2007. Google ScholarDigital Library
- T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3), 2009.Google Scholar
- A. Niculescu-mizil and R. Caruana. Obtaining calibrated probabilities from boosting. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2005.Google Scholar
- K. Pittman. Comparison of data mining techniques used to predict student retention. ProQuest, 2008.Google Scholar
- J. Quinlan. Induction of decision trees. Machine Learning, 1(1), 1986. Google ScholarDigital Library
- R. W. Rumberger and S. A. Lim. Why students drop out of school: A review of 25 years of research. California Dropout Research Project, Policy Brief 15, 2008.Google Scholar
- J. Soland. Predicting high school graduation and college enrollment: Comparing early warning indicator data and teacher intuition. Journal of Education for Students Placed at Risk, 18, 2013.Google Scholar
- A. Tamhane, S. Ikbal, B. Sengupta, M. Duggirala, and J. Appleton. Predicting student risks through longitudinal analysis. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD), 2014. Google ScholarDigital Library
- University of Chicago. The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship. http://dssg.uchicago.edu/. Accessed: 2014--10-01.Google Scholar
- U.S. Department of Education, National Center for Education Statistics. The condition of education. 2014.Google Scholar
- A. Vihavainen, M. Luukkainen, and J. Kurhila. Using students' programming behavior to predict success in an introductory mathematics course. In Proceedings of the International Conference on Educational Data Mining, 2011.Google Scholar
- K. L. Wagstaff. Machine learning that matters. In Proceedings of the International Conference on Machine Learning(ICML), 2012.Google Scholar
- S. K. Yadav, B. Bharadwaj, and S. Pal. Data mining applications: A comparative study for predicting student's performance. arXiv preprint arXiv:1202.4815, 2012.Google Scholar
Index Terms
- A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes
Recommendations
Who, when, and why: a machine learning approach to prioritizing students at risk of not graduating high school on time
LAK '15: Proceedings of the Fifth International Conference on Learning Analytics And KnowledgeSeveral hundred thousand students drop out of high school every year in the United States. Interventions can help those who are falling behind in their educational goals, but given limited resources, such programs must focus on the right students, at ...
Student Performance Prediction and Classification Using Machine Learning Algorithms
ICEIT 2019: Proceedings of the 2019 8th International Conference on Educational and Information TechnologyFor a productive and a good life, education is a necessity and it improves individuals' life with value and excellence. Also, education is considered a vital need for motivating self-assurance as well as providing the things are needed to partake in ...
A Robust Machine Learning Technique to Predict Low-performing Students
As enrollments and class sizes in postsecondary institutions have increased, instructors have sought automated and lightweight means to identify students who are at risk of performing poorly in a course. This identification must be performed early ...
Comments