skip to main content
10.1145/2783258.2788620acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes

Published:10 August 2015Publication History

ABSTRACT

Many school districts have developed successful intervention programs to help students graduate high school on time. However, identifying and prioritizing students who need those interventions the most remains challenging. This paper describes a machine learning framework to identify such students, discusses features that are useful for this task, applies several classification algorithms, and evaluates them using metrics important to school administrators. To help test this framework and make it practically useful, we partnered with two U.S. school districts with a combined enrollment of approximately 200,000 students. We together designed several evaluation metrics to assess the goodness of machine learning algorithms from an educator's perspective. This paper focuses on students at risk of not finishing high school on time, but our framework lays a strong foundation for future work on other adverse academic outcomes.

Skip Supplemental Material Section

Supplemental Material

p1909.mp4

mp4

157 MB

References

  1. Building a Grad Nation. http://www.americaspromise.org/sites/default/files/legacy/bodyfiles/BuildingAGradNation2012.pdf.Google ScholarGoogle Scholar
  2. E. Aguiar, G. A. Ambrose, N. V. Chawla, V. Goodrich, and J. Brockman. Engagement vs performance: Using electronic portfolios to predict first semester engineering student persistence. Journal of Learning Analytics, 1(3), 2014.Google ScholarGoogle ScholarCross RefCross Ref
  3. E. Aguiar, H. Lakkaraju, N. Bhanpuri, D. Miller, B. Yuhas, and K. Addison. Who, When, and Why: A machine learning approach to prioritizing students at risk of not graduating high school on time. In Proceedings of the Learning Analytics and Knowledge Conference(LAK), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E. M. Allensworth and J. Q. Easton. What matters for staying on track and graduating in chicago public high schools. Chicago, IL: Consortium on Chicago school research. Retrieved December, 17, 2007.Google ScholarGoogle Scholar
  5. E. M. Allensworth, J. A. Gwynne, P. Moore, and M. D. L. Torre. Looking forward to high school and college: Middle grade indicators of readiness in chicago public schools. 2014.Google ScholarGoogle Scholar
  6. R. Balfanz, L. Herzog, and D. J. Mac Iver. Preventing student disengagement and keeping students on the graduation path in urban middle-grades schools: Early identification and effective interventions. Educational Psychologist, 42(4), 2007.Google ScholarGoogle Scholar
  7. A. J. Bowers and R. Sprott. Why tenth graders fail to finish high school: Dropout typology latent class analysis. Journal of Education for Students Placed at Risk, 17(3), 2012.Google ScholarGoogle ScholarCross RefCross Ref
  8. A. J. Bowers, R. Sprott, and S. A. Taff. Do we know who will drop out?: A review of the predictors of dropping out of high school: Precision, sensitivity, and specificity. The High School Journal, 96(2), 2013.Google ScholarGoogle Scholar
  9. L. Breiman. Random forests. Machine learning, 45(1), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. N. V. Chawla and D. A. Cieslak. Evaluating probability estimates from decision trees. In Proceedings of the AAAI Conference on Artificial Intelligence Workshops, 2006.Google ScholarGoogle Scholar
  11. G. W. Dekker, M. Pechenizkiy, and J. M. Vleeshouwers. Predicting students drop out: A case study. International Working Group on Educational Data Mining, 2009.Google ScholarGoogle Scholar
  12. E. Er. Identifying at-risk students using machine learning techniques: A case study with is 100. International Journal of Machine Learning and Computing(IJMLC), 2(4), 2012.Google ScholarGoogle ScholarCross RefCross Ref
  13. D. C. French and J. Conrad. School dropout as predicted by peer rejection and antisocial behavior. Journal of Research on Adolescence, 11(3), 2001.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In ACM SIGMOD Record, volume 29. ACM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Hershkovitz, R. Baker, S. M. Gowda, and A. T. Corbett. Predicting future learning better using quantitative analysis of moment-by-moment learning. In Educational Data Mining, 2013.Google ScholarGoogle Scholar
  16. C. Hoxby, S. Turner, et al. Expanding college opportunities for high-achieving, low income students. Stanford Institute for Economic Policy Research Discussion Paper, (12-014), 2013.Google ScholarGoogle Scholar
  17. M. Kendall. Rank correlation methods. Griffin, London, 1948.Google ScholarGoogle Scholar
  18. H. M. Levin and C. Belfield. The price we pay: Economic and social consequences of inadequate education. Brookings Institution Press, 2007.Google ScholarGoogle Scholar
  19. H.-T. Lin, C.-J. Lin, and R. C. Weng. A note on platt's probabilistic outputs for support vector machines. Machine learning, 68(3), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3), 2009.Google ScholarGoogle Scholar
  21. A. Niculescu-mizil and R. Caruana. Obtaining calibrated probabilities from boosting. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2005.Google ScholarGoogle Scholar
  22. K. Pittman. Comparison of data mining techniques used to predict student retention. ProQuest, 2008.Google ScholarGoogle Scholar
  23. J. Quinlan. Induction of decision trees. Machine Learning, 1(1), 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. W. Rumberger and S. A. Lim. Why students drop out of school: A review of 25 years of research. California Dropout Research Project, Policy Brief 15, 2008.Google ScholarGoogle Scholar
  25. J. Soland. Predicting high school graduation and college enrollment: Comparing early warning indicator data and teacher intuition. Journal of Education for Students Placed at Risk, 18, 2013.Google ScholarGoogle Scholar
  26. A. Tamhane, S. Ikbal, B. Sengupta, M. Duggirala, and J. Appleton. Predicting student risks through longitudinal analysis. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. University of Chicago. The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship. http://dssg.uchicago.edu/. Accessed: 2014--10-01.Google ScholarGoogle Scholar
  28. U.S. Department of Education, National Center for Education Statistics. The condition of education. 2014.Google ScholarGoogle Scholar
  29. A. Vihavainen, M. Luukkainen, and J. Kurhila. Using students' programming behavior to predict success in an introductory mathematics course. In Proceedings of the International Conference on Educational Data Mining, 2011.Google ScholarGoogle Scholar
  30. K. L. Wagstaff. Machine learning that matters. In Proceedings of the International Conference on Machine Learning(ICML), 2012.Google ScholarGoogle Scholar
  31. S. K. Yadav, B. Bharadwaj, and S. Pal. Data mining applications: A comparative study for predicting student's performance. arXiv preprint arXiv:1202.4815, 2012.Google ScholarGoogle Scholar

Index Terms

  1. A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
      August 2015
      2378 pages
      ISBN:9781450336642
      DOI:10.1145/2783258

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 August 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader