research-article

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes

Authors:
Himabindu Lakkaraju

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Everaldo Aguiar

University of Notre Dame, Notre Dame, IN, USA

University of Notre Dame, Notre Dame, IN, USA
View Profile

,
Carl Shan

University of Chicago, Chicago, IL, USA

University of Chicago, Chicago, IL, USA
View Profile

,
David Miller

Northwestern University, Chicago, IL, USA

Northwestern University, Chicago, IL, USA
View Profile

,
Nasir Bhanpuri

University of Chicago, Chicago, IL, USA

University of Chicago, Chicago, IL, USA
View Profile

,
Rayid Ghani

University of Chicago, Chicago, IL, USA

University of Chicago, Chicago, IL, USA
View Profile

,
Kecia L. Addison

Montgomery County Public Schools, Gaithersburg, MD, USA

Montgomery County Public Schools, Gaithersburg, MD, USA
View Profile

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2015Pages 1909–1918https://doi.org/10.1145/2783258.2788620

Published:10 August 2015Publication History

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1909–1918

ABSTRACT

Many school districts have developed successful intervention programs to help students graduate high school on time. However, identifying and prioritizing students who need those interventions the most remains challenging. This paper describes a machine learning framework to identify such students, discusses features that are useful for this task, applies several classification algorithms, and evaluates them using metrics important to school administrators. To help test this framework and make it practically useful, we partnered with two U.S. school districts with a combined enrollment of approximately 200,000 students. We together designed several evaluation metrics to assess the goodness of machine learning algorithms from an educator's perspective. This paper focuses on students at risk of not finishing high school on time, but our framework lays a strong foundation for future work on other adverse academic outcomes.

Supplemental Material

p1909.mp4

mp4

157 MB

Download

References

Building a Grad Nation. http://www.americaspromise.org/sites/default/files/legacy/bodyfiles/BuildingAGradNation2012.pdf.Google Scholar
E. Aguiar, G. A. Ambrose, N. V. Chawla, V. Goodrich, and J. Brockman. Engagement vs performance: Using electronic portfolios to predict first semester engineering student persistence. Journal of Learning Analytics, 1(3), 2014.Google ScholarCross Ref
E. Aguiar, H. Lakkaraju, N. Bhanpuri, D. Miller, B. Yuhas, and K. Addison. Who, When, and Why: A machine learning approach to prioritizing students at risk of not graduating high school on time. In Proceedings of the Learning Analytics and Knowledge Conference(LAK), 2015. Google ScholarDigital Library
E. M. Allensworth and J. Q. Easton. What matters for staying on track and graduating in chicago public high schools. Chicago, IL: Consortium on Chicago school research. Retrieved December, 17, 2007.Google Scholar
E. M. Allensworth, J. A. Gwynne, P. Moore, and M. D. L. Torre. Looking forward to high school and college: Middle grade indicators of readiness in chicago public schools. 2014.Google Scholar
R. Balfanz, L. Herzog, and D. J. Mac Iver. Preventing student disengagement and keeping students on the graduation path in urban middle-grades schools: Early identification and effective interventions. Educational Psychologist, 42(4), 2007.Google Scholar
A. J. Bowers and R. Sprott. Why tenth graders fail to finish high school: Dropout typology latent class analysis. Journal of Education for Students Placed at Risk, 17(3), 2012.Google ScholarCross Ref
A. J. Bowers, R. Sprott, and S. A. Taff. Do we know who will drop out?: A review of the predictors of dropping out of high school: Precision, sensitivity, and specificity. The High School Journal, 96(2), 2013.Google Scholar
L. Breiman. Random forests. Machine learning, 45(1), 2001. Google ScholarDigital Library
N. V. Chawla and D. A. Cieslak. Evaluating probability estimates from decision trees. In Proceedings of the AAAI Conference on Artificial Intelligence Workshops, 2006.Google Scholar
G. W. Dekker, M. Pechenizkiy, and J. M. Vleeshouwers. Predicting students drop out: A case study. International Working Group on Educational Data Mining, 2009.Google Scholar
E. Er. Identifying at-risk students using machine learning techniques: A case study with is 100. International Journal of Machine Learning and Computing(IJMLC), 2(4), 2012.Google ScholarCross Ref
D. C. French and J. Conrad. School dropout as predicted by peer rejection and antisocial behavior. Journal of Research on Adolescence, 11(3), 2001.Google ScholarCross Ref
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In ACM SIGMOD Record, volume 29. ACM, 2000. Google ScholarDigital Library
A. Hershkovitz, R. Baker, S. M. Gowda, and A. T. Corbett. Predicting future learning better using quantitative analysis of moment-by-moment learning. In Educational Data Mining, 2013.Google Scholar
C. Hoxby, S. Turner, et al. Expanding college opportunities for high-achieving, low income students. Stanford Institute for Economic Policy Research Discussion Paper, (12-014), 2013.Google Scholar
M. Kendall. Rank correlation methods. Griffin, London, 1948.Google Scholar
H. M. Levin and C. Belfield. The price we pay: Economic and social consequences of inadequate education. Brookings Institution Press, 2007.Google Scholar
H.-T. Lin, C.-J. Lin, and R. C. Weng. A note on platt's probabilistic outputs for support vector machines. Machine learning, 68(3), 2007. Google ScholarDigital Library
T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3), 2009.Google Scholar
A. Niculescu-mizil and R. Caruana. Obtaining calibrated probabilities from boosting. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2005.Google Scholar
K. Pittman. Comparison of data mining techniques used to predict student retention. ProQuest, 2008.Google Scholar
J. Quinlan. Induction of decision trees. Machine Learning, 1(1), 1986. Google ScholarDigital Library
R. W. Rumberger and S. A. Lim. Why students drop out of school: A review of 25 years of research. California Dropout Research Project, Policy Brief 15, 2008.Google Scholar
J. Soland. Predicting high school graduation and college enrollment: Comparing early warning indicator data and teacher intuition. Journal of Education for Students Placed at Risk, 18, 2013.Google Scholar
A. Tamhane, S. Ikbal, B. Sengupta, M. Duggirala, and J. Appleton. Predicting student risks through longitudinal analysis. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD), 2014. Google ScholarDigital Library
University of Chicago. The Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship. http://dssg.uchicago.edu/. Accessed: 2014--10-01.Google Scholar
U.S. Department of Education, National Center for Education Statistics. The condition of education. 2014.Google Scholar
A. Vihavainen, M. Luukkainen, and J. Kurhila. Using students' programming behavior to predict success in an introductory mathematics course. In Proceedings of the International Conference on Educational Data Mining, 2011.Google Scholar
K. L. Wagstaff. Machine learning that matters. In Proceedings of the International Conference on Machine Learning(ICML), 2012.Google Scholar
S. K. Yadav, B. Bharadwaj, and S. Pal. Data mining applications: A comparative study for predicting student's performance. arXiv preprint arXiv:1202.4815, 2012.Google Scholar

Index Terms

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes
1. Information systems
  1. Information systems applications
    1. Decision support systems
      1. Expert systems

Recommendations

Who, when, and why: a machine learning approach to prioritizing students at risk of not graduating high school on time
LAK '15: Proceedings of the Fifth International Conference on Learning Analytics And Knowledge

Several hundred thousand students drop out of high school every year in the United States. Interventions can help those who are falling behind in their educational goals, but given limited resources, such programs must focus on the right students, at ...
Read More
Student Performance Prediction and Classification Using Machine Learning Algorithms
ICEIT 2019: Proceedings of the 2019 8th International Conference on Educational and Information Technology

For a productive and a good life, education is a necessity and it improves individuals' life with value and excellence. Also, education is considered a vital need for motivating self-assurance as well as providing the things are needed to partake in ...
Read More
A Robust Machine Learning Technique to Predict Low-performing Students

As enrollments and class sizes in postsecondary institutions have increased, instructors have sought automated and lightweight means to identify students who are at risk of performing poorly in a course. This identification must be performed early ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2015
2378 pages
ISBN:9781450336642
DOI:10.1145/2783258
General Chairs:
Longbing Cao
University of Technology, Sydney
,
Chengqi Zhang
University of Technology, Sydney
,
Program Chairs:
Thorsten Joachims
Cornell University
,
Geoff Webb
Monash University
,
Dragos D. Margineantu
Boeing Research
,
Graham Williams
Australian Taxation Office
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
applications
education
evaluation metrics
risk prediction
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 83
  Total Citations
  View Citations
- 1,620
  Total Downloads
- Downloads (Last 12 months)141
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Who, when, and why: a machine learning approach to prioritizing students at risk of not graduating high school on time

Student Performance Prediction and Classification Using Machine Learning Algorithms

A Robust Machine Learning Technique to Predict Low-performing Students