research-article

Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance

Authors:
Alireza Ahadi

University of Technology, Sydney, Sydney, Australia

University of Technology, Sydney, Sydney, Australia
View Profile

,
Raymond Lister

University of Technology, Sydney, Sydney, Australia

University of Technology, Sydney, Sydney, Australia
View Profile

,
Heikki Haapala

University of Helsinki, Helsinki, Finland

University of Helsinki, Helsinki, Finland
View Profile

,
Arto Vihavainen

University of Helsinki, Helsinki, Finland

University of Helsinki, Helsinki, Finland
View Profile

ICER '15: Proceedings of the eleventh annual International Conference on International Computing Education ResearchJuly 2015Pages 121–130https://doi.org/10.1145/2787622.2787717

Published:09 August 2015Publication History

ICER '15: Proceedings of the eleventh annual International Conference on International Computing Education Research

Pages 121–130

ABSTRACT

Methods for automatically identifying students in need of assistance have been studied for decades. Initially, the work was based on somewhat static factors such as students' educational background and results from various questionnaires, while more recently, constantly accumulating data such as progress with course assignments and behavior in lectures has gained attention. We contribute to this work with results on early detection of students in need of assistance, and provide a starting point for using machine learning techniques on naturally accumulating programming process data.

When combining source code snapshot data that is recorded from students' programming process with machine learning methods, we are able to detect high- and low-performing students with high accuracy already after the very first week of an introductory programming course. Comparison of our results to the prominent methods for predicting students' performance using source code snapshot data is also provided.

This early information on students' performance is beneficial from multiple viewpoints. Instructors can target their guidance to struggling students early on, and provide more challenging assignments for high-performing students. Moreover, students that perform poorly in the introductory programming course, but who nevertheless pass, can be monitored more closely in their future studies.

References

A. Ahadi and R. Lister. Geek genes, prior knowledge, stumbling points and learning edge momentum: Parts of the one elephant? In Proceedings of the Ninth Annual International ACM Conference on International Computing Education Research, ICER '13, pages 123--128, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
A. Ahadi, R. Lister, and D. Teague. Falling behind early and staying behind when learning to program. In Proceedings of the 25th Psychology of Programming Conference, PPIG '14, 2014.Google Scholar
J. Bennedsen and M. E. Caspersen. Abstraction ability as an indicator of success for learning object-oriented programming? ACM SIGCSE Bulletin, 38(2):39--43, 2006. Google ScholarDigital Library
J. Bennedsen and M. E. Caspersen. Failure rates in introductory programming. ACM SIGCSE Bulletin, 39(2):32--36, 2007. Google ScholarDigital Library
S. Bergin and R. Reilly. Programming: factors that influence success. ACM SIGCSE Bulletin, 37(1):411--415, 2005. Google ScholarDigital Library
P. Byrne and G. Lyons. The effect of student attributes on success in programming. In ACM SIGCSE Bulletin, volume 33, pages 49--52. ACM, 2001. Google ScholarDigital Library
B. Cantwell Wilson and S. Shrock. Contributing to success in an introductory computer science course: a study of twelve factors. In ACM SIGCSE Bulletin, volume 33, pages 184--188. ACM, 2001. Google ScholarDigital Library
Y. Cherenkova, D. Zingaro, and A. Petersen. Identifying challenging CS1 concepts in a large problem dataset. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE '14, pages 695--700, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
G. E. Evans and M. G. Simkin. What best predicts computer proficiency? Communications of the ACM, 32(11):1322--1327, 1989. Google ScholarDigital Library
D. Hagan and S. Markham. Does it help to have some programming experience before beginning a computing degree program? ACM SIGCSE Bulletin, 32(3):25--28, 2000. Google ScholarDigital Library
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10--18, 2009. Google ScholarDigital Library
M. A. Hall. Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato, 1999.Google Scholar
T. Hastie, R. Tibshirani, J. Friedman, T. Hastie, J. Friedman, and R. Tibshirani. The elements of statistical learning, volume 2. Springer, 2009.Google ScholarCross Ref
R. Hosseini, A. Vihavainen, and P. Brusilovsky. Exploring problem solving paths in a Java programming course. In Proceedings of the 25th Workshop of the Psychology of Programming Interest Group, 2014.Google Scholar
M. C. Jadud. Methods and tools for exploring novice compilation behaviour. In Proceedings of the second international workshop on Computing education research, pages 73--84. ACM, 2006. Google ScholarDigital Library
H. Jang, J. Reeve, and E. L. Deci. Engaging students in learning activities: It is not autonomy support or structure but autonomy support and structure. Journal of Educational Psychology, 102(3):588, 2010.Google ScholarCross Ref
S. Kullback and R. A. Leibler. On information and sufficiency. Ann. Math. Statist., 22(1):79--86, 03 1951.Google ScholarCross Ref
J. Kurhila and A. Vihavainen. Management, structures and tools to scale up personal advising in large programming courses. In Proceedings of the 2011 Conference on Information Technology Education, SIGITE '11, pages 3--8, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
R. Leeper and J. Silver. Predicting success in a first programming course. In ACM SIGCSE Bulletin, volume 14, pages 147--150. ACM, 1982. Google ScholarDigital Library
M. McCracken, V. Almstrum, D. Diaz, M. Guzdial, D. Hagan, Y. B.-D. Kolikant, C. Laxer, L. Thomas, I. Utting, and T. Wilusz. A multi-national, multi-institutional study of assessment of programming skills of first-year CS students. SIGCSE Bull., 33(4):125--180, Dec. 2001. Google ScholarDigital Library
D. Orr, C. Gwosć, and N. Netz. Social and economic conditions of student life in Europe: synopsis of indicators; final report; Eurostudent IV 2008-2011. W. Bertelsmann Verlag, 2011.Google Scholar
C. Piech, M. Sahami, D. Koller, S. Cooper, and P. Blikstein. Modeling how students learn to program. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education, SIGCSE '12, pages 153--160, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
L. Porter, M. Guzdial, C. McDowell, and B. Simon. Success in introductory programming: What works? Communications of the ACM, 56(8):34--36, 2013. Google ScholarDigital Library
L. Porter and D. Zingaro. Importance of early performance in CS1: Two conflicting assessment stories. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE '14, pages 295--300, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
L. Porter, D. Zingaro, and R. Lister. Predicting student success using fine grain clicker data. In Proceedings of the tenth annual conference on International computing education research, pages 51--58. ACM, 2014. Google ScholarDigital Library
M. M. T. Rodrigo, R. S. Baker, M. C. Jadud, A. C. M. Amarra, T. Dy, M. B. V. Espejo-Lahoz, S. A. L. Lim, S. A. Pascua, J. O. Sugay, and E. S. Tabanao. Affective and behavioral predictors of novice programmer achievement. ACM SIGCSE Bulletin, 41(3):156--160, 2009. Google ScholarDigital Library
M. M. T. Rodrigo, E. Tabanao, M. B. E. Lahoz, and M. C. Jadud. Analyzing online protocols to characterize novice Java programmers. Philippine Journal of Science, 138(2):177--190, 2009.Google Scholar
C. Romero and S. Ventura. Educational data mining: a review of the state of the art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40(6):601--618, 2010. Google ScholarDigital Library
C. Romero, S. Ventura, P. G. Espejo, and C. Hervás. Data mining algorithms to classify students. Educational Data Mining 2008.Google Scholar
N. Rountree, J. Rountree, A. Robins, and R. Hannah. Interacting factors that predict success and failure in a CS1 course. In ACM SIGCSE Bulletin, volume 36, pages 101--104. ACM, 2004. Google ScholarDigital Library
E. Sierens, M. Vansteenkiste, L. Goossens, B. Soenens, and F. Dochy. The synergistic relationship of perceived autonomy support and structure in the prediction of self-regulated learning. British Journal of Educational Psychology, 79(1):57--68, 2009.Google ScholarCross Ref
E. Soloway. Learning to program = learning to construct mechanisms and explanations. Commun. ACM, 29(9):850--858, Sept. 1986. Google ScholarDigital Library
J. Spacco. Marmoset: a programming project assignment framework to improve the feedback cycle for students, faculty and researchers. PhD thesis, 2006. Google ScholarDigital Library
M. V. Stein. Mathematical preparation as a basis for success in CS-II. Journal of Computing Sciences in Colleges, 17(4):28--38, 2002. Google ScholarDigital Library
M. Tukiainen and E. Mönkkönen. Programming aptitude testing as a prediction of learning to program. In Proc. 14th Workshop of the Psychology of Programming Interest Group, pages 45--57, 2002.Google Scholar
P. R. Ventura Jr. Identifying predictors of success for an objects-first CS1. 2005.Google ScholarCross Ref
A. Vihavainen. Predicting students' performance in an introductory programming course using data from students' own programming process. In Advanced Learning Technologies (ICALT), 2013 IEEE 13th International Conference on. IEEE, 2013. Google ScholarDigital Library
A. Vihavainen, J. Airaksinen, and C. Watson. A systematic review of approaches for teaching introductory programming and their influence on success. In Proceedings of the Tenth Annual Conference on International Computing Education Research, ICER '14, pages 19--26, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
A. Vihavainen, T. Vikberg, M. Luukkainen, and M. P\"artel. Scaffolding students' learning using Test My Code. In Proceedings of the 18th ACM conference on Innovation and technology in computer science education, pages 117--122. ACM, 2013. Google ScholarDigital Library
C. Watson and F. W. Li. Failure rates in introductory programming revisited. In Proceedings of the 2014 conference on Innovation & technology in computer science education, pages 39--44. ACM, 2014. Google ScholarDigital Library
C. Watson, F. W. Li, and J. L. Godwin. Predicting performance in an introductory programming course by logging and analyzing student programming behavior. In Advanced Learning Technologies (ICALT), 2013 IEEE 13th International Conference on, pages 319--323. IEEE, 2013. Google ScholarDigital Library
C. Watson, F. W. Li, and J. L. Godwin. No tests required: comparing traditional and dynamic predictors of programming success. In Proceedings of the 45th ACM technical symposium on Computer science education, pages 469--474. ACM, 2014. Google ScholarDigital Library
L. H. Werth. Predicting student performance in a beginning computer science class, volume 18. ACM, 1986. Google ScholarDigital Library
S. Wiedenbeck, D. Labelle, and V. N. Kain. Factors affecting course outcomes in introductory programming. In 16th Annual Workshop of the Psychology of Programming Interest Group, pages 97--109, 2004.Google Scholar
L. Williams, C. McDowell, N. Nagappan, J. Fernald, and L. Werner. Building pair programming knowledge through a family of experiments. In Proc. Empirical Software Engineering, pages 143--152. IEEE. Google ScholarDigital Library
M. Yudelson, R. Hosseini, A. Vihavainen, and P. Brusilovsky. Investigating automated student modeling in a Java MOOC. In Proceedings of The Seventh International Conference on Educational Data Mining 2014, 2014.Google Scholar

Index Terms

Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance
1. Information systems
  1. Information systems applications
    1. Data mining
2. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Computing education programs
        Computer science education

Recommendations

Evaluating Neural Networks as a Method for Identifying Students in Need of Assistance
SIGCSE '17: Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education

Course instructors need to be able to identify students in need of assistance as early in the course as possible. Recent work has suggested that machine learning approaches applied to snapshots of small programming exercises may be an effective solution ...
Read More
How novices tackle their first lines of code in an IDE: analysis of programming session traces
Koli Calling '14: Proceedings of the 14th Koli Calling International Conference on Computing Education Research

While computing educators have put plenty of effort into researching and developing programming environments that make it easier for students to create their first programs, these tools often have only little resemblance with the tools used in the ...
Read More
Transfer-Learning Methods in Programming Course Outcome Prediction
Special Issue on Learning Analytics and Regular Papers

The computing education research literature contains a wide variety of methods that can be used to identify students who are either at risk of failing their studies or who could benefit from additional challenges. Many of these are based on machine-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICER '15: Proceedings of the eleventh annual International Conference on International Computing Education Research
July 2015
300 pages
ISBN:9781450336307
DOI:10.1145/2787622
General Chair:
Brian Dorn
University of Nebraska at Omaha, USA
,
Program Chairs:
Judy Sheard
Monash University, Australia
,
Quintin Cutts
University of Glasgow, UK
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
detecting students in need of assistance
educational data mining
introductory programming
learning analytics
novice programmers
programming behavior
source code snapshot analysis
Qualifiers
- research-article
Conference

Acceptance Rates
ICER '15 Paper Acceptance Rate25of96submissions,26%Overall Acceptance Rate189of803submissions,24%
More
Upcoming Conference
ICER 2024

Sponsor:

sigcse

ACM Conference on International Computing Education Research

August 13 - 15, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 143
  Total Citations
  View Citations
- 1,655
  Total Downloads
- Downloads (Last 12 months)144
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance

ICER '15: Proceedings of the eleventh annual International Conference on International Computing Education Research

ABSTRACT

References

Cited By

Index Terms

Recommendations

Evaluating Neural Networks as a Method for Identifying Students in Need of Assistance

How novices tackle their first lines of code in an IDE: analysis of programming session traces

Transfer-Learning Methods in Programming Course Outcome Prediction