ABSTRACT
Methods for automatically identifying students in need of assistance have been studied for decades. Initially, the work was based on somewhat static factors such as students' educational background and results from various questionnaires, while more recently, constantly accumulating data such as progress with course assignments and behavior in lectures has gained attention. We contribute to this work with results on early detection of students in need of assistance, and provide a starting point for using machine learning techniques on naturally accumulating programming process data.
When combining source code snapshot data that is recorded from students' programming process with machine learning methods, we are able to detect high- and low-performing students with high accuracy already after the very first week of an introductory programming course. Comparison of our results to the prominent methods for predicting students' performance using source code snapshot data is also provided.
This early information on students' performance is beneficial from multiple viewpoints. Instructors can target their guidance to struggling students early on, and provide more challenging assignments for high-performing students. Moreover, students that perform poorly in the introductory programming course, but who nevertheless pass, can be monitored more closely in their future studies.
- A. Ahadi and R. Lister. Geek genes, prior knowledge, stumbling points and learning edge momentum: Parts of the one elephant? In Proceedings of the Ninth Annual International ACM Conference on International Computing Education Research, ICER '13, pages 123--128, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- A. Ahadi, R. Lister, and D. Teague. Falling behind early and staying behind when learning to program. In Proceedings of the 25th Psychology of Programming Conference, PPIG '14, 2014.Google Scholar
- J. Bennedsen and M. E. Caspersen. Abstraction ability as an indicator of success for learning object-oriented programming? ACM SIGCSE Bulletin, 38(2):39--43, 2006. Google ScholarDigital Library
- J. Bennedsen and M. E. Caspersen. Failure rates in introductory programming. ACM SIGCSE Bulletin, 39(2):32--36, 2007. Google ScholarDigital Library
- S. Bergin and R. Reilly. Programming: factors that influence success. ACM SIGCSE Bulletin, 37(1):411--415, 2005. Google ScholarDigital Library
- P. Byrne and G. Lyons. The effect of student attributes on success in programming. In ACM SIGCSE Bulletin, volume 33, pages 49--52. ACM, 2001. Google ScholarDigital Library
- B. Cantwell Wilson and S. Shrock. Contributing to success in an introductory computer science course: a study of twelve factors. In ACM SIGCSE Bulletin, volume 33, pages 184--188. ACM, 2001. Google ScholarDigital Library
- Y. Cherenkova, D. Zingaro, and A. Petersen. Identifying challenging CS1 concepts in a large problem dataset. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE '14, pages 695--700, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- G. E. Evans and M. G. Simkin. What best predicts computer proficiency? Communications of the ACM, 32(11):1322--1327, 1989. Google ScholarDigital Library
- D. Hagan and S. Markham. Does it help to have some programming experience before beginning a computing degree program? ACM SIGCSE Bulletin, 32(3):25--28, 2000. Google ScholarDigital Library
- M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10--18, 2009. Google ScholarDigital Library
- M. A. Hall. Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato, 1999.Google Scholar
- T. Hastie, R. Tibshirani, J. Friedman, T. Hastie, J. Friedman, and R. Tibshirani. The elements of statistical learning, volume 2. Springer, 2009.Google ScholarCross Ref
- R. Hosseini, A. Vihavainen, and P. Brusilovsky. Exploring problem solving paths in a Java programming course. In Proceedings of the 25th Workshop of the Psychology of Programming Interest Group, 2014.Google Scholar
- M. C. Jadud. Methods and tools for exploring novice compilation behaviour. In Proceedings of the second international workshop on Computing education research, pages 73--84. ACM, 2006. Google ScholarDigital Library
- H. Jang, J. Reeve, and E. L. Deci. Engaging students in learning activities: It is not autonomy support or structure but autonomy support and structure. Journal of Educational Psychology, 102(3):588, 2010.Google ScholarCross Ref
- S. Kullback and R. A. Leibler. On information and sufficiency. Ann. Math. Statist., 22(1):79--86, 03 1951.Google ScholarCross Ref
- J. Kurhila and A. Vihavainen. Management, structures and tools to scale up personal advising in large programming courses. In Proceedings of the 2011 Conference on Information Technology Education, SIGITE '11, pages 3--8, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- R. Leeper and J. Silver. Predicting success in a first programming course. In ACM SIGCSE Bulletin, volume 14, pages 147--150. ACM, 1982. Google ScholarDigital Library
- M. McCracken, V. Almstrum, D. Diaz, M. Guzdial, D. Hagan, Y. B.-D. Kolikant, C. Laxer, L. Thomas, I. Utting, and T. Wilusz. A multi-national, multi-institutional study of assessment of programming skills of first-year CS students. SIGCSE Bull., 33(4):125--180, Dec. 2001. Google ScholarDigital Library
- D. Orr, C. Gwosć, and N. Netz. Social and economic conditions of student life in Europe: synopsis of indicators; final report; Eurostudent IV 2008-2011. W. Bertelsmann Verlag, 2011.Google Scholar
- C. Piech, M. Sahami, D. Koller, S. Cooper, and P. Blikstein. Modeling how students learn to program. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education, SIGCSE '12, pages 153--160, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- L. Porter, M. Guzdial, C. McDowell, and B. Simon. Success in introductory programming: What works? Communications of the ACM, 56(8):34--36, 2013. Google ScholarDigital Library
- L. Porter and D. Zingaro. Importance of early performance in CS1: Two conflicting assessment stories. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE '14, pages 295--300, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- L. Porter, D. Zingaro, and R. Lister. Predicting student success using fine grain clicker data. In Proceedings of the tenth annual conference on International computing education research, pages 51--58. ACM, 2014. Google ScholarDigital Library
- M. M. T. Rodrigo, R. S. Baker, M. C. Jadud, A. C. M. Amarra, T. Dy, M. B. V. Espejo-Lahoz, S. A. L. Lim, S. A. Pascua, J. O. Sugay, and E. S. Tabanao. Affective and behavioral predictors of novice programmer achievement. ACM SIGCSE Bulletin, 41(3):156--160, 2009. Google ScholarDigital Library
- M. M. T. Rodrigo, E. Tabanao, M. B. E. Lahoz, and M. C. Jadud. Analyzing online protocols to characterize novice Java programmers. Philippine Journal of Science, 138(2):177--190, 2009.Google Scholar
- C. Romero and S. Ventura. Educational data mining: a review of the state of the art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40(6):601--618, 2010. Google ScholarDigital Library
- C. Romero, S. Ventura, P. G. Espejo, and C. Hervás. Data mining algorithms to classify students. Educational Data Mining 2008.Google Scholar
- N. Rountree, J. Rountree, A. Robins, and R. Hannah. Interacting factors that predict success and failure in a CS1 course. In ACM SIGCSE Bulletin, volume 36, pages 101--104. ACM, 2004. Google ScholarDigital Library
- E. Sierens, M. Vansteenkiste, L. Goossens, B. Soenens, and F. Dochy. The synergistic relationship of perceived autonomy support and structure in the prediction of self-regulated learning. British Journal of Educational Psychology, 79(1):57--68, 2009.Google ScholarCross Ref
- E. Soloway. Learning to program = learning to construct mechanisms and explanations. Commun. ACM, 29(9):850--858, Sept. 1986. Google ScholarDigital Library
- J. Spacco. Marmoset: a programming project assignment framework to improve the feedback cycle for students, faculty and researchers. PhD thesis, 2006. Google ScholarDigital Library
- M. V. Stein. Mathematical preparation as a basis for success in CS-II. Journal of Computing Sciences in Colleges, 17(4):28--38, 2002. Google ScholarDigital Library
- M. Tukiainen and E. Mönkkönen. Programming aptitude testing as a prediction of learning to program. In Proc. 14th Workshop of the Psychology of Programming Interest Group, pages 45--57, 2002.Google Scholar
- P. R. Ventura Jr. Identifying predictors of success for an objects-first CS1. 2005.Google ScholarCross Ref
- A. Vihavainen. Predicting students' performance in an introductory programming course using data from students' own programming process. In Advanced Learning Technologies (ICALT), 2013 IEEE 13th International Conference on. IEEE, 2013. Google ScholarDigital Library
- A. Vihavainen, J. Airaksinen, and C. Watson. A systematic review of approaches for teaching introductory programming and their influence on success. In Proceedings of the Tenth Annual Conference on International Computing Education Research, ICER '14, pages 19--26, New York, NY, USA, 2014. ACM. Google ScholarDigital Library
- A. Vihavainen, T. Vikberg, M. Luukkainen, and M. P\"artel. Scaffolding students' learning using Test My Code. In Proceedings of the 18th ACM conference on Innovation and technology in computer science education, pages 117--122. ACM, 2013. Google ScholarDigital Library
- C. Watson and F. W. Li. Failure rates in introductory programming revisited. In Proceedings of the 2014 conference on Innovation & technology in computer science education, pages 39--44. ACM, 2014. Google ScholarDigital Library
- C. Watson, F. W. Li, and J. L. Godwin. Predicting performance in an introductory programming course by logging and analyzing student programming behavior. In Advanced Learning Technologies (ICALT), 2013 IEEE 13th International Conference on, pages 319--323. IEEE, 2013. Google ScholarDigital Library
- C. Watson, F. W. Li, and J. L. Godwin. No tests required: comparing traditional and dynamic predictors of programming success. In Proceedings of the 45th ACM technical symposium on Computer science education, pages 469--474. ACM, 2014. Google ScholarDigital Library
- L. H. Werth. Predicting student performance in a beginning computer science class, volume 18. ACM, 1986. Google ScholarDigital Library
- S. Wiedenbeck, D. Labelle, and V. N. Kain. Factors affecting course outcomes in introductory programming. In 16th Annual Workshop of the Psychology of Programming Interest Group, pages 97--109, 2004.Google Scholar
- L. Williams, C. McDowell, N. Nagappan, J. Fernald, and L. Werner. Building pair programming knowledge through a family of experiments. In Proc. Empirical Software Engineering, pages 143--152. IEEE. Google ScholarDigital Library
- M. Yudelson, R. Hosseini, A. Vihavainen, and P. Brusilovsky. Investigating automated student modeling in a Java MOOC. In Proceedings of The Seventh International Conference on Educational Data Mining 2014, 2014.Google Scholar
Index Terms
- Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance
Recommendations
Evaluating Neural Networks as a Method for Identifying Students in Need of Assistance
SIGCSE '17: Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science EducationCourse instructors need to be able to identify students in need of assistance as early in the course as possible. Recent work has suggested that machine learning approaches applied to snapshots of small programming exercises may be an effective solution ...
How novices tackle their first lines of code in an IDE: analysis of programming session traces
Koli Calling '14: Proceedings of the 14th Koli Calling International Conference on Computing Education ResearchWhile computing educators have put plenty of effort into researching and developing programming environments that make it easier for students to create their first programs, these tools often have only little resemblance with the tools used in the ...
Transfer-Learning Methods in Programming Course Outcome Prediction
Special Issue on Learning Analytics and Regular PapersThe computing education research literature contains a wide variety of methods that can be used to identify students who are either at risk of failing their studies or who could benefit from additional challenges. Many of these are based on machine-...
Comments