skip to main content
10.1145/2787622.2787717acmconferencesArticle/Chapter ViewAbstractPublication PagesicerConference Proceedingsconference-collections
research-article

Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance

Published:09 August 2015Publication History

ABSTRACT

Methods for automatically identifying students in need of assistance have been studied for decades. Initially, the work was based on somewhat static factors such as students' educational background and results from various questionnaires, while more recently, constantly accumulating data such as progress with course assignments and behavior in lectures has gained attention. We contribute to this work with results on early detection of students in need of assistance, and provide a starting point for using machine learning techniques on naturally accumulating programming process data.

When combining source code snapshot data that is recorded from students' programming process with machine learning methods, we are able to detect high- and low-performing students with high accuracy already after the very first week of an introductory programming course. Comparison of our results to the prominent methods for predicting students' performance using source code snapshot data is also provided.

This early information on students' performance is beneficial from multiple viewpoints. Instructors can target their guidance to struggling students early on, and provide more challenging assignments for high-performing students. Moreover, students that perform poorly in the introductory programming course, but who nevertheless pass, can be monitored more closely in their future studies.

References

  1. A. Ahadi and R. Lister. Geek genes, prior knowledge, stumbling points and learning edge momentum: Parts of the one elephant? In Proceedings of the Ninth Annual International ACM Conference on International Computing Education Research, ICER '13, pages 123--128, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Ahadi, R. Lister, and D. Teague. Falling behind early and staying behind when learning to program. In Proceedings of the 25th Psychology of Programming Conference, PPIG '14, 2014.Google ScholarGoogle Scholar
  3. J. Bennedsen and M. E. Caspersen. Abstraction ability as an indicator of success for learning object-oriented programming? ACM SIGCSE Bulletin, 38(2):39--43, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Bennedsen and M. E. Caspersen. Failure rates in introductory programming. ACM SIGCSE Bulletin, 39(2):32--36, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Bergin and R. Reilly. Programming: factors that influence success. ACM SIGCSE Bulletin, 37(1):411--415, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Byrne and G. Lyons. The effect of student attributes on success in programming. In ACM SIGCSE Bulletin, volume 33, pages 49--52. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Cantwell Wilson and S. Shrock. Contributing to success in an introductory computer science course: a study of twelve factors. In ACM SIGCSE Bulletin, volume 33, pages 184--188. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Cherenkova, D. Zingaro, and A. Petersen. Identifying challenging CS1 concepts in a large problem dataset. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE '14, pages 695--700, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. E. Evans and M. G. Simkin. What best predicts computer proficiency? Communications of the ACM, 32(11):1322--1327, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Hagan and S. Markham. Does it help to have some programming experience before beginning a computing degree program? ACM SIGCSE Bulletin, 32(3):25--28, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1):10--18, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. A. Hall. Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato, 1999.Google ScholarGoogle Scholar
  13. T. Hastie, R. Tibshirani, J. Friedman, T. Hastie, J. Friedman, and R. Tibshirani. The elements of statistical learning, volume 2. Springer, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  14. R. Hosseini, A. Vihavainen, and P. Brusilovsky. Exploring problem solving paths in a Java programming course. In Proceedings of the 25th Workshop of the Psychology of Programming Interest Group, 2014.Google ScholarGoogle Scholar
  15. M. C. Jadud. Methods and tools for exploring novice compilation behaviour. In Proceedings of the second international workshop on Computing education research, pages 73--84. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Jang, J. Reeve, and E. L. Deci. Engaging students in learning activities: It is not autonomy support or structure but autonomy support and structure. Journal of Educational Psychology, 102(3):588, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  17. S. Kullback and R. A. Leibler. On information and sufficiency. Ann. Math. Statist., 22(1):79--86, 03 1951.Google ScholarGoogle ScholarCross RefCross Ref
  18. J. Kurhila and A. Vihavainen. Management, structures and tools to scale up personal advising in large programming courses. In Proceedings of the 2011 Conference on Information Technology Education, SIGITE '11, pages 3--8, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Leeper and J. Silver. Predicting success in a first programming course. In ACM SIGCSE Bulletin, volume 14, pages 147--150. ACM, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. McCracken, V. Almstrum, D. Diaz, M. Guzdial, D. Hagan, Y. B.-D. Kolikant, C. Laxer, L. Thomas, I. Utting, and T. Wilusz. A multi-national, multi-institutional study of assessment of programming skills of first-year CS students. SIGCSE Bull., 33(4):125--180, Dec. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Orr, C. Gwosć, and N. Netz. Social and economic conditions of student life in Europe: synopsis of indicators; final report; Eurostudent IV 2008-2011. W. Bertelsmann Verlag, 2011.Google ScholarGoogle Scholar
  22. C. Piech, M. Sahami, D. Koller, S. Cooper, and P. Blikstein. Modeling how students learn to program. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education, SIGCSE '12, pages 153--160, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Porter, M. Guzdial, C. McDowell, and B. Simon. Success in introductory programming: What works? Communications of the ACM, 56(8):34--36, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L. Porter and D. Zingaro. Importance of early performance in CS1: Two conflicting assessment stories. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education, SIGCSE '14, pages 295--300, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Porter, D. Zingaro, and R. Lister. Predicting student success using fine grain clicker data. In Proceedings of the tenth annual conference on International computing education research, pages 51--58. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. M. T. Rodrigo, R. S. Baker, M. C. Jadud, A. C. M. Amarra, T. Dy, M. B. V. Espejo-Lahoz, S. A. L. Lim, S. A. Pascua, J. O. Sugay, and E. S. Tabanao. Affective and behavioral predictors of novice programmer achievement. ACM SIGCSE Bulletin, 41(3):156--160, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. M. T. Rodrigo, E. Tabanao, M. B. E. Lahoz, and M. C. Jadud. Analyzing online protocols to characterize novice Java programmers. Philippine Journal of Science, 138(2):177--190, 2009.Google ScholarGoogle Scholar
  28. C. Romero and S. Ventura. Educational data mining: a review of the state of the art. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40(6):601--618, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Romero, S. Ventura, P. G. Espejo, and C. Hervás. Data mining algorithms to classify students. Educational Data Mining 2008.Google ScholarGoogle Scholar
  30. N. Rountree, J. Rountree, A. Robins, and R. Hannah. Interacting factors that predict success and failure in a CS1 course. In ACM SIGCSE Bulletin, volume 36, pages 101--104. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. E. Sierens, M. Vansteenkiste, L. Goossens, B. Soenens, and F. Dochy. The synergistic relationship of perceived autonomy support and structure in the prediction of self-regulated learning. British Journal of Educational Psychology, 79(1):57--68, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  32. E. Soloway. Learning to program = learning to construct mechanisms and explanations. Commun. ACM, 29(9):850--858, Sept. 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Spacco. Marmoset: a programming project assignment framework to improve the feedback cycle for students, faculty and researchers. PhD thesis, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. V. Stein. Mathematical preparation as a basis for success in CS-II. Journal of Computing Sciences in Colleges, 17(4):28--38, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Tukiainen and E. Mönkkönen. Programming aptitude testing as a prediction of learning to program. In Proc. 14th Workshop of the Psychology of Programming Interest Group, pages 45--57, 2002.Google ScholarGoogle Scholar
  36. P. R. Ventura Jr. Identifying predictors of success for an objects-first CS1. 2005.Google ScholarGoogle ScholarCross RefCross Ref
  37. A. Vihavainen. Predicting students' performance in an introductory programming course using data from students' own programming process. In Advanced Learning Technologies (ICALT), 2013 IEEE 13th International Conference on. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Vihavainen, J. Airaksinen, and C. Watson. A systematic review of approaches for teaching introductory programming and their influence on success. In Proceedings of the Tenth Annual Conference on International Computing Education Research, ICER '14, pages 19--26, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Vihavainen, T. Vikberg, M. Luukkainen, and M. P\"artel. Scaffolding students' learning using Test My Code. In Proceedings of the 18th ACM conference on Innovation and technology in computer science education, pages 117--122. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. C. Watson and F. W. Li. Failure rates in introductory programming revisited. In Proceedings of the 2014 conference on Innovation & technology in computer science education, pages 39--44. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. C. Watson, F. W. Li, and J. L. Godwin. Predicting performance in an introductory programming course by logging and analyzing student programming behavior. In Advanced Learning Technologies (ICALT), 2013 IEEE 13th International Conference on, pages 319--323. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. C. Watson, F. W. Li, and J. L. Godwin. No tests required: comparing traditional and dynamic predictors of programming success. In Proceedings of the 45th ACM technical symposium on Computer science education, pages 469--474. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. L. H. Werth. Predicting student performance in a beginning computer science class, volume 18. ACM, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. S. Wiedenbeck, D. Labelle, and V. N. Kain. Factors affecting course outcomes in introductory programming. In 16th Annual Workshop of the Psychology of Programming Interest Group, pages 97--109, 2004.Google ScholarGoogle Scholar
  45. L. Williams, C. McDowell, N. Nagappan, J. Fernald, and L. Werner. Building pair programming knowledge through a family of experiments. In Proc. Empirical Software Engineering, pages 143--152. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Yudelson, R. Hosseini, A. Vihavainen, and P. Brusilovsky. Investigating automated student modeling in a Java MOOC. In Proceedings of The Seventh International Conference on Educational Data Mining 2014, 2014.Google ScholarGoogle Scholar

Index Terms

  1. Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICER '15: Proceedings of the eleventh annual International Conference on International Computing Education Research
        July 2015
        300 pages
        ISBN:9781450336307
        DOI:10.1145/2787622

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 9 August 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ICER '15 Paper Acceptance Rate25of96submissions,26%Overall Acceptance Rate189of803submissions,24%

        Upcoming Conference

        ICER 2024
        ACM Conference on International Computing Education Research
        August 13 - 15, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader