skip to main content
research-article
Open Access

Program equivalence for assisted grading of functional programs

Published:13 November 2020Publication History
Skip Abstract Section

Abstract

In courses that involve programming assignments, giving meaningful feedback to students is an important challenge. Human beings can give useful feedback by manually grading the programs but this is a time-consuming, labor intensive, and usually boring process. Automatic graders can be fast and scale well but they usually provide poor feedback. Although there has been research on improving automatic graders, research on scaling and improving human grading is limited.

We propose to scale human grading by augmenting the manual grading process with an equivalence algorithm that can identify the equivalences between student submissions. This enables human graders to give targeted feedback for multiple student submissions at once. Our technique is conservative in two aspects. First, it identifies equivalence between submissions that are algorithmically similar, e.g., it cannot identify the equivalence between quicksort and mergesort. Second, it uses formal methods instead of clustering algorithms from the machine learning literature. This allows us to prove a soundness result that guarantees that submissions will never be clustered together in error. Despite only reporting equivalence when there is algorithmic similarity and the ability to formally prove equivalence, we show that our technique can significantly reduce grading time for thousands of programming submissions from an introductory functional programming course.

Skip Supplemental Material Section

Supplemental Material

oopsla20main-p188-p-video.mp4

mp4

28.1 MB

References

  1. Amal Ahmed, Derek Dreyer, and Andreas Rossberg. 2009. State-dependent representation independence. In Proc. Symposium on Principles of Programming Languages. ACM, 340-353. https://doi.org/10.1145/1480881.1480925 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Amal J. Ahmed. 2006. Step-Indexed Syntactic Logical Relations for Recursive and Quantified Types. In Proc. European Symposium on Programming. Springer, 69-83. https://doi.org/10.1007/11693024_6 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. John D. Backes, Suzette Person, Neha Rungta, and Oksana Tkachuk. 2013. Regression Verification Using Impact Summaries. In Proc. International Symposium Model Checking Software. Springer, 99-116. https://doi.org/10.1007/978-3-642-39176-7_7 Google ScholarGoogle ScholarCross RefCross Ref
  4. C Leonard Berman and Louise H Trevillyan. 1989. Functional comparison of logic designs for VLSI circuits. In Proc. International Conference on Computer-Aided Design. IEEE, 456-459. https://doi.org/10.1109/ICCAD. 1989.76990 Google ScholarGoogle ScholarCross RefCross Ref
  5. François Bobot, Jean-Christophe Filliâtre, Claude Marché, and Andrei Paskevich. 2015. Let's verify this with Why3. Int. J. Softw. Tools Technol. Transf. 17, 6 ( 2015 ), 709-727. https://doi.org/10.1007/s10009-014-0314-5 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A tool for checking ANSI-C programs. In Proc. International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 168-176. https://doi.org/10. 1007/978-3-540-24730-2_15 Google ScholarGoogle ScholarCross RefCross Ref
  7. Edmund M. Clarke, Armin Biere, Richard Raimi, and Yunshan Zhu. 2001. Bounded Model Checking Using Satisfiability Solving. Formal Methods Syst. Des. 19, 1 ( 2001 ), 7-34. https://doi.org/10.1023/A:1011276507260 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Joshua Clune, Vijay Ramamurthy, Ruben Martins, and Umut A. Acar. 2020. Program Equivalence for Assisted Grading of Functional Programs (Extended Version). CoRR abs/2010.08051 ( 2020 ). arXiv: 2010.08051 https://arxiv.org/abs/ 2010.08051Google ScholarGoogle Scholar
  9. Leonardo Mendonça de Moura and Nikolaj Bjørner. 2008. Z3: An Eficient SMT Solver. In Proc. International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337-340. https://doi.org/10.1007/978-3-540-78800-3_24 Google ScholarGoogle ScholarCross RefCross Ref
  10. Derek Dreyer, Amal Ahmed, and Lars Birkedal. 2009. Logical Step-Indexed Logical Relations. In Proc. Annual Symposium on Logic in Computer Science. IEEE Computer Society, 71-80. https://doi.org/10.1109/LICS. 2009.34 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Grigory Fedyukovich, Arie Gurfinkel, and Natasha Sharygina. 2016. Property Directed Equivalence via Abstract Simulation. In Proc. International Conference Computer-Aided Verification. Springer, 433-453. https://doi.org/10.1007/978-3-319-41540-6_24 Google ScholarGoogle ScholarCross RefCross Ref
  12. Dennis Felsing, Sarah Grebing, Vladimir Klebanov, Philipp Rümmer, and Mattias Ulbrich. 2014. Automating regression verification. In Proc. International Conference on Automated Software Engineering. ACM, 349-360. https://doi.org/10. 1145/2642937.2642987 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Benny Godlin and Ofer Strichman. 2009. Regression verification. In Proc. Design Automation Conference. ACM, 466-471. https://doi.org/10.1145/1629911.1630034 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sumit Gulwani, Ivan Radicek, and Florian Zuleger. 2018. Automated clustering and program repair for introductory programming assignments. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 465-480. https://doi.org/10.1145/3192366.3192387 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Chung-Kil Hur, Derek Dreyer, Georg Neis, and Viktor Vafeiadis. 2012. The marriage of bisimulations and Kripke logical relations. In Proc. Symposium on Principles of Programming Languages. ACM, 59-72. https://doi.org/10.1145/2103656. 2103666 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Guilhem Jaber. 2020. SyTeCi: automating contextual equivalence for higher-order programs with references. PACMPL 4, POPL ( 2020 ), 59 : 1-59 : 28. https://doi.org/10.1145/3371127 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Shalini Kaleeswaran, Anirudh Santhiar, Aditya Kanade, and Sumit Gulwani. 2016. Semi-supervised verified feedback generation. In Proc. International Symposium on Foundations of Software Engineering. ACM, 739-750. https://doi.org/10. 1145/2950290.2950363 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Vasileios Koutavas and Mitchell Wand. 2006. Small bisimulations for reasoning about higher-order imperative programs. In Proc. Symposium on Principles of Programming Languages. ACM, 141-152. https://doi.org/10.1145/1111037.1111050 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Xiao Liu, Shuai Wang, Pei Wang, and Dinghao Wu. 2019. Automatic grading of programming assignments: an approach based on formal semantics. In Proc. International Conference on Software Engineering: Software Engineering Education and Training. IEEE / ACM, 126-137. https://doi.org/10.1109/ICSE-SEET. 2019.00022 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. David Mitchel Perry, Dohyeong Kim, Roopsha Samanta, and Xiangyu Zhang. 2019. SemCluster: clustering of imperative programming assignments based on quantitative semantic features. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation. 860-873. https://doi.org/10.1145/3314221.3314629 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, and Regina Barzilay. 2016. sk_p: a neural program corrector for MOOCs. In Proc. International Conference on Systems, Programming, Languages and Applications: Software for Humanity. ACM, 39-40. https://doi.org/10.1145/2984043.2989222 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated feedback generation for introductory programming assignments. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 15-26. https://doi.org/10.1145/2491956.2462195 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Eijiro Sumii and Benjamin C. Pierce. 2005. A bisimulation for type abstraction and recursion. In Proc. Symposium on Principles of Programming Languages. ACM, 63-74. https://doi.org/10.1145/1040305.1040311 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ke Wang, Rishabh Singh, and Zhendong Su. 2018. Search, align, and repair: data-driven feedback generation for introductory programming exercises. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 481-495. https://doi.org/10.1145/3192366.3192384 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lenore Zuck, Amir Pnueli, Yi Fang, and Benjamin Goldberg. 2002. VOC: A translation validator for optimizing compilers. Electronic notes in theoretical computer science 65, 2 ( 2002 ), 2-18. https://doi.org/10.1016/S1571-0661 ( 04 ) 80393-1 Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Program equivalence for assisted grading of functional programs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Programming Languages
      Proceedings of the ACM on Programming Languages  Volume 4, Issue OOPSLA
      November 2020
      3108 pages
      EISSN:2475-1421
      DOI:10.1145/3436718
      Issue’s Table of Contents

      Copyright © 2020 Owner/Author

      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 November 2020
      Published in pacmpl Volume 4, Issue OOPSLA

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader