research-article

Open Access

Program equivalence for assisted grading of functional programs

Authors:
Joshua Clune

Carnegie Mellon University, USA

Carnegie Mellon University, USA
View Profile

,
Vijay Ramamurthy

Carnegie Mellon University, USA

Carnegie Mellon University, USA
View Profile

,
Ruben Martins

Carnegie Mellon University, USA

Carnegie Mellon University, USA
View Profile

,
Umut A. Acar

Carnegie Mellon University, USA

Carnegie Mellon University, USA
View Profile

Proceedings of the ACM on Programming Languages Volume 4 Issue OOPSLAArticle No.: 171pp 1–29https://doi.org/10.1145/3428239

Published:13 November 2020Publication History

Proceedings of the ACM on Programming Languages

Abstract

In courses that involve programming assignments, giving meaningful feedback to students is an important challenge. Human beings can give useful feedback by manually grading the programs but this is a time-consuming, labor intensive, and usually boring process. Automatic graders can be fast and scale well but they usually provide poor feedback. Although there has been research on improving automatic graders, research on scaling and improving human grading is limited.

We propose to scale human grading by augmenting the manual grading process with an equivalence algorithm that can identify the equivalences between student submissions. This enables human graders to give targeted feedback for multiple student submissions at once. Our technique is conservative in two aspects. First, it identifies equivalence between submissions that are algorithmically similar, e.g., it cannot identify the equivalence between quicksort and mergesort. Second, it uses formal methods instead of clustering algorithms from the machine learning literature. This allows us to prove a soundness result that guarantees that submissions will never be clustered together in error. Despite only reporting equivalence when there is algorithmic similarity and the ability to formally prove equivalence, we show that our technique can significantly reduce grading time for thousands of programming submissions from an introductory functional programming course.

Supplemental Material

oopsla20main-p188-p-video.mp4

mp4

28.1 MB

Download

References

Amal Ahmed, Derek Dreyer, and Andreas Rossberg. 2009. State-dependent representation independence. In Proc. Symposium on Principles of Programming Languages. ACM, 340-353. https://doi.org/10.1145/1480881.1480925 Google ScholarDigital Library
Amal J. Ahmed. 2006. Step-Indexed Syntactic Logical Relations for Recursive and Quantified Types. In Proc. European Symposium on Programming. Springer, 69-83. https://doi.org/10.1007/11693024_6 Google ScholarDigital Library
John D. Backes, Suzette Person, Neha Rungta, and Oksana Tkachuk. 2013. Regression Verification Using Impact Summaries. In Proc. International Symposium Model Checking Software. Springer, 99-116. https://doi.org/10.1007/978-3-642-39176-7_7 Google ScholarCross Ref
C Leonard Berman and Louise H Trevillyan. 1989. Functional comparison of logic designs for VLSI circuits. In Proc. International Conference on Computer-Aided Design. IEEE, 456-459. https://doi.org/10.1109/ICCAD. 1989.76990 Google ScholarCross Ref
François Bobot, Jean-Christophe Filliâtre, Claude Marché, and Andrei Paskevich. 2015. Let's verify this with Why3. Int. J. Softw. Tools Technol. Transf. 17, 6 ( 2015 ), 709-727. https://doi.org/10.1007/s10009-014-0314-5 Google ScholarDigital Library
Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A tool for checking ANSI-C programs. In Proc. International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 168-176. https://doi.org/10. 1007/978-3-540-24730-2_15 Google ScholarCross Ref
Edmund M. Clarke, Armin Biere, Richard Raimi, and Yunshan Zhu. 2001. Bounded Model Checking Using Satisfiability Solving. Formal Methods Syst. Des. 19, 1 ( 2001 ), 7-34. https://doi.org/10.1023/A:1011276507260 Google ScholarDigital Library
Joshua Clune, Vijay Ramamurthy, Ruben Martins, and Umut A. Acar. 2020. Program Equivalence for Assisted Grading of Functional Programs (Extended Version). CoRR abs/2010.08051 ( 2020 ). arXiv: 2010.08051 https://arxiv.org/abs/ 2010.08051Google Scholar
Leonardo Mendonça de Moura and Nikolaj Bjørner. 2008. Z3: An Eficient SMT Solver. In Proc. International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337-340. https://doi.org/10.1007/978-3-540-78800-3_24 Google ScholarCross Ref
Derek Dreyer, Amal Ahmed, and Lars Birkedal. 2009. Logical Step-Indexed Logical Relations. In Proc. Annual Symposium on Logic in Computer Science. IEEE Computer Society, 71-80. https://doi.org/10.1109/LICS. 2009.34 Google ScholarDigital Library
Grigory Fedyukovich, Arie Gurfinkel, and Natasha Sharygina. 2016. Property Directed Equivalence via Abstract Simulation. In Proc. International Conference Computer-Aided Verification. Springer, 433-453. https://doi.org/10.1007/978-3-319-41540-6_24 Google ScholarCross Ref
Dennis Felsing, Sarah Grebing, Vladimir Klebanov, Philipp Rümmer, and Mattias Ulbrich. 2014. Automating regression verification. In Proc. International Conference on Automated Software Engineering. ACM, 349-360. https://doi.org/10. 1145/2642937.2642987 Google ScholarDigital Library
Benny Godlin and Ofer Strichman. 2009. Regression verification. In Proc. Design Automation Conference. ACM, 466-471. https://doi.org/10.1145/1629911.1630034 Google ScholarDigital Library
Sumit Gulwani, Ivan Radicek, and Florian Zuleger. 2018. Automated clustering and program repair for introductory programming assignments. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 465-480. https://doi.org/10.1145/3192366.3192387 Google ScholarDigital Library
Chung-Kil Hur, Derek Dreyer, Georg Neis, and Viktor Vafeiadis. 2012. The marriage of bisimulations and Kripke logical relations. In Proc. Symposium on Principles of Programming Languages. ACM, 59-72. https://doi.org/10.1145/2103656. 2103666 Google ScholarDigital Library
Guilhem Jaber. 2020. SyTeCi: automating contextual equivalence for higher-order programs with references. PACMPL 4, POPL ( 2020 ), 59 : 1-59 : 28. https://doi.org/10.1145/3371127 Google ScholarDigital Library
Shalini Kaleeswaran, Anirudh Santhiar, Aditya Kanade, and Sumit Gulwani. 2016. Semi-supervised verified feedback generation. In Proc. International Symposium on Foundations of Software Engineering. ACM, 739-750. https://doi.org/10. 1145/2950290.2950363 Google ScholarDigital Library
Vasileios Koutavas and Mitchell Wand. 2006. Small bisimulations for reasoning about higher-order imperative programs. In Proc. Symposium on Principles of Programming Languages. ACM, 141-152. https://doi.org/10.1145/1111037.1111050 Google ScholarDigital Library
Xiao Liu, Shuai Wang, Pei Wang, and Dinghao Wu. 2019. Automatic grading of programming assignments: an approach based on formal semantics. In Proc. International Conference on Software Engineering: Software Engineering Education and Training. IEEE / ACM, 126-137. https://doi.org/10.1109/ICSE-SEET. 2019.00022 Google ScholarDigital Library
David Mitchel Perry, Dohyeong Kim, Roopsha Samanta, and Xiangyu Zhang. 2019. SemCluster: clustering of imperative programming assignments based on quantitative semantic features. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation. 860-873. https://doi.org/10.1145/3314221.3314629 Google ScholarDigital Library
Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, and Regina Barzilay. 2016. sk_p: a neural program corrector for MOOCs. In Proc. International Conference on Systems, Programming, Languages and Applications: Software for Humanity. ACM, 39-40. https://doi.org/10.1145/2984043.2989222 Google ScholarDigital Library
Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated feedback generation for introductory programming assignments. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 15-26. https://doi.org/10.1145/2491956.2462195 Google ScholarDigital Library
Eijiro Sumii and Benjamin C. Pierce. 2005. A bisimulation for type abstraction and recursion. In Proc. Symposium on Principles of Programming Languages. ACM, 63-74. https://doi.org/10.1145/1040305.1040311 Google ScholarDigital Library
Ke Wang, Rishabh Singh, and Zhendong Su. 2018. Search, align, and repair: data-driven feedback generation for introductory programming exercises. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 481-495. https://doi.org/10.1145/3192366.3192384 Google ScholarDigital Library
Lenore Zuck, Amir Pnueli, Yi Fang, and Benjamin Goldberg. 2002. VOC: A translation validator for optimizing compilers. Electronic notes in theoretical computer science 65, 2 ( 2002 ), 2-18. https://doi.org/10.1016/S1571-0661 ( 04 ) 80393-1 Google ScholarCross Ref

Index Terms

Program equivalence for assisted grading of functional programs
1. Theory of computation
  1. Logic
    1. Automated reasoning

Recommendations

A descriptive analysis of a computer-assisted instruction developmental english program
Read More
The effects of a computer-assisted instruction tutorial program on the academic performance and attitudes of college athletes
Read More
Program equivalence by circular reasoning
Abstract
We propose a logic and a deductive system for stating and automatically proving the equivalence of programs written in languages having a rewriting-based operational semantics. The chosen equivalence is parametric in a so-called observation ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Programming Languages Volume 4, Issue OOPSLA
November 2020
3108 pages
EISSN:2475-1421
DOI:10.1145/3436718
Issue’s Table of Contents

Copyright © 2020 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 November 2020
Published in pacmpl Volume 4, Issue OOPSLA

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Assisted Grading
Formal Methods
Functional Programming
Program Equivalence
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 703
  Total Downloads
- Downloads (Last 12 months)174
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Program equivalence for assisted grading of functional programs

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

A descriptive analysis of a computer-assisted instruction developmental english program

The effects of a computer-assisted instruction tutorial program on the academic performance and attitudes of college athletes

Program equivalence by circular reasoning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Program equivalence for assisted grading of functional programs

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

A descriptive analysis of a computer-assisted instruction developmental english program

The effects of a computer-assisted instruction tutorial program on the academic performance and attitudes of college athletes

Program equivalence by circular reasoning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media