Abstract
Inductive invariants can be robustly synthesized using a learning model where the teacher is a program verifier who instructs the learner through concrete program configurations, classified as positive, negative, and implications. We propose the first learning algorithms in this model with implication counter-examples that are based on machine learning techniques. In particular, we extend classical decision-tree learning algorithms in machine learning to handle implication samples, building new scalable ways to construct small decision trees using statistical measures. We also develop a decision-tree learning algorithm in this model that is guaranteed to converge to the right concept (invariant) if one exists. We implement the learners and an appropriate teacher, and show that the resulting invariant synthesis is efficient and convergent for a large suite of programs.
Supplemental Material
Available for Download
Please find additional material at http://madhu.cs.illinois.edu/popl16/
- Competition on Software Verification (SV-COMP) benchmarks. https://svn.sosy-lab.org/software/ sv-benchmarks/tags/svcomp14/loops/.Google Scholar
- Learning bayesian network parameters under equivalence constraints. Artificial Intelligence, (0):–, 2015.Google Scholar
- A. Albarghouthi and K. L. McMillan. Beautiful interpolants. In Proceedings of the 25th International Conference on Computer Aided Verification, CAV’13, pages 313–329, 2013. ISBN 978-3-642-39798-1. Google ScholarDigital Library
- R. Alur, P. Cerný, P. Madhusudan, and W. Nam. Synthesis of interface specifications for java classes. In POPL 2005, pages 98–109. ACM, 2005. Google ScholarDigital Library
- R. Alur, P. Madhusudan, and W. Nam. Symbolic compositional verification by learning assumptions. In CAV 2005, volume 3576 of LNCS, pages 548–562. Springer, 2005. Google ScholarDigital Library
- R. Alur, R. Bodík, G. Juniwal, M. M. K. Martin, M. Raghothaman, S. A. Seshia, R. Singh, A. Solar-Lezama, E. Torlak, and A. Udupa. Syntax-guided synthesis. In FMCAD 2013, pages 1–17. IEEE, 2013.Google ScholarCross Ref
- D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87–106, 1987. Google ScholarDigital Library
- D. Angluin. Queries and concept learning. Mach. Learn., 2(4):319–342, Apr. 1988. ISSN 0885-6125.. URL http://dx.doi.org/10. 1023/A:1022821128753. Google ScholarDigital Library
- G. BakIr, T. Hofmann, B. Scholkopf, A. J. Smola, B. Taskar, and S. Vishwanathan. Predicting Structured Data. MIT Press, Cambridge, MA, USA, 2007. Google ScholarDigital Library
- M. Barnett, B. E. Chang, R. DeLine, B. Jacobs, and K. R. M. Leino. Boogie: A modular reusable verifier for object-oriented programs. In FMCO 2005, volume 4111 of LNCS, pages 364–387. Springer, 2005. Google ScholarDigital Library
- A. Betts, N. Chong, A. Donaldson, S. Qadeer, and P. Thomson. Gpuverify: A verifier for gpu kernels. SIGPLAN Not., 47(10):113– 132, Oct. 2012. Google ScholarDigital Library
- D. Beyer and M. E. Keremoglu. Cpachecker: A tool for configurable software verification. In CAV 2011, volume 6806 of LNCS, pages 184–190. Springer, 2011. Google ScholarDigital Library
- A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, COLT’ 98, pages 92–100, 1998. Google ScholarDigital Library
- ISBN 1-58113-057-0.Google Scholar
- A. R. Bradley. Sat-based model checking without unrolling. In VMCAI 2011, volume 6538 of LNCS, pages 70–87. Springer, 2011. Google ScholarDigital Library
- Y.-F. Chen and B.-Y. Wang. Learning boolean functions incrementally. In CAV, volume 7358 of LNCS, pages 55–70. Springer, 2012. Google ScholarDigital Library
- J. M. Cobleigh, D. Giannakopoulou, and C. S. Pasareanu. Learning assumptions for compositional verification. In TACAS 2003, volume 2619 of LNCS, pages 331–346. Springer, 2003. Google ScholarDigital Library
- E. Cohen, M. Dahlweid, M. A. Hillebrand, D. Leinenbach, M. Moskal, T. Santen, W. Schulte, and S. Tobies. Vcc: A practical system for verifying concurrent c. In TPHOLs, pages 23–42, 2009. Google ScholarDigital Library
- M. Colón, S. Sankaranarayanan, and H. Sipma. Linear invariant generation using non-linear constraint solving. In CAV 2003, volume 2725 of LNCS, pages 420–432. Springer, 2003.Google Scholar
- C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273–297, 1995. Google ScholarDigital Library
- P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL 1977, pages 238–252. ACM Press, 1977. Google ScholarDigital Library
- L. M. de Moura and N. Bjørner. Efficient e-matching for smt solvers. In CADE, pages 183–198, 2007. Google ScholarDigital Library
- M. D. Ernst, A. Czeisler, W. G. Griswold, and D. Notkin. Quickly detecting relevant program invariants. In ICSE 2000, pages 449–458. ACM Press, 2000. Google ScholarDigital Library
- C. Flanagan and K. R. M. Leino. Houdini, an annotation assistant for ESC/Java. In FME, volume 2021 of LNCS, pages 500–517. Springer, 2001. Google ScholarDigital Library
- P. Garg, C. Löding, P. Madhusudan, and D. Neider. Learning universally quantified invariants of linear data structures. In CAV 2013, volume 8044 of LNCS, pages 813–829. Springer, 2013.Google Scholar
- P. Garg, C. Löding, P. Madhusudan, and D. Neider. ICE: A robust framework for learning invariants. In CAV 2014, volume 8559 of LNCS, pages 69–87. Springer, 2014. Google ScholarDigital Library
- P. Garoche, T. Kahsai, and C. Tinelli. Incremental invariant generation using logic-based automatic abstract transformers. In NFM 2013, volume 7871 of LNCS, pages 139–154. Springer, 2013.Google Scholar
- B. S. Gulavani, T. A. Henzinger, Y. Kannan, A. V. Nori, and S. K. Rajamani. Synergy: a new algorithm for property checking. In SIGSOFT FSE, pages 117–127. ACM, 2006. Google ScholarDigital Library
- S. Gulwani, S. Srivastava, and R. Venkatesan. Program analysis as constraint solving. In PLDI, pages 281–292. ACM, 2008. Google ScholarDigital Library
- A. Gupta and A. Rybalchenko. Invgen: An efficient invariant generator. In CAV 2009, volume 5643 of LNCS, pages 634–640. Springer, 2009. Google ScholarDigital Library
- F. Ivancic and S. Sankaranarayanan. NECLA Benchmarks. http: //www.nec-labs.com/research/system/systems_ SAV-website/small_static_bench-v1.1.tar.gz.Google Scholar
- R. Jhala and K. L. McMillan. A practical and complete approach to predicate refinement. In TACAS, volume 3920 of LNCS, pages 459–473. Springer, 2006. Google ScholarDigital Library
- M. Kawaguchi, P. M. Rondon, and R. Jhala. Type-based data structure verification. In PLDI 2009, pages 304–315. ACM, 2009. Google ScholarDigital Library
- M. J. Kearns and U. V. Vazirani. An introduction to computational learning theory. MIT Press, Cambridge, MA, USA, 1994. ISBN 0-262-11193-4. Google ScholarDigital Library
- S. Kong, Y. Jung, C. David, B.-Y. Wang, and K. Yi. Automatically inferring quantified loop invariants by algorithmic learning from simple templates. In APLAS. Springer, 2010. Google ScholarDigital Library
- S. Krishna, C. Puhrsch, and T. Wies. Learning invariants using decision trees. CoRR, abs/1501.04725, 2015.Google Scholar
- N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2(4):285–318, 1987. Google ScholarDigital Library
- K. L. McMillan. Interpolation and SAT-Based model checking. In CAV, volume 2725 of LNCS, pages 1–13. Springer, 2003.Google Scholar
- K. L. McMillan. Lazy abstraction with interpolants. In CAV 2006, volume 4144 of LNCS, pages 123–136. Springer, 2006. Google ScholarDigital Library
- T. M. Mitchell. Machine learning. McGraw-Hill, 1997. ISBN 978-0- 07-042807-2. Google ScholarDigital Library
- D. Neider. Applications of Automata Learning in Versification and Synthesis. PhD thesis, RWTH Aachen University, April 2014.Google Scholar
- T. Nguyen, D. Kapur, W. Weimer, and S. Forrest. Using dynamic analysis to discover polynomial and array invariants. In ICSE, pages 683–693. IEEE, 2012. Google ScholarDigital Library
- E. Pek, X. Qiu, and P. Madhusudan. Natural proofs for data structure manipulation in c using separation logic. In PLDI, page 46, 2014. Google ScholarDigital Library
- X. Qiu, P. Garg, A. Stefanescu, and P. Madhusudan. Natural proofs for structure, data, and separation. In PLDI, pages 231–242, 2013. Google ScholarDigital Library
- J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1): 81–106, 1986. Google ScholarCross Ref
- J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. ISBN 1-55860-238-0. Google ScholarDigital Library
- A. Reynolds, C. Tinelli, A. Goel, S. Krstic, M. Deters, and C. Barrett. Quantifier instantiation techniques for finite model finding in SMT. In Automated Deduction - CADE-24 - 24th International Conference on Automated Deduction, Lake Placid, NY, USA, June 9-14, 2013. Proceedings, pages 377–391, 2013.. URL http://dx.doi.org/ 10.1007/978-3-642-38574-2_26. Google ScholarDigital Library
- F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 65(6): 386–408, 1958.Google ScholarCross Ref
- C. E. Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27:379–423, 623–656, 1948.Google ScholarCross Ref
- R. Sharma and A. Aiken. From invariant checking to invariant inference using randomized search. In CAV 2014, volume 8559 of LNCS, pages 88–105. Springer, 2014. Google ScholarDigital Library
- R. Sharma, A. V. Nori, and A. Aiken. Interpolants as classifiers. In CAV, volume 7358 of LNCS, pages 71–87. Springer, 2012. Google ScholarDigital Library
- R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, and A. V. Nori. A data driven approach for algebraic loop invariants. In ESOP, volume 7792 of LNCS, pages 574–592. Springer, 2013. Google ScholarDigital Library
- R. Sharma, S. Gupta, B. Hariharan, A. Aiken, and A. V. Nori. Verification as learning geometric concepts. In SAS, volume 7935 of LNCS, pages 388–411. Springer, 2013.Google Scholar
- A. Solar Lezama. Program Synthesis By Sketching. PhD thesis, EECS Department, University of California, Berkeley, Dec 2008.Google Scholar
- A. Thakur, A. Lal, J. Lim, and T. Reps. Posthat and all that: Attaining most-precise inductive invariants. Technical Report TR1790, University of Wisconsin, Madison, WI, Apr 2013.Google Scholar
- A. Vardhan and M. Viswanathan. Learning to verify branching time properties. Formal Methods in System Design, 31(1):35–61, 2007. Google ScholarDigital Library
Index Terms
- Learning invariants using decision trees and implication counterexamples
Recommendations
Learning invariants using decision trees and implication counterexamples
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesInductive invariants can be robustly synthesized using a learning model where the teacher is a program verifier who instructs the learner through concrete program configurations, classified as positive, negative, and implications. We propose the first ...
Horn-ICE learning for synthesizing invariants and contracts
We design learning algorithms for synthesizing invariants using Horn implication counterexamples (Horn-ICE), extending the ICE-learning model. In particular, we describe a decision-tree learning algorithm that learns from nonlinear Horn-ICE samples, ...
Agnostically learning decision trees
STOC '08: Proceedings of the fortieth annual ACM symposium on Theory of computingWe give a query algorithm for agnostically learning decision trees with respect to the uniform distribution on inputs. Given black-box access to an *arbitrary* binary function f on the n-dimensional hypercube, our algorithm finds a function that agrees ...
Comments