ABSTRACT
Concolic execution and fuzzing are two complementary coverage-based testing techniques. How to achieve the best of both remains an open challenge. To address this research problem, we propose and evaluate Legion. Legion re-engineers the Monte Carlo tree search (MCTS) framework from the AI literature to treat automated test generation as a problem of sequential decision-making under uncertainty. Its best-first search strategy provides a principled way to learn the most promising program states to investigate at each search iteration, based on observed rewards from previous iterations. Legion incorporates a form of directed fuzzing that we call approximate path-preserving fuzzing (APPFuzzing) to investigate program states selected by MCTS. APPFuzzing serves as the Monte Carlo simulation technique and is implemented by extending prior work on constrained sampling. We evaluate Legion against competitors on 2531 benchmarks from the coverage category of Test-Comp 2020, as well as measuring its sensitivity to hyperparameters, demonstrating its effectiveness on a wide variety of input programs.
- Animesh Basak Chowdhury, Raveendra Kumar Medicherla, and Venkatesh R. 2019. VeriFuzz: Program Aware Fuzzing. In Tools and Algorithms for the Construction and Analysis of Systems, Dirk Beyer, Marieke Huisman, Fabrice Kordon, and Bernhard Steffen (Eds.). Springer International Publishing, Cham, 244--249.Google Scholar
- D. Beyer. 2020. Second Competition on Software Testing: Test-Comp 2020. In Proc. FASE (LNCS). Springer. https://www.sosy-lab.org/research/pub/2020-FASE.Second_Competition_on_Software_Testing_Test-Comp_2020.pdfGoogle Scholar
- Dirk Beyer and Marie-Christine Jakobs. 2019. CoVeriTest: Cooperative verifier-based testing. In International Conference on Fundamental Approaches to Software Engineering. Springer, 389--408.Google ScholarCross Ref
- Dirk Beyer, Stefan Löwe, and Philipp Wendler. 2019. Reliable benchmarking: Requirements and solutions. International Journal on Software Tools for Technology Transfer 21, 1 (2019), 1--29.Google ScholarDigital Library
- Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2329--2344.Google ScholarDigital Library
- Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as Markov chain. IEEE Transactions on Software Engineering 45, 5 (2017), 489--506.Google ScholarCross Ref
- C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton. 2012. A Survey of Monte Carlo Tree Search Methods. IEEE Transactions on Computational Intelligence and AI in Games 4, 1 (March 2012), 1--43. Google ScholarCross Ref
- Cristian Cadar, Daniel Dunbar, Dawson R Engler, et al. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs.. In OSDI, Vol. 8. 209--224.Google ScholarDigital Library
- Marek Chalupa, Tomáš Jašek, Lukáš Tomovič, Martin Hruška, Veronika Šoková, Paulína Ayaziová, Jan Strejček, and Tomáš Vojnar. 2020. Symbiotic 7: Integration of Predator and More. In Tools and Algorithms for the Construction and Analysis of Systems, Armin Biere and David Parker (Eds.). Springer International Publishing, Cham, 413--417.Google Scholar
- Peng Chen and Hao Chen. 2018. Angora: Efficient fuzzing by principled search. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 711--725.Google ScholarCross Ref
- R. Dutra, K. Laeufer, J. Bachrach, and K. Sen. 2018. Efficient Sampling of SAT Solutions for Testing. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 549--559. Google ScholarDigital Library
- Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning Chen. 2018. Collafl: Path sensitive fuzzing. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 679--696.Google ScholarCross Ref
- Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed Automated Random Testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, IL, USA) (PLDI '05). Association for Computing Machinery, New York, NY, USA, 213--223. Google ScholarDigital Library
- Patrice Godefroid, Michael Y. Levin, and David Molnar. 2012. SAGE: Whitebox Fuzzing for Security Testing. Queue 10, 1 (Jan. 2012), 20--27. Google ScholarDigital Library
- Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 50--59.Google ScholarCross Ref
- Joxan Jaffar, Rasool Maghareh, Sangharatna Godboley, and Xuan-Linh Ha. 2020. TracerX: Dynamic symbolic execution with interpolation (competition contribution). In International Conference on Fundamental Approaches to Software Engineering. Springer, 530--534.Google ScholarDigital Library
- Levente Kocsis and Csaba Szepesvári. 2006. Bandit based Monte-Carlo planning. In European Conference on Machine Learning. Springer, 282--293.Google ScholarDigital Library
- Hoang M Le. 2020. Llvm-based hybrid fuzzing with LibKluzzer (competition contribution). In International Conference on Fundamental Approaches to Software Engineering. Springer, 535--539.Google ScholarDigital Library
- Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavy-weight dynamic binary instrumentation. ACM Sigplan notices 42, 6 (2007), 89--100.Google ScholarDigital Library
- H. Peng, Y. Shoshitaishvili, and M. Payer. 2018. T-Fuzz: Fuzzing by Program Transformation. In 2018 IEEE Symposium on Security and Privacy (SP). 697--710.Google Scholar
- BobJoyce Reed Hastings. 1991. Purify: Fast detection of memory leaks and access errors. In In Proc. of the Winter 1992 USENIX Conference. Citeseer.Google Scholar
- Sebastian Ruland, Malte Lochau, and Marie-Christine Jakobs. 2020. HybridTiger: Hybrid model checking and domination-based partitioning for efficient multi-goal test-suite generation (competition contribution). In International Conference on Fundamental Approaches to Software Engineering. Springer, 520--524.Google ScholarDigital Library
- Andrey Ryabinin. 2014. UBSan: run-time undefined behavior sanity checker. E-mail publié sur la liste de développement du noyau Linux 20 (2014).Google Scholar
- Kostya Serebryany. 2015. libFuzzer-a library for coverage-guided fuzz testing. LLVM project (2015).Google Scholar
- Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. AddressSanitizer: A fast address sanitychecker. In Presented as part of the 2012 {USENIX} Annual Technical Conference ({USENIX} {ATC} 12). 309--318.Google Scholar
- Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, et al. 2016. Sok: (State of) The Art of War: Offensive techniques in binary analysis. In 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 138--157.Google ScholarCross Ref
- David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484.Google Scholar
- Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting Fuzzing Through Selective Symbolic Execution.. In NDSS, Vol. 16. 1--16.Google Scholar
- István Szita, Guillaume Chaslot, and Pieter Spronck. 2010. Monte-Carlo Tree Search in Settlers of Catan. In Advances in Computer Games, H. Jaap van den Herik and Pieter Spronck (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 21--32.Google Scholar
- Guy Van den Broeck, Kurt Driessens, and Jan Ramon. 2009. Monte-Carlo tree search in poker using expected reward distributions. In Asian Conference on Machine Learning. Springer, 367--381.Google ScholarDigital Library
- Xinyu Wang, Jun Sun, Zhenbang Chen, Peixin Zhang, Jingyi Wang, and Yun Lin. 2018. Towards Optimal Concolic Testing. In Proceedings of the 40th International Conference on Software Engineering (Gothenburg, Sweden) (ICSE '18). Association for Computing Machinery, New York, NY, USA, 291--302. Google ScholarDigital Library
- C.D. Ward and Peter Cowling. 2009. Monte Carlo search applied to card selection in Magic: The Gathering. 9 -- 16. Google ScholarCross Ref
- Xiaofei Xie, Lei Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, and Simon See. 2019. DeepHunter: A Coverage-Guided Fuzz Testing Framework for Deep Neural Networks. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (Beijing, China) (ISSTA 2019). Association for Computing Machinery, New York, NY, USA, 146--157. Google ScholarDigital Library
- C. Yeh, H. Lu, J. Yeh, and S. Huang. 2017. Path Exploration Based on Monte Carlo Tree Search for Symbolic Execution. In 2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI). 33--37. Google ScholarCross Ref
- Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. {QSYM}: A practical concolic execution engine tailored for hybrid fuzzing. In 27th {USENIX} Security Symposium ({USENIX} Security 18). 745--761.Google Scholar
- Michal Zalewski. 2017. American fuzzy lop (AFL) fuzzer.Google Scholar
- Lei Zhao, Yue Duan, Heng Yin, and Jifeng Xuan. 2019. Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing.. In NDSS.Google Scholar
- Legion: best-first concolic testing
Recommendations
Achieving both model and code coverage with automated gray-box testing
A-MOST '07: Proceedings of the 3rd international workshop on Advances in model-based testingWe have devised a novel technique to automatically generate test cases for a software system, combining black-box model-based testing with white-box parameterized unit testing. The former provides general guidance for the structure of the tests in the ...
Automatically performing weak mutation with the aid of symbolic execution, concolic testing and search-based testing
Automating software testing activities can increase the quality and drastically decrease the cost of software development. Toward this direction, various automated test data generation tools have been developed. The majority of existing tools aim at ...
Conformance Testing of Distributed Concurrent Systems with Executable Designs
Formal Methods for Components and ObjectsThis paper presents a unified approach to test case generation and conformance test execution in a distributed setting. A model in the object-oriented, concurrent modeling language Creol is used both for generating test inputs and as a test oracle. For ...
Comments