Legion: best-first concolic testing

Authors:
Dongge Liu

The University of Melbourne, Melbourne, Victoria, Australia

The University of Melbourne, Melbourne, Victoria, Australia
View Profile

,
Gidon Ernst

LMU Munich, Munich, Bavaria, Germany

LMU Munich, Munich, Bavaria, Germany
View Profile

,
Toby Murray

The University of Melbourne, Melbourne, Victoria, Australia

The University of Melbourne, Melbourne, Victoria, Australia
View Profile

,
Benjamin I. P. Rubinstein

The University of Melbourne, Melbourne, Victoria, Australia

The University of Melbourne, Melbourne, Victoria, Australia
View Profile

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software EngineeringDecember 2020Pages 54–65https://doi.org/10.1145/3324884.3416629

Published:27 January 2021Publication History

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

Pages 54–65

ABSTRACT

Concolic execution and fuzzing are two complementary coverage-based testing techniques. How to achieve the best of both remains an open challenge. To address this research problem, we propose and evaluate Legion. Legion re-engineers the Monte Carlo tree search (MCTS) framework from the AI literature to treat automated test generation as a problem of sequential decision-making under uncertainty. Its best-first search strategy provides a principled way to learn the most promising program states to investigate at each search iteration, based on observed rewards from previous iterations. Legion incorporates a form of directed fuzzing that we call approximate path-preserving fuzzing (APPFuzzing) to investigate program states selected by MCTS. APPFuzzing serves as the Monte Carlo simulation technique and is implemented by extending prior work on constrained sampling. We evaluate Legion against competitors on 2531 benchmarks from the coverage category of Test-Comp 2020, as well as measuring its sensitivity to hyperparameters, demonstrating its effectiveness on a wide variety of input programs.

References

Animesh Basak Chowdhury, Raveendra Kumar Medicherla, and Venkatesh R. 2019. VeriFuzz: Program Aware Fuzzing. In Tools and Algorithms for the Construction and Analysis of Systems, Dirk Beyer, Marieke Huisman, Fabrice Kordon, and Bernhard Steffen (Eds.). Springer International Publishing, Cham, 244--249.Google Scholar
D. Beyer. 2020. Second Competition on Software Testing: Test-Comp 2020. In Proc. FASE (LNCS). Springer. https://www.sosy-lab.org/research/pub/2020-FASE.Second_Competition_on_Software_Testing_Test-Comp_2020.pdfGoogle Scholar
Dirk Beyer and Marie-Christine Jakobs. 2019. CoVeriTest: Cooperative verifier-based testing. In International Conference on Fundamental Approaches to Software Engineering. Springer, 389--408.Google ScholarCross Ref
Dirk Beyer, Stefan Löwe, and Philipp Wendler. 2019. Reliable benchmarking: Requirements and solutions. International Journal on Software Tools for Technology Transfer 21, 1 (2019), 1--29.Google ScholarDigital Library
Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2329--2344.Google ScholarDigital Library
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as Markov chain. IEEE Transactions on Software Engineering 45, 5 (2017), 489--506.Google ScholarCross Ref
C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton. 2012. A Survey of Monte Carlo Tree Search Methods. IEEE Transactions on Computational Intelligence and AI in Games 4, 1 (March 2012), 1--43. Google ScholarCross Ref
Cristian Cadar, Daniel Dunbar, Dawson R Engler, et al. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs.. In OSDI, Vol. 8. 209--224.Google ScholarDigital Library
Marek Chalupa, Tomáš Jašek, Lukáš Tomovič, Martin Hruška, Veronika Šoková, Paulína Ayaziová, Jan Strejček, and Tomáš Vojnar. 2020. Symbiotic 7: Integration of Predator and More. In Tools and Algorithms for the Construction and Analysis of Systems, Armin Biere and David Parker (Eds.). Springer International Publishing, Cham, 413--417.Google Scholar
Peng Chen and Hao Chen. 2018. Angora: Efficient fuzzing by principled search. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 711--725.Google ScholarCross Ref
R. Dutra, K. Laeufer, J. Bachrach, and K. Sen. 2018. Efficient Sampling of SAT Solutions for Testing. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 549--559. Google ScholarDigital Library
Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning Chen. 2018. Collafl: Path sensitive fuzzing. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 679--696.Google ScholarCross Ref
Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed Automated Random Testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, IL, USA) (PLDI '05). Association for Computing Machinery, New York, NY, USA, 213--223. Google ScholarDigital Library
Patrice Godefroid, Michael Y. Levin, and David Molnar. 2012. SAGE: Whitebox Fuzzing for Security Testing. Queue 10, 1 (Jan. 2012), 20--27. Google ScholarDigital Library
Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 50--59.Google ScholarCross Ref
Joxan Jaffar, Rasool Maghareh, Sangharatna Godboley, and Xuan-Linh Ha. 2020. TracerX: Dynamic symbolic execution with interpolation (competition contribution). In International Conference on Fundamental Approaches to Software Engineering. Springer, 530--534.Google ScholarDigital Library
Levente Kocsis and Csaba Szepesvári. 2006. Bandit based Monte-Carlo planning. In European Conference on Machine Learning. Springer, 282--293.Google ScholarDigital Library
Hoang M Le. 2020. Llvm-based hybrid fuzzing with LibKluzzer (competition contribution). In International Conference on Fundamental Approaches to Software Engineering. Springer, 535--539.Google ScholarDigital Library
Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavy-weight dynamic binary instrumentation. ACM Sigplan notices 42, 6 (2007), 89--100.Google ScholarDigital Library
H. Peng, Y. Shoshitaishvili, and M. Payer. 2018. T-Fuzz: Fuzzing by Program Transformation. In 2018 IEEE Symposium on Security and Privacy (SP). 697--710.Google Scholar
BobJoyce Reed Hastings. 1991. Purify: Fast detection of memory leaks and access errors. In In Proc. of the Winter 1992 USENIX Conference. Citeseer.Google Scholar
Sebastian Ruland, Malte Lochau, and Marie-Christine Jakobs. 2020. HybridTiger: Hybrid model checking and domination-based partitioning for efficient multi-goal test-suite generation (competition contribution). In International Conference on Fundamental Approaches to Software Engineering. Springer, 520--524.Google ScholarDigital Library
Andrey Ryabinin. 2014. UBSan: run-time undefined behavior sanity checker. E-mail publié sur la liste de développement du noyau Linux 20 (2014).Google Scholar
Kostya Serebryany. 2015. libFuzzer-a library for coverage-guided fuzz testing. LLVM project (2015).Google Scholar
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. AddressSanitizer: A fast address sanitychecker. In Presented as part of the 2012 {USENIX} Annual Technical Conference ({USENIX} {ATC} 12). 309--318.Google Scholar
Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, et al. 2016. Sok: (State of) The Art of War: Offensive techniques in binary analysis. In 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 138--157.Google ScholarCross Ref
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484.Google Scholar
Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting Fuzzing Through Selective Symbolic Execution.. In NDSS, Vol. 16. 1--16.Google Scholar
István Szita, Guillaume Chaslot, and Pieter Spronck. 2010. Monte-Carlo Tree Search in Settlers of Catan. In Advances in Computer Games, H. Jaap van den Herik and Pieter Spronck (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 21--32.Google Scholar
Guy Van den Broeck, Kurt Driessens, and Jan Ramon. 2009. Monte-Carlo tree search in poker using expected reward distributions. In Asian Conference on Machine Learning. Springer, 367--381.Google ScholarDigital Library
Xinyu Wang, Jun Sun, Zhenbang Chen, Peixin Zhang, Jingyi Wang, and Yun Lin. 2018. Towards Optimal Concolic Testing. In Proceedings of the 40th International Conference on Software Engineering (Gothenburg, Sweden) (ICSE '18). Association for Computing Machinery, New York, NY, USA, 291--302. Google ScholarDigital Library
C.D. Ward and Peter Cowling. 2009. Monte Carlo search applied to card selection in Magic: The Gathering. 9 -- 16. Google ScholarCross Ref
Xiaofei Xie, Lei Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, and Simon See. 2019. DeepHunter: A Coverage-Guided Fuzz Testing Framework for Deep Neural Networks. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (Beijing, China) (ISSTA 2019). Association for Computing Machinery, New York, NY, USA, 146--157. Google ScholarDigital Library
C. Yeh, H. Lu, J. Yeh, and S. Huang. 2017. Path Exploration Based on Monte Carlo Tree Search for Symbolic Execution. In 2017 Conference on Technologies and Applications of Artificial Intelligence (TAAI). 33--37. Google ScholarCross Ref
Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. {QSYM}: A practical concolic execution engine tailored for hybrid fuzzing. In 27th {USENIX} Security Symposium ({USENIX} Security 18). 745--761.Google Scholar
Michal Zalewski. 2017. American fuzzy lop (AFL) fuzzer.Google Scholar
Lei Zhao, Yue Duan, Heng Yin, and Jifeng Xuan. 2019. Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing.. In NDSS.Google Scholar

Legion: best-first concolic testing
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis

Recommendations

Achieving both model and code coverage with automated gray-box testing
A-MOST '07: Proceedings of the 3rd international workshop on Advances in model-based testing

We have devised a novel technique to automatically generate test cases for a software system, combining black-box model-based testing with white-box parameterized unit testing. The former provides general guidance for the structure of the tests in the ...
Read More
Automatically performing weak mutation with the aid of symbolic execution, concolic testing and search-based testing

Automating software testing activities can increase the quality and drastically decrease the cost of software development. Toward this direction, various automated test data generation tools have been developed. The majority of existing tools aim at ...
Read More
Conformance Testing of Distributed Concurrent Systems with Executable Designs
Formal Methods for Components and Objects

This paper presents a unified approach to test case generation and conformance test execution in a distributed setting. A model in the object-oriented, concurrent modeling language Creol is used both for generating test inputs and as a test oracle. For ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
December 2020
1449 pages
ISBN:9781450367684
DOI:10.1145/3324884
General Chair:
John Grundy,
Program Chairs:
Claire Le Goues,
David Lo
Copyright © 2020 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 January 2021
Check for updates
Author Tags
concolic execution
constrained fuzzing
monte carlo tree search
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate82of337submissions,24%
Upcoming Conference
ASE '24

Sponsor:

sigsoft online

sigsoft online

ASE '24: 39th IEEE/ACM International Conference on Automated Software Engineering

October 27 - November 1, 2024

Sacramento , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 461
  Total Downloads
- Downloads (Last 12 months)140
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Legion: best-first concolic testing

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

ABSTRACT

References

Cited By

Recommendations

Achieving both model and code coverage with automated gray-box testing

Automatically performing weak mutation with the aid of symbolic execution, concolic testing and search-based testing

Conformance Testing of Distributed Concurrent Systems with Executable Designs