2011 | OriginalPaper | Chapter
Using Datalog for Fast and Easy Program Analysis
Authors : Yannis Smaragdakis, Martin Bravenboer
Published in: Datalog Reloaded
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Our recent work introduced the
Doop
framework for points-to analysis of Java programs. Although Datalog has been used for points-to analyses before,
Doop
is the first implementation to express full end-to-end context-sensitive analyses in Datalog. This includes key elements such as call-graph construction as well as the logic dealing with various semantic complexities of the Java language (native methods, reflection, threading, etc.).
The findings from the
Doop
research effort have been surprising. We set out to create a framework that would be highly complete and elegant without sacrificing performance “too much”. By the time
Doop
reached maturity, it was a full order-of-magnitude faster than Lhoták and Hendren’s
Paddle
—the state-of-the-art framework for context-sensitive points-to analyses. For the exact same logical points-to definitions (and, consequently, identical precision)
Doop
is more than 15x faster than
Paddle
for a 1-call-site sensitive analysis, with lower but still substantial speedups for other important analyses. Additionally,
Doop
scales to very precise analyses that are impossible with prior frameworks, directly addressing open problems in past literature. Finally, our implementation is modular and can be easily configured to analyses with a wide range of characteristics, largely due to its declarativeness.
Although this performance difference is largely attributable to architectural choices (e.g., the use of an explicit representation vs. BDDs), we believe that our ability to efficiently optimize our implementation was largely due to the declarative specifications of analyses. Working at the Datalog level eliminated much of the artificial complexity of a points-to analysis implementation, allowing us to concentrate on indexing optimizations and on the algorithmic essence of each analysis.