Sprattus - a Unified Framework for Rapid Prototyping of Static and Dynamic Program Analyses
Introduction
Sprattus is a framework for static and dynamic program analysis of of low-level C and C++ programs on LLVM IR. It uses Symbolic Abstraction with the Z3 Theorem Prover to provide a flexible interface for designing program analyses in a compositional way.
Program analyses can be specified concisely with a semantic specification of the analysis domain without the need of stating the effects of each LLVM instruction in the analysis. With this specification, Sprattus can perform sound static program analyses as well as dynamic analyses based on code instrumentation.
Download and Installation
Sources of Sprattus can be downloaded here. The README
file inside contains compilation and installation instructions.
Using the Docker Hub
A virtual system image with prebuilt Sprattus along with all the dependencies is available on Docker Hub. After installing Docker it can be launched with a simple command:
docker run -it tdudziak/sprattus
The image contains a user account with home directory in /home
. Inside, you will find Sprattus sources in /home/src
and binaries in /home/build
.
Basic Usage
This section gives an overview of the basic usage of the tools in the Sprattus framework. For this purpose, consider the following example C program:
1 #include <inttypes.h> 2 #include <stdint.h> 3 #include <stdlib.h> 4 #include <stdio.h> 5 6 uint64_t foo(uint8_t a) 7 { 8 uint64_t x = a; 9 x = x | (x << 32); 10 x = x | (x << 16) | (x << 48); 11 return x; 12 } 13 14 int main(int argc, char** argv) 15 { 16 if (argc > 1) { 17 uint64_t res = foo(atoi(argv[1])); 18 printf("%" PRIx64 "\n", res); 19 } 20 21 return 0; 22 }It defines a function
foo
that takes an 8 bit unsigned integer as an input, performs bit-level manipulations on it and returns a 64 bit unsigned integer.
The main
function calls foo
with an argument taken from the command line.
This short program is distributed with Sprattus as a test case—you can find it in example/pattern.c
. To analyze it, we need to compile it to LLVM bitcode first. Inside the docker image we provide this can be dome with:
$ cd ~/src/examples $ clang -O2 -c -g -emit-llvm -mllvm -inline-threshold=0 -o out.bc pattern.c $ opt -mem2reg -instnamer out.bc -o pattern.bc
This will produce a bitcode file pattern.bc
.
To perform any kind of analysis, Sprattus needs a configuration object. This object is constructed in a configuration file which is a Python script allowed to also make Sprattus API calls and use the domain constructors and parameterization strategies that they provide. A simple configuration file was included with the example:
$ cat pattern.py config.FragmentDecomposition = {'Strategy': 'Body'} config.AbstractDomain = BitMask.Single
The two lines of this file define the fragment decomposition and the abstract domain used during the analysis. Sprattus computes invariants at certain program points and the fragment decomposition used determines where these invariants are placed. For the value Body
above, abstraction points will be placed as sparsely as possible and the loop invariants will be derived at the end of each loop body. This is different from the default strategy, Edges
, which abstract after every basic block. Full list of options available in the configuration file and their default values are available in ~/src/configs/default.py
.
The abstract domain determines the form of the invariants detected by the tool. The value of the AbstractDomain
option in the configuration file can be a list of multiple domains. All predefined domains can be determined by running spranalyze -list-domains
.
The domain chosen in the above confguration will classify each bit of each value appearing in the program as either "always 1", "always 0", or unknown. This can be used to provide a rough estimate of the range of possible values as well as other information like alignment. To statically analyze the function foo
use:
$ spranalyze -config=pattern.py -function=foo pattern.bc a = ******** conv(x@pattern.c:8) = 00000000000000000000000000000000000000000000000000000000******** or(x@pattern.c:9) = 000000000000000000000000********000000000000000000000000******** or2(pattern.c:10) = 00000000********00000000********000000000000000000000000******** or4(x@pattern.c:10) = 00000000********00000000********00000000********00000000******** shl(pattern.c:9) = 000000000000000000000000********00000000000000000000000000000000 shl1(pattern.c:10) = 00000000********000000000000000000000000********0000000000000000 shl3(pattern.c:10) = 00000000********000000000000000000000000000000000000000000000000
By default, the output will show properties associated with the function exit. You can use the command line argument -where=everywhere
to see the results for all program locations, although this doesn't change much in this particular case.
For the BitMask
domain, the invariant takes a form of a bit pattern with "*
" signifying that the bit at this position may take any value.
The output invariants are printed with both LLVM IR names and original C names and source locations. For values corresponding to anonymous expressions, the description contains the file and line number—e.g. shl(pattern.c:9)
is the subexpression (x << 32)
. A C variable name is also printed if it's available, like in the case of or4(x@pattern.c:10)
, the value assigned to the variable x
in line 10.
Instrumentation and dynamic analysis
Dynamic analysis with the same domain can be performed by preprocessing the bitcode file with the Sprattus instrumentation tool:
$ sprinstrument -config=pattern.py pattern.bc dyndata.db instrumented.bc $ clang -lsprattus instrumented.bc -o pattern
This produces an executable pattern
compiled from an instrumented LLVM bitcode file instrumented.bc
. The additional database file dyndata.db
stores the results of dynamic analysis. This database will be used by spranalyze
if we pass instrumented.bc
as its argument. Purely dynamic analysis can be performed by setting -mode=only-dynamic
as follows:
$ spranalyze -mode=only-dynamic -function=foo instrumented.bc
Initially, the result will be bottom
since we haven't run the executable yet. You can try running it with different arguments, e.g. ./pattern 170
and ./pattern -171
, and see how this influences the results.
Note that even when the parameter -mode
is set to full
, the analyzer will still use the results of dynamic analysis to speed up its static counterpart.
Authors
Sprattus was developed by Tomasz Dudziak, Fabian Ritter, and Heiko Becker at Saarland University.