Sprattus - a Unified Framework for Rapid Prototyping of Static and Dynamic Program Analyses

Introduction

Sprattus is a framework for static and dynamic program analysis of of low-level C and C++ programs on LLVM IR. It uses Symbolic Abstraction with the Z3 Theorem Prover to provide a flexible interface for designing program analyses in a compositional way.

Program analyses can be specified concisely with a semantic specification of the analysis domain without the need of stating the effects of each LLVM instruction in the analysis. With this specification, Sprattus can perform sound static program analyses as well as dynamic analyses based on code instrumentation.

Download and Installation

Sources of Sprattus can be downloaded here. The README file inside contains compilation and installation instructions.

Using the Docker Hub

A virtual system image with prebuilt Sprattus along with all the dependencies is available on Docker Hub. After installing Docker it can be launched with a simple command:

docker run -it tdudziak/sprattus

The image contains a user account with home directory in /home. Inside, you will find Sprattus sources in /home/src and binaries in /home/build.

Basic Usage

This section gives an overview of the basic usage of the tools in the Sprattus framework. For this purpose, consider the following example C program:

 1 #include <inttypes.h>
 2 #include <stdint.h>
 3 #include <stdlib.h>
 4 #include <stdio.h>
 5
 6 uint64_t foo(uint8_t a)
 7 {
 8     uint64_t x = a;
 9     x = x | (x << 32);
10     x = x | (x << 16) | (x << 48);
11     return x;
12 }
13
14 int main(int argc, char** argv)
15 {
16     if (argc > 1) {
17         uint64_t res = foo(atoi(argv[1]));
18         printf("%" PRIx64 "\n", res);
19     }
20
21     return 0;
22 }
It defines a function foo that takes an 8 bit unsigned integer as an input, performs bit-level manipulations on it and returns a 64 bit unsigned integer. The main function calls foo with an argument taken from the command line.

This short program is distributed with Sprattus as a test case—you can find it in example/pattern.c. To analyze it, we need to compile it to LLVM bitcode first. Inside the docker image we provide this can be dome with:

$ cd ~/src/examples
$ clang -O2 -c -g -emit-llvm -mllvm -inline-threshold=0 -o out.bc pattern.c
$ opt -mem2reg -instnamer out.bc -o pattern.bc

This will produce a bitcode file pattern.bc. To perform any kind of analysis, Sprattus needs a configuration object. This object is constructed in a configuration file which is a Python script allowed to also make Sprattus API calls and use the domain constructors and parameterization strategies that they provide. A simple configuration file was included with the example:

$ cat pattern.py
config.FragmentDecomposition = {'Strategy': 'Body'}
config.AbstractDomain = BitMask.Single

The two lines of this file define the fragment decomposition and the abstract domain used during the analysis. Sprattus computes invariants at certain program points and the fragment decomposition used determines where these invariants are placed. For the value Body above, abstraction points will be placed as sparsely as possible and the loop invariants will be derived at the end of each loop body. This is different from the default strategy, Edges, which abstract after every basic block. Full list of options available in the configuration file and their default values are available in ~/src/configs/default.py.

The abstract domain determines the form of the invariants detected by the tool. The value of the AbstractDomain option in the configuration file can be a list of multiple domains. All predefined domains can be determined by running spranalyze -list-domains.

The domain chosen in the above confguration will classify each bit of each value appearing in the program as either "always 1", "always 0", or unknown. This can be used to provide a rough estimate of the range of possible values as well as other information like alignment. To statically analyze the function foo use:

$ spranalyze -config=pattern.py -function=foo pattern.bc
a = ********
conv(x@pattern.c:8) = 00000000000000000000000000000000000000000000000000000000********
or(x@pattern.c:9) = 000000000000000000000000********000000000000000000000000********
or2(pattern.c:10) = 00000000********00000000********000000000000000000000000********
or4(x@pattern.c:10) = 00000000********00000000********00000000********00000000********
shl(pattern.c:9) = 000000000000000000000000********00000000000000000000000000000000
shl1(pattern.c:10) = 00000000********000000000000000000000000********0000000000000000
shl3(pattern.c:10) = 00000000********000000000000000000000000000000000000000000000000

By default, the output will show properties associated with the function exit. You can use the command line argument -where=everywhere to see the results for all program locations, although this doesn't change much in this particular case. For the BitMask domain, the invariant takes a form of a bit pattern with "*" signifying that the bit at this position may take any value.

The output invariants are printed with both LLVM IR names and original C names and source locations. For values corresponding to anonymous expressions, the description contains the file and line number—e.g. shl(pattern.c:9) is the subexpression (x << 32). A C variable name is also printed if it's available, like in the case of or4(x@pattern.c:10), the value assigned to the variable x in line 10.

Instrumentation and dynamic analysis

Dynamic analysis with the same domain can be performed by preprocessing the bitcode file with the Sprattus instrumentation tool:

$ sprinstrument -config=pattern.py pattern.bc dyndata.db instrumented.bc
$ clang -lsprattus instrumented.bc -o pattern

This produces an executable pattern compiled from an instrumented LLVM bitcode file instrumented.bc. The additional database file dyndata.db stores the results of dynamic analysis. This database will be used by spranalyze if we pass instrumented.bc as its argument. Purely dynamic analysis can be performed by setting -mode=only-dynamic as follows:

$ spranalyze -mode=only-dynamic -function=foo instrumented.bc

Initially, the result will be bottom since we haven't run the executable yet. You can try running it with different arguments, e.g. ./pattern 170 and ./pattern -171, and see how this influences the results.

Note that even when the parameter -mode is set to full, the analyzer will still use the results of dynamic analysis to speed up its static counterpart.

Authors

Sprattus was developed by Tomasz Dudziak, Fabian Ritter, and Heiko Becker at Saarland University.