IncA: Real-Time Feedback through Incremental Program Analysis

IncA is a program analysis framework. It comes with a DSL for the definition of program analyses and the runtime system evaluates program analyses incrementally to achieve the performance that is needed for real-time feedback in IDEs. When code gets changed, the IncA runtime system incrementally updates the results instead of the repeated recomputation from scratch. IncA program analysis code is translated into graph patterns and we reuse existing incremental graph pattern matching solutions to evaluate the analysis code on the AST of the analyzed program.

IncA consists of a program analysis language, an optimizing compiler and an efficient runtime system. We have created a concrete implementation of the system on top of the MPS language workbench and we reuse the incremental graph pattern matching component of the IncQuery project.

1 - Highlights

IncA provides a DSL for the definition of program analyses.
The compiler of IncA translates the program analysis code into a matching graph pattern representation.
The compiler performs an interprocedural data-flow analysis on the IncA program analysis code to conservatively approximate the types of AST nodes which can affect the analysis result. The runtime system uses the information to prune the AST changes that cannot affect the results thus reducing the runtime and memory requirement.
The IncA analysis DSL can be effectively used to define complex program analyses (see Case Studies).
The IncA runtime system can efficiently evaluate incremental program analyses on large code bases (see Evaluation).

2 - Getting Started

The IncA source code is available on GitHub: git clone https://github.com/szabta89/IncA Once cloned, you can simply open the project under the code folder in MPS. The project contains the DSL language implementation and MPS solutions for the IncA runtime system. Inside the tests folder there are examples for simple program analyses.

IncA + mbeddr for the case studies

Using IncA with the case studies requires mbeddr as well, which is set of integrated languages for embedded software development also built on top of MPS. The following steps are required to get you started:

Clone mbeddr https://github.com/mbeddr/mbeddr.core.git and checkout the feature/inca branch of the repository.
The IncA system is integrated into mbeddr as a git submodule because we already started to use it for the development of mbeddr. In order to checkout IncA as a submodule run the git submodule update --recursive --init command in the mbeddr checkout folder.
Navigate to the code/languages folder in the mbeddr checkout folder and execute the buildLanguages.sh or buildLanguages.bat file based on your platform. This will build the language artifacts of IncA and complete mbeddr. On a modern machine this requires 15-20 mins and this operation is only required once before the first usage of mbeddr.
Open the project under code/languages/com.mbeddr.core from the mbeddr checkout folder in MPS. This opens up mbeddr, which is an IDE for C software development.
In order for MPS to find all the generated code of mbeddr, you need to set up a path variable with the name mbeddr.github.core.home which must point to the checkout folder of mbeddr on your machine. Setting a path variable is possible from the Preferences/Settings page's Build, Execution, Deployment tab.

Navigating in mbeddr

You can play around with mbeddr by navigating to the test.ts.core.dataflow module (Navigate -> Go to Module). This module contains mbeddr C code snippets which can be used to try out IncA program analyses. Whenever you need to find a file in MPS simply press CMD/CTRL + N and type in the name of the file you are looking for, then hit ENTER.

The following YouTube video also guides you through the first steps.

3 - Case Studies

IncA program analysis code can be viewed and edited in MPS. However, MPS uses projectional editing and does not store the program code in simple text format which makes it hard to read the code outside of MPS. To ease the browsing of the program analyses code of our three case studies we make them available for download as a formatted pdf file and also in simple text format. Developers of IncA code can use a JavaDoc like documentation extension and we used it document every function in the analysis code to explain how the analyses work.

Control Flow Analysis

We implemented an incremental control flow analysis that handles all of mbeddr C, including conditionals (if and switch), loops (for, while, and do while), and jumps (break and continue). The control flow analysis produces the sets of CFG edges where a CFG node represents a possible control flow point like a statement, alternative switch case branches or else if parts of if statements. The analysis code can be viewed as a declarative description of CFG source - target relationships in the context of the various control statements.

The ControlFlowAnalysis module contains the implementation in MPS (look it up with the CMD/CRTL + N keys).

Points-to Analysis

Given a variable that stores a pointer, the goal of a points to analysis is to identify the possible targets of the variable. A well-known algorithm for computing the points-to tuples is Andersen's algorithm. The algorithm considers four basic kinds of assignments and derives the points-to relation for the whole program from them.

Our points-to analysis in IncA builds on Andersen’s rules but extends them in three ways. First, by implementing the analysis in IncA we immediately improve the runtime after code changes through incrementality. Second, we add flow-sensitivity by building on top of our incremental control flow analysis. Third, we do not require the code to only use the four kinds of assignments in Andersen’s rules, rather support all of mbeddr C except pointer arithmetics.

The PointsToAnalysis module contains the implementation in MPS.

Well-formedness Checks for mbeddr C Code

We implemented four well-formedness checks for mbeddr C and its language extensions.

CYCLE: mbeddr C provides modules for organizing code. This check detects cyclic dependencies between modules.
GLOBAL: This check detects conflicting global variables with the same name across modules.
REC: This check detects recursive functions by construction and inspection of a call graph. In embedded systems with constrained memory, the stack space required for recursive functions is often unacceptable.
COMP mbeddr C supports interfaces and composable components. This check detects components that fail to imple- ment all functions declared by their interfaces.

The WellFormedness module contains the implementation in MPS.

FindBugs for Java

We implemented 10 FindBugs rules with IncA. The original identifiers of the rules are as follows; CI_CONFUSED_INHERITANCE, EQ_ABSTRACT_SELF, CO_SELF_NO_OBJECT, SE_NO_SUITABLE_CONSTRUCTOR, IMSE_DONT_CATCH_IMSE, UUF_UNUSED_FIELD, FI_PUBLIC_SHOULD_BE_PROTECTED, DM_RUN_FINALIZERS_ON_EXIT, HE_EQUALS_USE_HASHCODE, ES_COMPARING_STRINGS_WITH_EQ.

The FindBugs module contains the implementation in MPS.

4 - Evaluation

We evaluated the performance of IncA program analyses on three case studies with programmatically introduced program changes:

We used the CFG and points-to analyses to analyze the Toyota ITC benchmark code. The mbeddr C equivalent of the Toyota ITC benchmark as an MPS project is available under the code/plugins/inca/benchmark/toyota-itc folder in the mbeddr checkout directory.
We evaluated the well-formedness rules on the code base of the SmartMeter project. This project is commercial, thus the code base is not available.
We used the IncA FindBugs rules to check the mbeddr importer code base for bug patterns. The mbeddr importer is a commercial project, thus the code base is not available.

We repeated each measurement five times and discarded the results of the first and second run to account for the Java VM warm-up. The raw memory and runtime measurement results are available for all case studies: [runtime][memory].

The following YouTube video guides you through the case studies and the performance evaluation.

Table of contents:

Software Requirements: