User:Liuhaots:JLLVM

From Trusted Cloud Group
Jump to: navigation, search

Contents

Motivation

The LLVM project is a great project. But it is for C++ developers on linux and unix systems. It's difficult for Java developers to analysis LLVM IR(intermediate representation).

Our motivation is to build a lightweight and platform independent LLVM Core for Java developers. This product is named JLLVM. Although the community LLVM is far more powerful than ours, our JLLVM is still work for some simple taskes.

Introduction

JLLVM is a Java version of LLVM Core. It's light weight and platform independent. JLLVM is used to recognize LLVM IR and store the elements of IR in Java class.

What can JLLVM do?

  • Recognize a LLVM IR file, store all the elements in Java class and return you a Module, which has been passed by some simple syntax analysis and contains CFG.
  • You can use the Module to do something you like. Our purpose is to use JLLVM as a platform so that we can build some program analysis tools on it. Maybe you can improve JLLVM and let it do some more powerful work.

Design

use ANTLR to recognize LLVM IR and generate a parser. The parser analysis LLVM IR and store LLVM elements(Instruction, Function, Type,etc) in Java class. The ANTLR code is write in LLVM.g file. The .g file will generate two Java class: Lexer.java, Parser.java. Adding Java code to .g file and it will add the code into Lexer.java and Parser.java automatically. To use JLLVM, you don't need to know ANTLR. But if you want to add your own code to Parser, you should learn ANTLR and edit .g file.

The ANTLR Parser include the whole LLVM IR grammar. We store all the IR elements in Java class. For the important elements we care about, we use some independent Java class, such as Type, Instruction, BasicBlock, Function, etc. And as for some elements we don't care about, we also store them as String, such as Linkage Types, Parameter Attributes, etc.

At present, JLLVM has all the features needed to analysis LLVM IR. As LLVM is a large project and we need more time to complete some other advanced features and make JLLVM more powerful.

Download JLLVM Source Code

Source code:File:JLLVM.zip

To build the project as the following steps:

  • Import the project source code to eclipse as an "Existing project into Workspace.
  • Include ANTLR lib.(download ANTLR work or the latest version antlrworks-1.4.3.jar).To include a lib: Right click on the project->"Build Path"->"Add External Archives"->choose the "antlrworks-1.4.3.jar")


There are two parts of the source code: JLLVM Core and Tools.

JLLVM Core

All the code of core is included in VMCore. The target project to be analysised must be built in a single LLVM IR file. The code is organized as LLVM Core of C++ version but not completed. There are lack of many functional features. But there are two necessary features that we finished:

It's very easy to get a Module. You just need the following code:

LLVMLexer l = new LLVMLexer(new ANTLRStringStream(new String(buffer)));  //buffer is a byte[] which stores the source LLVM IR file.
CommonTokenStream ct = new CommonTokenStream(l);
LLVMParser p = new LLVMParser(ct);
Module cfg = p.program();

About the detail use of Module, you can see our source code of VMCore packages.

OutOfMemoryError: It is because the source code is too large(such as Apache is 19MB). Some arguments should be added to Configurations of the project, for example I add "-Xms512m -Xmx1024m" to "VM arguments" of "run Configurations" in eclipse.

Tools

We have already build two tools on the JLLVM platform to verify the locking security of programs:

(1)Lockset Analysis Tool. Based on the theory of RacerX

(2)ESP Lock Analysis Tool. Based on the theory of ESP

You can run the Lockset Analysis Tool using TestLocksetTraverse.java or run the ESP tool with ESP_GUI.java.

Current State

It is not so powerful as LLVM, as I did this work alone.

If you have any question, please contact me: liuhaots@gmail.com.

Thanks

Personal tools
Namespaces
Variants
Actions
Navigation
Upload file
Toolbox