Example C++ to LIR Translation

There are a lot of translation steps to go from C++ to bitcode to LIR. This post hopefully solidifies everything with a concrete example. Consider ActionScript source that adds two variables and assigns the sum to another variable.

var a;
var b;
var sum = a + b; 

 

The LIR that is normally generated is:

left = load left[0]
right = load right[0]
add = icall #add ( left, right )
 
store vars[32] = add

 

First, the two variables left and right, which are a and b in the AS3 source code, are loaded. Next, a call to the VM C++ method add is generated. The result of add is stored into LIR vars[32], which is a location in memory and represents the AS3 variable sum.

Instead of calling Add, the JIT should inline the method. To do that, C++ has to be converted to LIR. The VM C++ method that does the actual add has the source:

Atom Toplevel::add(Atom left, Atom right)
{
    BITCODE_INLINEABLE  // Indicate that we want to translate this method to LIR
    if (areNumbers(left, right)) {        
return
addNumbers(left, right);
    }
    else {
        return addUnknown(left,right);
    }
}

 

The add method is part of a C++ object named Toplevel. ActionScript values are modeled as Atoms in C++. An Atom is a 32 bit integer with the bottom 3 bits used for type information. The C++ source checks to see what the "+" operator does depending on the types of the two values. Once Tamarin is compiled with llvm, it produces bitcode that looks like:

define i32 @add2(%"struct.avmplus::Toplevel"* %this, i32 %left, i32 %right) {
entry:
    call void @enableBitcodeInlining() // The macro expansion of BITCODE_INLINEABLE
 
    %0 = call i8 @areNumbers(%"struct.avmplus::Toplevel"* %this, i32 %left, i32 %right) 
    %toBool = icmp eq i8 %0, 0      
 
    // if true go to addNumbers, otherwise addUnknown
    br %toBool, label %addNumbers, label %addUnknown    
 
addNumbers:     
    %1 = call i32 @addNumbers(%"struct.avmplus::Toplevel"* %this, i32 %left, i32 %right) 
    ret i32 %1
 
addUnknown:       
    %2 = call i32 @addUnknown(%"struct.avmplus::Toplevel"* %this, i32 %left, i32 %right) 
    ret i32 %2
}

 

The LLVM bitcode is in SSA form and retains all of the type information and control flow. The call to enableBitcodeInlining is the C++ source macro BITCODE_INLINEABLE. The static translator looks for a call to enableBitcodeInlining as an indicator to translate the method to LIR. The llvm type i32 represents an integer, which is what a C++ Atom really is. Finally, the resulting LIR once the C++ add method is inlined into the LIR instead of being called is:

left = load leftp[0]
right = load rightp[0]
 
inline( left, right )
    retVal = stackAlloc 4
    isNumberAdd = icall #areNumbers (left, right)
 
    eq1 = eq isNumberAdd, 0
    jump true eq1 -> addNumbersLabel
    jump false eq1 -> addUnknownLabel
 
addNumbersLabel:
    addNumbers = icall #addNumbers (left, right)
    store retVal[0] = addNumbers
    jump -> endInline 
 
addUnknownLabel:
    addUnknown = icall #addUnknown (left,right)
    store retVal[0] = addUnknown
    jump -> endInline 
 
endInline:
 
ld5 = load retVal[0]
sti vars[32] = ld5

 

The LIR still needs to load the left and right operands prior to inlining the method. Space is then allocated on the stack for the return value of the method. The C++ return statements become LIR store values at the allocated stack location. The C++ call statements remain LIR call statements, unless they too are explicitly inlined. In the original LIR, the value returned from call add was stored into the location vars[32]. Now, the return value is loaded from retVal and stored again into vars[32], completing the inlining of C++ add.

Although not in this example, the translator takes the same approach used by return statements for Phi functions. LIR doesn't have a Phi opcode. Instead, stores are pushed up to the basic block where the value is flowing into the Phi function. The actual Phi turns into a load from the store location.

Also, the translator always follows LLVM semantics and creates both the true and false branches in LIR. Future work is to optimize away one of the branches and add a LIR_phi instruction.