When TraceMonkey was born, the team forked the NanoJIT backend from Tamarin. However, over the summer, the TraceMonkey and Tamarin teams wanted to merge their changes back into a shared repository. The intermediate representation (LIR) changed a bit. What's it look like now?
Here is what a basic LIR instruction looks like:
// force sizeof(LIns)==8 and 8-byte alignment on 64-bit machines.
// this is necessary because sizeof(Reservation)==4 and we want all
// instances of LIns to be pointer-aligned.
What about Reservation?
// The opcode is not logically part of the Reservation, but we include it
// in this struct to ensure that opcode plus the Reservation fits in a
// single word.
uint32_t arIndex:16; // index into stack frame. displ is -4*arIndex
Register reg:7; // register UnknownReg implies not in register
uint32_t used:1; // when set, the reservation is active
A LIR instruction is a padding around a 32 bit Reservation which contains the opcode. But where are the operands to an instruction?
This is the biggest difference after the merge, at least from Tamarin's perspective. Prior to the nanoJIT merge, LIR instructions were inserted into a contiguous chunk of memory. Each LIR instruction was one 32 bit word. The top 8 bits were reserved for the opcode while the lower 24 bits were used as operands. Each operand was represented as an 8 bit offset from the point in memory. The actual LIR instruction structure had no notion of pointers.
Now LIR instructions directly point to their operands. But where are the operands?. There are multiple LIR instruction types depending on the number of operands an instruction requires. For example, return instructions only need to point to the value they are returning. NanoJIT has a LInspOp1 class for instructions that have only one operand:
// 1-operand form. Used for LIR_ret, unary arithmetic/logic ops,
friend class LIns;
"Ins" here points to "this" instruction. NanoJIT also has LInsOp2 for instructions with two operands:
// 2-operand form. Used for loads, guards, branches, comparisons, binary
And one last one for instructions that have three operands. This means LIR instructions are variable length and can be up to 4 words in length. Now what about constant values such as the number 6? NanoJIT has a few other specialized LIR instructions such as LInsI:
// Used for LIR_int and LIR_ialloc.
If you want to get into the gritty details, checkout the NanoJIT merge MDC article. If you want to see how it is all implemented, checkout LIR.h on the mercurial repository on line 210 for a nice comment.