Contact
Search
Essays

Entries in SquirrelFish Internals (8)

Thursday
Mar262009

Nitro's Garbage Collector

The remaining piece in Apple's Nitro (Formerly SquirrelFish) is the garbage collector. Within the past month or two, I think the SquirrelFish team moved from a reference counting to a conservative mark-and-sweep GC with free lists . Prior to every allocation of an object, Nitro checks to see if there is enough space for a new object. If there is no longer any space, it begins to mark then sweep the heap. There are two heaps: the primary and number heap. The primary heap is for JavaScript objects. The number heap looks like its for numbers. The mark phase touches both heaps while the sweeping is done independently for each heap.

The primary and number heap are instances of the CollectorHeap. The CollectorHeap contains a list of memory blocks currently being used in addition to some metadata such as the number of blocks used, where the next free block is, and how many objects are currently live. The list of blocks are of type CollectorBlock. The CollectorBlock points a CollectorCell which finally stores JsCell objects. JsCells wrap around all objects and values. One thing to note is that for the number heap, the amount of memory each cell has is halved.

Collector.h
struct CollectorHeap {
CollectorBlock** blocks;
};

class CollectorBlock {
public:
CollectorCell cells[CELLS_PER_BLOCK]; // CELLS_PER_BLOCK = 2031
CollectorCell* freeList;
Heap* heap;
};

struct CollectorCell {
union {
double memory[CELL_ARRAY_LENGTH]; // CELL_ARRAY_LENGTH = 4
struct {
ptrdiff_t next;
} freeCell;
} u;
};


The overall algorithm to mark/sweep seems pretty standard. When Nitro enters the sweep phase, it recursively scans and marks all live objects starting from the root set. The marking includes objects in both the primary and number heaps. Sweeping occurs first on the primary, then number heap. Nitro iterates over each CollectorBlock, looking at all the cells in each block. When the object in a CollectorCell is considered garbage, that cell is added to the list of free cells. Thus the GC does not copy or compact anything. Everything is maintained in a free list. Finally, allocating space for a new object is done by following a pointer to the next free space.

Collector.cpp
// find a free spot in the block and detach it from the free list
Cell* newCell = targetBlock->freeList;

// "next" field is a cell offset -- 0 means next cell, so a zeroed block is already initialized
targetBlock->freeList = (newCell + 1) + newCell->u.freeCell.next;

// Allocation for the new object
return newCell;

Sunday
Feb152009

SquirrelFish Objects and Property Access

Fetching and setting the value of a JavaScript object is an extensive task. The type of an object can change at any time. Because this rarely occurs, it would be nice to optimize for the standard case where object properties are assumed to type stable. SquirrelFish Extreme does this by having multiple methods of getting/setting a property depending on a few conditions. However, before we can learn about accessing properties, we have to check out how SquirrelFish implements JavaScript objects.

Conceptually, an object in SquirrelFish is a wrapper around a hashtable. All strings have a hash associated with them. When accessing a property, the requested property name's hash is used as a key within a hashtable. The value of the hashtable leads us to the address in memory where the property exists.

Ahh if life was so simple. The time consuming process is always learning the details. So let's look at the following JavaScript source code:

var test = new Object();
test.message = "SFX FTW";
print(test.message);
The resulting bytecode simplified with all moves deleted:
[   0] enter
[ 13] get_by_id r5, r4, prototype // Returns an empty object
[ 21] construct r-14, r4, 1, 15, r5, r6 // Construct the object
[ 31] put_by_id r-14, message,"SFX" // Set property message value to "SFX"
[ 42] resolve_func r4, r3, print // Find the print function
[ 46] get_by_id r5, r-14, message // Get the value of property "message"
[ 54] call r3, r3, 2, 14 // Call the method print
[ 59] end r3

Identifiers:
id0 = Object
id1 = prototype
id2 = message
id3 = print

Take a look at the Identifiers. Identifier objects are created during the parsing of the JavaScript source. Identifiers are really wrappers around another class UString for U(nicode) string. The UString class contains both the hash of the string it represents and a pointer to the actual string data.

When a property access occurs, there are three important items required: The object containing the property, the identifier which contains a hash representing the string, and the value of the property. This data is used by three data structures in SquirrelFish:

  1. JsObject - A class that represents an actual JavaScript Object.
  2. Structure - a field within a JsObject that contains all the methods necessary to get/set values in a hashtable.
  3. PropertySlots - holds all the necessary information to directly access a property's value.
As a general workflow, manipulating a property requires looking for the actual JsObject where the property exists. The JsObject contains a Structure which does the hash look up to find the property's address in memory. This information is then stored into a PropertySlot object. Finally, the JsObject reads/writes to the property through the PropertySlot object. JsObject and Structure are intricately linked to actually store and fetch property values. These data structures are defined as follows:
JsObject.h
typedef JSValuePtr* PropertyStorage;
PropertyStorage m_propertyStorage; // Location where property values are stored

JsCell.h
Structure* m_structure; // Access to the Hashtable

Structure.h:
PropertyMapHashTable* m_propertyTable; // The actual hash table

Actual variable locations in memory are represented by a PropertySlot:
PropertySlot.h
JSValuePtr m_value; // Not used if a slot property
size_t m_offset; // offset from JsObject.m_propertyStorage
JSValuePtr m_slotBase; // the object this property is located on

union {
JSObject* getterFunc;
JSValuePtr* valueSlot;
Register* registerSlot;
unsigned index;
} m_data; // The actual value of the property

SquirrelFish Extreme has multiple types of PropertySlot. "PutPropertySlot" is used when assigning a value to a property. It contains a few additional fields:
PutPropertySlot.h
Type m_type; // New, Existing, or Invalid Property type
JSObject* m_base; // The object this property belongs to
bool m_wasTransition; // Not sure yet
size_t m_offset; // Offset from JsObject.m_
propertyStorage
The thing to note is that the actual address where values are stored is in JsObject Instance->m_propertyStorage[PropertySlot->offset]. So the Structure in JsObject is really a key->value pair mapping PropertyName->offset. We finally have enough information to actually store a value into a property on an object. When we want to add the "message" property to the "test" object, memory is allocated for a PutPropertySlot instance. The PutPropertySlot instance is used for the "test" object where the property should be added and passed into the following method:
JsObject.cpp
void JSObject::put(ExecState* exec, Identifier& propertyName, JSValuePtr value, PutPropertySlot& slot)
{
// Check if there are any getters or setters in the prototype chain
JSValuePtr prototype;
for (JSObject* obj = this; !obj->structure()->hasGetterSetterProperties(); obj = asObject(prototype)) {
prototype = obj->prototype();

// No Getters/Setters so set the property on "this"
if (prototype.isNull()) {
// slot is still an empty structure with no meaningful data
// putDirect("message", "SFX", 0, true, slot);
putDirect(propertyName, value, 0, true, slot); // Slot filled, value entered
return;
}
}
}

The putDirect method creates the hashtable for "this" object. Space is allocated for the property on the JsObject through the m_propertyStorage. This information is written into the PutPropertySlot object, which finally stores the value to (m_propertyStorage + offset).

Actually retrieving the value uses the standard PropertySlot object. Retrieving the value calls the get method in JsObject:

JsObject.cpp
JSValuePtr JSValuePtr::get(ExecState* exec, Identifier& propertyName, PropertySlot& slot)
{
if (// Unlikely Event "this" is not a heap allocated object) { }
else {
JSCell* cell = asCell();
while (true) {
if (cell->fastGetOwnPropertySlot(exec, propertyName, slot)) // Fill the slot information
return slot.getValue(exec, propertyName); // Fetch the actual value
}
}
}
It follows the same control flow of finding the offset from the hashtable, storing that information in a PropertySlot, and finally retrieving the value. You may wonder why slot.getValue needs the property name variable. For this particular example, there is no need but is instead used in other property access cases. Thus, there are many other conditions where this execution path is invalid. I'll be getting into other topics such as what happens when the type of a property changes, or if you have a getter/setter at a later date.

* SFX uses http://www.azillionmonkeys.com/qed/hash.html for their hash function

Wednesday
Feb042009

Calling the print Method

Why is it tradition that when you learn a new language, you have to say "hello world"? For some reason computer scientists have this fascination with "hello world". Every book that tries to teach you a new programming language has one example: print hello world to the screen. I'm going to continue this proud tradition by looking at how SquirrelFish actually executes "hello world".

One of the most interesting things I've learned about VMs is that printing hello world to the screen means you actually have gotten quite a few things working. It's quite an endeavor to successfully implement "hello world" and is a joyous occasion when such an event occurs. Let's take a look at the bytecode for print("hello world"):

[   0] enter
[ 1] mov r3, r0
[ 4] resolve_func r4, r3, print(@id0)
[ 8] mov r5, r1
[ 11] call r3, r3, 2, 14
[ 16] end r3
The moves are actual x86 move instructions. Within the JITed code, resolve_func and call are x86 calls to C methods that do the actual work of the bytecode. Thus, the really interesting bytecodes to look at are resolve_func and call. Let's first take a look at the bytecode resolve_func which calls the C method cti_op_resolve_func:
Interpreter.cpp::cti_op_resolve_func
do {
base = *iter;
if (base->getPropertySlot(callFrame, ident, slot)) {
JSObject* thisObj = base->toThisObject(callFrame);

// Returns the function pointer to "print" method
JSValuePtr result = slot.getValue(callFrame, ident);

RETURN_PAIR(thisObj, JSValuePtr::encode(result));
}
++iter;
} while (iter != end);

The variable iter starts at the beginning of the scope chain and ends at the end of the current scope chain. Thus we're looking for a property through the scope chain as described in the Ecmascript spec. This case is the simplest since only the global scope exists. However, where does the global scope initially define the property print? Where is print defined? Let's checkout the Global Object:
jsc.cpp
GlobalObject::GlobalObject(const Vector<UString>& arguments)
: JSGlobalObject()
{
putDirectFunction(... "debug"), functionDebug));
putDirectFunction(... "print"), functionPrint));
putDirectFunction(... "quit"), functionQuit));
putDirectFunction(... "gc"), functionGC));
putDirectFunction(... "version"), functionVersion));
putDirectFunction(... "run"), functionRun));
putDirectFunction(... "load"), functionLoad));
putDirectFunction(... "readline"), functionReadline));

*... are parameters passed. Shortened for clarity.

Any references to the property "print" actually refer to the C function "functionPrint. Thus resolve_func returns a point to "functionPrint". Then within the compiled context threaded code, there is a test to see if the resolved method is a JavaScript defined function or a native function. Since this is a "native" function, or one implemented in C, the bytecode call invokes cti_op_call_NotJSFunction. Let's take a look at this function:
Interpreter.cpp
JSValueEncodedAsPointer* Interpreter::cti_op_call_NotJSFunction(STUB_ARGS)
{
// Setup call frame
returnValue = callData.native.function(callFrame, asObject(funcVal), thisValue, argList); // Points to functionPrint
return JSValuePtr::encode(returnValue);
}

jsc.cpp:
JSValuePtr functionPrint(ExecState* exec, JSObject*, JSValuePtr, const ArgList& args)
{
printf("%s", args.at(exec, i).toString(exec).UTF8String().c_str());
putchar('\n');
fflush(stdout);
return jsUndefined();
}

cti_op_call_NotJSFunction finally calls functionPrint. Whew! There is the long road to hello world in a virtual machine.

Thursday
Jan152009

Garbage Collection

Garbage collection is a pretty critical decision when building a virtual machine. It determines a bunch of other decisions such as object layout in memory. For a good overview of different garbage collection techniques checkout wikipedia. To see some of the other issues that you face when actually building a GC, there is a really good FAQ on the I.E.C.C GC List FAQ.


One of the central issues that a garbage collector has to figure out is how to distinguish between what is a pointer and an integer? Given any value, is it a real reference or just a really big number? It's difficult to tell. One method to tell if a value is a reference is to check if the pointer resides in an area of allocated memory. If so, we just assume that this value is really a pointer, but there is no real proof that it isn't an actual integer. Another method is to use some of the bits in all values as a type tag. This is what SquirrelFish does.

The lowest 2 bits in a 32 bit field are for the type tag. If the two least significant bits are 0, the next 30 bits refer to a pointer address. If the least significant bit is a 1, the next 31 bits are used as a signed integer. For example, consider the signed integer 4.

Normal binary would be:
0000 0100

SquirrelFish internal representation:
0000 1001

There are other tags in SquirrelFish as well. The implementation details as well as the other comments are in JSImmediate.h..

 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX                     00
[ high 30 bits: pointer address ] [ low 2 bits -- always 0 ]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 1
[ high 31 bits: signed integer ] [ lowest bit -- always 1 ]

000000000000000000000000000 01 10
[ boolean value ] [ bool ] [ tag 'other' ]

000000000000000000000000000 10 10
[ zero ] [ undefined ] [ tag 'other' ]

000000000000000000000000000 00 10
[ zero ] [ zero ] [ tag 'other' ]

* Tip from Aaron. Pointers are guaranteed to point to aligned word boundaries. 

Tuesday
Jan132009

SquirrelFish JIT code

I'm really surprised at SquirrelFish's simplicity. I was expecting some kind of register allocator, even a linear scan. Linear scan register allocation is fast because you only have to iterate over all the code once instead of building an interference graph. An interference graph has nodes which represents each physical register on the machine. If two registers are needed at the same time, the two nodes are connected in the graph. Then you color the graph with however many physical registers you have to assign registers. If you can't color the graph, you remove a node by "spilling" it and removing the node from the graph. A spilled node means that the value is stored in memory instead of a register. However, SquirrelFish says forget that! Let's just store everything in memory. Doing an operation involves loading something from memory, the operation, then storing it back. Primitive bytecode operations are done in direct x86. Addition for example, are done with an actual x86 addition. More complex opcodes such as resolv_func have C functions. Here's some sample x86 code for the bytecode resolve_func:

010E01CE  mov         dword ptr [edi+8],0Ah
010E01D5 mov eax,0Ah
010E01DA mov dword ptr [edi+18h],eax
010E01DD mov dword ptr [esp+4],0BAFB10h
010E01E5 mov ecx,esp
010E01E7 mov dword ptr [esp+38h],edi
010E01EB call JSC::Interpreter::cti_op_resolve_func (4A1340h)
For all it's simplicity, it's smoking fast. wow. If you want more on Register Allocation, checkout my favorite source, Wikipedia!