Reducing the number of guards with Shapes

When Andreas, Michael, and I first started doing trace compilation on JavaScript, we had some interesting unexpected issues. What we discovered with our first implementation, was that there were too many guards during property look ups, and too many guards regarding the type of a specific variable/property. With so many guards, the amount of bailout code increased, which led to in general, an explosion of code size, and a performance hit.

We looked into a way of compressing the type and structure of objects into a single check. The structure of an object is what properties an object has and their corresponding types, not their values. This way, we could remove the type check guards for every property access, and instead replace them with a single check which ensures that an object's properties did not change.

The idea Andreas came up with, was to encapsulate this information into a "Shape", which is a global unique id among all objects in the system. So if you have two different objects which contain the same properties, and all the properties have the same type, they would share the same shape id. How does this help with the guards? How does this affect a trace? You still have issues of type conversions during specific operations, for example with an add. If you want to do integer addition, you still need a check to ensure that the integer doesn't overflow into a double as the semantics of JavaScript say a number is always a double.

Now the question is, where do you insert these checks? This was a bit confusing for a very long time, so please feel free to ask questions if its not clear. As always, its easier to explain if we have an example. The code example we'll use is this:.

var i = 0;
var j = new Object;
j.x = 0;
var temp = 0;

for (i = 0; i < 100000000; i++) {
temp += i;
j.x = temp;
}


Sometime during execution, the var temp will overflow into a double. However, assuming the loop threshold to start a trace is low enough, a trace should have been compiled such that j.x is typed as an integer, temp is an integer, and the generated code does an integer addition. Let's assume now that the current "Shape" for variable j is 10. If a new property is added, or the type of a property changes, the "Shape" would increase to 11. One way of looking at a trace is as follows:

: topOfLoop

Guard to check i < 10000000
Guard to ensure j is shape 10
integer add temp + i

Guard to ensure temp + i is an integer
assign temp to temp + i

Guard to ensure temp is an integer
assign j.x to var temp;

jump topOfLoop


In SSA this would not work since j.x would point directly to the addition result, but let's assume a simple compiler just to make a point. The previous trace works just fine as long as temp stays within the bounds of an INTEGER. It also makes logical sense to always guard on the type of the value being assigned to a given property. If we didn't, you could technically have a case where you assigned some property a value which has a different type than the shape indicated. In this case, the shape identifier of "10" would no longer valid if the property j.x is no longer an integer. So what happens when j.x finally overflows into a double?

The trace would jump into the interpreter, in which case the shape of j would change to "11" since j.x is now a double. Now upon recording, we would insert a guard to ensure that the shape of j is 11, the addition becomes a double with no check since there is nothing else to convert a double to, in terms of adding numbers. The resulting trace would look something like this:

: topOfLoop
Guard to ensure j is shape 11
Guard to check i &lt; 10000000

double add temp + i

assign temp to temp + i
assign j.x to var temp;

jump topOfLoop


However, we could greatly optimize the integer case with the following trace structure:
Guard to ensure j is shape 10

: topOfLoop

Guard to check i < 10000000

integer add temp + i
Guard type to ensure temp + i is an integer

assign temp to temp + i
assign j.x to var temp;

jump topOfLoop

We removed the guard to ensure that j.x was being assigned an integer, and pulled the guard to ensure j was shape 10 outside of the loop. Here is the important part. We no longer need the guard to ensure j.x is being assigned an integer. We had one check outside of the loop which checked j.x is still an integer type. Next, since the value being assigned was ALREADY checked for overflow, we don't need another redundant guard when assigning j.x. If j.x is supposed to overflow, it would be detected by the guard to ensure that the addition result was still an integer. The compiled code would never reach the code to assign the value to j.x. We can carry this idea to any assignment for a property. The type check for a property will always be done by a previous guard at the actual operation time, not the property assignment time.

The most interesting portion, is that SpiderMonkey generally has the same idea, in a slightly different fashion. To get an in depth look at whats going on, there is a 2 page comment in the jsscope.h file of SpiderMonkey. Even though there is an explanation of whats going on, it is difficult to understand, and the code is pretty icky. I'll dissect the actual implementation later on. We also want to implement something like this into Tamarin-Tracing.