Twitter
Essays
Random
« The Long Tail | Main | SquirrelFish JIT code »
Thursday
Jan152009

Garbage Collection

Garbage collection is a pretty critical decision when building a virtual machine. It determines a bunch of other decisions such as object layout in memory. For a good overview of different garbage collection techniques checkout wikipedia. To see some of the other issues that you face when actually building a GC, there is a really good FAQ on the I.E.C.C GC List FAQ.


One of the central issues that a garbage collector has to figure out is how to distinguish between what is a pointer and an integer? Given any value, is it a real reference or just a really big number? It's difficult to tell. One method to tell if a value is a reference is to check if the pointer resides in an area of allocated memory. If so, we just assume that this value is really a pointer, but there is no real proof that it isn't an actual integer. Another method is to use some of the bits in all values as a type tag. This is what SquirrelFish does.

The lowest 2 bits in a 32 bit field are for the type tag. If the two least significant bits are 0, the next 30 bits refer to a pointer address. If the least significant bit is a 1, the next 31 bits are used as a signed integer. For example, consider the signed integer 4.

Normal binary would be:
0000 0100

SquirrelFish internal representation:
0000 1001

There are other tags in SquirrelFish as well. The implementation details as well as the other comments are in JSImmediate.h..

 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX                     00
[ high 30 bits: pointer address ] [ low 2 bits -- always 0 ]

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 1
[ high 31 bits: signed integer ] [ lowest bit -- always 1 ]

000000000000000000000000000 01 10
[ boolean value ] [ bool ] [ tag 'other' ]

000000000000000000000000000 10 10
[ zero ] [ undefined ] [ tag 'other' ]

000000000000000000000000000 00 10
[ zero ] [ zero ] [ tag 'other' ]

* Tip from Aaron. Pointers are guaranteed to point to aligned word boundaries. 

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (2)

it's worth noting here that this trick works because the things to which the pointers point are guaranteed to be aligned to word boundaries, thus the bottom two bits of the pointer would be zero anyway.

This is a fairly old trick and is used by many programming language implementations (off the top of my head Matz's Ruby, OCaml, and a handful of lisps all do this).

January 23, 2009 | Unregistered CommenterAaron

Hi Aaron,

Thanks for the tip, it's a pretty cool trick.

January 24, 2009 | Unregistered CommenterMason Chang

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>