Thursday, July 2, 2009

Blissful Lulu Carpenter's Coffee

Ahh Santa Cruz, a piece of the beach in northern California. It's such a gorgeous city, with the most amazing lazy atmosphere. I wonder how anyone gets anything done in the city. Especially when you can stroll to a bistro, have a delicious luncheon outside, and finish with the home made artisan LuLu coffee.

LuLu's has an almost perfect location: on a side street in downtown Santa Cruz. The cafe has a sunroof, providing plenty of natural light for the earthy toned cafe. Each chair has an electrical outlet underneath, ideal for the work in the cafe types. However, LuLu’s is serious about coffee.

As I pondered what to order, I looked at the list of coffees they roasted on site. One was named "black cat", a famous blend by Intelligentsia for the coffee geeks out there. I asked the Barista about it. She said it was actually their own roasted blend, and that a lot of people asked her about it. That's when you know hardcore coffee people visit. And I wasn't disappointed.

LuLu's is the first commercial cafe I've been to where each drink is made individually. Every shot is pulled and milk steamed, specifically for one drink. If you go to StarBucks, you can see the employees holding a big jug of milk, steaming a gallon at a time. Not at LuLu’s. They are like the In-N-Out of cafes: it takes a while to get your drink, but it’s delicious. The coffee is excellent, a bit on the weak and bitter side, but still satisfying. The Barista poured each drink carefully, drawing latte art with magnificent skill. LuLu's lives, breaths coffee, and it clearly shows. Next time you are in Santa Cruz and need a caffeine fix, stop by LuLu's and enjoy.

Sunday, June 7, 2009

TraceMonkey PLDI Paper

TraceMonkey, the tracing JavaScript virtual machine in Firefox 3.5, has a Programming Language Design and Implementation Paper (PLDI) paper entitled "Trace-Based Just-in-Time Type Specialization for Dynamic Languages". Since the conference is finally coming up, we can finally release the pdf. You can get it at David Mandelin's blog.

Congrats :).

Wednesday, May 27, 2009

Storing Type Information

Tamarin stores the type information for a variable in the Traits object. The Traits information is stored within the actual ABC file. You can find the definition in the AVM2 manual. However, in the actual implementation, the type information is stored in two different fields in a Trait instance. The actual traits.builtinType field is used by the verifier. The traits.builtinType field isn't really interesting and isn't used that often by the verifier. Another structure, the SlotInfo, is used during the LIR and assembly stages.

Traits.h
struct SlotInfo
{
// lower 3 bits is type information
// upper 29 bits is the offset / 4.
// eg for (24)ebp, upper 29 bits should be 6
uint32_t offsetAndSST;

inline SlotStorageType sst() const {
return SlotStorageType(offsetAndSST & 7);
}

inline uint32_t offset() const {
return (offsetAndSST >> 3) << 2;
}
};

enum SlotStorageType
{
SST_atom,
SST_string,
SST_namespace,
SST_scriptobject,

SST_int32,
SST_uint32,
SST_bool32,
SST_double
};

The SlotInfo.offsetAndSST contains all the type information for a local variable at the LIR/x86 stages. The lower 3 bits contains type information while the upper 29 bits tells us the offset from the machine stack. This information is valuable as the type information tells the assembler how much space to allocate on the stack for the variable. For example, an integer only needs 4 bytes while a double needs 8 as they are long doubles. The actual offset from the stack is stored in the offset() method.

Tuesday, May 12, 2009

Quick Compiler Links

While scanning reddit, I found two virtual machine links. The first is a free book on the basics of compiler design. It goes over parsing, type checking, code generation, register allocation, and even bootstrapping! It has a lot of ugly greek symbols, but seems like a good reference.


The other is a talk by Paul Biggar, one of the main developers for phc, the open source PHP compiler. It's on compiling and optimizing scripting languages.




Enjoy!

Friday, May 8, 2009

Smooth Coffee in Downtown Mountain View

Red Rock Coffee literally looks like a rock, a brick building with slabs of red. This gem in downtown Mountain View delivers terrific coffee:


I was actually quite surprised with Red Rock. Most people on CoffeeGeek agreed that Red Rock was ehh. Dana Street Roasting Company, a coffee shop just a few blocks away, was supposed to be the best Mountain View had to offer. But Red Rock has better coffee. Period.

I ordered my regular single shot cappuccino and was greeted by deliciously thick foam. Not too airy, not too creamy. Perfect. Red Rock's espresso complements the warm, airy milk, creating a soothing drink with chocolaty notes, a subtle sourness, that finishes with a pleasant tinge of bitterness. My only gripe is that the espresso may be a bit too subdued, which is ideal for those who like sweeter coffees. But next time, I need a double shot :).

Tuesday, May 5, 2009

Creating LIR for Fun and Profit

Low-level Intermediate Representation (LIR) is the IR thats fiddled with and optimized on in Tamarin. Tamarin takes in ABC, converts it to LIR, does all the compilation optimizations on LIR, then feeds LIR into an assembler where it is finally generated into machine code. All the cool stuff happens in LIR. If you want to make Tamarin faster, you need to work with the LIR. Thus it should be important to learn how to use and generate LIR if you want to speed up Tamarin.

Let's start with the basics: x = a + b. We want to build some LIR that adds two numbers.

// load var a. lhs = left hand side
LIns* lhs = loadAtomRep(sp-1);

Remember that ABC is a stack based machine. This line creates a new LIR instruction that represents the Atom one item below the top of the stack. LIR is in Single Static Assignment form and only works on Atoms. Atoms are the internal representation of data items in Tamarin. The bottom 3 bits are used for type tags and the top 29 are used for data:

<---- Data Bits here ----------> <Type>
aaaaaaaa bbbbbbbb cccccccc ddddd zzz

Eg the number 10 as an integer: // 6 represents an integer
00000000 00000000 00000000 01010 101

Now the next LIR:

// Same thing but top of stack
LIns* rhs = loadAtomRep(sp);

// TopLevel contains a bunch of helper functions that are too complex for x86
LIns* toplevel = loadToplevel();

Nothing too big here, same thing as loading the lhs. The TopLevel is in TopLevel.cpp and contains a bunch of helper functions that are called in the compiled x86. They are mostly functions that are too annoying or complex to build in actual x86.

// Call the function TopLevel::add2 with parameters lhs and rhs
LIns* out = callIns(FUNCTIONID(add2), 3, toplevel, lhs, rhs);

This calls the method TopLevel::add2(lhs, rhs). Add2() does all the type checks of lhs and rhs and finally adds the two numbers. What's nice about this is that you don't even have to generate any LIR to sort out parameters. The assembler will figure everything out, move variables and parameters into the right place and generate the correct x86. The LIns* out instruction points to the result of add2. Finally we want to write the result back to the stack:

// store the result in memory
localSet(sp-1, atomToNativeRep(result, out), result);

The add2() method returns an Atom. atomToNativeRep() converts Atoms to their known types if they have been assigned one. The LocalSet function creates the necessary instructions to store the resulting atom to memory. The final sequence of instructions actually used by Tamarin to add two numbers is:

CodegenLIR.cpp
LIns* lhs = loadAtomRep(sp-1);
LIns* rhs = loadAtomRep(sp);
LIns* toplevel = loadToplevel();
LIns* out = callIns(FUNCTIONID(add2), 3, toplevel, lhs, rhs);
localSet(sp-1, atomToNativeRep(result, out), result)

Building Branches

It would be a shame if we couldn't branch, but generating LIR to branch is pretty easy! Let's go back to the add example and say build a piece of x86 code that does the following:

if (lhs == int && rhs == int) {
int add
jump to LABEL
}
else {
call add2()
}

LABEL

Let's focus on building only the check to make sure lhs == int and creating the jump to LABEL. We can check that lhs is really an integer by reading the type tag of the bottom 3 bits. If (lhs & 6 == 6) is true, then the atom is an integer.

// Check that the lhs is an integer.
// InsConst(kIntegerType) creates a constant integer 6
// atom & 6 gives us if this is an int.
LIns *lhs_type = binaryIns(LIR_and, lhs, InsConst(kIntegerType));

// check to see if it is equal to an integer
LIns *lhs_is_int = binaryIns(LIR_eq, lhs_type, InsConst(kIntegerType));

// if not, jump to the normal add2. Notice no label reference.
LIns *lhs_not_integer = branchIns(LIR_jf, lhs_is_int);

// Make sure rhs is an integer
// do an integer add
// jump somewhere else

// tell the lhs_not_inger branch instruction to jump to this location
// Create a LIR_label instruction and point previous branch instructions
// to use this label
LIns *genericAddLabel = Ins(LIR_label);

// Tell lhs_not_inger to point to this new address
lhs_not_integer->target(genericAddLabel);

// Generate the add2() method call

Tada, like magic!. The backend will find all the labels and generate the appropriate jump. You can just generate LIR, branch to wherever you want, and Tamarin figures out the rest. It's actually quite nice that the backend does all the register allocation and figures everything out for method calls and parameters.

Thursday, April 16, 2009

Home Made Latte Art Perfection



  © Header Image by Scott Beale / Laughing Squid

  © Blogger template 'Morning Drink' by Ourblogtemplates.com 2008

  © Customized by Evan Chen

Back to TOP