Twitter
Essays
Random
« The Developer Productivity Case for C++ Translation | Main | Downtown Coffee »
Friday
Oct302009

Adding More Cowbell to Tamarin

Many dynamic language virtual machines (VM) written in C++ work by compiling code that passes parameters into VM C++ methods which do most of the heavy lifting. For example, adding two values becomes two x86 pushes onto the stack with a x86 call into a C++ method add. When I last looked at Apple's SquirrelFish, that's all it did. The generated x86 code in Tamarin more or less does the same thing with a few optimizations. And having such a simple compiler works surprisingly well, but it only takes you so far. Sooner or later, you'll want to generate better, faster machine code.

An effective way of generating faster code is method inlining. Instead of generating calls, the JIT should inline the methods that do the real work. The problem for Tamarin's JIT, is that it doesn't know how to inline C++ methods. NanoJIT only compiles LIR, not C++. Enter the C++ to LIR translator.

We do this by statically compiling Tamarin with LLVM. LLVM is a pretty awesome GCC replacement that is nicely designed, object oriented, and generally a pleasure to work with. The output of LLVM is like an object file, but instead contains LLVM bitcode. This new tool then translates LLVM bitcode into LIR. At runtime, Tamarin compiles the LIR, which really represents a VM C++ method. Boom, inlinable C++ methods.

We don't have to apply the translator to only method inlining. In essence, Tamarin compiles VM C++ methods as if they were ActionScript methods. Application ActionScript, such as a youtube video, prior to execution, is translated into LIR, which is then compiled into machine code. We also translate most of Tamarin, written in C++, into LIR, which is then compilable into machine code by NanoJIT. NanoJIT has the power to compile, inline, and optimize VM C++ methods for the performance win.

You may be thinking, hmm this sounds an awfully like a self-interpreting VM. A self-interpreting VM is a VM written in the language it executes. For example, writing a JavaScript VM in JavaScript would make it self-interpreting. Well, you're right. This approach hacks in self-interpreting/JITting into a VM written in C++.

Consider the three images above just to clarify the differences. The "standard" VM approach, and what Tamarin did before, is to JIT code that calls into C++ methods. The host C++ compiler (Visual Studio/GCC) does a lot of work, and the JIT has no idea what was going on. In a self-interpreting VM, the JIT compiles everything, application and VM code, and knows everything about itself. This approach mimics a self-interpreting VM by translating C++ methods into LIR, and JITing both application and VM code.

Over the next few weeks, I'll be detailing the implementation/design decisions in Tamarin.

Thanks to Michael Bebenita for proofreading and creating the images. Checkout this great SNL skit if you are wondering what More Cowbell is.

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (4)

Very interesting - look forward to the next post!

Would it be possible to create a "performance" more static/less dynamic scripting language which targetted LLVM/LIR directly - skipping Actionscript/Javascript and ABC entirely. Which could be used to create libs/module where performance was required - which could be called from Actionscript/Javascript - but without the pain of c/c++.

October 31, 2009 | Unregistered Commenterian c

I think that would be possible. LLVM Bitcode/LIR is just an intermediate representation. You could write a front end which converts your language from syntax into that IR, and use whatever backend you wanted including LLVM/NanoJIT. I think sooner or later we're going to start having "scripting" languages with optional typing exactly because of performance needs. I need to start reading more on optional static typing though since it's been done before.

November 2, 2009 | Registered CommenterMason Chang

What kind of performance are you observing with this?

November 3, 2009 | Unregistered Commentersiva s

The whole thing is still in a very research/prototype phase. The fundamental idea works but we don't have any real numbers yet. We view this as fundamentally an infrastructure update. It allows you to do a whole bunch of other stuff with the JIT that wasn't possible before like runtime feedback. Another issue is that Visual Studio's compiler is a way better optimizing compiler than NanoJIT is at the moment, so the code VS/GCC generates is going to be faster than what NanoJIT can do.

November 3, 2009 | Registered CommenterMason Chang

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>