Calling into a native method

So far we've gotten into a lot of detail of how a native method is created, and mapped into the forth VM. This post will finally glue the two together. All method calls get compiled into the JavaScript opcode CallProperty, which is implemented in forth as such:

EXTERN: OP_callproperty callprop NEXT ;

: callprop ( obj [ns] [name] args )
prepnameandbind callpropvec ;

CASE: callpropvec
( 0 BKIND_NONE ) callprop_none
( 1 BKIND_METHOD ) callprop_method
// Others

: callprop_method bind2method 2nip callenv ;

: callenv ( xobj xargs argc env -- result )
PUSHFRAME ( obj args argc env )
methodinfopc SETPC ( obj args argc )
ckint ckstk ckargc
coerce_actual_args ( obj args argc )
// param0 is "special"
prepparam0 ( obj args argc obj teobjtype )
coerce ( obj args argc obj' )
replacereceiver0 ( obj' args argc )
ENV getforthimpl ( obj args argc wimpl )
EXECUTE ( result )

We see this long chain until we get to callenv which is the important one. The other words such as bind, are used to lookup the method name and which object it is associated with. Both Forth words "ENV" and "getForthimpl" are C functions. If we look at getforthimpl:
wcodep FASTCALL getforthimpl(MethodEnv* env)
return env->getForthWord();
Which does the following:
wcodep MethodEnv::getForthWord() const
if (NativeMethodInfop(m_body_pos)->isForth)
return wcodep(NativeMethodInfop(m_body_pos)->body);
return CODE_POS(w_enterimpl_native);
So here, if the "native" method is implemented in Forth, we jump into the Forth implementation which again is really the code_pool defined in vm_min.cpp. We are done explaining how Forth native methods are called. However, the weird part is that to call a C native method, we have to jump BACK into Forth, which jumps BACK into C. Hence it still makes sense to call "getForthWord", because we get the forth word which jumps into C code. Looking at w_enterimpl_native:
extern_entry(w_enterimpl_native, 8045)
So let's look at the forth implementation of what happens here:
EXTERN: w_enterimpl_native ( obj args argc -- result )
PENDX CALLNATIVE_X ckpendingexception ; ( result )
Looking at CALLNATIVE_X:
// obj args argc pendx -- result
PRIM(void, CALLNATIVE_X)(Frame*& f, Boxp& sp)
const int32_t argc = sp[-1].i;
sp[-argc-2].q = callnative_x_glue(sp[0], f->env);
sp -= argc+2;

Finally, we can look at callnative_x_glue:
NativeMethodInfop nmi = env->getNativeMI();
Which finally calls the native C function. WHEW.

A Detailed Inspection of the Native Maps Part Deux: Forth

Now time for the Forth mapping. This portion makes a little bit more sense if we use the ABC from Builtin: Doing Object.toString(). Let's see the source code in
private static native function _toStr(o:*):*
The hint to Flex allows it to use the FORTH_METHOD macro in Builtin.h:
Let's see what FORTH_METHOD does:
#define FORTH_METHOD(m) { uintptr_t(wcodep(code_pool) + m##_offset), true, 0, 0 },
Again, we get the address of the Forth method, but it's not a machine address. code_pool is the Forth VM in C functions. However, what is wcodep:
typedef byte Token;
typedef const Token* wcodep;
typedef wcodep* wcodepp;
So it is really just a byte pointer. Where is the w_toString_offset declared? When we go back to vm_min.h:
extern_entry(w_toString, 8592)
The offset was generated from the Forth Compiler We see that this macro is defined here:
#define extern_entry(x,y) enum { x##_offset = y };
Finally let's see what the forth implementation does:
EXTERN: w_toString ( Object$ obj argc=1 -- result )
isclass IF
traitsenvname // never returns null
sbox! nip ;
Finally we have a solid view of the glue between C, Forth, and ActionScript. Now the last kicker is how is this native map information used from ActionScript to call into the map.

Forth Language Interpreter in C

This one is an interesting implementation detail. I'm going to go over how the Forth language interpreter works, specifically how a given Forth word becomes a C++ function. Forth word implementations are denoted by the label "prim" everywhere. Looking in vm_min.h we have code such as:

prim_proto(int32_t, AND, (const int32_t, const int32_t))
which defines the Forth word "AND" and creates the C function prototype. This is later expanded in interpreter.cpp to:
#define prim_proto(ret,nm,args) PRIM(ret,nm) args ;
And after looking at the macro definition of Prim:
#define PRIM(ret,x) AVMPLUS_FORCE_INLINE static ret prim_##x
The AVMPLUS_FORCE_INLINE forces an inline of all the method calls. When fully expanded, the C function prototype that is generated by the C preprocessor is:
INLINE static int32_t prim_AND(const int32_t, const int32_t);
The actual implementation is then supplied later in the file:
PRIM(int32_t, AND)(const int32_t a, const int32_t b)
return a & b;
Again, these are only for Forth words. Now we have the function definitions for the Forth words being used in "vm.fs". However, to actually implement the Forth semantics, a few more things need to happen, like mimicking the stack. Located in vm_min_interp.h is a long series of generated code from the Python forth compiler. Continuing our example of using the Forth word "AND":

INTERP_CALLPRIM(int32_t, i_1_0, AND, (sp[-1].i, sp[0].i))
INTERP_COPY(sp[-1].i, i_1_0)

The function calls here are actually macro calls which get expanded in Interpreter.cpp as shown here:

#define INTERP_PRIMCASE(x) case x: { TIMING_START(t_interp)

#define INTERP_CALLPRIM(rtype,ret,x,args) INTERP_CALLFUNCBYNAME(rtype, ret, prim_##x, args)
#define INTERP_CALLFUNCBYNAME(_rettype,_result,_funcname,_args) const _rettype _result = _funcname _args;
#define INTERP_COPY(dst,src) dst = src;

#define INTERP_ADJUSTSP(d) sp += (d);
#define INTERP_NEXT(x) } goto top_of_loop;

There are two more macro calls within this which are:
#if defined(_DEBUG) || defined(AVMPLUS_VERBOSE)
#define INTERP_PRE inner_pre_interp(f, ip, sp, rp);
#define INTERP_PRE

So unless the debug flag is set, the INTERP_PRE does nothing. Heres the macro for INVALIDATE_BOX_TYPE:

#define INVALIDATE_BOX_TYPE(x) do { (x).parts.type = BoxedInvalid; } while (0)
#define INVALIDATE_BOX_TYPE(x) do { } while (0)
So after all the macro expansion, the final hand typed could would look like:
case AND: {
const int32_t i_1_0 = prim_AND(sp[-1].i, sp[0].i);
sp[-1].i = i_1_0;
do { (sp[-1]).parts.type = BoxedInvalid; } while (0);
sp += -1;
} goto top_of_loop;

Finally this is all wrapped around a switch statement:
#define NEXTIP        (*ip++)

switch (NEXTIP) {
#include "vm_min_interp.h"

Tamarin Interpreter Architecture Overview

I finally have a general interpreter architecture view of Tamarin in my head. I'll get into the tracing aspect of it later this week. Tamarin is pretty convoluted, so you may want to make yourself some coffee prior to jumping into the rabbit hole.

So far, my count on the number of languages used, in the order that I'll explain how they are used is five: Java, ActionScript, Forth, C, and Python. These languages can be broken down into two subgroups: Support code which includes Java/ActionScript, and the actual interpreter/tracing JIT: written in Python, Forth, and C.

The first half, which I call support code, helps Tamarin get up and running. Java is the easiest to understand. Java is used by Adobe Flex to compile JavaScript/ActionScript source files into .abc files.

During compile time, Flex is used to compile the native ActionScript files into .abc files. These native ActionScript files contain some of the support code of the ActionScript spec, such as Math.min()/Math.Max(); They also define all the native function calls for some Objects. For example, Array::Splice is defined in Some of these functions also have hooks into C such as Math.Cos which is defined with:

    public static native function acos(x:Number):Number;

Such calls are expanded into C function calls during compilation time. They are all glued together via, which compile all the ActionScript into .abc files.

Finally, the really convoluted part is how Python, Forth, and C interact with each other. Here is where the meat and potatoes are. Tamarin does double interpretation, meaning any .abc bytecode being executed is actually being interpreted twice: Once in C and once in Forth. The ABC bytecode implementations are in Forth. Therefore, the whole JavaScript implementation is in Forth. However, the Forth interpreter is written in C. So now you have two different languages being interpreted: ABC via Forth, and Forth via C. These Forth words are implemented as C functions. So in the end, you have a C interpreter which interprets Forth, which interprets the ABC. Therefore, in reality, any ABC bytecode is being interpreted twice in C.

Here is a quick flowchart showing the execution of a single ABC bytecode:

* Side note: Forth words are like functions in other languages

Python acts as the glue between Forth and C. During compilation, Python reads the JavaScript Interpreter Forth source file, and outputs two C header files: One which contains the Forth language interpreter, and one which represents the JavaScript interpreter implemented in Forth. A quick diagram is as follows:

The labels of inner/outer interpreter are taken from the Tamarin source. Hopefully this isn't too confusing. I'll be getting into implementation specific details over the week. Some of the hooks are pretty nifty.

Mapping Tamarin opcodes between C and Forth

So after a good time reading, I've figured out a little piece of how Forth hooks into the C. All the files discussed here are in the "core" directory of the tamarin-tracing folder. To make this easier to understand, we're going to use a specific example: the opcode "bitand". As defined in the avm2 pdf, the opcode is (bitand = 168 (0xa8)). Looking in vm.fs, we see that there is an extern keyword next to it:

EXTERN: OP_bitand            2toint & ibox NEXT ;

We also see that the actual "bitwise anding" is done in forth. If we look in vm_min.cpp we see this:
// OP_bitand offset=7175
SNEST, /* 2toint -29 */ 227, AND, LIT8, uint8_t(BoxedInt), SETRAWBOXTYPE, NEXT,

Looking for offset 7175 in the table below:
extern const short opcode_offsets[] = {
0, 0, 5, 6, 7, 1593, 2039, 2042,
2044, 2052, 0, 0, 2056, 2205, 2211, 2217,
2223, 2227, 2232, 2237, 2521, 2526, 2532, 2538,
2544, 2550, 2747, 2752, 2757, 2773, 2775, 3063,
3178, 3184, 0, 3190, 3285, 3286, 3287, 3293,
3299, 3303, 3304, 3305, 3306, 3307, 3308, 3309,
3317, 3322, 3330, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
3491, 3558, 4077, 4315, 4316, 4338, 4433, 4915,
4920, 4933, 4950, 0, 5097, 0, 5167, 5172,
0, 0, 0, 0, 0, 5177, 5182, 5187,
5207, 5272, 5280, 0, 0, 5294, 5903, 5913,
5928, 6028, 6072, 6073, 6074, 6078, 6085, 0,
6093, 0, 6141, 0, 6346, 6357, 6385, 6396,
6407, 6411, 6418, 6425, 6432, 6439, 6446, 6453,
6457, 0, 0, 0, 0, 0, 0, 0,
6458, 6462, 6469, 6470, 6477, 6484, 6488, 6668,
6699, 6706, 0, 0, 0, 0, 0, 0,
6713, 6721, 6731, 6746, 6756, 6771, 6853, 6861,
0, 0, 0, 0, 0, 0, 0, 0,
6869, 7104, 7112, 7120, 7128, 7137, 7155, 7165,
7175, 7182, 7189, 7196, 7203, 7210, 7218, 7226,

The bottom left entry, "7175" is at position (bitand = 168 (0xa8)) in the table. Also, looking at file: vm_min.h, we see extern_entry(OP_bitand, 7175). Since everything in forth is based on the address of a word, I think these offsets represent how far away from 0 each word is. It seems that the actual BINARY itself is a collection of jumps between forth words, and the order of the words are in vm_min.cpp. The actual implementation of these words are in vm.fs. Now as to why there are random numbers in vm_min.cpp, I am still lost. But at least now I have a footing.