Table of Contents for Programming Languages: a survey

Other intermediate target languages designed for implementing a single high-level language

CPython bytecode

Two stacks:

Stack ops:



quoted and paraphrased from

General instructions:

Unary operations:

Unary operations take the top of the stack, apply the operation, and push the result back on the stack.

Binary operations:

Binary operations remove the top of the stack (TOS) and the second top-most stack item (TOS1) from the stack. They perform the operation, and put the result back on the stack.

In-place operations:

In-place operations are like binary operations, in that they remove TOS and TOS1, and push the result back on the stack, but the operation is done in-place when TOS1 supports it, and the resulting TOS may be (but does not have to be) the original TOS1.


Miscellaneous opcodes:


Miscellaneous opcodes with arguments:

All of the following opcodes expect arguments. An argument is two bytes, with the more significant byte last.


    Prefixes any opcode which has an argument too big to fit into the default two bytes. ext holds two additional bytes which, taken together with the subsequent opcode’s argument, comprise a four-byte argument, ext being the two most-significant bytes.


    Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the variable argument list, followed by keyword and positional arguments.


    Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the keyword arguments dictionary, followed by explicit keyword and positional arguments.


    Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the keyword arguments dictionary, followed by the variable-arguments tuple, followed by explicit keyword and positional arguments.

Note: as of Python 3.6, "The Python interpreter now uses a 16-bit wordcode instead of bytecode which made a number of opcode optimizations possible. (Contributed by Demur Rumed with input and reviews from Serhiy Storchaka and Victor Stinner in bpo-26647 and bpo-28050.)" [1].

(rationale: "The change is to have all instructions take an argument. This removes the branch on each instruction on whether to load oparg. It then also aligns instructions to always be 2 bytes rather than 1 or 3 by having arguments only take up 1 byte. In the case that an argument to an instruction is greater than 255, it can chain EXTENDED_ARG up to 3 times. In practice this rarely occurs, mostly only for jumps, & abarnert measured stdlib to be ~5% smaller

The rationale is that this offers 3 benefits: Smaller code size, simpler instruction iteration/indexing (One may now scan backwards, as peephole.c does in this patch), which between the two results in a small perf gain (the only way for perf to be negatively impacted is by an increase in EXTENDED_ARGs, when I post benchmarking I'll also post a count of how many more EXTENDED_ARGs are emitted)

This also means that if I want to create something like a tracer that tracks some information for each instruction, I can allocate an array of codesize/2 bytes, then index off of half the instruction index. This isn't currently done in peephole.c, nor does this include halving jump opargs

" ) -- [2]


Dalvik VM


(todo is this wrong?) interestingly, Dalvik only has 16 registers.


" The Dalvik instruction set implements an interesting compromise: it is register based, but there are a finite number of them as opposed to the theoretically infinite registers of LLVM or Parrot. Dalvik supports 65,536 registers, a vast number compared to hardware CPUs and presumably sufficient to implement SSA (if desired) in reasonably large functions.

Even more interestingly, not all Dalvik instructions can access all registers. Many Dalvik instructions dedicate 4 bits to the register number, requiring their operands to be stored in the first 16 registers. A few more instructions have an 8 bit instruction number, to access the first 256. There are also instructions to copy the value to or from any of the 65,536 registers to a low register, for a subsequent instruction to access.

It took a while to understand the rationale for this choice, and I'm still not confident I fully get it. Clearly the Dalvik designers believe that keeping data in one of the high registers will be faster than explicitly storing it to memory, even if the vast number of registers end up mostly residing in RAM. Addressing data as register numbers instead of memory addresses should make it easier for the VM to dynamically remap Dalvik registers to the real hardware registers. For example, if it can predict that virtual register 257 will likely be used in the near future it can be kept in a CPU register instead of being immediately stored to memory. " --

Non-deprecated, non-implementation-specific instructions (many instructions have _2ADDR forms, and many also have _LIT8 and _LIT16 forms, and many have _WIDE forms, and many have _FLOAT, _INT, and _LONG forms, and many have _BOOLEAN, _BYTE, _CHAR, _OBJECT, _SHORT, and _WIDE forms, and many have _16 and _32 forms, and many of these are mixed, e.g. ADD_INT_2ADDR. I'll only list the base instruction in these cases)





opcodes (about 268 of them):


MoarVM? is a VM built for Perl6's Rakudo implementation (the most canonical Perl6 implementation as of this writing).

MoarVM introduction and features

"The heart of the VM": "

"What does a VM typically do?

" Some key features provided by MoarVM? include:


12 different node types:

-- [9]

MoarVM 6model

6model according to and :

"An object is a blob of memory. The only constraint is that the first thing in the blob must be a pointer/reference to a Shared Table data structure." --

Object is composed of:

STable is composed of:

on meta-objects: 6model "provides one meta-object that implements objects with attributes (state) and methods (behavior) - and that's about it. It doesn't enforce one definition of method dispatch, inheritance, interfaces, introspection and so forth. These are all built up by implementing meta-objects that specify their semantics." -- [12]

on representations: " Rather than committing to one view of how to lay out an object in memory, 6model supports "representations". Representations define how attributes are actually stored and accessed, how primitive types (integers, floating point values and strings) are boxed and unboxed - or perhaps both. Additionally, representations are orthogonal to meta-objects, meaning it is possible to define one type (e.g. a class) and use it with different storage strategies. This is known as representation polymorphism...An object may in the abstract be a blob of memory that starts with a pointer to a Shared Table, but of course something has to know what the rest of it means. That's what representations do. A representation is responsible for object allocation, attribute storage and access (both in terms of memory layout and operation), boxing from and unboxing to native types and (depending on the VM) GC interaction. Representations may be like singletons, or they may act more like instances. How a representation is implemented is VM-specific; in fact, pretty much everything the representation has to do is also VM-specific." -- [13]

on meta-objects and caches:

"In an ideal world, every single method dispatch we perform would be conducted by delegating to the meta-object's method that implements method dispatch semantics. In the real world, that's not practical from a performance point of view. Thus 6model provides various mechanisms that a meta-object can "opt in" to in order to allow for muchly increased performance. However, it considers all of these to really just be a kind of "cache". A v-table is just an array of invokable objects published by the meta-object, which it is responsible for maintaining. Similar things apply to type-checking." -- [14]

" REPR API has a common part (allocation, GC marking) along with several sub - protocols for different ways of using memory: Representations are orthogonal to type (and thus dis - interested in method dispatch, type check, etc.) and also non - virtual (if you know the REPR, can inline stuff) "

there are various calls to functions prefixed with 'MVM_6model' in

these seem to be declared in

the list seems to be (for readabilty, i removed the 'MVM' prefixes from everything in this list, and also the '6model' prefixes): describes various primitive types in MoarVM?: strings, arrays, hashes, (external/foreign) functions. All of these are needed for fields within the 'primitive' meta-object type, KnowHOW?. KnowHOW? meta-objects are then constructed for each of the other primitive types. The meta-object for the KnowHOW? class itself is itself a KnowHOW?.

"Every object in MoarVM? is a 6model object - one object system for the whole VM. By object, we mean what you think of as objects (Arrays, Hashes, Boxed integers, floats, etc., Threads, handles)..." [15]

Presumably much of MoarVM?'s 6model implementation is in

See also

MoarVM links

Haskell GHC languages

Note that Haskell is implemented a bit differently from many languages; this stems from Haskell's properties:

It is therefore assumed that Haskell programs will, at runtime, very frequently be dealing with unevaluated thunks, with closures which represent unevaluated or partially evaluated functions, and with the application of functions whose arity was unknown at compile time.

GHC Core

Possible extensions:

Core in one slide: " data Expr b -- "b" for the type of binders, = Var Id

Lit Literal
App (Expr b) (Arg b)
Lam b (Expr b)
Let (Bind b) (Expr b)
Case (Expr b) b Type [Alt b]
Type Type
Cast (Expr b) Coercion
Coercion Coercion
Tick (Tickish Id) (Expr b)

data Bind b = NonRec? b (Expr b)

Rec [(b, (Expr b))]

type Arg b = Expr b

type Alt b = (AltCon?, [b], Expr b)

data AltCon? = DataAlt? DataCon?

" -- [16]
LitAlt? Literal DEFAULT

"Core is technically a variant of a System FC (which is itself a variant of System F)" -- [17]

" Values: Both data and functions Unboxed: Primitive (machine level) types, not "boxed up" on the heap Closure: Heap allocated data associated with a method Thunk: A suspended computation Continuations: The cousin of closures. Unlike a closure you aren't simply calling a function, you are continuing a saved execution state. Basically though, if you have closures and tail calls as Haskell does, continuations can be built.

Yes, closures, thunks and continuations are all very similar. One implementation can capture them all, however the terminology is used to capture the different use cases.


    reducible expression. A expression that can be evaluated further
        normal form:
        an expression without an redexes
        head normal form:
        an expression where the top level (head) is neither a redex NOR a lambda abstraction with a reducible body
        weak head normal form:
        an expression where the top level (head) isn't a redex
    unfolding of a function f is just the body of f.
        Unfolding = Inlining.


    evaluation strategies:
        call-by-value: arguments evaluated before function entered (copied)
        call-by-name: arguments passed unevaluated
        call-by-need: arguments passed unevaluated but an expression is only evaluated once (sharing)
    no-strict evaluation Vs. lazy evaluation:
        non-strict: Includes both call-by-name and call-by-need, general term for evaluation strategies that don't evaluate arguments before entering a function
        lazy evaluation: Specific type of non-strict evaluation. Uses call-by-need (for sharing).... scrutinee: The expression you are case'ing on in a case statement " -- [18]

"Graph Reduction: The way that lazy functional languages like Haskell are implemented is through a technique called graph reduction. Its best to use the graph reduction model as an intuitive way to think about how Haskell is evaluated, the actual way GHC implements Haskell is pretty close to how an imperative language works. " -- [19]


spineless tagless G-machine

GHC Bytecode (interpreter)

from :

" -- Messing with the stack = STKCHECK Word

   -- Push locals (existing bits of the stack)
   | PUSH_L    !Word16{-offset-}
   | PUSH_LL   !Word16 !Word16{-2 offsets-}
   | PUSH_LLL  !Word16 !Word16 !Word16{-3 offsets-}
   -- Push a ptr  (these all map to PUSH_G really)
   | PUSH_G       Name
   | PUSH_PRIMOP  PrimOp
   | PUSH_BCO     (ProtoBCO Name)
   -- Push an alt continuation
   | PUSH_ALTS          (ProtoBCO Name)
   | PUSH_ALTS_UNLIFTED (ProtoBCO Name) ArgRep
   -- Pushing literals
   | PUSH_UBX  (Either Literal (Ptr ())) Word16
	-- push this int/float/double/addr, on the stack. Word16
	-- is # of words to copy from literal pool.  Eitherness reflects
	-- the difficulty of dealing with MachAddr here, mostly due to
	-- the excessive (and unnecessary) restrictions imposed by the
	-- designers of the new Foreign library.  In particular it is
	-- quite impossible to convert an Addr to any other integral
	-- type, and it appears impossible to get hold of the bits of
	-- an addr, even though we need to assemble BCOs.
   -- various kinds of application
SLIDE Word16{-this many-} Word16{-down by this much-}
   -- To do with the heap
   | ALLOC_AP  !Word16 -- make an AP with this many payload words
   | ALLOC_AP_NOUPD !Word16 -- make an AP_NOUPD with this many payload words
   | ALLOC_PAP !Word16 !Word16 -- make a PAP with this arity / payload words
   | MKAP      !Word16{-ptr to AP is this far down stack-} !Word16{-number of words-}
   | MKPAP     !Word16{-ptr to PAP is this far down stack-} !Word16{-number of words-}
   | UNPACK    !Word16 -- unpack N words from t.o.s Constr
   | PACK      DataCon !Word16
			-- after assembly, the DataCon is an index into the
			-- itbl array
   -- For doing case trees
   | LABEL     LocalLabel
   | TESTLT_I  Int    LocalLabel
   | TESTEQ_I  Int    LocalLabel
   | TESTLT_W  Word   LocalLabel
   | TESTEQ_W  Word   LocalLabel
   | TESTLT_F  Float  LocalLabel
   | TESTEQ_F  Float  LocalLabel
   | TESTLT_D  Double LocalLabel
   | TESTEQ_D  Double LocalLabel
   -- The Word16 value is a constructor number and therefore
   -- stored in the insn stream rather than as an offset into
   -- the literal pool.
   | TESTLT_P  Word16 LocalLabel
   | TESTEQ_P  Word16 LocalLabel
JMP LocalLabel?
   -- For doing calls to C (via glue code generated by libffi)
   | CCALL            Word16    -- stack frame size
                      (Ptr ())  -- addr of the glue code
                      Word16    -- whether or not the call is interruptible
                                -- (XXX: inefficient, but I don't know
                                -- what the alignment constraints are.)
   -- For doing magic ByteArray passing to foreign calls
   | SWIZZLE          Word16 -- to the ptr N words down the stack,
                      Word16 -- add M (interpreted as a signed 16-bit entity)
   -- To Infinity And Beyond
   | ENTER
   | RETURN		-- return a lifted value
   | RETURN_UBX ArgRep -- return an unlifted value, here's its rep
   -- Breakpoints 
   | BRK_FUN          (MutableByteArray# RealWorld) Word16 BreakInfo"


Haskell YHC


Robert Dockins and Samuel Z. Guyer, Bytecode Verification for Haskell table 1

YHC bytecode

the following is paraphrased/quoted from


Haskell Hugs


Quoted or paraphrased from The implementation of the Gofer functional programming system:

Instructions for constructing values:

The following instructions are used to construct frag- ments of program graph, using the stack to store tem- porary values as well as the function parameters and the results of calls to the evaluator.

Control flow:

The following instructions are used to call the eval- uator, to test the values that it returns, to indicate the end of a particular reduction, or to signal an ir- reducible expression.



The Functional Language Research Compiler / Haskell Research Compiler has an IL called 'MIL'.


Smalltalk Slang

From , the operations available in Slang are:

"&" "

"+" "-" "" "
" min: max: bitAnd: bitOr: bitXor: bitShift: "<" "<=" "=" ">" ">=" "~=" "==" isNil notNil whileTrue: whileFalse: to:do: to:by:do: ifTrue: ifFalse: ifTrue:ifFalse: ifFalse:ifTrue: at: at:put: 2 bitInvert32 preIncrement integerValueOf: integerObjectOf: isIntegerObject:
" and: or: not


The Smalltalk-80 Bytecodes Range Bits Function 0-15 0000iiii Push Receiver Variable #iiii 16-31 0001iiii Push Temporary Location #iiii 32-63 001iiiii Push Literal Constant #iiiii 64-95 010iiiii Push Literal Variable #iiiii 96-103 01100iii Pop and Store Receiver Variable #iii 104-111 01101iii Pop and Store Temporary Location #iii 112-119 01110iii Push (receiver, true, false, nil, -1, 0, 1, 2) [iii] 120-123 011110ii Return (receiver, true, false, nil) [ii] From Message 124-125 0111110i Return Stack Top From (Message, Block) [i] 126-127 0111111i unused 128 10000000 jjkkkkkk Push (Receiver Variable, Temporary Location, Literal Constant, Literal Variable) [jj] #kkkkkk 129 10000001 jjkkkkkk Store (Receiver Variable, Temporary Location, Illegal, Literal Variable) [jj] #kkkkkk 130 10000010 jjkkkkkk Pop and Store (Receiver Variable, Temporary Location, Illegal, Literal Variable) [jj] #kkkkkk 131 10000011 jjjkkkkk Send Literal Selector #kkkkk With jjj Arguments 132 10000100 jjjjjjjj kkkkkkkk Send Literal Selector #kkkkkkkk With jjjjjjjj Arguments 133 10000101 jjjkkkkk Send Literal Selector #kkkkk To Superclass With jjj Arguments 134 10000110 jjjjjjjj kkkkkkkk Send Literal Selector #kkkkkkkk To Superclass With jjjjjjjj Arguments 135 10000111 Pop Stack Top 136 10001000 Duplicate Stack Top 137 10001001 Push Active Context 138-143 unused 144-151 10010iii Jump iii + 1 (i.e., 1 through 8) 152-159 10011iii Pop and Jump 0n False iii +1 (i.e., 1 through 8) 160-167 10100iii jjjjjjjj Jump(iii - 4) *256+jjjjjjjj 168-171 101010ii jjjjjjjj Pop and Jump On True ii *256+jjjjjjjj 172-175 101011ii jjjjjjjj Pop and Jump On False ii *256+jjjjjjjj 176-191 1011iiii Send Arithmetic Message #iiii 192-207 1100iiii Send Special Message #iiii 208-223 1101iiii Send Literal Selector #iiii With No Arguments 224-239 1110iiii Send Literal Selector #iiii With 1 Argument 240-255 1111iiii Send Literal Selector #iiii With 2 Arguments


" By default, Lua has a maximum stack frame size of 250. This is encoded as MAXSTACK in llimits.h. The maximum stack frame size in turn limits the maxi mum number of locals per function, which is set at 200, encoded as LUAI_MAXVARS in luaconf.h. Other limits found in the same file include the maximum number of upval ues per function (60), encoded as LUAI_MAXUPVALUES , call depths, the minimum C stack size, etc. Also, wit h an sBx field of 18 bits, jumps and control structures cannot exceed a jump distance of about 131071. "

0 MOVE Copy a value between registers 1 LOADK Load a constant into a register 2 LOADBOOL Load a boolean into a register 3 LOADNIL Load nil values into a range of registers 4 GETUPVAL Read an upvalue into a register 5 GETGLOBAL Read a global variable into a register 6 GETTABLE Read a table element into a register 7 SETGLOBAL Write a register value into a global variable 8 SETUPVAL Write a register value into an upvalue 9 SETTABLE Write a register value into a table element 10 NEWTABLE Create a new table 11 SELF Prepare an object method for calling 12 ADD Addition operator 13 SUB Subtraction operator 14 MUL Multiplication operator 15 DIV Division operator 16 MOD Modulus (remainder) operator 17 POW Exponentiation operator 18 UNM Unary minus operator 19 NOT Logical NOT operator 20 LEN Length operator 21 CONCAT Concatenate a range of registers 22 JMP Unconditional jump 23 EQ Equality test 24 LT Less than test 25 LE Less than or equal to test 26 TEST Boolean test, with conditional jump (if not (R(A) <=> C) then PC++) 27 TESTSET Boolean test, with conditional jump and assignment (if (R(B) <=> C) then R(A) := R(B) else PC++) 28 CALL Call a closure 29 TAILCALL Perform a tail call 30 RETURN Return from function call 31 FORLOOP Iterate a numeric for loop 32 FORPREP Initialization for a numeric for loop 33 TFORLOOP Iterate a generic for loop 34 SETLIST Set a range of array elements for a table 35 CLOSE Close a range of locals being used as upvalues 36 CLOSURE Create a closure of a function prototype 37 VARARG Assign vararg function arguments to registers

this table is taken almost verbatim from




" The suffix(es) of the instruction name distinguish variants of the same basic instruction:

Here are the possible operand types:


Comparison ops: IS{LT/GT/LE/GE] (<, >, <=, >=), ISEQ{V/S/N/P} (==, where 2nd operand is variable, string constant, number constant, primitive type), ISNE{V/S/N/P} (!=). note: "for floating-point comparisons (x < y) is not the same as not (x >= y) in the presence of NaNs?"

Unary Test and Copy ops: IST (jump if true), ISF (jump if false), ISTC (Copy D to A and jump, if D is true), ISFC (Copy D to A and jump, if D is false)

Unary ops: MOV, NOT, UNM (unary minus), LEN (object length)

Binary ops: ADD{VN, NV, VV} (variable and number, number and variable, variable and variable), SUB{VN, NV, VV}, MUL{VN, NV, VV}, DIV{VN, NV, VV}, MOD{VN, NV, VV}, POW, CAT

Constant ops: K{STR, CDATA, SHORT, NUM, PRI} (set A to constant D, where constant is of type string, cdata, short, number, primitive), KNIL (Set slots A thru D to nil)

Upvalue and Function ops: UGET (set A to upvalue D), USET{V, S, N, P} (set upvalue to variable/string constant, number constant, primitive constant), UCLO (Close upvalues for slots ≥ rbase and jump to target D), FNEW (Create new closure from prototype D and store it in A)

Table ops: TNEW (Set A to new table with size D), TDUP (Set A to duplicated template table D), GGET (A = _G[D]), GSET (_G[D] = A), TGET{V, S, B} (A = B[C], where index C is of type variable, string constant, unsigned byte literal), TSET{V, S, B}, TSETM ((A-1)[D], (A-1)[D+1], ... = A, A+1, ..) (note: "MULTRES from the previous bytecode gives the number of table slots to fill...MULTRES is an internal variable that keeps track of the number of results returned by the previous call or by VARG instructions with multiple results. It's used by calls (CALLM or CALLMT) or returns (RETM) with multiple arguments and by a table initializer (TSETM).")

Calls and Vararg Handling: CALL (A, ..., A+B-2 = A(A+1, ..., A+C-1)), CALLM (A, ..., A+B-2 = A(A+1, ..., A+C+MULTRES)), CALLT (Tailcall: return A(A+1, ..., A+D-1)), CALLMT (Tailcall: return A(A+1, ..., A+D+MULTRES)), ITERC (Call iterator: A, A+1, A+2 = A-3, A-2, A-1; A, ..., A+B-2 = A(A+1, A+2)), ITERN (Specialized ITERC, if iterator function A-3 is next()), VARG (Vararg: A, ..., A+B-2 = ...), ISNEXT (Verify ITERN specialization and jump; see

Returns: "All return instructions copy the results starting at slot A down to the slots starting at one below the base slot (the slot holding the frame link and the currently executing function)." RET (return A, ..., A+D-2), RETM (return A, ..., A+D+MULTRES-1), RET0 (return), RET1 (return A)

Loops and branches: FORI (numeric for loop init), FORL (numeric for loop), ITER (iterator for loop), LOOP ("actually no-ops...only used by the JIT-compiler to speed up data-flow and control-flow analysis...The bytecode instruction itself is needed so the JIT-compiler can patch it to enter the JIT-compiled trace for the loop."). Many of these also have 'J' and 'I' variants which mark JIT-compiled loops and loops which cannot be JIT-compiled, respectively (the default variants "do hotspot detection. Trace recording is triggered if the loop is executed often enough"). See .

Function headers: FUNCF (Fixed-arg Lua function), FUNCV (Vararg Lua function), FUNCC (Pseudo-header for C functions), FUNCCW (Pseudo-header for wrapped C functions), FUNC* (Pseudo-header for fast functions). FUNCF and FUNCV each have 'J' and 'I' variants which mark JIT-compiled and un-JITable functions, as in the 'loops and branches' section.


Erlang BEAM VM

The Erlang virtual machine. Elixir also targets it.


quoted and paraphrased from : appears to be copied from there. claims that it documents (most of) the instruction set used in BEAM v5.9.1 (which is different from the "generic" instruction set that is shown in dissassembly).

Argument Types

Types of instruction arguments are encoded by single character codes. The meaning of these codes is given in the table below.

    Type Description
    t    An arbitrary term, e.g. {ok,[]}
    I    An integer literal, e.g. 137
    x    A register, r.g. R1
    y    A stack slot
    c    An immediate term, i.e. atom/small int/nil
    a    An atom, e.g. 'ok'
    f    A code label
    s    Either a literal, a register or a stack slot
    d    Either a register or a stack slot
    r    A register R0
    P    A unsigned integer literal
    j    An optional code label
    e    A reference to an export table entry
    l    A floating-point register

Model Elements

The model of the BEAM virtual machine contains the following elements:

    1. Registers R0-R255. R0 is a special 'fast' register.
    2. IP (instruction pointer) contains the reference to the instruction being executed.
    3. CP (continuation pointer) contains a return address of the current function call.
    4. EP (stack pointer) points to the last value pushed on stack.
    5. HTOP (heap top) points to the first empty word of the heap.
    6. The message queue. SAVE pointer marks the message being currently matched.
    7. tmpA and tmpB variables that temporarily hold values for the imminent arithmetic or logical operation.
    8. Floating-point registers FR0-FR15.

'Live' refers to the current number of registers that hold values still needed by the process. Many instructions take this parameter to be able to perform garbage collection, if necessary.


Allocation and initialization:

Stack manipulation:

Control flow:

Exception handling:

Composite data manipulation:


Message queues:



Moves and Loads and stores:



Processes, meta:


Binary data:


(applied to the data stack)
  int CNDJMP(int addr=0);    # if TOS == 0, then jump to specified address
  int JMP(int addr=0);
  void POP(void);
  void LD_FUNC(const char* fname,int hash,int paran,void* pfunc,   # fname is the string name of a fn; without a longer look, i can't tell if this pushes a reference to a partially applied function onto the data stack, or if it actually calls the function
               struct G__ifunc_table_internal* ifunc, int ifn);
  void LD_FUNC_BC(struct G__ifunc_table* ifunc,int ifn,int paran,void *pfunc);
  void LD_FUNC_VIRTUAL(struct G__ifunc_table* ifunc,int ifn,int paran,void *pfunc);
  void RETURN(void);                                        # half
  void CAST(int type,int tagnum,int typenum,int reftype);
  void CAST(G__TypeInfo& x);
  void OP1(int opr);              # one of +,-
  void LETVVAL(void);             # something about assigning and lvalues?
  void ADDSTROS(int os);          # ? adds 'os' (offset) to a variable named G__store_struct_offset; dunno what this is for
  void LETPVAL(void);		  # ? i dunno, some kind of assignment? looks at the top two(?) items on the stack and pops the top one
  void TOPNTR(void);              # calls G__val2pointer on the TOS
  void NOT(void);                 # applies NOT to TOS
  void BOOL(void);                # i think this coerces the TOS to bool
  int ISDEFAULTPARA(int addr=0);  # if the stack is not empty, then JMP to addr, otherwise continue
  void LD_VAR(struct G__var_array* var,int ig15,int paran,int var_type);  # i dont understand this at all, but i think this is about loading a variable, which may be an array, into TOS? It first consumes a quantity of 'paran' parameters from the stack, but afaict it then ignores these? Then in the call to G__getvariable, the first parameter, which looks like it is supposed to be a variable name, is always the empty string, which looks like it should cause G_getvariable, in , to return the empty string? So i'm confused.
  void ST_VAR(struct G__var_array* var,int ig15,int paran,int var_type);  # 
  void LD_MSTR(struct G__var_array* var,int ig15,int paran,int var_type);
  void ST_MSTR(struct G__var_array* var,int ig15,int paran,int var_type);
  void LD_LVAR(struct G__var_array* var,int ig15,int paran,int var_type);
  void ST_LVAR(struct G__var_array* var,int ig15,int paran,int var_type);
  void CMP2(int operator2);
  void PUSHSTROS(void);
  void SETSTROS(void);
  void POPSTROS(void);
  void SETTEMP(void);
  void FREETEMP(void);
  void GETRSVD(const char* item);
  void REWINDSTACK(int rewind);
  int CND1JMP(int addr=0);
  void LD_IFUNC(struct G__ifunc_table* p_ifunc,int ifn,int hash,int paran,int funcmatch,int memfunc_flag);
  void NEWALLOC(int size,int isclass_array);
  void SET_NEWALLOC(int tagnum,int var_type);
  void SET_NEWALLOC(const G__TypeInfo& type);
  void DELETEFREE(int isarray);
  void SWAP();
  void BASECONV(int formal_tagnum,int baseoffset);
  void STORETEMP(void);
  void ALLOCTEMP(int tagnum);
  void POPTEMP(int tagnum);
  void REORDER(int paran,int ig25);
  void LD_THIS(int var_type);
  void RTN_FUNC(int isreturn);
  void SETMEMFUNCENV(void);
  void RECMEMFUNCENV(void);
  void ADDALLOCTABLE(void);
  void DELALLOCTABLE(void);
  void BASEDESTRUCT(int tagnum,int isarray);
  void REDECL(struct G__var_array* var,int ig15);
  void TOVALUE(G__value* pbuf);
  void INIT_REF(struct G__var_array* var,int ig15,int paran,int var_type);
  void PUSHCPY(void);
  void LETNEWVAL(void);
  void SETGVP(int pushpop);
  void TOPVALUE(void);
  void CTOR_SETGVP(struct G__var_array* var,int ig15,int mode); 
  int TRY(int first_catchblock=0,int endof_catchblock=0);
  void TYPEMATCH(G__value* pbuf);
  void ALLOCEXCEPTION(int tagnum);
  void THROW(void);
  void CATCH(void);
  void SETARYINDEX(int newauto);
  void RESETARYINDEX(int newauto);
  void GETARYINDEX(void);
  void PAUSE();
  void NOP(void);
  // new instructions
  void ENTERSCOPE(void);
  void EXITSCOPE(void);
  void PUTAUTOOBJ(struct G__var_array* var,int ig15);
  void CASE(void* x);
  /* void SETARYCTOR(int num); */
  void MEMCPY();
  void MEMSETINT(int mode,map<long,long>& x);
  int JMPIFVIRTUALOBJ(int offset,int addr=0);
  void VIRTUALADDSTROS(int tagnum,struct G__inheritance* baseclass,int basen);
  void cancel_VIRTUALADDSTROS();

see also for some code that executes (some of?) these (and other things generated by the optimizer?)


" CIL (for C Intermediate Language) sits in between GENERIC and GIMPLE: control structures are kept and side-effects are hoisted out of expression trees by introducing instructions. It is originally the intermediate language used in safe compiler CCured. While GENERIC is best suited for source code syntactic analyses and GIMPLE for optimization, CIL is ideally suited for source code semantic analyses. As such, it is the frontend of many analyses for C programs, ranging from type-safe compilation to symbolic evaluation and slicing. " -- Yannick Moy. Automatic Modular Static Safety Checking for C Programs. PhD thesis, Université Paris-Sud, January 2009, section 2.1, page 42

"CIL is both lower-level than abstract-syntax trees, by clarifying ambiguous constructs and removing redundant ones, and also higher-level than typical intermediate languages designed for compilation, by maintaining types and a close relationship with the source program. The main advantage of CIL is that it compiles all valid C programs into a few core constructs with a very clean semantics...CIL features a reduced number of syntactic and conceptual forms. For example, all looping constructs are reduced to a single form, all function bodies are given explicit return statements, syntactic sugar like "->" is eliminated and function arguments with array types become pointers. (For an extensive list of how CIL simplifies C programs, see Section 4.) This reduces the number of cases that must be considered when manipulating a C program. CIL also separates type declarations from code and flattens scopes within function bodies. This structures the program in a manner more amenable to rapid analysis and transformation. CIL computes the types of all program expressions, and makes all type promotions and casts explicit. CIL supports all GCC and MSVC extensions except for nested functions and complex numbers. Finally, CIL organizes C’s imperative features into expressions, instructions and statements based on the presence and absence of side-effects and control-flow. Every statement can be annotated with successor and predecessor information. Thus CIL provides an integrated program representation that can be used with routines that require an AST (e.g. type-based analyses and pretty-printers), as well as with routines that require a CFG (e.g., dataflow analyses). CIL also supports even lower-level representations (e.g., three-address code), see Section 8." [20]


Ray Toal's "Case Study: An IR for C"

" C is such a simple language, an IR is fairly easy to design. Why is C so "simple"?

    Only base types are several varieties of ints and floats
    No true arrays — e1[e2] just abbreviates *(e1+e2); all indicies start at zero, there's no bounds checking
    All parameters passed by value, and in order
    Block structure is very restricted: all functions on same level (no nesting); variables are either truly global or local to a top-level function; implementation drastically simplified because no static links are ever needed and functions can be easily passed as parameters without scoping trouble.
    Structure copy is bitwise
    Arrays aren't copied element by element
    Language is small; nearly everything interesting (like I/O) is in a library 

Thus, the following tuples are sufficient:

        x <- y                        x <- y[i]
        x <- &y                       x[i] <- y
        x <- *y                       goto L
        *x <- y                       if x relop y goto L
        x <- unaryop y                param x
        x <- y binaryop z             call p, n
        unaryop is one of: +, -, !, ~, ...
        binaryop is one of: +, -, *, /, %, &, |, ^, ., &&, ||, ...
        relop is one of ==, !=, <, <=, >, >=
        x[i] means i memory location x + i
        call p,n means call p with n bytes of arguments" -- [21]


"Newspeak [96] is at the same level as MSIL, while maintaining some control structures and expression trees. These design decisions notably facilitate source code analyses. It is the intermediate language used in static analyzer Penjili." -- Yannick Moy. Automatic Modular Static Safety Checking for C Programs. PhD thesis, Université Paris-Sud, January 2009, section 2.1, page 42


CompCert (Clight and Cminor)

"The CompCert? project [22] for building a verified compiler presents a chain of intermediate languages for compilation of C programs, from Clight, a large subset of C, to plain assembly, through Cminor, which is similar to MSIL and Newspeak" -- Yannick Moy. Automatic Modular Static Safety Checking for C Programs. PhD thesis, Université Paris-Sud, January 2009, section 2.1, page 42



"C0 is a Pascal-like subset of C, similar to MISRA-C. It excludes the features of C that make full verification problematic: pointer arithmetic, unions, pointer casts. It is the intermediate language used in the Verisoft project for pervasive formal verification of computer systems.

Simpl is a very generic Sequential IMperative Programming Language. It offers high-level constructs like closures, pointers to procedures, dynamic method invocation, all above an abstract state that can be instantiated differently depending on the language analyzed. In his PhD? thesis, Schirmer presents a two-fold embedding of a programming language in HOL theorem prover through Simpl. On the one hand, Simpl statements are deeply embedded inside HOL, which allows one to formally verify correctness of the anal- ysis on statements. On the other hand, the abstract state is shallowly embedded inside HOL, for an easier application to program analysis of many different languages. Simpl is used as target language in the C0 compiler of the Verisoft project." -- Yannick Moy. Automatic Modular Static Safety Checking for C Programs. PhD thesis, Université Paris-Sud, January 2009, section 2.1, page 43

Ruva VM (Ruby VM)

Guile 2.1

Guile Tree IL

a selection, quoted and paraphrased from

Tree Intermediate Language (Tree-IL) is a structured intermediate language that is close in expressive power to Scheme. It is an expanded, pre-analyzed Scheme.

Tree-IL is a convenient compilation target from source languages. It can be convenient as a medium for optimization, though CPS is usually better. The strength of Tree-IL is that it does not fix order of evaluation, so it makes some code motion a bit easier.

Tree-IL is “structured” in the sense that its representation is based on records, not S-expressions. This ... ensures that compiling to a lower-level language only requires a limited set of transformations. For example, the Tree-IL type <const> is a record type with two fields, src and exp. Instances of this type are created via make-const. Fields of this type are accessed via the const-src and const-exp procedures. There is also a predicate, const?. See Records, for more information on records.

All Tree-IL types have a src slot, which holds source location information for the expression.

Although Tree-IL objects are represented internally using records, there is also an equivalent S-expression external representation for each kind of Tree-IL. For example, the S-expression representation of #<const src: #f exp: 3> expression would be:

(const 3)

Users may program with this format directly at the REPL:

scheme@(guile-user)> ,language tree-il
Happy hacking with Tree Intermediate Language!  To switch back, type `,L scheme'.
tree-il@(guile-user)> (call (primitive +) (const 32) (const 10))
⇒ 42

One may create Tree-IL objects from their external representations via calling parse-tree-il, the reader for Tree-IL.

from "The job of the Scheme compiler is to expand all macros and all of Scheme to its most primitive expressions. The definition of “primitive expression” is given by the inventory of constructs provided by Tree-IL, the target language of the Scheme compiler: procedure calls, conditionals, lexical references, and so on. This is described more fully in the next section.

The tricky and amusing thing about the Scheme-to-Tree-IL compiler is that it is completely implemented by the macro expander. Since the macro expander has to run over all of the source code already in order to expand macros, it might as well do the analysis at the same time, producing Tree-IL expressions directly. "


(void): An empty expression. In practice, equivalent to Scheme’s (if #f #f). 
(const exp): A constant. 
(primitive name): A reference to a “primitive”.
(lexical name gensym): A reference to a lexically-bound variable. The name is the original name of the variable in the source program. gensym is a unique identifier for this variable.
(set! (lexical name gensym) exp): Sets a lexically-bound variable. 
( @ module name): A reference to a variable in a specific module.
(set! (@ mod name) exp): Sets a variable in a specific module. 
(toplevel name): References a variable from the current procedure’s module. 
(set! (toplevel name) exp): Sets a variable in the current procedure’s module. 
(define (toplevel name) exp): Defines a new top-level variable in the current procedure’s module. 
(if test then else): A conditional. Note that else is not optional. 
(call proc . args): A procedure call. 
(primcall name . args): A call to a primitive. Equivalent to (call (primitive name) . args). This construct is often more convenient to generate and analyze than <call>. As part of the compilation process, instances of (call (primitive name) . args) are transformed into primcalls. 
(seq head tail): A sequence. The semantics is that head is evaluated first, and any resulting values are ignored. Then tail is evaluated, in tail position. 
(lambda meta body): A closure. meta is an association list of properties for the procedure. body is a single Tree-IL expression of type <lambda-case>. As the <lambda-case> clause can chain to an alternate clause, this makes Tree-IL’s <lambda> have the expressiveness of Scheme’s case-lambda. 
(lambda-case ((req opt rest kw inits gensyms) body) [alternate]): One clause of a case-lambda. A lambda expression in Scheme is treated as a case-lambda with one clause. req is a list of the procedure’s required arguments, as symbols. opt is a list of the optional arguments, or #f if there are no optional arguments. rest is the name of the rest argument, or #f. kw is a list of the form, (allow-other-keys? (keyword name var) ...), where keyword is the keyword corresponding to the argument named name, and whose corresponding gensym is var. inits are tree-il expressions corresponding to all of the optional and keyword arguments, evaluated to bind variables whose value is not supplied by the procedure caller. Each init expression is evaluated in the lexical context of previously bound variables, from left to right. gensyms is a list of gensyms corresponding to all arguments: first all of the required arguments, then the optional arguments if any, then the rest argument if any, then all of the keyword arguments. body is the body of the clause. If the procedure is called with an appropriate number of arguments, body is evaluated in tail position. Otherwise, if there is an alternate, it should be a <lambda-case> expression, representing the next clause to try. If there is no alternate, a wrong-number-of-arguments error is signaled. 
(let names gensyms vals exp): Lexical binding, like Scheme’s let. names are the original binding names, gensyms are gensyms corresponding to the names, and vals are Tree-IL expressions for the values. exp is a single Tree-IL expression. 
(letrec names gensyms vals exp): A version of <let> that creates recursive bindings, like Scheme’s letrec
(prompt escape-only? tag body handler): A dynamic prompt. Instates a prompt named tag, an expression, during the dynamic extent of the execution of body, also an expression. If an abort occurs to this prompt, control will be passed to handler, also an expression, which should be a procedure. The first argument to the handler procedure will be the captured continuation, followed by all of the values passed to the abort. If escape-only? is true, the handler should be a <lambda> with a single <lambda-case> body expression with no optional or keyword arguments, and no alternate, and whose first argument is unreferenced. See Prompts, for more information. 
(abort tag args tail): An abort to the nearest prompt with the name tag, an expression. args should be a list of expressions to pass to the prompt’s handler, and tail should be an expression that will evaluate to a list of additional arguments. An abort will save the partial continuation, which may later be reinstated, resulting in the <abort> expression evaluating to some number of values. 

There are two Tree-IL constructs that are not normally produced by higher-level compilers, but instead are generated during the source-to-source optimization and analysis passes that the Tree-IL compiler does. Users should not generate these expressions directly, unless they feel very clever, as the default analysis pass will generate them as necessary.

(let-values names gensyms exp body): Like Scheme’s receive – binds the values returned by evaluating exp to the lambda-like bindings described by gensyms. That is to say, gensyms may be an improper list. <let-values> is an optimization of a <call> to the primitive, call-with-values. 
(fix names gensyms vals body): Like <letrec>, but only for vals that are unset lambda expressions. fix is an optimization of letrec (and let). 

See for some details of some optimizations that Guile performs on Guile Tree IL.

Guile CPS

Continuation-passing style (CPS) is Guile’s principal intermediate language, bridging the gap between languages for people and languages for machines. CPS gives a name to every part of a program: every control point, and every intermediate value. This makes it an excellent medium for reasoning about programs, which is the principal job of a compiler.

An Introduction to CPS


" Consider the following Scheme expression:

  (display "The sum of 32 and 10 is: ")
  (display 42)

Let us identify all of the sub-expressions in this expression, annotating them with unique labels:

  (display "The sum of 32 and 10 is: ")
  |k1      k2
  (display 42)
  |k4      k5

Each of these labels identifies a point in a program. One label may be the continuation of another label. For example, the continuation of k7 is k6. This is because after evaluating the value of newline, performed by the expression labelled k7, we continue to apply it in k6.

Which expression has k0 as its continuation? It is either the expression labelled k1 or the expression labelled k2. Scheme does not have a fixed order of evaluation of arguments, though it does guarantee that they are evaluated in some order. Unlike general Scheme, continuation-passing style makes evaluation order explicit. In Guile, this choice is made by the higher-level language compilers.

Let us assume a left-to-right evaluation order. In that case the continuation of k1 is k2, and the continuation of k2 is k0.

With this example established, we are ready to give an example of CPS in Scheme:

(lambda (ktail)
  (let ((k1 (lambda ()
              (let ((k2 (lambda (proc)
                          (let ((k0 (lambda (arg0)
                                      (proc k4 arg0))))
                            (k0 "The sum of 32 and 10 is: ")))))
                (k2 display))))
        (k4 (lambda _
              (let ((k5 (lambda (proc)
                          (let ((k3 (lambda (arg0)
                                      (proc k7 arg0))))
                            (k3 42)))))
                (k5 display))))
        (k7 (lambda _
              (let ((k6 (lambda (proc)
                          (proc ktail))))
                (k6 newline)))))

Holy code explosion, Batman! What’s with all the lambdas? Indeed, CPS is by nature much more verbose than “direct-style” intermediate languages like Tree-IL. At the same time, CPS is simpler than full Scheme, because it makes things more explicit.

In the original program, the expression labelled k0 is in effect context. Any values it returns are ignored. In Scheme, this fact is implicit. In CPS, we can see it explicitly by noting that its continuation, k4, takes any number of values and ignores them. Compare this to k2, which takes a single value; in this way we can say that k1 is in a “value” context. Likewise k6 is in tail context with respect to the expression as a whole, because its continuation is the tail continuation, ktail. CPS makes these details manifest, and gives them names. "



Guile’s CPS language is composed of terms, expressions, and continuations.


A term can either evaluate an expression and pass the resulting values to some continuation, or it can declare local continuations and contain a sub-term in the scope of those continuations.

$continue k src exp: Evaluate the expression exp and pass the resulting values (if any) to the continuation labelled k. The source information associated with the expression may be found in src, which is either an alist as in source-properties or is #f if there is no associated source.

$letk conts body: Bind conts, a list of continuations ($cont instances), in the scope of the sub-term body. The continuations are mutually recursive.

Additionally, the early stages of CPS allow for a set of mutually recursive functions to be declared as a term. This $letrec type is like Tree-IL’s <fix>. The contification pass will attempt to transform the functions declared in a $letrec into local continuations. Any remaining functions are later lowered to $fun expressions. $letrec names syms funs body: Declare the mutually recursive set of functions denoted by names, syms, and funs within the sub-term body. names and syms are lists of symbols, and funs is a list of $fun values. syms are globally unique.


Here is an inventory of the kinds of expressions in Guile’s CPS language. Recall that all expressions are wrapped in a $continue term which specifies their continuation.

$void: Continue with the unspecified value.

$const val: Continue with the constant value val.

$prim name: Continue with the procedure that implements the primitive operation named by name.

$fun src meta free body: Continue with a procedure. src identifies the source information for the procedure declaration, and meta is the metadata alist as described above in Tree-IL’s <lambda>. free is a list of free variables accessed by the procedure. Early CPS uses an empty list for free; only after closure conversion is it correctly populated. Finally, body is the $kentry $cont of the procedure entry.

$call proc args: Call proc with the arguments args, and pass all values to the continuation. proc and the elements of the args list should all be variable names. The continuation identified by the term’s k should be a $kreceive or a $ktail instance.

$callk label proc args: $callk is for the case where the call target is known to be in the same compilation unit. label should be some continuation label, though it need not be in scope. In this case the proc is simply an additional argument, since it is not used to determine the call target at run-time.

$primcall name args: Perform the primitive operation identified by name, a well-known symbol, passing it the arguments args, and pass all resulting values to the continuation. The set of available primitives includes all primitives known to Tree-IL and then some more; see the source code for details.

$values args: Pass the values named by the list args to the continuation.

$prompt escape? tag handler: Push a prompt on the stack identified by the variable name tag, which may be escape-only if escape? is true, and continue with zero values. If the body aborts to this prompt, control will proceed at the continuation labelled handler, which should be a $kreceive continuation. Prompts are later popped by pop-prompt primcalls.


The remaining element of the CPS language in Guile is the continuation. In CPS, all continuations have unique labels. Since this aspect is common to all continuation types, all continuations are contained in a $cont instance:

CPS $cont k cont: Declare a continuation labelled k. All references to the continuation will use this label.

$kargs names syms body: The most common kind of continuation binds some number of values, and then evaluates a sub-term. $kargs is this kind of simple lambda. Bind the incoming values to the variables syms, with original names names, and then evaluate the sub-term body.

Variable names (the names in the syms of a $kargs) should be globally unique, and also disjoint from continuation labels. To bind a value to a variable and then evaluate some term, you would continue with the value to a $kargs that declares one variable. The bound value would then be available for use within the body of the $kargs.

$kif kt kf: Receive one value. If it is true for the purposes of Scheme, branch to the continuation labelled kt, passing no values; otherwise, branch to kf. For internal reasons, only certain terms may continue to a $kif. Compiling $kif avoids allocating space for the test variable, so it needs to be preceded by expressions that can test-and-branch without temporary values. In practice this condition is true for $primcalls to null?, =, and similar primitives that have corresponding br-if-foo VM operations; see the source code for full details. When in doubt, bind the test expression to a variable, and continue to the $kif with a $values expression. The optimizer should elide the $values if it is not needed. Calls out to other functions need to be wrapped in a $kreceive continuation in order to adapt the returned values to their uses in the calling function, if any.

$kreceive arity k: Receive values on the stack. Parse them according to arity, and then proceed with the parsed values to the $kargs continuation labelled k. As a limitation specific to $kreceive, arity may only contain required and rest arguments. $arity is a helper data structure used by $kreceive and also by $kclause, described below.

CPS Data: $arity req opt rest kw allow-other-keys?: A data type declaring an arity. req and opt are lists of source names of required and optional arguments, respectively. rest is either the source name of the rest variable, or #f if this arity does not accept additional values. kw is a list of the form ((keyword name var) ...), describing the keyword arguments. allow-other-keys? is true if other keyword arguments are allowed and false otherwise. Note that all of these names with the exception of the vars in the kw list are source names, not unique variable names.

Additionally, there are three specific kinds of continuations that can only be declared at function entries.

$kentry self tail clauses: Declare a function entry. self is a variable bound to the procedure being called, and which may be used for self-references. tail declares the $cont wrapping the $ktail for this function, corresponding to the function’s tail continuation. clauses is a list of $kclause $cont instances.

$ktail: A tail continuation.

$kclause arity cont: A clause of a function with a given arity. Applications of a function with a compatible set of actual arguments will continue to cont, a $kargs $cont instance representing the clause body.

See for some details of some optimizations that Guile performs on Guile CPS.

Guile 2.1 bytecode



Guile Links

See also Guile section of proj-plbook-plChLispLangs, proj-plbook-plChImplementationCaseStudies, proj-plbook-plChMiscIntermedLangs, proj-plbook-plChTargetsImpl.

WebKit JavaScriptCore

Old documentation of their bytecode, this may be out of date:

Those docs may be out of date because they don't match this, but maybe i am misunderstanding and this is something else:


SpiderMonkey Javascript bytecode


Variables and scopes:





OCaml has both a compiler and a bytecode interpreter.

The bytecode interpreter is based on an earlier CAML project/version called ZINC.

OCaml Lambda Form IL

ZINC abstract machine (ZAM) (1990)

Two stacks, the argument stack and the return stack.



ZINC abstract machine (ZAM) with cache (1990)



type expression = mutable Zident of expr_ident (* variable "x" or "module.y" *)

Zconstant of struct_constant (* constant "3.14" *)
Zconstruct of constr_desc global & expression list (* constructor application *)
Zapply of expression & expression list (* multiple application *)
Zlet of (pattern & expression) list & expression (* local binding *)
Zletrec of (pattern & expression) list & expression (* local recursive binding *)
Zfunction of (pattern list & expression) list (* functions (with pattern matching) *)
Ztrywith of expression & (pattern & expression) list (* exception handling *)
Zsequence of expression & expression (* sequence *)
Zcondition of expression & expression & expression (* if...then...else... *)
Zwhile of expression & expression (* "while" loop *)
Zsequand of expression & expression (* sequential "and" *)
Zsequor of expression & expression (* sequential "or" *)
Zconstraint of expression & type_syntax (* type constraint *)
Zassign of string & expression (* assignment *)

and expr_ident = Zglobal of value_desc global (* global variable, with its descriptor *)

Zlocal of string (* local variable *)

type pattern = Zwildpat (* underscore "_" *)

Zvarpat of string (* variable *)
Zaliaspat of pattern & string (* alias "... as y" *)
Zconstantpat of atomic_constant (* constant *)
Zconstructpat of constr_desc global & pattern list (* construction *)
Zorpat of pattern & pattern (* alternative *)
Zconstraintpat of pattern & type_syntax (* type constraint *)

type atomic_constant = ACnum of num

and struct_constant = SCatom of atomic_constant
ACint of int
ACfloat of float
ACstring of string
SCblock of constr_tag & struct_constant list


ZINC CAML Enriched lambda-calculus (1990)

type lambda = Lvar of num (* local variable (de Bruijn’s number) *)

Lconst of struct_constant (* constants *)
Lapply of lambda & lambda list (* multiple application *)
Lfunction of num & lambda (* curried function (num = arity) *)
Llet of lambda list & lambda (* local binding *)
Lletrec of lambda list & lambda (* local recursive binding *)
Lprim of primitive & lambda list (* primitive operation *)
Lcond of lambda & (atomic_constant & lambda) list (* equality tests with constants *)
Lswitch of num & lambda & (constr_tag & lambda) list (* jump through table *)
Lstaticfail (* as "raise failure", but with static scope *)
Lstatichandle of lambda & lambda (* handler for static failures *)
Lhandle of lambda & lambda (* handler for "real" exceptions (dynamic scope) *)
Lifthenelse of lambda & lambda & lambda (* conditional *)
Lsequence of lambda & lambda (* sequence *)
Lwhile of lambda & lambda (* "while" loop *)
Lsequand of lambda & lambda (* sequential "and" *)
Lsequor of lambda & lambda (* sequential "or" *)

and primitive = Pget_global of string & string (* access to global variables *)

Pset_global of string & string (* modification of global variables *)
Pmakeblock of constr_tag (* building a tuple in the heap *)
Pfield of num (* accessing one field of a tuple *)
Psetfield of num (* modifying one field of a tuple *)
Pccall of num (* calling a C primitive *)
Praise (* raising an exception *)
Ptest of bool_test (* comparisons *)
... (* arithmetic operations, and much more *)

and bool_test = Pequal_test

Pnum_test of num prim_test
Pint_test of int prim_test
Pfloat_test of float prim_test
Pstring_test of string prim_test
Peqtag_test of constr_tag
Pnoteqtag_test of constr_tag

and ’a prim_test = PTeq

PTnoteqimm of ’a


ZINC CAML Linear code (1990)


ZINC CAML bytecode (1990)

Constants and literals:

Function handling:

Environment handling:

Building and destructuring blocks:


Floating-point numbers:



Eq, Equal:

Branches and conditional branches:



Caml bytecode (Zinc bytecode) 2010

version 3.11.2

the following is quoted or paraphrased from

" The virtual machine executing a caml bytecode is stack-centered. The caml stack, augmented with an accumulator, stores values. A caml value can be either:

Along with the stack, the virtual machine contains seven registers:

Stack and accumulator operations:


Conditionals and local branches:

Other control flow:

Memory and constant and variable loads/stores and allocations:

Composite data:






Redundancies for efficiency (push then something else, or constants instead of parameters):



AliceML core

Copied or paraphrased from :

"Alice Abstract Code"

data types with examples:

'restricted value set': integers, tuples, closures, tagged values and exceptions

See Guido Tack’s Diploma thesis (Linearisation, minimisation and transformation of data graphs with transients. Diploma thesis, Programming Systems Lab, Universit ̈at des Saarlan-des, Saarbr ̈ucken, May 2003) for the representation format of these data types in the SEAM Abstract Store (graph store).

Code is stored both as Alice Abstract Code (so that it can be pickled), and also as compiled code (either compiled to bytecode or to native code). Compiled code is represented as binary 'chunk' nodes in the SEAM Abstract Store; since these are opaque to SEAM, the garbage collector can't tell which other nodes they reference, therefore binary code is wrapped by a pair node which contains (points to) both a binary chunk node and also to an "immediate environment" which is a node holding references to other nodes that are referred to by the binary code. Binary code is position-independent in anticipation of being moved around by the garbage collector.

" Alice represents code as a directed acyclic graph. The acyclic structure is achieved by transforming all loops into recursion. The nodes of the graph encode the instructions of the abstract code. Except for the root node, each node has either one or two predecessors. A node has n ≥ 0 successors. The code graph explicitly encodes the control flow, which means that each instruction knows about its continuations.

The static Alice compiler removes all information about types. Static type safety guaranties that the abstract code instructions operate on correctly typed arguments. In particular, there are no polymorphic instructions, which means that the instructions do not have to test the type of its arguments. "

AliceML Abstract Code Instructions

Copied or paraphrased from :


The abstract code explicitly differentiates between defining and applied occurrences of identifiers. Let id be the set of identifiers and value denote the set of all Alice values. Defining occurrences of identifiers can have the following form: def ::= IdDef? {id : id}


There is either a local definition of an identifier or a wildcard. A wildcard is only a hint that the result of the definition is not used.

There are three forms of applied occurrences: ref ::= Global {id : id}

Immediate Global {value : value}Local {id : id}

Each abstract code instruction is parameterized over ref and def, respectively. This yields a compact definition of the instructions. However, execution speed suffers because each identifier access is preceded by a case distinction. This is a first hint that the primary design focus of the abstract code is not execution speed but compact definition and ease of (just-in-time) compilation.

join points:

A node in the abstract code graph is a join point if it has two predecessors. The abstract code explicitly encodes join points as special nod es. So by definition, all other nodes (except the root node) have exactly one predecessor. The Shared instruction is just a marker for a join point. It contains a stamp that distinguishes it from other join points in the code graph.

allocation of data structures:

creation plus initialization

The style is the same for all values: first the destination is given, then several data elements for initialization are provided, and the last component next identifies the successor instruction.

access to data structures:


Both instructions: 1. extract the tag t from the test value. 2. look up the tag in a vector of alternatives. 3. The corresponding entry in the vector contains a list of local identifiers and a successor instruction. The local identifiers are bound to the fields of the tagged value, and then we jump to the successor instruction.

The general test and the compact test only differ in the second step. A linear search on the vector is performed to find an entry with tag t. If there is no corresponding entry, the 'else' branch is taken.

Compact tests are used if the test cases cover a contiguous range of tags from 0 to k. Then, the branches can be stored compactly in a vector, such that their position defines the associated integer tag. After a range check 0<=t<=k, the tag t serves as an index into the vector, yielding constant time lookup. The range check can be removed if the test is exhaustive, which means that the test table covers all possible cases. The static Alice compiler can find out about exhaustive tests on the basis of type information. For all exhaustive tests, the 'else' branch is set to NONE.

procedure application:


join point wrapper:


Encoded as procedure calls (using Apply and ApplyTail?) to a library of built-in procedures (incuding, for example, arithmetic, an instruction to initiate concurrent computation, string ops). For speed, the bytecode offers primitives for some of these, and the native compiler inlines some of them.


AliceML bytecode

Copied or paraphrased from :

"Alice byte code can be roughly described as a linearizion and specialization of the abstract code."



Data Structure Access:

All instructions that require the actual value of an argument contain an internal test on transient values. If the test finds out that the argument is not yet determined, the interpreter interrupts execution and issues a request to the scheduler. There is, however, a single situation in which we need an explicit transient test: pattern matching on the empty tuple, i.e. the unit value.

Waiting on a potential transient:

Local control flow and conditionals:

Procedure Application:


Calling convention conversion:

There must be some way to load procedure arguments out of the g lobal register bank into byte code registers. Due to the calling convention of Al ice (Section 2.2.3), the number of arguments that are in the register bank does not nec essarily coincide with the receiver’s expectation. So in some cases, the receiver h as to convert the arguments before loading them into byte code registers. The two CCC instructions in the byte code perform the conversion internally and then load the arg uments into the dedicated registers.

Inlined primitives:

Other bytecodes for efficiency: "A concise set of byte code instructions is nice for presentation and it eases compiler implementation. However, for many instructions, there are special cases that occur quite often. Adding specialized instructions to the instru ction set noticeably increases execution speed. Alice offers tuples of fixed arity. In practice, most of the tup les are pairs and triples. Therefore, it pays off to introduce specialized byte code ins tructions for construction and deconstruction of pairs and triples. Other examples for specializations are call instructions. The number of arguments rarely exceeds three , such that performance benefits from specialized call instructions for argument nu mbers from zero 2 to three ar- guments. There are also specialized arithmetic instructio ns, for example, an increment and a decrement instruction for integer arguments. The sema ntics of Alice prescribes overflow and underflow checks for all arithmetic operations. The general instruction to add two integers checks on the result whether it falls below t he smallest representable number or exceeds the biggest representable number. The inc rement instruction only needs to check for overflow, and the decrement instruction on ly needs to check for underflow."

"Super-instructions combine several virtual machine instr uctions into one, thus reduc- ing code size as well as dispatch and argument access overhea d. The literature distin- guishes two ways of implementing super-instructions, name ly a static and a dynamic approach [19]. In the static approach the interpreter write r identifies common in- struction sequences and adds a new implementation for each s uper-instruction. In the dynamic approach super-instructions are created at run-ti me. The Alice byte code system uses static super-instructions. There is, for instance, an instruction that combines pair construction and initializ ation."


Ruby YARV VM bytecode

Instruction list:

Other lists of those instructions and related (possibly out of date):

Instruction list:

Berkeley Packet Filter (BPF) bytecode (Linux variants)

BPF is a VM language used for packet filtering. There is a later variant "eBPF" (or "internal BPF"). describes the bytecodes for both of these, starting with section "BPF engine and instruction set".

BPF has two special registers A and X, and 16 memory locations (or GPRs (general purpose registers)). They are 32-bit width.

Instructions are 64-bits. A 16-bit opcode, one 32-bit operand, and two special 8-bit jump targets ("one for condition "jump if true", the other one "jump if false""). There are 10 or so addressing modes (different subsets of them are applicable to different instructions) (where are the addressing modes encoded? i'm not sure). The addressing modes include constant, A, X, the GPRs, and various addressing modes to address locations within the current packet (remember, this is a language for packet filtering).

eBPF "is modelled closer to the underlying architecture to mimic native instruction sets..It is designed to be JITed with one to one mapping, which can also open up the possibility for GCC/LLVM compilers to generate optimized eBPF code through an eBPF backend that performs almost as fast as natively compiled code."

eBPF has 64-bit 10 registers instead of 2 32-bit (A and X), plus a read-only frame pointer. It also contains a CALL instruction and a calling convention. The convention is: "

Note that only functions that take 5 or less arguments are supported by eBPF CALL. This was chosen as a lowest-common-denominator after a survey of calling conventions associated with popular 64-bit architectures: "Natively, x86_64 passes first 6 arguments in registers, aarch64/ sparcv9/mips64 have 7 - 8 registers for arguments; x86_64 has 6 callee saved registers, and aarch64/sparcv9/mips64 have 11 or more callee saved registers."

eBPF has "conditional jt/jf targets replaced with jt/fall-through".

eBPF is limited to 4096 instructions (however this limit can be subverted by use of BPF_MAP_TYPE_PROG_ARRAY), and in the past, can only jump forward (no loops; i'm not sure if original non-e BPF had this restriction). Now, "the only jumps which are allowed to go backwards are the end of a loop with a fixed number of iterations" [22]; specifically, there is a limit of 32 nested calls.

eBPF has some new instructions:

eBPF has 'maps' which are key-value associative arrays/dictionaries that support the following operations:

eBPF does some verification to disallow:



Rust MIR

Note: Rust MIR is still in development, yet i wrote the following in present tense.

Rust compiles Rust -> HLL AST -> MIR -> LLVM.

Rust's MIR is after type checking. Types are explicit in MIR. Borrow checking is done on MIR. Basic blocks are explicit in MIR (MIR is a control-flow graph (CFG)). In Mir, control flow is translated to GOTO and IF and SWITCH and function calling (i think) (and a function call can specify an panic handler). Monomorphization is done after MIR (on LLVM).



An Efficient Implementation of SELF, a Dynamically-Typed Object-Oriented Language Based on Prototypes.

" Each bytecode in the byte array represents a single byte-sized virtual machine instruction,and is divided into two parts: a 3-bit opcode and a 5-bit object array index...The only opcodes used to represent SELFprograms are the following:

SELF push self onto the execution stack LITERAL <value index> push a literal value onto the execution stack SEND <message name index> send a message, popping the receiver and arguments off the execution stack and pushing the result SELF SEND <message name index> send a message to self, popping the arguments off the execution stack and pushing the result SUPER SEND <message name index> send a message to self, delegated to all parents, popping the arguments off the execution stack and pushing the result DELEGATEE <parent name index> delegate the next message send to the named parent NON-LOCAL RETURN execute a non-local return from the lexically-enclosing method activation INDEX-EXTENSION <index extension> extend the next index by prepending the index extension

The index for the opcodes is an index into the accompanying object array. The 5-bit offset allows the first 32 message names and literals to be referred to directly;indices larger than 32 are constructed using extra INDEX-EXTENSION instruc-tions. In SELF source code, primitive operations are invoked with the same syntax usedto send a message, except that the message name begins with an underscore (“_”). " -- [23]

"The byte codes needed to express SELF programs fall into only three classes: base values (LITERAL and SELF), message sends, and non-local return." ((and index extension, which is encoding))

"Smalltalk-80 defines a much larger set of byte codes [12], tuned to minimize space and maximize interpretation speed, and includes byte codes to fetch and store local, instance, class, pool, and global vari- ables and shortcut byte codes for common-case operations such as loading constants like nil, true, and 0. Smalltalk-80 systems also use special control flow byte codes to implement common boolean messages like ifTrue:ifFalse: and whileTrue:; the Small- talk parser translates the message sends into conditional and unconditional branch byte codes, open-coding the argument blocks. Similarly, the == message is auto- matically translated into the identity comparison primitive operation byte code. A similar optimization is included for messages like + and <, which the parser trans- lates into special byte codes. When executed, these byte codes either directly invoke the corresponding integer primitive operation (if the receiver is an integer), or perform the message send (if the receiver isn’t an integer). "

"Each byte code in the byte array represents a single byte-sized virtual machine instruction, and is divided into two parts: a 3-bit opcode and a 5-bit object array index. " eg for self.x, the five bits are the index of method 'x' within object 'self'.

Scheme 79 chip S-code


from the Appendix (things that start with 'primitive-'):


Fantom fcode

Fantom compiles to JVM and CLR. 'fcode' is its IR.

Fantom fcode operation argument types

Fantom fcode instructions

FalseTrueIntFloatDecimalStrDurationTypeUri): load literal onto stack (some are immediate, some are LOADK eg load a constant from a constant table)
Store)Var (Register): takes a local var register index (0 is this)
Store)Instance (FieldRef?): load/store field from/to storage
Store)Static (FieldRef?): load/store static field from/to storage
NE) (TypePair?): a.equals(b)
LTGTGE) (TypePair?): <= 0, etc



Haxe AST

Haxe is a language that compiles to many target platforms.

The Haxe AST is exposed to Haxe Macros [24].

types of AST nodes (from ExprDef?):

other stuff in that file:



"Scheme 48...The name was intended to reflect the idea that the implementation should be so clear and simple that you could believe the system could have been written in 48 hours (or perhaps, when S48 got complex enough, this was altered to "could have been *read* in 48 hours")....Its innovations were its module system, the language in which its VM was defined ("pre-scheme"), and its stack-management technology. ... Pre-scheme was quite interesting. Kelsey published a paper on it, as well, I believe. It was Scheme in the sense that you could load it into a Scheme system and run the code. But it was restrictive -- it required you to write in a fashion that allowed complete Hindley-Milner static type inference, and all higher-order procedures were beta-substituted away at compile time, meaning you could *straightforwardly* translate a prescheme program into "natural" C code with C-level effiency. That is, you could view prescheme as a really pleasant alternative to C for low-level code. And you could debug your prescheme programs in the interactive Scheme development environment of your choice, before flipping a switch and translating to C code, because prescheme was just a restricted Scheme. The Scheme 48 byte-code interpreter was written in prescheme. Prescheme sort of died -- beyond the academic paper he wrote, Kelsey never quite had the time to document it and turn it into a standalone tool that other people could use (Ian Horswill's group at Northwestern is an exception to that claim -- they have used prescheme for interesting work). There are ideas there to be had, however. " -- [25]

prescheme appears to be implemented in

From the Wikipedia page: " Variants

Macro-Free PreScheme? Obtained from Full PreScheme? by expanding all macros. Pure PreScheme? A tail-recursive, closure-free dialect of PreScheme?, obtained from Macro-Free PreScheme? by hoisting lambda expressions and beta expansion. VLISP PreScheme? Simple PreScheme? "



BCPL O-code

The BCPL compiler was the first to define a virtual machine for the purpose of portability [26]. The language of the VM is called OCODE.



Later, a lower-level assembly-like virtual machine was added beneath OCode, called INTCODE.

" Unlike a conventional computer, the INTCODE machine is not fully specified, and such details as the word-length, byte-size, and instruction format are left undefined. The machine has 6 control registers as follows: A and B are accumulators for computing expressions, C is the sequence control register giving the location of the next instruction to be obeyed, D is a register used to hold the computed address of the current instruction, and P and G are index registers. All these registers are the size of a machine word.

 An instruction has a 3 bit function field, and an address
 field of unspecified size, 2 bits for index modification and an
 indirection bit.  These fields may be laid out in the word in
 any way that is convenient for the interpreter.  An instruction
 is executed as follows.  Firstly, it is fetched from the store
 and C is incremented, then, the computed address is formed by
 assigning the address field to D, conditionally adding P or G
 as specified by the modification field, and indirecting if
 required.  Finally, the operation specified by the function
 field is performed.
 The 8 machine functions are: LOAD, ADD, STORE, JUMP,
 CALL is used in the compilation of a BCPL
 function or routine call. ...EXECUTE OPERATION provides a miscellaneous collection of arithmetic,
 relational, logical, and control functions, the actual function
 being determined by D. Most of the functions operate on B and
 A, usually leaving a result in A.  For example, X7 will cause the
 remainder after the integer division of B by A to be assigned to
 A.  There are 23 execute operations in the basic INTCODE machine,
 but for practical use, a further 5 to 10 are needed in order to
 provide an adequate interface with the operating system.
 " -- []

Later on, presumably INTCODE was refined into CINTCODE, which is much more elaborate (see page 155 (PDF page 163) of [27]).


Featherweight Erlang


CIL (C Intermediate Language)

" exp::=Constant(const)

Lvalue(lvalue)SizeOfExp?(exp)SizeOfType?(type)AlignOfExp?(exp)AlignOfType?(type)UnOp?(unop,exp)BinOp?(binop,exp,exp)Cast(type,exp)AddressOf?(lvalue)StartOf?(lvalue)instr::=Set(lvalue,exp)Call(lvalue option,exp,exp list)Asm(raw strings,lvalue list,exp list)

Fig. 4.The syntax of CIL expressions and instructions "

" stmt::=Instr(instr list)

Return(exp option)Goto(stmt)BreakContinueIf(exp,stmt list,stmt list)Switch(exp,stmt list,stmt list)Loop(stmt list)Fig. 5.The syntax of CIL statements.

" type::=Void

Int(intKind)Float(floatKind)Ptr(type)Array(type,exp)Fun(type,variable list)Enum(enumInfo)Named(string,type)Struct(compInfo)Union(compInfo)enumInfo::= (string,item list)compInfo::= (string,field list)Fig. 6.The abstract syntax of CIL types

mruby bytecode






3. /


 tmpB. The direction of the shift depends on the sign of tmpB. Arg1 is Live. Jumps to Arg0 on error if Arg0 is not zero.
tmpB. Arg1 is Live. Jumps to Arg0 on error if Arg0 is not zero.


Combined instructions for efficiency:

Core Erlang

BEAM variants

UCSD Pascal p-code

Stack-based. "Responsible for wide adoption of Pascal in the 1970s." [29]


Cint, a C/C++ interpreter

VM instructions (from cint-5.18.00/cint/src/bc_inst.h ). In some cases i've written comments on what i think these instructions might mean, these are only guesses from skimming the source code very quickly, as i couldn't find any documentation (although some instructions had code comments, which i often copied into here):

  void LD(G__value* pval);
  void LD(int a);            # push a copy of the value found at depth a in the data stack onto the data stack
  void CL(void);             # clear stack pointer; Take a possible breakpoint, and then clear the data stack, and the structure offset stack.
  void OP2(int opr);         # one of +,-,*,/,%,@,