# Other Intermediate languages (ILs) designed to be targetted by multiple high-level languages, or pedagogical or minimalistic

## Parrot

Parrot started out as a runtime for Perl6. Then it refocused on being an interoperable VM target for a variety of languages. However, it hasn't been very successful (due to not enough volunteers being motivated to spend enough hours hacking on it), and even Perl6 is moving away from it.

Even if unsuccessful, it is still of interest because it is one of the few VMs designed with interoperation between multiple HLLs in mind.

Note that multiple core Parrot devs claim that all core Parrot devs hate Parrot's object model: http://whiteknight.github.io/2011/09/10/dust_settles.html http://www.modernperlbooks.com/mt/2012/12/the-implementation-of-perl-5-versus-perl-6.html

It's register-based. It provides garbage collection.

It has a syntactic-sugar IR language called PIR (which handles register allocation and supports named registers), an assembly-language called PASM, an AST serialization format called PAST, and a bytecode called PBT.

Its objects are called PMCs (Polymorphic Containers).

Its set of opcodes are extensible (a program written in Parrot can define custom opcodes). Parrot itself contains a lot of opcodes: http://docs.parrot.org/parrot/devel/html/ops.html

At one point there was an effort called M0 to redefine things from a small, core set of opcodes but i don't know what happened to it; this appears to be the list of M0 opcodes: https://github.com/parrot/parrot/blob/m0/src/m0/m0.ops . I dunno if the M0 project is still ongoing, see http://leto.net/dukeleto.pl/2011/05/what-is-m0.html https://github.com/parrot/mole http://reparrot.blogspot.com/2011/07/m0-roadmap-goals-for-q4-2011.html http://gerdr.github.io/on-parrot/rethinking-m0.html . The repo seems to be at https://github.com/parrot/parrot/tree/m0 . There is also an IL that compiles to M0: https://github.com/parrot/m1/blob/master/docs/pddxx_m1.pod .

There was an earlier effort for some sort of core language called L1 http://wknight8111.blogspot.com/2009/06/l1-language-of-parrot-internals.html . Not sure what happened with that either.

Instructions/Opcodes:

The M0 opcodes:

https://github.com/parrot/parrot/blob/m0/docs/pdds/draft/pdd32_m0.pod

control flow: noop goto (go to a fixed offset in the current bytecode segment) goto_if (conditionally go to a fixed offset in the current bytecode segment) goto_chunk (go to an offset in another chunk)

arithmetic: add_i add_n sub_i sub_n mult_i mult_n div_i div_n mod_i (remainder) mod_n isgt_i isgt_n isge_i isge_n convert_i_n (convert from integer to numeric) convert_n_i

bitwise arithmetic: ashr (right bitshift with sign extension) lshr (right bitshift without sign extension) shl (left bitshift) and or xor

Memory/GC ops: gc_alloc sys_alloc sys_free copy_mem

todo: set set_imm deref set_ref set_byte get_byte set_word get_word csym ccall_arg ccall_ret ccall print_s print_i print_n exit

_i means arith on 'integers', _n means arith on 'two numeric registers', "Treat *$2 and *$3 as integer or floating-point values, (operate on) them and store the result in *\$1."

todo; explain the crytic ones; descriptions here https://github.com/parrot/parrot/blob/m0/docs/pdds/draft/pdd32_m0.pod

"

Parrot

Parrot is also a register based virtual machine. It defines four types of registers:

Integers
Numbers (i.e. floating point)
Strings
Polymorphic Containers (PMCs), which reference complex types and structures

Like LLVM, Parrot does not define a maximum number of registers: each function uses as many registers as it needs. Functions do not re-use registers for different purposes by storing their values to memory, they specify a new register number instead. The Parrot runtime will handle assignment of virtual machine registers to CPU registers.

So far as I can tell, integer registers are the width of the host CPU on which the VM is running. A Parrot bytecode might find itself using either 32 or 64 bit integer registers, determined at runtime and not compile time. This is fascinating if correct, though it seems like BigNum? handling would be somewhat complicated by this. " -- http://codingrelic.geekhold.com/2010/07/virtual-instruction-sets-opcode.html

## NekoVM

Instructions:

• AccNull? AccTrue? AccFalse? AccThis? AccInt? AccStack? AccGlobal? AccEnv? AccField? AccArray? AccIndex? AccBuiltin?
• SetStack? SetGlobal? SetEnv? SetField? SetArray? SetIndex? SetThis?
• Push Pop
• Call Ret ObjCall?
• Jump JumpIf? JumpIfNot?
• Trap EndTrap?
• MakeEnv? MakeArray?
• Bool IsNull? IsNotNull?
• Add Sub Mult Div Mod Shl Shr UShr Or And Xor Eq Neq Gt Gte Lt Lte Not
• TypeOf? Compare Hash New JumpTable? Apply AccStack?0 AccStack?1 AccIndex?0 AccIndex?1 PhysCompare? TailCall? Loop MakeArray?2 AccInt?32 Last

Used as a target for Haxe; being replaced by HashLink? VM in this capacity. Neko and Haxe and HashLink? have the same lead dev (Neko precedes Haxe).

Haxe currently uses NekoVM? as its primary VM target, but it's switching to HashLink?. Neko and Haxe and HashLink? have the same lead dev (Neko precedes Haxe). Why is Haxe switching to HashLink?? And why isn't hashlink just called 'Neko 3.0'? https://haxe.org/blog/hashlink-indepth/ explains that "Neko was my first complete virtual machine. It was designed before Haxe was even created and was meant to run various programming languages. In the end only two languages ended up targeting Neko: NekoML?, the language of the Neko compiler, and of course Haxe. Back then, the Neko virtual machine was not especially designed to run Haxe and suffered from some limitations, the main one being performance....Neko is very slow. The main reason for this is that it's dynamically typed". Hashlink by contrast is "strictly-typed register-based" (Neko is stack-based / a stack machine: [1].

---

## Tiny

Pedagogical language

http://marvin.cs.uidaho.edu/Teaching/CS445/tinyGrammar.pdf

http://marvin.cs.uidaho.edu/Teaching/CS445/tmDescription35.txt

---

## C- language

Pedagogical language. Not to be confused with C--.

http://marvin.cs.uidaho.edu/Teaching/CS445/c-Grammar.pdf

---

## Dis

Primitive types: byte, word, string, real, big (and some short versions of some of these) Addressing modes: immediate, indirect and double indirect

Garbage Collection

there are separate instructions (omitted from below) for calling into a different module (and for allocating a stack frame prior to that call)

### ISA

Question marks mean that there are a bunch of instructions of this form for different types.

todo: the rest of these:

new, newz - Allocate object

Syntax:		new	src, dst
newz	src, dst
Function:	dst = malloc(src->size);
initmem(dst, src->map);

The new instruction allocates and initializes storage to a new area of memory. The size and locations of pointers are specified by the type descriptor number given as the src operand. A pointer to the newly allocated object is placed in dst. Any space not occupied by pointers has undefined value.

The newz instruction additionally guarantees that all non-pointer values are set to zero. It is not used by Limbo. newa, newaz - Allocate array

Syntax:		newa	src1, src2, dst
newaz	src1, src2, dst
Function:	dst = malloc(src2->size * src1);
for(i = 0; i < src1; i++)
initmem(dst + i*src2->size, src2->map);

The newa instruction allocates and initializes an array. The number of elements is specified by the src1 operand. The type of each element is specified by the type descriptor number given as the src2 operand. Space not occupied by pointers has undefined value. The newaz instruction additionally guarantees that all non-pointer values are set to zero; it is not used by Limbo. newcx - Allocate channel

Syntax:		newcw	dst
newcb	dst
newcl	dst
newcf	dst
newcp	dst
newcm	src, dst
newcmp	src, dst
Function:	dst = new(Channel)

The newc instruction allocates a new channel of the specified type and stores a reference to the channel in dst. For the newcm instruction the source specifies the number of bytes of memory used by values sent on the channel (see the movm instruction above). For the newcmp instruction the first operand specifies a type descriptor giving the length of the structure and the location of pointers within the structure (see the movmp instruction above). orx - Logical OR

Syntax:		orb	src1, src2, dst
orw	src1, src2, dst
orl	src1, src2, dst
Function:	dst = src1 | src

These instructions compute the bitwise OR of the two operands addressed by src1 and src2 and store the result in the dst operand. recv - Receive from channel

Syntax:		recv	src, dst
Function:	dst = <-src

The recv instruction receives a value from some other thread on the channel specified by the src operand. Communication is synchronous, so the calling thread will block until a corresponding send or alt is performed on the channel. The type of the received value is determined by the channel type and the dst operand specifies where to place the received value. ret - Return from function

Syntax:		ret
Function:	npc = link(fp)
mod = mod(fp)
fp = frame(fp)
pc = npc

The ret instruction returns control to the instruction after the call of the current function. send - Send to channel

Syntax:		send	src, dst
Function:	dst <-= src

The send instruction sends a value from this thread to some other thread on the channel specified by the dst operand. Communication is synchronous so the calling thread will block until a corresponding recv or alt is performed on the channel. The type of the sent value is determined by the channel type and the dst operand specifies where to retrieve the sent value. shlx - Shift left arithmetic

Syntax:		shlb	src1, src2, dst
shlw	src1, src2, dst
shll	src1, src2, dst
Function:	dst = src2 << src1

The shl instructions shift the src2 operand left by the number of bits specified by the src1 operand and store the result in the dst operand. Shift counts less than 0 or greater than the number of bits in the object have undefined results. shrx - Shift right arithmetic

Syntax:		shrb	src1, src2, dst
shrw	src1, src2, dst
shrl	src1, src2, dst
Function:	dst = src2 >> src1

The shr instructions shift the src2 operand right by the number of bits specified by the src1 operand and store the result in the dst operand. Shift counts less than 0 or greater than the number of bits in the object have undefined results. slicea - Slice array

Syntax:		slicea	src1, src2, dst
Function:	dst = dst[src1:src2]

The slicea instruction creates a new array, which contains the elements from the index at src1 to the index src2-1. The new array is a reference array which points at the elements in the initial array. The initial array will remain allocated until both arrays are no longer referenced. slicec - Slice string

Syntax:		slicec	src1, src2, dst
Function:	dst = dst[src1:src2]

The slicec instruction creates a new string, which contains characters from the index at src1 to the index src2-1. Unlike slicea , the new string is a copy of the elements from the initial string. slicela - Assign to array slice

Syntax:		slicela	  src1, src2, dst
Function:	dst[src2:] = src1

The src1 and dst operands must be arrays of equal types. The src2 operand is a non-negative integer index. The src1 array is assigned to the array slice dst[src2:]; src2 + nelem(src1) must not exceed nelem(dst). spawn - Spawn function

Syntax:		spawn	src, dst
Function:	fork();
if(child)
dst(src);

The spawn instruction creates a new thread and calls the function specified by the dst operand. The argument frame passed to the thread function is specified by the src operand and should have been created by the frame instruction. subx - Subtract

Syntax:		subb	src1, src2, dst
subf	src1, src2, dst
subw	src1, src2, dst
subl	src1, src2, dst
Function:	dst = src2 - src1

The sub instructions subtract the operands addressed by src1 and src2 and stores the result in the dst operand. For subb, the result is truncated to eight bits. tail - Tail of list

Syntax:		tail	src, dst
Function:	dst = src->next

The tail instruction takes the list specified by the src operand and creates a reference to a new list with the head removed, which is stored in the dst operand. tcmp - Compare types

Syntax:		tcmp	src, dst
Function:	if(typeof(src) != typeof(dst))
error("typecheck");

The tcmp instruction compares the types of the two pointers supplied by the src and dst operands. The comparison will succeed if the two pointers were created from the same type descriptor or the src operand is nil; otherwise, the program will error. The dst operand must be a valid pointer. xorx - Exclusive OR

Syntax:		xorb	src1, src2, dst
xorw	src1, src2, dst
xorl	src1, src2, dst
Function:	dst = src1 ^ src2

These instructions compute the bitwise exclusive-OR of the two operands addressed by src1 and src2 and store the result in the dst operand.

## Guile's VM

### Guile 2.0: stack machine

About 180 instructions: http://www.gnu.org/software/guile/manual/html_node/Instruction-Set.html#Instruction-Set

A selection of some of the instructions (quotes and paraphrased quotes from the documentation):

Lexical Environment Instructions:

These instructions access and mutate the lexical environment of a compiled procedure—its free and bound variables.

• local-ref, local-set, local-bound?: Push onto the stack the value of the local variable located at x within the current stack frame; Pop from the stack and assign to local variable; Push #t on the stack if the x-th local variable has been assigned, or #f otherwise.
• free-ref: Push the value of the captured variable located at position x within the program’s vector of captured variables.
• make-closure: Pop x values and a program object off the stack in that order, and push a new program object closing over the given free variables.
• fix-closure : Fix up the free variables array of the closure stored in the x-th local variable. This instruction will pop as many values from the stack as are in the corresponding closure’s free variables array. The topmost value on the stack will be stored as the closure’s last free variable, with other values filling in free variable slots in order. fix-closure is part of a hack for allocating mutually recursive procedures. The hack is to store the procedures in their corresponding local variable slots, with space already allocated for free variables. Then once they are all in place, this instruction fixes up their procedures’ free variable bindings in place. This allows most letrec-bound procedures to be allocated unboxed on the stack.

Top-Level Environment Instructions:

These instructions access values in the top-level environment: bindings that were not lexically apparent at the time that the code in question was compiled. The location in which a toplevel binding is stored can be looked up once and cached for later. The binding itself may change over time, but its location will stay constant. Currently only toplevel references within procedures are cached, as only procedures have a place to cache them, in their object tables.

• toplevel-ref, toplevel-set: Push the value of the toplevel binding whose location is stored in at position x in the current procedure’s object table; Pop a value off the stack, and set it as the value of the toplevel variable stored at x in the object table
• define: Pop a symbol and a value from the stack, in that order. Look up its binding in the current toplevel environment, creating the binding if necessary. Set the variable to the value.
• link-now: Pop a value, x, from the stack. Look up the binding for x, according to the rules for toplevel-ref, and push that variable on the stack. If the lookup fails, an error will be signalled.
• variable-ref, variable-set, variable-bound?: Dereference the variable object which is on top of the stack and replace it by the value of the variable it represents; Pop off two objects from the stack, a variable and a value, and set the variable to the value; Pop off the variable object from top of the stack and push #t if it is bound, or #f otherwise.
• make-variable: Replace the top object on the stack with a variable containing it (ie. box a value). Used in some circumstances when compiling letrec expressions.

Procedure Call and Return Instructions:

• new-frame: Push a new frame on the stack, reserving space for the dynamic link, return address, and the multiple-values return address. The frame pointer is not yet updated, because the frame is not yet active – it has to be patched by a call instruction to get the return address.
• call(nargs), apply(nargs): Call the procedure located at sp[-nargs] with the nargs arguments located from sp[-nargs + 1] to sp[0]. This instruction requires that a new frame be pushed on the stack before the procedure, via new-frame. It patches up that frame with the current ip as the return address, then dispatches to the first instruction in the called procedure, relying on the called procedure to return one value to the newly-created continuation. Because the new frame pointer will point to sp[-nargs + 1], the arguments don’t have to be shuffled around – they are already in place. apply is the same except it takes a single item on the stack, which is a list that contains the arguments to the called function.
• tail-call(nargs), tail-apply: Like call and apply, except reuses the current stack frame rather than creating a new one
• call/nargs, tail-call/nargs: like call and tail-call, except the number of items is given on the stack, rather than being part of the instruction stream (rather than being an immediate constant)
• mv-call: like call, except that a multiple-value continuation is created in addition to a single-value continuation, ie. the called function can return multiple values
• return: return a single value from a call (free the program’s frame, returning the top value from the stack to the current continuation. (The stack should have exactly one value on it.)
• return/values(nvalues), return/nvalues, return/values*: return multiple values. return/values(nvalues) gives the number of values to return in the instruction stream (an immediate constant), return/nvalues dynamically takes the number of values to return from the top of the stack, return/values* takes a list of values to return
• truncate-values nbinds nrest: Used in multiple-value continuations, this instruction takes the values that are on the stack (including the number-of-values marker) and truncates them for a binding construct. For example, a call to (receive (x y . z) (foo) ...) would, logically speaking, pop off the values returned from (foo) and push them as three values, corresponding to x, y, and z. In that case, nbinds would be 3, and nrest would be 1 (to indicate that one of the bindings was a rest argument). Signals an error if there is an insufficient number of values.
• call/cc, tail-call/cc: Capture the current continuation, and then call (or tail-call) the procedure on the top of the stack, with the continuation as the argument.

Function Prologue Instructions: todo

Trampoline Instructions:

Branch Instructions:

Data Constructor Instructions:

• make-int8/16/uint8/uint16, make-int8:0 (constant 0), make-int8:1
• make-false, make-true
• make-nil
• make-eol
• make-char8/32
• make-symbol
• make-keyword
• list: Pops off the top n values off of the stack, consing them up into a list, then pushes that list on the stack.
• vector
• make-struct: Make a new struct from the top n values on the stack. The values are popped, and the new struct is pushed. The deepest value is used as the vtable for the struct, and the rest are used in order as the field initializers.
• make-array: Pop an array shape from the stack, then pop the remaining n values, pushing a new array.
• object-ref: Push nth value from the current program’s object vector

Dynamic Environment Instructions:

Miscellaneous Instructions:

Inlined Scheme Instructions:

• not, not-not
• eq?, not-eq?, eqv?, equal?: eq? is #t if x and y are the same object, except for numbers and characters, which are *unspecified*. eqv? is #t if x and y are the same object, or for characters and numbers the same value. equal? is structural equality (return #t if x and y are the same type, and their contents or value are equal)
• null?, not-null?
• pair?, list?, struct?
• set-car!(pair,x), set-cdr!(pair,x)
• cons, car, cdr
• vector-ref(x,y), vector-set(x,n,y), struct-ref(x,y), struct-set(x,n,y), slot-ref(x,y), slot-set(x,n,y)
• struct-vtable, class-of

Inlined Mathematical Instructions:

Inlined Bytevector Instructions:

### Guile 2.2's VM: register machine

A selection of some of the instructions (quotes and paraphrased quotes from the documentation):

Each VM instruction starts by indicating which operation it is, and then follows by encoding its source and destination operands. Each procedure declares that it has some number of local variables, including the function arguments. These local variables form the available operands of the procedure, and are accessed by index.

The local variables for a procedure are stored on a stack. Calling a procedure typically enlarges the stack, and returning from a procedure shrinks it. Stack memory is exclusive to the virtual machine that owns it.

In addition to their stacks, virtual machines also have access to the global memory (modules, global bindings, etc) that is shared among other parts of Guile, including other VMs.

The registers that a VM has are as follows:

ip - Instruction pointer
sp - Stack pointer
fp - Frame pointer

The structure of the top stack frame is as follows:

 ...
+==================+ <- fp + 2 = SCM_FRAME_PREVIOUS_SP (fp)
| Dynamic link     |
+------------------+
| Return address   |
+==================+ <- fp
| Local 0          |
+------------------+
| Local 1          |
+------------------+
| ...              |
+------------------+
| Local N-1        |
\------------------/ <- sp

The dynamic link is the offset of the fp that was in effect before this program was applied, relative to the current fp.

A function call in Guile is very cheap: the VM simply hands control to the procedure. The procedure itself is responsible for asserting that it has been passed an appropriate number of arguments. For example, only calls to keyword-argument procedures “pay” for the cost of parsing keyword arguments. (At the time of this writing, calling procedures with keyword arguments is typically two to four times as costly as calling procedures with a fixed set of arguments.)

Consider the following Scheme code as an example:

(define (foo a)
(lambda (b) (list foo a b)))

Within the lambda expression, foo is a top-level variable, a is a lexically captured variable, and b is a local variable.

Another way to refer to a and b is to say that a is a “free” variable, since it is not defined within the lambda, and b is a “bound” variable.

Going back to our example, b may be allocated on the stack, as it is never mutated.

a may also be allocated on the stack, as it too is never mutated. Within the enclosed lambda, its value will be copied into (and referenced from) the free variables vector.

foo is a top-level variable, because foo is not lexically bound in this example.

Guile allocates all variables on the stack. When a lexically enclosed procedure with free variables—a closure—is created, it copies those variables into its free variable vector. References to free variables are then redirected through the free variable vector.

Variables that are assigned are usually allocated in boxes, so that continuations and closures can capture their identity and not their value at one point in time. If a variable is ever set!, however, it will need to be heap-allocated instead of stack-allocated, so that different closures that capture the same variable can see the same value. Also, this allows continuations to capture a reference to the variable, instead of to its value at one point in time. For these reasons, set! variables are allocated in “boxes”—actually, in variable cells.

Thus perhaps counterintuitively, what would seem “closer to the metal”, viz set!, actually forces an extra memory allocation and indirection. Sometimes Guile’s optimizer can remove this allocation, but not always.

A compiled procedure is a compound object consisting of its bytecode and a reference to any captured lexical variables. In addition, when a procedure is compiled, it has associated metadata written to side tables, for instance a line number mapping, or its docstring. You can pick apart these pieces with the accessors in (system vm program). A procedure may reference data that was statically allocated when the procedure was compiled.

We can see how these concepts tie together by disassembling the foo function we defined earlier to see what is going on:

scheme@(guile-user)> (define (foo a) (lambda (b) (list foo a b))) scheme@(guile-user)> ,x foo Disassembly of #<procedure foo (a)> at #xea4ce4:

0    (assert-nargs-ee/locals 2 0)    ;; 2 slots (1 arg)    at (unknown file):1:0
1    (make-closure 1 7 1)            ;; anonymous procedure at #xea4d04 (1 free var)
4    (free-set! 1 0 0)               ;; free var 0
6    (mov 0 1)
7    (return-values 2)               ;; 1 value

Disassembly of anonymous procedure at #xea4d04:

0    (assert-nargs-ee/locals 2 2)    ;; 4 slots (1 arg)    at (unknown file):1:16
1    (toplevel-box 1 74 58 68 #t)    ;; foo'
6    (box-ref 1 1)
7    (make-short-immediate 0 772)    ;; ()                 at (unknown file):1:28
8    (cons 2 2 0)
9    (free-ref 3 3 0)                ;; free var 0
11    (cons 3 3 2)
12    (cons 2 1 3)
13    (return-values 2)               ;; 1 value

Lexical Environment Instructions:

• mov
• long-fmov: Copy a value from one local slot to another, but addressing slots relative to the fp instead of the sp. This is used when shuffling values into place after multiple-value returns.
• make-closure: Make a new closure, and write it to dst. The code for the closure will be found at offset words from the current ip. offset is a signed 32-bit integer. Space for nfree free variables will be allocated.
• free-ref: Load free variable idx from the closure src into local slot dst
• free-set!: Set free variable idx from the closure dst to src. This instruction is usually used when initializing a closure’s free variables, but not to mutate free variables, as variables that are assigned are boxed.
• box: Create a new variable holding src, and place it in dst.
• box-ref: Unpack the variable at src into dst, asserting that the variable is actually bound.
• box-set!: Set the contents of the variable at dst to set.

Top-Level Environment Instructions:

• current-module: Store the current module in dst.
• resolve: Resolve sym in the current module, and place the resulting variable in dst. An error will be signalled if no variable is found. If bound? is true, an error will be signalled if the variable is unbound.
• define!: Look up a binding for sym in the current module, creating it if necessary. Set its value to val.
• toplevel-box: Load a value. The value will be fetched from memory, var-offset 32-bit words away from the current instruction pointer. var-offset is a signed value. Up to here, toplevel-box is like static-ref. Then, if the loaded value is a variable, it is placed in dst, and control flow continues. Otherwise, we have to resolve the variable. In that case we load the module from mod-offset, just as we loaded the variable. Usually the module gets set when the closure is created. sym-offset specifies the name, as an offset to a symbol. We use the module and the symbol to resolve the variable, placing it in dst, and caching the resolved variable so that we will hit the cache next time.
• module-box: Like toplevel-box, except mod-offset points at a module identifier instead of the module itself. A module identifier is a module name, as a list, prefixed by a boolean. If the prefix is true, then the variable is resolved relative to the module’s public interface instead of its private interface.

Procedure Call and Return Instructions:

• call: Call a procedure. proc is the local corresponding to a procedure. The two values below proc will be overwritten by the saved call frame data. The new frame will have space for nlocals locals: one for the procedure, and the rest for the arguments which should already have been pushed on. When the call returns, execution proceeds with the next instruction. There may be any number of values on the return stack; the precise number can be had by subtracting the address of proc from the post-call sp.
• call-label: This instruction is just like call, except that instead of dereferencing proc to find the call target, the call target is known to be at label, a signed 32-bit offset in 32-bit units from the current ip.
• tail-call: Tail-call a procedure. Requires that the procedure and all of the arguments have already been shuffled into position. Will reset the frame to nlocals.
• tail-call-label: As call is to call-label, tail-call is to tail-call-label.
• tail-call/shuffle: Tail-call a procedure. The procedure should already be set to slot 0. The rest of the args are taken from the frame, starting at from, shuffled down to start at slot 0. This is part of the implementation of the call-with-values builtin.
• receive: Receive a single return value from a call whose procedure was in proc, asserting that the call actually returned at least one value. Afterwards, resets the frame to nlocals locals.
• receive-values: Receive a return of multiple values from a call whose procedure was in proc.
• return-values: Return a number of values from a call frame. This opcode corresponds to an application of values in tail position. As with tail calls, we expect that the values have already been shuffled down to a contiguous array starting at slot 1. If nlocals is nonzero, reset the frame to hold that number of locals. Note that a frame reset to 1 local returns 0 values.
• call/cc: Capture the current continuation, and tail-apply the procedure in local slot 1 to it. This instruction is part of the implementation of call/cc, and is not generated by the compiler.

Function Prologue Instructions:

• assert-nargs-ee, assert-nargs-ge, assert-nargs-le: If the number of actual arguments is not ==, >=, or <= expected, respectively, signal an error. The number of arguments is determined by subtracting the stack pointer from the frame pointer (fp - sp).
• br-if-nargs-ne, br-if-nargs-lt, br-if-nargs-gt: If the number of actual arguments is not equal, less than, or greater than expected, respectively, add offset, a signed 24-bit number, to the current instruction pointer. These instructions are used to implement multiple arities, as in case-lambda.
• alloc-frame: Ensure that there is space on the stack for nlocals local variables, setting them all to SCM_UNDEFINED, except those values that are already on the stack.
• reset-frame: Like alloc-frame, but doesn’t check that the stack is big enough, and doesn’t initialize values to SCM_UNDEFINED. Used to reset the frame size to something less than the size that was previously set via alloc-frame.
• br-if-npos-gt: Find the first positional argument after nreq. If it is greater than npos, jump to offset. This instruction is only emitted for functions with multiple clauses, and an earlier clause has keywords and no rest arguments.
• bind-kwargs
• bind-rest: Collect any arguments at or above dst into a list, and store that list at dst.

Branch Instructions:

• br: Add offset to the current instruction pointer.
• br-if-:
• true
• null
• nil
• pair
• struct
• char
• tc7: If the value in test has the TC7 given in the second word, add offset to the current instruction pointer. TC7 codes are part of the way Guile represents non-immediate objects, and are deep wizardry. See libguile/tags.h for all the details.
• eq
• eqv
• =
• <
• <=
• logtest: If the bitwise intersection of the integers in a and b is nonzero

Dynamic Environment Instructions:

• abort: Abort to a prompt handler. The tag is expected in slot 1, and the rest of the values in the frame are returned to the prompt handler. This corresponds to a tail application of abort-to-prompt.
• prompt: Push a new prompt on the dynamic stack, with a tag from tag and a handler at handler-offset words from the current ip.
• wind: Push wind and unwind procedures onto the dynamic stack. See dynamic-wind
• unwind: a normal exit from the dynamic extent of an expression. Pop the top entry off of the dynamic stack.
• push-fluid: Dynamically bind value to fluid by creating a with-fluids object and pushing that object on the dynamic stack. A fluid is an object that can store one value per dynamic state. Each thread has a current dynamic state, and when accessing a fluid, this current dynamic state is used to provide the actual value. In this way, fluids can be used for thread local storage, but they are in fact more flexible: dynamic states are objects of their own and can be made current for more than one thread at the same time, or only be made current temporarily, for example. Fluids can also be used to simulate the desirable effects of dynamically scoped variables. See Fluids and Dynamic States.
• pop-fluid
• fluid-ref
• fluid-set
• current-thread: Write the value of the current thread to dst.

Miscellaneous Instructions:

• halt
• push, pop, drop (Pop the stack pointer by count words, discarding any values that were stored there)

Inlined Scheme Instructions:

• make-vector, vector-length, vector-ref, vector-set!
• struct-vtable, allocate-struct, struct-ref, struct-set!
• make-array, string-length, string-ref,
• class-of
• cons, car, cdr, set-car!, set-cdr!
• integer->char, char->integer

There are also '/immediate' (immediate addr mode) variants of the vector and struct ops.

Inlined Mathematical Instructions:

• add, sub, mul, div, quo, rem, mod
• shifts: ash
• bitwise boolean ops: logand, logior, logxor, logsub

There are also '/immediate' (immediate addr mode) variants of 'add' and 'sub'.

Unboxed Floating-Point Arithmetic instructions:

• scm->f64: unbox
• f64->scm: box
• load-f64: Load a 64-bit value formed by joining high-bits and low-bits, and write it to dst.
• fadd, fsub, fmul, fdiv

There are also a set of Constant Instructions, a set of Inlined Bytevector Instructions, and a set of Unboxed Integer Arithmetic instructions (interestingly this has both a left shift and a right shift).

## Neohapsis Toy VM 101

http://recon.cx/2008/a/craig_smith/Neohapsis-VM-101.pdf

Four General Purpose Registers: r1, r2, r3, r4

Instruction Registers: IP, baseip

Stack Registers: SP, basesp

Flag Registers: flags

Our Virtual CPU Instruction Set:

MOV r32, r32 MOV [r1], r32 MOV r1, [r1] MOV r32, value CMP r32, value INC/DEC r32 AND/OR r1, value XOR r32,r32 PUSH/POP r32 JMP (Relative / Direct) JE, JL, JG CALL (r1 / value) EXITVM

## Mesa and Cedar

Mesa is interesting because a paper was published claiming that its bytecode had a very high code density (roughly 4 bytes per HLL statement; http://en.wikipedia.org/wiki/Mesa_%28programming_language%29#cite_ref-3 ).

## Simplicity

https://blockstream.com/simplicity.pdf

"Simplicity is a typed, combinator-based, functional language without loops and recursion, designed to be used for crypto-currencies and blockchain applications"

Opinions:

much of the following is quoted or paraphrased from [2]

types:

• the unit type which only has one element (they write this element '1' but it's usually written '()')
• sum types A + B where A and B are types (this type contains the tagged union of values from either A or B)
• product types A * B containing pairs where the left item in the pair is from A and the right item is from B

core Simplicity has 9 types of terms:

• unit x: returns the singular value of the unit type and ignores x
• injl t and injr t: create tagged values (injl creates a value of the left type of the sum type, and injr creates a value of the right type of the sum type. Note: an example is given where there is a sum type "bit" whose left and right types represent 0 and 1, respectively)
• case s t: evaluates one of its two sub-expressions based on the tag of the first component of its input
• pair s t: creates pairs
• take t and drop t: access first and second components of a pair respectively
• iden x: x (the identity function for any type)
• comp s t: is function composition

Simplicity's 'bit machine' has "two non-empty stacks of data frames. One stack holds read-only frames, and the other stack holds write-only frames. A data frame consists of an array of cells and a cursor pointing either at one of the cells or pointing just beyond the end of the array. Each cell contains one of three values: a zero bit, a one bit, or an undefined value." and 10 instructions:

• nop
• write(b): write bit value b to the cell under the active write frame’s cursor and advances the cursor by one cell. The value b must be either 0 or 1 or else the machine crashes.
• copy(n): copy n cells from the active read frame, starting at its cursor, to the active write frame, starting at its cursor, and advances the active write frame’s cursor by n cells. Note that undefined values can be copied
• skip(n): advance the active write frame’s cursor by n cells.
• fwd(n)/bwd(n): advance/retreat the active read frame's cursor by n
• newFrame(n): allocate a new frame with n cells and pushes it onto the write-frame stack, making it the new active write frame.
• moveFrame: pop a frame off the write-frame stack and push it onto the read-frame stack
• dropFrame: pop a frame off the read-frame stack and deallocate it
• read: This instruction doesn’t modify the state of the machine. Instead it returns the value of the bit under the active read frame’s cursor to the machine’s operator

I think Simplicity terms are translated into sequences of bit machine instructions?

## Ethereum Virtual Machine (EVM) (for cryptocurrency smart contracts)

Ethereum is a platform for writing applications on top of a blockchain. The applications can send messages and currency between each other. The applications are run in virtual machines by third parties, and are charged 'gas' for each computational step.

It has all of a stack, local random-access memory ('local' as in function-local, like a locally scoped variable; this memory appears empty at the beginning of each function (even if this is a reentrant call); when the call returns to a parent function the parent's memory is intact), and persistent storage. The word size is 256-bit, and all arithmetic is mod 2^256. There is a 256 call depth limit. Many opcodes return '0' in case of an error (rather than terminating execution). There is no built-in 'cron' or 'timer' facility for programs to be executed periodically; all execution must be triggered externally by a user (this external user may, however, be an external cron script). Each step of execution costs 'gas'; the fees may be seen at http://ether.fund/tool/gas-fees. At any given time, there is a maximal execution time (in terms of 'gas') (the GASLIMIT opcode returns the current limit). The current gas limit may be seen at https://stats.ethdev.com/. At the time of this writing, the limit is 3141592.

https://github.com/ethereum/wiki/wiki/Design-Rationale#virtual-machine talks a bit about some design choices.

### opcodes

A list of 127 (as of this writing) opcodes is at http://gavwood.com/Paper.pdf , appendix H 'Virtual Machine Specification', PDF page 22.

https://github.com/ethereum/wiki/wiki/Ethereum-Development-Tutorial () ( an older version with 56 opcodes may be found at http://web.archive.org/web/20140820233358/https://github.com/ethereum/wiki/wiki/Ethereum-Development-Tutorial , section section 'Virtual machine opcodes'; this older version also has slightly different instruction description phrasing that may be easier to read, which are copied below for much of the wording of the instruction descriptions; differences in the versions include the addition of ADDMOD, MULMOD, EXTCODSIZE, EXTCODECOPY, BLOCKHASH replacing PREVHASH, JUMPDEST, LOG*, CALLCODE, SIGNEXTEND, DUP2+, SWAP2+, the removal of NEG, the return value of CREATE).

The VM halts with an error if:

• it runs out of 'gas'
• the instruction is (statically) invalid
• stack underflow
• stack overflow (there would be more than 1024 items in the stack after executing the next instruction)
• a jump destination is invalid

These conditions can be checked before executing each instruction, so "no instruction can, through its execution, cause an exceptional halt" -- [3]

Here's the list:

Terminating opcode:

• STOP: stops execution

Basic opcodes:

• MUL
• SUB
• DIV
• SDIV: signed integer division
• MOD
• SMOD: signed mod
• ADDMOD: (a + b) % c (this is different from ADD and then MOD because there is no implicit 'mod 2^256' after a+b but before % c; this is used in cryptography [4])
• MULMOD: (a * b) % c (this is different from MUL and then MOD because there is no implicit 'mod 2^256' after a*b but before % c; this is used in cryptography [5])
• EXP: exponentiation
• SIGNEXTEND: Extend length of two's complement signed integer ("the purpose of SIGNEXTEND is to facilitate typecasting from a larger signed integer to a smaller signed integer. Small signed integers are useful because JIT-compiled virtual machines may in the future be able to detect long-running chunks of code that deals primarily with 32-byte integers and speed it up considerably." [6])
• LT: less-than
• GT
• SLT: signed less-than
• SGT
• EQ
• ISZERO
• AND: bitwise AND
• OR: bitwise OR
• XOR: bitwise XOR
• BYTE: pops 2 values a, b, pushes the ath byte of b (zero if a >= 32)

Cryptographic opcodes:

• SHA3: pops 2 values a, b, pushes SHA3(memory[a: a+b])

Ethereum message opcodes:

• ADDRESS: pushes the address of the currently executing account
• BALANCE: pushes the balance of the given account
• ORIGIN: pushes the original sending account of the transaction that led to the current message (ie. the account that pays for the gas) ("the primary use of the ORIGIN opcode, which provides the sender of a transaction, is to allow contracts to make refund payments for gas." [7])
• CALLER: pushes the sender of the current message
• CALLVALUE: pushes the ether value sent with the current message (note: there is a proposal to deprecate this: see https://github.com/ethereum/EIPs/issues/28#issuecomment-158625666 )
• CALLDATALOAD: pops 1 value a, pushes msgdata[a: a + 32] where msgdata is the message data. All out-of-bounds bytes are assumed to be zero
• CALLDATASIZE: pushes len(msgdata)
• CALLDATACOPY: pops 3 values a, b, c, copies msgdata[b: b+c] to memory[a: a+c]
• CODESIZE: pushes len(code) where code is the contract's code
• CODECOPY: copy currently running code to memory; pops 3 values a, b, c, copies code[b: b+c] to memory[a: a+c]
• GASPRICE: pushes the GASPRICE of the current transaction (note: there is a proposal to deprecate this: see https://github.com/ethereum/EIPs/issues/28#issuecomment-158625666 )
• EXTCODESIZE: size of an account's code ("the primary uses here are to allow contracts to check the code of other contracts against a template, or even simulating them, before interacting with them. See http://lesswrong.com/lw/aq9/decision_theories_a_less_wrong_primer/ for applications." [8])
• EXTCODECOPY: like CODECOPY; copies an account's code to memory

Ethereum blockchain opcodes:

• BLOCKHASH: Get the hash of one of the 256 most recent complete blocks ("used as a semi-secure source of randomness, and to allow contracts to evaluate Merkle tree proofs of state in the previous block without requiring a highly complex recursive "Ethereum light client in Ethereum" construction." [9])
• COINBASE: pushes the coinbase (ie. miner's address) of the current block ("the primary uses of the COINBASE opcode are to (i) allow sub-currencies to contribute to network security if they so choose, and (ii) open up the use of miners as a decentralized economic set for sub-consensus-based applications like Schellingcoin" [10])
• TIMESTAMP: pushes the timestamp of the current block
• NUMBER: pushes the number of the current block
• DIFFICULTY: pushes the difficulty of the current block
• GASLIMIT: pushes the gas limit of the current block

Memory and control flow opcodes:

• POP
• MLOAD: load word from memory; pops one value a and pushes memory[a: a + 32]
• MSTORE: save word to memory; pops two values a, b and sets memory[a: a + 32] = b
• MSTORE8: save byte to memory; pops two values a, b and sets memory[a] = b % 256
• SSTORE: save word to storage
• JUMP: jump (can only jump to a JUMPDEST, i think; [11] and [12] seem to differ on this point, or maybe i am confused) ("JIT-compiled virtual machines become much easier to implement when jump destinations are restricted to a few indices" [13]) (note also that in order to allow auto-translation of old code upon language improvements, dynamic jumps are discouraged BUT THIS IS NOT CURRENTLY ENFORCED: https://github.com/ethereum/EIPs/issues/28#issuecomment-158625666)
• JUMPI: conditional jump (if not zero) (can only jump to a JUMPDEST, i think; [14] and [15] seem to differ on this point, or maybe i am confused)
• PC: pushes the program counter ("although theoretically not necessary, as all instances of the PC opcode can be replaced by simply putting in the actual program counter at that index as a push, using PC in code allows for the creation of position-independent code (ie. compiled functions which can be copy/pasted into other contracts, and do not break if they end up at different indices)." [16])
• MSIZE: pushes len(memory) (note: memory auto-expands; there is no memory limit)
• GAS: pushes the amount of gas remaining (before executing this operation)
• JUMPDEST: Mark a valid destination for jumps. This is a NOP.
• (proposed) CALLDEPTH to check the current call stack depth (useful in light of the 256 depth limit)

Push Immediate Value and DUP and SWAP opcodes

• PUSH1-PUSH32 (32 different opcodes): PUSH n-byte immediate value onto the stack (n can be from 1 to 32)
• DUP1-DUP16 (16 different opcodes): DUP n-th stack item
• SWAP1-SWAP16 (16 different opcodes): SWAP 1st and n-th stack items

Logging:

• LOG0-LOG4 (4 different opcodes): Append log item with n topics

Interaction with other messages / call stack opcodes:

• CREATE: create new account with associated code. Pops three values endowment, input_offset, input_size, creates a new contract with initialization code memory[input_offset: input_offset+input_size] and endowment (ie. initial ether sent), and pushes the address of the new account
• CALL: call a message into an account. pops seven values a, b, c, d, e, f, g (gas, to_address, ether_value, in_offset, in_size, out_offset, out_size), and sends a message to address b with a gas and c ether and data memory[d: d+e]. Output is saved to memory[f: f+g], right-padding with zero bytes if the output length is less than g bytes. If execution did not run out of gas pushes 1, otherwise pushes 0.
• CALLCODE: Message-call into this account with alternative account's code ("the purpose of this is to allow contracts to call "functions" in the form of code stored in other contracts, with a separate stack and memory, but using the contract's own storage. This makes it much easier to scalably implement "standard libraries" of code on the blockchain." [17])
• DELEGATECALL (new as of 2015-11). Like CALLCODE, except that "the call created has the same sender and value as the original call...the child code would execute in essentially the same environment (except for reduced gas and increased callstack depth) as the parent".
• RETURN: pops two values a, b, and stops execution, returning memory[a: a + b]

Error handling opcode:

• SUICIDE: pops one value a, sends all remaining ether to that address, returns and flags the contract for deletion as soon as transaction execution ends ("an opcode which allows a contract to quickly delete itself if it is no longer needed. The fact that SUICIDES are processed at the end of transaction execution, and not immediately, is motivated by the fact that having the ability to revert suicides that were already executed would substantially increase the complexity of the cache that would be required in an efficient VM implementation." [18])

### EVM: proposal to remove dynamic jumps to assist static analysis

https://github.com/ethereum/EIPs/issues/615

Proposal to remove the two dynamic jump instructions, JUMP and JUMPI (JUMPI is a conditional jump, not an immediate form of JUMP), and replace them with five primitive instructions: JUMPTO, JUMPIF, BEGINSUB, JUMPSUB, RETURNSUB.

JUMPTO (JUMPTO jump_target) is an immediate jump (that is, a jump to a constant address embedded inline in the program code). JUMPIF (JUMPIF jump_target) is a conditional form of JUMPTO.

BEGINSUB, JUMPSUB, and RETURNSUB are for making subroutines. Each subroutine can only have a single entry point, which is marked with BEGINSUB (BEGINSUB n_args, n_results). RETURNSUB (RETURNSUB) returns. JUMPSUB (JUMPSUB jump_target) jumps into a subroutine.

In addition, the proposal provides the following non-essential (but efficient) new instructions:

BEGINDATA: specifies that all of the following bytes to the end of the bytecode are data, and not reachable code.

JUMPV n, jumpdest ...: (n and each jumpdest are two inline bytes): jump to one of a vector of n JUMPDEST offsets (the index is on top of the stack). "If the index is greater than or equal to n - 1 the last (default) offset is used...used to implement O(1) jumptables"

JUMPSUBV n, beginsub: like JUMPV but like a JUMPSUB instead of a JUMPTO.

GETLOCAL x, PUTLOCAL x: (x is two inline bytes): push/pop the stack, getting/putting the result from/to local variable n. "Local variable n is the n'th stack item below the frame pointer".

The proposal provides for the VM to maintain internally (not accessible to the program) a frame pointer, a return stack (of places to return to when subroutines return), and a frame stack (a stack of frame pointers corresponding to the return pointers in the return stack) (afaict, the return stack and the frame stack could be combined). "The frame pointer FP is set to SP + n_args at entry to the currently executing subroutine....Defining the frame pointer so as to include the arguments is unconventional, but better fits our stack semantics and simplifies the remainder of the proposal."

"The first instruction of an array of EVM bytecode begins execution of a main routine with no arguments, SP and FP set to 0, and with one value on the return stack—code size - 1. (Executing the virtual byte of 0 after this offset causes an EVM to stop. Thus executing a RETURNSUB with no prior JUMPSUB or JUMBSUBV—that is, in the main routine—executes a STOP.)"

The proposal also defines some validity conditions. The preexisting EVM validity conditions (asserted at each instruction) are:

• Gas is sufficient
• no stack overflow: There are no more than 1024 stack items
• no stack underflow
• all jumps are to a JUMPDEST
• no invalid instructions

The new validity conditions are:

• there are no more than 1024 return stack items
• the stack pointer must be less than or equal to FP (note: the convention here is that the stack grows downwards)
• JUMPTO, JUMPIF, JUMPV target only JUMPDESTs
• JUMPSUB, JUMPSUBV target only BEGINSUBs
• "JUMP instructions do not address instructions outside of the subroutine they occur in" (i guess, since they are deprecating JUMP, by 'JUMP instructions' they mean JUMPTO, JUMPIF, JUMPV??)
• "For JUMPSUB and JUMPSUBV the frame size is at least the n_args of the BEGINSUB(s) to jump to"
• "For RETURNSUB the frame size is equal to the n_results of the enclosing BEGINSUB."
• For every instruction in the code the frame size is constant

"In practice, we must test at runtime for conditions 1 and 2—sufficient gas and sufficient stack. We don’t know how much gas there will be, we don’t know how deep a recursion may go, and analysis of stack depth even for non-recursive programs is non-trivial. All of the remaining conditions we validate statically."

"All of the instructions are O(1) with a small constant, requiring just a few machine operations each, whereas a JUMP or JUMPI must do an O(log n) binary search of an array of JUMPDEST offsets before every jump."

The 'Validation' section of [19] gives some example validation code.

### Tooling and variant implementations

"...there is a way to make compilers that are not subject to these sorts of bugs: build the compiler inside of a proof assistant, and prove its correctness using a formal proof that can be checked by a machine. This is exemplified by the CompCert? project, which is built in the proof assistant Coq and has accompanying proofs of correctness. In the CSmith study, no bugs were found in the parts of CompCert? that had been proven correct.

Elle is a project to do this same thing, for a structured programming language (Elle-Core, or just “Elle” for short when clear from context) that compiles down to the EVM.

Elle builds on Eth-Isabelle, a formal specification of the EVM...inside of the Isabelle proof assistant...

On top of this EVM implementation, Elle contains a syntactic definition of the Elle-Core language along with a formal semantics for it. It contains an implementation of a translation from Elle-Core to EVM (described in detail here), as well as a correctness proof linking the semantics of Elle-Core programs to their compiled EVM counterparts (described in detail here).

Elle-Core provides users with a structured programming abstraction, freeing them from having to reason directly about addresses of program locations when describing control-flow (e.g. the jumps and conditional jumps used to implement if-statements and for-loops). With Elle’s structural abstraction, it becomes straightforward to implement conventional control structures such as if, for, unless, etcetera. "

---

## LLL

Intermediate language targeting the EVM.

"LLL was always meant to be very simple and minimalistic; essentially just a tiny wrapper over coding in ASM directly. In my opinion just use serpent; it has direct access to opcodes so it is a superset of LLL but it also has a whole bunch of high-level features as well for when you want them. The downside is that the compiler is more complex and so theoretically might contain more bugs." [20]

examples:

---

## iulia

Intermediate language between Solidity and the EVM.

https://github.com/chriseth/notes/blob/gh-pages/articles/abi_iulia/abi_iulia.md

---

## Michelson (Tezos VM)

https://www.michelson-lang.com/

A smallish, statically typed, functional, concatenative, stack-based language.

most of the following is paraphrased or quoted from https://tezos.com/static/papers/language.pdf

Opinions:

### Motivations

" Why Michelson?

At first sight, Michelson is a strange language. It doesn’t include many features like polymorphism, closures, or named functions. Compared to a language like Haskell or OCaml, it seems underpowered; its stack is not always easy to deal with; there is no standard library. However, these restrictions are largely motivated by the language’s design goals.

Tezos takes a slightly different view from Ethereum regarding the role of smart contracts. We think of our platform more as a way to implement pieces of business logic than as a generic “world computer”. Looking at Ethereum, most contracts implement things like multisig wallets, vesting and distribution rules, etc. Michelson is targeted to these applications, not the case of arbitrary programs.

Michelson is designed as a readable compilation target, though it can be hand written. The goal is that even the output of a compiler can be understood. We intend the language to be simple enough that developers can build their own analysis tools and compilers should they prefer to do so. This is a departure from the EVM’s bytecode which more closely resembles assembly. With a lower-level bytecode, you usually need confidence in both your program and the compiler toolchain. With Michelson you can more easily check over and verify properties of the program that is actually executed. Using a higher-level bytecode also simplifies the process of proving properties about the compiled output. Programs written in Michelson can be reasonably analyzed by SMT solvers and formalized in Coq without the need for more complicated techniques like separation logic. Similarly, the restrictions imposed by the forced indentation and capitalization ensure that the source cannot be obfuscated with indentation and alignment tricks.

Our current implementation of Michelson is based around an OCaml GADT, which we have used to verify the type-soundness of the language. Additionally, the implementation of a stack based language maps directly to the semantics. The same is not true for any efficient implementation of the lambda-calculus. There have also been two formally verified implementations of Michelson, one in Coq and one in F*. One day, we hope to replace our current implementation with a verified one.

Finally, one of the main advantages of Tezos is that the system is amendable. We want to start with a small core language in which we are confident and add features as good use cases are created for them. We don't want to throw everything into the language in at the onset and then break backwards compatibility.

So, why Michelson? To provide a straightforward platform for business logic, to provide a readable bytecode, and to be introspectable. When I was working with Olin Shivers, he was very fond of saying that one should always use a "tool small enough for the job". Michelson has been carefully designed to be that tool. " [21]

### Michelson Types

core primitive constant types:

• string
• int{8
 16 32 64}
• uint{8
 16 32 64}

other core types:

• bool
• unit: The type whose only value is Unit, to use as a placeholder when some result or parameter is non necessary. For instance, when the only goal of a contract is to update its storage.
• list (t): A single, immutable, homogeneous linked list, whose elements are of type (t)
• pair (l) (r): A pair of values of types (l) and (r)
• option (t): Optional value of type (t); None or (Some v)
• or (l) (r): A union of two types: a value holding either a value of type (l) or a value of type (r), that we write (Left a) or (Right b).
• set (t): Immutable sets of values of type (t)
• map (k) (t): Immutable maps from keys of type (k) to values of type (t)

domain-specific types:

• timestamp
• tez: A specific type for manipulating tokens.
• contract 'param 'result: A contract, with the type of its code.
• key: A public cryptography key.
• signature: A cryptographic signature.

### Michelson Instructions

#### Michelson core control structures and instructions

• FAIL: abort the current program
• { I ; C }: Sequence.
• IF bt bf: Conditional branching.
• LOOP body: while loop ('body' returns a bool; if its True, body is executed again)
• DIP code: pop the top item off the stack (but remember it), execute 'CODE', then push the remembered item back onto the top of the stack
• DIIP code, DIIIP code, etc: same as DIP but pop and remember multiple items off the top of the stack (as many items as there are 'I's in the instruction name) (syntactic sugar; can be implemented as a macro that compiles to DIPs)
• EXEC: pop an item 'a' from the stack, then pop an item 'f' from the stack. Call function f with a stack consisting only of one item, a, then push the result onto the stack

#### Michelson core stack instructions

• DROP
• DUP
• DUUP, DUUUP, etc: same as DUP but with multiple items (as many items as there are 'I's in the instruction name) (syntactic sugar; can be implemented as a macro that compiles to DIPs, DUPs, and SWAPs)
• SWAP
• PUSH 'a k: push constant value k, which must be of type 'a, onto the stack
• UNIT: Push the unit value (the sole value of type 'unit') onto the stack

#### Michelson core comparison instructions

• COMPARE: an ad-hoc polymorphic instruction which takes two items (of the same type, i assume?) (popped from the stack, i assume?) and returns an int64, which is 0 if the two items are equal, negative if the first is less than the second, and positive otherwise. Compare is only defined for some types; these types are called 'comparable'.
• EQ: Checks that the top of the stack EQuals zero (pops an int64, and pushes a bool indicating whether that int64 was equal to 0)
• NEQ: Checks that the top of the stack does Not EQual zero
• LT: Checks that the top of the stack is Less Than zero
• GT: TOS > 0
• LE: TOS <= 0
• GE: TOS >= 0

There is syntactic sugar for merging COMPARE and comparison combinators, and also for branching:

• CMP{EQ
 NEQ LT GT LE GE}: same as COMPARE followed by EQ/NEQ/LT/GT/LE/GE.
• IF{EQ
 NEQ LT GT LE GE}: same as EQ/NEQ/LT/GT/LE/GE followed by IF.
• IFCMP{EQ
 NEQ LT GT LE GE}: same as COMPARE followed by EQ/NEQ/LT/GT/LE/GE followed by IF.

• OR
• AND
• XOR
• NOT

#### Michelson integer instructions

Performing computations between values of different int types must be done via explicit casts.

" For specifying arithmetics, we consider that integers are all stored on 64 bits (the largest integer size) so that we can express the operations, in particular casts, using usual bitwise masks. In this context, the type indicator functions are defined as follows (which can be read both as a constraint on the bitpatttern and as a conversion operation). Uint64 (x) = Int64 (x) = x Uint32 (x) = x & 0x00000000FFFFFFFF Int32 (x) = x & 0x00000000FFFFFFFF

 (x & 0x80000000 ? 0xFFFFFFFF00000000 : 0)
Uint16 (x) = x & 0x000000000000FFFF
Int16 (x) = x & 0x000000000000FFFF
| (x & 0x8000 ? 0xFFFFFFFFFFFF0000 : 0)
Uint8 (x) = x & 0x00000000000000FF
Int8 (x) = x & 0x00000000000000FF
| (x & 0x80 ? 0xFFFFFFFFFFFFFF00 : 0)"
• NEG: negation ("With cycling semantics for overflows (min (t) = -min (t))")
• ABS: absolute value ("With cycling semantics for overflows ((abs (min (t)) = min (t))")
• ADD ("With cycling semantics for overflows")
• SUB ("With cycling semantics for overflows")
• MUL ("Unchecked for overflows")
• DIV (divide by zero is like a 'FAIL' instruction)
• MOD (mod by zero is like a 'FAIL' instruction)
• CAST t_to: where t_to is an integer type name; typecasts the item on top of the stack to type t_to

Alternative instructions that check for overflows (and are like a 'FAIL' if there is an overflow):

• CHECKED_NEG
• CHECKED_ABS
• CHECKED_SUB
• CHECKED_MUL
• CHECKED_CAST: fails unless the item on TOS can be round-trip converted between types t_from and t_to without changing the value

Bitwise logical instructions available only on unsigned integers (in addition to bool):

• OR
• AND
• XOR
• NOT

Logical shifts are available only on unsigned integers:

• LSL
• LSR

#### Michelson operations on strings

• CONCAT
• COMPARE: Lexicographic comparison

("Strings are mostly used for naming things without having to rely on external ID databases. So what can be done is basically use string constants as is, concatenate them and use them as keys.")

#### Michelson operations on pairs

• PAIR: Build a pair from the stack's top two elements
• a macro matching the regexp P(A*AI)+R: A syntactic sugar for building nested pairs in bulk. I think it means that, each time you see an 'I', that delimits a level of nesting; and between the 'I's, the 'A's denote how many items on the stack to collect into a 'cons list'. So i think 'PAAIR' means to push (previous-TOS, previous-second-item-on-stack), and i think PAAIAIR means to push (previous-third-item-on-stack, (previous-TOS, previous-second-item-on-stack)). But i'm not sure. The actual definition is given by the rules 'P(\fst=A*)AI(\rest=(A*AI)+)R; C / S => P(\fst)AIR ; P(\rest)R ; C / S' and 'PA(\rest=A*)AIR ; C / S => DIP (P(\rest)AIR) ; C / S'.
• CAR: Access the left part of a pair (like Lisp)
• CDR: Access the right part of a pair (like Lisp)
• C[AD]+R: A macro for accessing fields in nested pairs. I think this expands to a sequence of CARs and CDRs, where, from left to right, each 'A' in the middle is a CAR, and each 'D' is a CDR.

#### Michelson operations on sets

Recall that sets are homogeneous and typed generic/polymorphic; their type includes the type of the elements they contain.

• EMPTY_SET 'element-type: Build a new, empty set for elements of a given type. The 'element-type type must be comparable (the COMPARE primitive must be defined over it).
• MEM: Check for the presence of an element in a set. There are two inputs (the element, and the set), and one output (a bool).
• UPDATE: Inserts or removes an element in a set, replacing a previous value. There are three inputs (the element, a bool, and the set) and one output (the new set). Note that this function signature is slightly different from map UPDATE.
• REDUCE: Apply a function on a set passing the result of each application to the next one and return the last. There are three inputs (the function (which takes two inputs, an element and a value, and returns a new value), the set, and the initial value), and one output (the last value).

#### Michelson operations on maps

• EMPTY_MAP 'key 'val: Build a new, empty map. The 'key type must be comparable.
• GET: Access an element in a map, returns an optional value to be checked with IF_SOME. Takes two inputs (the key and the map), and returns one output (an option wrapping the value found at that key, if any).
• MEM: Check for the presence of an element in a map.
• UPDATE: Assign or remove an element in a map. There are three inputs (the key, an option wrapping the value to be assigned, if any, and the map) and one output (the new map).
• MAP: Apply a function on a map and return the map of results under the same bindings. There are two inputs (the function and the map) and one output (the result map).
• REDUCE: Apply a function on a map passing the result of each application to the next one and return the last.

#### Michelson operations on optional values

• SOME: Pack an optional value which is present. Takes one input (the value to be packed) and returns one output (the packed value)
• NONE 't: Returns the absent optional value of type 't.
• IF_SOME bt bf: Inspect an optional value; executes 'bt' on packed value if one is present, or executes bf if the optional value is absent. Both 'bt' and 'bf' must push the same types of items onto the stack.

todo

## AntShares VM

### AntShares VM Instructions

Copied/paraphrased from [22]:

Constants:

• PUSHBYTES1, PUSHBYTES2, ..., PUSHBYTES75: The next n bytes are data to be pushed onto the stack
• PUSHDATA1, PUSHDATA2, PUSHDATA4: The next n bytes contain the number of bytes to be pushed onto the stack
• PUSHM1: Push -1
• PUSH0, PUSH1, PUSH2, ..., PUSH16: push constant small integers
• PUSHF, PUSHT: same as PUSH0, PUSH1

Flow control:

• NOP, JMP, JMPIF, JMPIFNOT, CALL, RET, APPCALL, SYSCALL, TAILCALL

Stack:

• TOALTSTACK: Pop input from main stack, push to alt stack
• FROMALTSTACK: the reverse of TOALTSTACK
• XDROP, XSWAP, XTUCK
• DEPTH: Push the number of stack items onto the stack
• DROP: remove top stack item
• DUP: duplicate top stack item
• NIP: remove 2nd stack item
• OVER: push a copy of the 2nd stack item
• PICK: The item n back in the stack is copied to the top
• ROLL: The item n back in the stack is moved to the top
• ROT: rotate the top 3 stack items
• SWAP: swap the top 3 stack items
• TUCK: The item at the top of the stack is copied and inserted before the second-to-top item

Strings:

• CAT, SUBSTR, SIZE, LEFT (Keeps only characters left of the specified point in a string), RIGHT

Bitwise:

• INVERT, AND, OR, XOR, EQUAL (1 if the inputs are exactly equal, 0 otherwise)

Arithmetic (Note: Arithmetic inputs are limited to signed 32-bit integers, but may overflow their output):

• INC, DEC
• NEGATE, ABS, NOT, NZ (0 if the input is 0. 1 otherwise), ADD, SUB, MUL, DIV, MOD, SHL, SHR, BOOLAND (If both a and b are not 0, the output is 1. Otherwise 0), BOOLOR, NUMEQUAL, NUMNOTEQUAL, LT, GT, LTE, GTE, MIN (the smaller of a and b), MAX, WITHIN (Returns 1 if x is within the specified range (left-inclusive), 0 otherwise)

Crypto:

• SHA1, SHA256, HASH160, HASH256, CHECKSIG, CHECKMULTISIG

Array:

• ARRAYSIZE, PACK, UNPACK, PICKITEM, SETITEM, NEWARRAY

## Sketch of proposal for smart contracts in Ripple cryptocurrency

https://wiki.ripple.com/Contracts#Language

## CIYAM's proposal for smart contracts/automated transactions in NXT cryptocurrency

https://ciyam.org/at/at.html

## UNCOL and ADNF

https://en.wikipedia.org/wiki/Compiler-compiler

## tinyRAM

"Each instruction contains an opcode, the immediate flag, two register addresses ri and rj , and a w-size third operand A , which is either interpreted as a register address or an immediate depending on the value of the immediate flag. Each instruction fills two w-size words."

instructions shared with Lajos:

• AND, OR, XOR, NOT
• ADD, SUB, MULL, UMULH, SMULH, UDIC, UMOD, ANSWER (halt and return value)

instructions in tinyRAM but not in Lajos:

• SHL, SHR
• CMPE, CMPA (rj > A), CMPAE, CMPG (signed version of CMPA), CMPGE
• MOV, CMOV
• JMP, CJMP, CNJMP (JMP if NOT condition)
• READ (get next value from tape)

https://github.com/pepper-project/tinyram/blob/master/doc/asm/asm.tex

### Lajos (a putative successor/variant to tinyRAM)

"Lajos instructions are w -sized. This allows us to store one instruction per memory location, but it means that we cannot have full w -sized immediate values. If w -sized immediates are necessary, they can be encoded in the program text and thus pre-loaded in memory and accessible via LOAD . (Note that unlike a real processor, there is no additional overhead for memory access)....Lajos instructions contain the same fields as in TinyRAM?, but since they are packed into a single w -size word, the immediate size depends on choice of w and k . A convenient choice is w = 32 and k = 32 , yielding a 16-bit immediate...Reduced immediate size has an important ramification: immediate values can no longer be used to address all 2 w memory locations. This mainly limits the jump and memory access instructions; as a result, in Lajos these instructions are modified such that immediate values offset a base address."

"Lajos has a zero register: any read from r0 yields 0, and any write to r0 is discarded."

The zero register "allows further instruction set simplification. In particular, Lajos eliminates all 5 CMP * instructions, adding in their place SUBS ; along with SUB and XOR , all functionality is retained. Further, MOV can be eliminated, since a zero operand is always available for the ADD , SUB , OR , and XOR instructions."

"having relative jumps allows us to neatly eliminate the CMOV instruction."

"SHR and SHL are both relatively expensive instructions: the limitations of arithmetic circuits dictate that bit shifting a variable distance involves an unrolled loop. Consequently, both SHR and SHL imply a loop of length w in the transition verifier. Lajos reduces this burden by eliminating SHR , since it can be emulated by SHL followed by DIV . The semantics of SHL are also modified from TinyRAM? to better accommodate emulated SHR ."

"The READ instruction becomes quite costly when the tape lengths are large; since the READ instruction must be encoded in the transition verifier, this cost is repeated for each execution step. However, as with the program text itself, the tapes can be pre-loaded into memory before execution starts, obviating the READ instruction entirely. To facilitate this, the Lajos assembler provides two directives, inTape and auxTape , that set the memory location at which the tapes are loaded. (See the TrAsm? documentation for more details.)"

instructions shared with tinyRAM:

• AND, OR, XOR, NOT
• ADD, SUB, MULL, UMULH, SMULH, UDIC, UMOD, ANSWER (halt and return value)

instructions in Lajos but not in tinyRAM:

• SHL, SUBS (signed subtraction)
• CJMP, CNJMP

https://github.com/pepper-project/tinyram/blob/master/doc/asm/asm.tex

TinyRAM? was originally a Harvard architecture, but has been modified to vnTinyRAM, a von Neumann variant ("program and data are stored in the same read-write address space; programs may use runtime code generation").[23] [24]

## R1CS

The zk-SNARK framework https://github.com/scipr-lab/libsnark can be used by a party who known a satisfying instance of an NP problem to prove to another party the existence of the satisfier without revealing the identity of the satisfier. The NP statement to be proved can be expressed in any of: TinyRAM?, R1CS ("(Rank-1 Constraint Systems), which is a language that is similar to arithmetic circuit satisfiability"), BACS ("(Bilinear Arithmetic Circuit Satisfiability). This simplifies the writing of NP statements when the additional flexibility of R1CS is not needed. Internally, it reduces to R1CS"), USCS ("Unitary-Square Constraint Systems"), TBCS ("Two-input Boolean Circuit Satisfiability").

" An instance of the language is specified by a set of equations over a prime field F, and each equation looks like: < A, (1,X) > * < B , (1,X) > = < C, (1,X) > where A,B,C are vectors over F, and X is a vector of variables.

In particular, arithmetic (as well as boolean) circuits are easily reducible to this language by converting each gate into a rank-1 constraint. See [BCGTV13] Appendix E (and "System of Rank 1 Quadratic Equations") for more details about this. "

## gcc

" GCC uses three main intermediate languages to represent the program during compilation: GENERIC, GIMPLE and RTL. GENERIC is a language-independent representation generated by each front end. It is used to serve as an interface between the parser and optimizer. GENERIC is a common representation that is able to represent programs written in all the languages supported by GCC.

GIMPLE and RTL are used to optimize the program. GIMPLE is used for target and language independent optimizations (e.g., inlining, constant propagation, tail call elimination, redundancy elimination, etc). Much like GENERIC, GIMPLE is a language independent, tree based representation. However, it differs from GENERIC in that the GIMPLE grammar is more restrictive: expressions contain no more than 3 operands (except function calls), it has no control flow structures and expressions with side-effects are only allowed on the right hand side of assignments. See the chapter describing GENERIC and GIMPLE for more details. " -- https://gcc.gnu.org/onlinedocs/gccint/Tree-SSA.html#Tree-SSA

https://gcc.gnu.org/onlinedocs/gccint/Parsing-pass.html#Parsing-pass

https://en.wikipedia.org/wiki/GNU_Compiler_Collection#GENERIC_and_GIMPLE

https://en.wikibooks.org/wiki/GNU_C_Compiler_Internals/GNU_C_Compiler_Architecture_4_1

https://en.wikibooks.org/wiki/GNU_C_Compiler_Internals/GNU_C_Compiler_Architecture

### gcc GENERIC

https://gcc.gnu.org/onlinedocs/gccint/GENERIC.html#GENERIC

"GENERIC [163] is an abstract syntax representation of C that takes care of parsing and linking issues. It remains at the same level of abstraction as C source: control structures are preferred over control-flow graph; expression trees are preferred over three-address code expressions; typed variables are preferred over pseudo-registers or registers; stack frames remain implicit. It is the first main intermediate language used in the GCC compiler suite" -- Yannick Moy. Automatic Modular Static Safety Checking for C Programs. PhD thesis, Université Paris-Sud, January 2009, section 2.1, page 42

### gcc Gimple

https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html#GIMPLE

http://stackoverflow.com/questions/21660563/can-gcc-compile-gimple

ftp://gcc.gnu.org/pub/gcc/summit/2003/GENERIC%20and%20GIMPLE.pdf

"SIMPLE is an intermediate language in the McCAT? compiler that facilitates alias and dependency analyses. It is simpler than GENERIC, with expression trees translated into three-address code expressions. GIMPLE is a clone of SIMPLE where control structures are translated into control-flow graphs. It is the second main intermediate language used in the GCC compiler suite." -- Yannick Moy. Automatic Modular Static Safety Checking for C Programs. PhD thesis, Université Paris-Sud, January 2009, section 2.1, page 42

### gcc RTL

https://gcc.gnu.org/onlinedocs/gccint/RTL.html#RTL

## GNU Lightning

"a portable, fast and easily retargetable dynamic code generation system....The interface that it exposes to is that of a standardized RISC architecture loosely based on the SPARC and MIPS chips."

"There are a few general-purpose registers (six, not including those used to receive and pass parameters between subroutines), and arithmetic operations involve three operands—either three registers or two registers and an arbitrarily sized immediate value. ... There are at least seven integer registers, of which six are general-purpose, while the last is used to contain the frame pointer (FP). The frame pointer can be used to allocate and access local variables on the stack, using the allocai or allocar instruction.

Of the general-purpose registers, at least three are guaranteed to be preserved across function calls (V0, V1 and V2) and at least three are not (R0, R1 and R2). "

Copied or paraphrased from http://www.gnu.org/software/lightning/manual/lightning.html#The-instruction-set :

types: "

• integer types:
• char
• short (signed, unsigned)
• int (signed, unsigned)
• long (signed)
• floating point types:
• float
• double "

Each instruction is composed of:

• an operation, like sub or mul
• most times, a register/immediate flag (r or i)
• an unsigned modifier (u), a type identifier or two, when applicable.

Each instruction takes two or three operands; in most cases, one of them can be an immediate value instead of a register.

most non-memory operations only take integers (either signed or unsigned) as operands; this was done in order to reduce the instruction set, and because most architectures only provide word and long word operations on registers.

I've omitted the variant suffixes 'r' and 'i' which say whether or not the last operand is immediate.

Binary ALU operations:

These accept three operands; the last one can be an immediate. 'add' and 'sub' have three variants, plain, 'c' (set carry), and 'x' (add carry). addx operations must directly follow addc, and subx must follow subc; otherwise, results are undefined.

• add, sub, mul, div, rem, div
• and, or, xor
• lsh, rsh

Unary ALU operations:

• neg, com (unary negation, and unary complement)
• abs, sqrt (float operands only)

Compare instructions:

These accept three operands. The last two operands are compared, and the first operand, that must be an integer register, is set to either 0 or 1.

• lt, le, gt, ge, eq, ne, unlt, unle, ungt, unge, uneq, ltgt, ord, unord,

Transfer operations:

• mov
• ext (convert types)
• truncr (truncate)

Network extensions:

• hton Host-to-network (big endian) order
• ntoh Network-to-host order

• ld: O1 = *O2
• ldx: O1 = *(O2 + O3)

Store operations:

• st: *O1 = O2
• stx: *(O1+O2) = O3

Flow control, argument management, for the caller:

• prepare: begin to push arguments for an upcoming function call (the call-er calls this). Do not call 'prepare' twice without calling 'finish' first
• pusharg: push an argument for an upcoming function call (the call-er calls this)
• ellipsis: if calling a vararg function, call ellipsis to mark the position of the ellipsis
• finish: call the specified function, and clean the stack from pushed parameters
• call: call the specified function, but do not clean the stack from pushed parameters. You still have to call 'finish' later to clean the stack.
• retval: fetch the return value of a previously called function into a register (the call-er calls this)
• jmp: unconditional jump

Flow control, argument management, for the callee:

• prolog: mark the beginning of a function body
• allocai: reserve space on the stack (receives the number of bytes to allocate and returns the offset from the frame pointer register FP to the base of the area)
• arg: "does not actually generate any code: instead, it is a function which returns a value to be passed to getarg"
• getarg: fetch an argument
• ret: return a value
• epilog: mark the end of a function body

Branch instructions:

Takes three arguments, one jump location and two functions to be compared.

• blt (if (O2 < O3) goto O1), ble, bgt, bge, beq, bne
• bunlt (if !(O2 >= O3) goto O1), bunle, bungt, bunge, buneq (if !(O2 < O3) && !(O2 > O3) goto O1), bltgtr (if !(O2 >= O3)
 !(O2 <= O3) goto O1), bordr (if (O2 == O2) && (O3 == O3) goto O1), bunordr (if !(O2 != O2) (O3 != O3) goto O1)
• bms (if O2 & O3 goto O1), bmc (if !(O2 & O3) goto O1)
• boadd (O2 += O3, goto O1 if overflow), bxaddr (O2 += O3, goto O1 if no overflow), bosub, bxsub

Labels:

• label: simple label
• indirect: special simple label; "indirect is useful when creating jump tables, and tells GNU lightning to not optimize out a label that is not the target of any jump, because an indirect jump may land where it is defined."
• forward: forward label; "forward is used to patch code generation before the actual position of the label is known. "

See documentation for section Labels in http://www.gnu.org/software/lightning/manual/lightning.html#The-instruction-set .

Used by Racket.

## libjit

https://www.gnu.org/software/libjit/

Was part of https://en.m.wikipedia.org/wiki/DotGNU

### libjit instructions

https://www.gnu.org/software/libjit/doc/libjit_8.html#Instructions

## Portable Assembly Forth (PAF)

" An issue that a number of portable assembly languages have had is that they require the code to be organized in functions that follow the standard calling convention (ABI) of the platform, which usually prevents tail-call optimization. PAF provides ABI calls and definitions for interfacing with the rest of the world, but also PAF calls and definitions, which (unlike ABI calls) can be tail-call-optimized and can therefore be used as universal control flow primitives [Ste77] (see Section 3.10 and 3.11).

Another problem is that indirect branches and calls have a high cost, because the compiler has to assume that every branch/call can reach any entry point. PAF introduces tags to specify which branches/calls can reach which entry points (see Section 3.9 and 3.10).

The most significant difference between PAF and Forth is that PAF contains restrictions that ensure that the stack depth is always statically determinable, so stack items can be mapped to registers (Section 3.3 and 3.9). It is interesting that these restrictions are relatively minor and don’t affect much Forth code; it’s also interesting to see an example of Forth code that is affected (see Section 5).

...

The main innovations of PAF are: tags indicate the control flow for indirect branches and calls; and PAF has two kinds of calls and definitions: the ABI ones follow the platform’s calling convention and are useful for interfacing to the outside world, while the PAF ones allow tail-call elimination and are useful for implementing general control structures.

...

2 Previous Work

2.1 ((why not just use C?))

Another problem of C (and probably a reason why it is not used as often as compiler target language as for interpreters) is that its control flow is quite inflexible: Code is divided into C functions, that can be called and from which control flow can return; the only other way to change control flow across functions is longjmp() . Varargs in combination with other language features have led to calling conventions where the caller is responsible for removing the arguments from the stack. This makes it impossible to implement guaranteed tail-call optimization, which would be necessary to use C calls as a general control flow primitive [Ste77] As a result, any control flow that does not fit the C model, such as unlimited tail calls, backtracking, coroutining, and even exceptions is hard to map to C efficiently

...

2.2 ((why not just use LLVM?))

LLVM shares many of the problems of C...you have to divide the code into functions that follow some calling convention, restricting the kind of control flow that is possible. To work around this problem, it is possible to add your own calling convention, but that is not easy

...

2.3 ((why not just use C--?))

C-- has been designed as portable assembly language. Many considerations went into its design, and it appears to be well-designed, if a little too complex for my taste, but the project appears to be stagnant...

2.4 ((why not just use GNU Lightning?))

GNU Lightning cannot perform any register allocation on its own. Therefore the front end has to perform the register allocation. It also does not perform instruction selection; each Lightning instruction is translated to at least one native instruction...

it is...possible to implement your own calling conventions and other control flow, ..., but (from reading the manual) it is not clear if this can be integrated with the stack handling by GNU Lightning und if one can use the processor’s call instruction for your own calling convention.

It is possible to use better code generation technology with the GNU Lightning interface, and also to provide ways to use the processor’s call and return instructions for your own calling convention. With these changes, wouldn’t the GNU Lightning interface be the perfect portable assembly language? It would certainly satisfy the basic requirements of a portable assembly language, but as a replacement for a language like C, it misses conveniences like register allocation...

the relation between PAF and the machine is not as direct as for GNU Lightning: There is register allocation and instruction selection, there may be instruction scheduling, and code replication. Instruction selection and instruction scheduling make better code possible (at the cost of slower compilation); register allocation interacts with these phases, and leaving it to the clients would require duplicated work in the clients, as register allocation is not really language-specific

...

3.2 Target machines...here we define the class of machines that we target with PAF:

PAF targets general-purpose computer architectures, i.e., the architectures that have been designed as compiler targets, such as AMD64, ARM, IA-32, IA-64, MIPS, PowerPC?, SPARC.

Memory on the target machines is byte-addressed with a flat address space; e.g., DSPs with separate X and Y address spaces are not target machines.

The target machines use modulo (wrap-around) arithmetics and and signed numbers are represented in 2s-complement representation.

The target machines have a uniform register set for integers and addresses (not, e.g., accumulators with different size than address registers), and possibly separate (but internally also uniform) floating point registers.

...

3.9 Control flow inside definitions

The standard Forth words begin again until ahead if then cs-roll are available in PAF and are useful for building structured control flow, such as if ... then ... elsif ... then ... else ... end . While one can construct any control flow with these words [Bad90], if you want to implement labels and gotos, it’s easier to use labels and gotos. Therefore, PAF (unlike Forth) provides that, too: L: name defines a label and goto: name jumps to it.

PAF also supports indirect gotos: ’ name / tag produces the address of label name, and goto/ tag jumps to a label passed on the stack. The tag indicates which gotos can jump to which labels; a PAF program must not jump to a label address generated with a different tag. E.g., a C compiler targeting PAF could use a separate tag for each switch statement and the labels occuring there.

...

Whichever method of control flow you use, on a control flow join the statically determined stack depth has to be the same on all joining control flows. This ensures that the PAF compiler can always determine the stack depth and can map stack items to registers even across control flow. This is a restriction compared to Forth, but most Forth code conforms with this restriction. Breaking this rule is detected and reported as error by the PAF compiler. So the tags have another benefit in connection with the stack-depth rule: The static stack depth for a given tag must be the same (for all labels and all gotos), but they can be different for different tags. If there were no tags, all labels and gotos in a definition would have to have the same stack depth. ...

3.10 PAF Definitions and PAF calls...

The stack effects of all definitions whose address is taken with the same tag have to be compatible. I.e., there must be one stack effect that describes all of them; e.g., ( x x -- x ) is a valid stack effect of both + and drop (although the minimal stack effect of drop is ( x -- ) ), so + and drop have compatible stack effects.

The use of tags here has two purposes: It informs the PAF compiler about the control flow; and it also informs it about the stack effect of the indirect call (while a Forth compiler usually has to assume that execute can call anything, and have any stack effect). Or conversely, in connection with the stack-depth rule: Tags allow different stack effects for indirectly called definitions with different tags; without tags, all indirectly called definitions would have to have the same stack effect.

...

3.11 ABI definitions and ABI calls

We need to specify the stack effect explicitly as signature of an ABI definition or call. The syntax for such a signature is [xr]*-[xr]* , where x indi- cates a cell (machine word/integer/address) argument, and r a floating-point argument; the letters before the - indicate parameters, and the letters afterwards the results. The division into x and r reflects the division into general-purpose registers and floating-point registers on real machines, and the role these registers play in many calling conventions.

A definition conforming to the calling convention is defined with abi: sig name

...

You can take the address of an ABI funtion with abi’ name and call it with abi-exec. sig . There are no tail calls to ABI functions, because we cannot guarantee that tail calls can be optimized in all calling conventions.

...

3.12 Why have two kinds of definitions and two kinds of calls?

The PAF definitions and calls allow to implement various control structures such as backtracking through tail calls [Ste77]. They also allow the compiler to use flexible and possibly more efficient calling interfaces than the ABI calling convention. On the other hand, the ABI counterparts allow interfacing with other languages and using dynamically or statically linked binary libraries, including callbacks, and using PAF to build such libraries (e.g., as plug-ins).

... 3.13 Exceptions

It is possible to build non-local control-flow such as exceptions with tail-calls, but it is often more convenient to let a PAF definition correspond to a source language function/method/procedure (no need to spread locals across several definitions). Exceptions are a common non-local control-flow construct, so PAF includes them

...

4 Non-Features ... Garbage collection...Types PAF does not perform type checking during compi-lation, nor at run-time; also, there is no overloading of several operations on the same operator based on types....Debugger...SIMD...

...

5 PAF vs. Forth

PAF has restrictions and features that allow the compiler to statically determine the stack depth. As a consequence, in PAF there is no need to implement the stacks in memory, with a stack pointer for each stack (data stack and return stack for cells, floating-point stack for floating-point values). In contrast, Forth needs to have a separate mem- ory area and stack pointer for each stack, and while stack items can be kept in registers for most of the code, there are some words (in particular, execute ) and code patterns (unbalanced stack effects on control flow joins), that force stack items into memory and usually also force stack pointer updates.

This property of Forth is avoided in PAF by requiring balanced stack effects on control flow joins (see Section 3.9), and by replacing execute with exec. tag (see Section 3.10); all definition addresses returned for a particular tag are required to have compatible stack effects, so exec. tag has a statically determined stack effect

The effect on real programs is relatively small: most Forth code has balanced stack effects for control flow anyway, and most occurences of ’ and execute can be converted to their tagged variants, because programmers keep the stack depth statically determinable in order to keep the code understandable. However, there are cases where the restrictions are not so easy to comply with. E.g., object-oriented packages in Forth use execute for words with arbitrary stack effects.

...

6 Related work

We have discussed C, LLVM, C--, and Vcode/GNU Lightning in Section 2. There are projects that are similar to PAF in using a restricted or modified form of a higher-level language as portable assembler:

• The Python system PyPy? uses a restricted form of Python called RPython as low-level intermediate language [AACM07].
• Asm.js is a subset of JavaScript? that is so restricted that it can serve as portable assembly language.
• PreScheme? is a low-level subset of Scheme used as intermediate language for implementing Scheme48 [KR94]. In all these cases the base language is much higher-level than Forth, and it is much more of a stretch to create a low-level subset than for Forth.

Machine Forth (which evolved into colorForth) is a simple variant of Forth created by Chuck Moore, the inventor of Forth. It closely corresponds to the instructions on his Forth CPUs, but he also wrote an implementation for IA-32 that creates native code. The IA-32 compiler is very simple, basically just expanding the words into short machine code sequences.

It does not map stack items beyond the top-of-stack to registers, yet the generated code is relatively compact; this reflects the fact that machine Forth is close to the machine, including IA-32.

...

7 Conclusion

The main contributions of PAF are:

• Tags indicate which indirect branches can reach which labels and which indirect calls can call which definitions. Compared to general indirect branches and calls, this gives more freedom to the front end’s stack usage and to the PAF compiler’s register allocator. Tags need less implementation effort and produce better results than trying to achieve the same result through program analysis.
• Definitions and calls are split into those conforming to the ABI/calling convention of the platform, and others for which the compiler can use any calling interface (and different ones for different sets of callers and callees). This allows tail-call optimization (unlike ABI calling conventions), which in turn means that we can use the calls as a primitive for arbitrary control structures (e.g., coroutining).
• Restrictions (compared to Forth) on the use of stack items make it possible to have a static relation between stack items and registers for all programs, and avoid the need for a separate stack pointer and memory area for each stack. This highlights which Forth features are expensive and where they are used.

...

One may wonder about the “absence” of some operations in PAF; e.g., there is <? U<? , but only =? + - * . The reason is that, on the two’s-complement machines that PAF targets, these operations are the same for signed and unsigned numbers

"

## UEFI's EBC (EFI Byte Code)

http://www.uefi.org/sites/default/files/resources/UEFI%202_5.pdf Chapter 21, page 985, PDF page 1034 (instruction set listing starts at 21.8, page 994, PDF page 1043)

Many arguments support both direct and indirect indexing. 'Natural values' are special UEFI-specified number representations used to abstract away from bit length.

moves: MOV, MOVI (immediate), MOVIn (immediate indexed value), MOVn (unsigned natural value), MOVsn (signed natural value, MOVREL (relative indexing)

arithmetic: ADD, SUB, MUL, MULU (unsigned MUL) DIV, DIVU, MOD, MODU, NEG

logical: AND, OR, NOT, XOR

shifts and conversions: ASHR, SHL, SHR

conversions: EXTNDB (sign-extend a byte (8-bit) value), EXTNDW (sign-extend a 16-bit value), EXTNDD (sign-extend a 32-bit value)

comparisons: CMP (=, <=, >=, unsigned <=, unsigned >), CMPI (CMP immediate)

control: JMP (unconditional, or conditional upon condition bit in FLAGS register; absolute, or relative; possibly immediate; 32-bit immediate value (in which case there is also a register operand, which could be the zero register, and the immediate value is an offset), or 64-bit operand (that must be immediate; no register operand), JMP8 (relative jump to signed 8-bit offset; unconditional, or conditional upon condition bit in FLAGS register), CALL, RET

stack ops: PUSH, PUSHn (unsigned natural), POP, POPn

misc: BREAK (halt/error; get version; debug breakpoint; system call; create thunk; set compiler version), LOADSP (load FLAGS register with a general-purpose register), STORESP

## Mu

Mu calls itself a 'micro' virtual machine for managed languages (especially dynamic ones): in comparison to VMs such as the JVM and .NET which are "designed for certain kinds of languages", it is "minimalist", "explicitly designed to support the development of new languages", and "target[s] a low level of abstraction"; the IR is "SSA-based rather than stack-based, the type system is much simpler, the concurrency primitives are low-level, and all high-level optimizations are the responsibility of the client run-time system, not the micro virtual machine".

In my opinion, Mu appears to position itself as a better LLVM. In comparison to LLVM, Mu is "not a compiler framework", and "performance-critical language-specific optimizations are performed outside the micro virtual machine in higher layers of the run-time stack" [25] (but it seems to me that if Mu took off as an IR, it would be possible to write an optimizer framework over it, and provide common language-agnostic optimizer passes, just as LLVM does on top of its core LLVM IR). More importantly, Mu has built-in support for garbage collection and concurrency. [26] states that the Mu has "just three concerns: hardware (compiler back end), memory (garbage collector), and concurrency (scheduler and memory model for language implementers)." "Mu is designed to support managed languages while the LLVM is designed for C-like languages", for example, "the LLVM IR type system contains raw pointer types but not reference types". ", LLVM’s focus on heavy-duty static optimizing compilation of C/C++ leaves its support for dynamic just-in-time (JIT) compilation as somewhat of an afterthought, receiving less attention to quality and maintenance. In contrast, dynamic languages rely heavily on on-going run-time (re)compilation to achieve good performance". "LLVM’s focus on C means that support for garbage collection, concurrency, and dynamic typing are minimal or weak."

### design criteria

Draining the Swamp: Micro Virtual Machines as Solid Foundation for Language Development lists eight design principles:

• minimal; "any feature or optimization that can be deflected to the higher layers will be"
• multi-language client libraries sitting above Mu compensate for minimalism
• possibility of a formally verifiable Mu instance later
• "Mu’s client is trusted; it is allowed to shoot itself in the foot"
• "We use LLVM IR as a common frame of reference for our IR, deviating only where we find compelling cause to do so"
• Mu is an open specification, not an implementation
• goal is to support diverse languages
• high performance

### types

void

numbers: int, float, double

composites:

• vector
• struct
• array
• tagref64 ("Mu does not have a “union” type because a union of reference and value types will make a memory location ambiguous to the garbage collector with respect to its contents holding a reference or not. However, Mu provides a tagged reference type tagref64 . It uses several bits of a 64-bit word to indicate whether it currently holds an object reference, an integer or a double . In this way, exact garbage collection is still possible.")
• hybrid ("akin to a struct with a fixed prefix F , followed by an array of unspecified size having elements from (non-hybrid) type V . The size of the variable-length array part in a hybrid is specified at allocation time. We expect, for example, that most language clients would implement their string types with a hybrid type. Similarly, a Java client might represent Java arrays as a fixed prefix int holding the size of the array and a variable-length part for the payload.")

calling: func, thread, stack

refs:

• ref
• iref (internal reference; "references to memory locations that may be internal to objects (e.g. array elements or struct fields)")
• weakref
• ptr (native unmanaged pointer)

### misc notes

"The top-level unit of the Mu IR is a code bundle , which contains definitions of types, function signatures, constants, global cells and functions. A function has basic blocks and instructions. The Mu instruction set contains LLVM-like primitive arithmetic and logical instructions as well as Mu-specific garbage-collection-aware memory operations, thread/stack operations, and traps"

#### particularly IL-ish things

(this is meant to be an IL after all)

"SSA-based rather than stack-based"

"For the convenience of the micro virtual machine rather than the client, the types of the operands are explicitly written as type arguments so that Mu does not need to infer the type of any instruction from the types of its operands."

"Mu IR programs must explicitly truncate, extend, convert or cast the arguments to match the signature."

"The client must explicitly generate TAILCALL instructions to utilize this feature. Mu implementations need not automatically convert conventional CALL s into TAILCALL s, though an implementation might."

" Exception handling within a function, for example, a throw statement in a try - catch block in Java, should be translated to branching instructions ( BRANCH and BRANCH2 ) in the Mu IR. In this case, Mu is not aware of any exceptions being thrown. "

exact garbage collection.

### instruction set

• inter-function control: CALL, TAILCALL, THROW, LANDINGPAD (to catch exceptions; a CALL has a pointer to the relevant LANDINGPAD, if any)

## QEMU TCG

The QEMU emulator/virtualization system internally uses a compiler called "Tiny Code Generator" (TCG) to translate the instructions for the processor being emulated into the instructions for the host processor. The idea is that when TCG is extended to emulate some type of processor, someone writes a translation from the operations of the ISA of the emulated processor into the "TCG frontend ops" (TCG frontend instruction set); and when TCG is ported to itself run on a new platform, someone writes a translation from the "TCG backend ops" (TCG backend instruction set). Then when QEMU is emulating some software, the software being emulated is translated to frontend ops, the frontend ops are translated to backend ops, and the backend ops are translated into instructions for the host processor.

### QEMU TCG Backend instrution set

• MOV
• ADD, SUB, DIV(U), MUL(U), REM(U), NEG (the U variants are unsigned)
• AND, OR, NOT, EQ ("EQV"), XOR (which is equivalend to NEQ), NAND, NOR, ANDC ("Logical AND one operand with the complement of another"), ORC ("Logical OR one operand with the complement of another")
• ROTL, ROTR, SAR (shift arithmetic right), SHL (logical shift left), SHR (logical shift right)
• BSWAP(16
 32 64): byte swap a 16-bit/32-bit/64-bit quantity
• EXT(8
 16 32)(S U): Sign (S) or Zero (U) extend a (8/16/32)-bit operand

moving data between registers and arbitrary host memory:

• LD(8
 16 32)(S U) Load an (8/16/32)-bit quantity from host memory and (sign or zero)-extend
• LD: alias to target native sized load
• ST(8
 16 32): Store an (8/16/32)-bit quantity to host memory
• ST Alias to target native sized store

moving data between registers and arbitrary target memory:

• QEMU_(LD
 ST)(8 16 32 64)(S U): same as above

Code Flow:

• BR: branch
• BRCOND: if (arg1 <condition> arg2) goto label
• CALL
• GOTO_TB: goto translation block
• EXIT_TB: exit translation block
• SETCOND: ret = arg1 <condition> arg2 . Conditions are: always, never, eq, neq, lt (<), le (<=), gt, ge, ltu (unsigned <), leu, gtu, geu.
• SET_LABEL Mark the current location with a label (in ASM would often be label:)

Misc:

• NOP

CALL, BR, DISCARD, NOP do not appear to be available in the frontend [27], but perhaps i'm misunderstanding.

The frontend also includes some extra-op helpers:

• tcg_global_mem_new(TCG_AREG0, offsetof(CPUState, reg), "reg");TCG_AREG0, offsetof(CPUState, reg), "reg"): Declare a named TCG register

temporary registers:

• tcg_temp_new(): Create a new temporary register
• tcg_temp_local_new(): Create a local temporary register. Simple temporary register cannot carry its value across jump/brcond, only local temporary can.
• tcg_temp_free(tmp): Free a temporary register

labels:

• gen_new_label(): Create a new label

Note that "As of 2014, QEMU does not support 3D acceleration" [28].

### QEMU TCG Links

NOTE:

• note that another intrepretation of XOR is "not equal" (NEQ): if x and y are each either 0 or 1, then (x XOR y) iff (x != y). Similarly, x EQ y == NOT(x XOR y).

## SECD

The Wikipedia description is good and fairly concise, so i'll just quote it here:

" The SECD machine is a highly influential[citation needed] virtual machine and abstract machine intended as a target for functional programming language compilers. The letters stand for Stack, Environment, Control, Dump, the internal registers of the machine. These registers point to linked lists in memory.

The machine was the first to be specifically designed to evaluate lambda calculus expressions. It was originally described by Peter J. Landin as part of his ISWIM programming language definition in 1963. The description published by Landin was fairly abstract, and left many implementation choices open (like an operational semantics). Hence the SECD machine is often presented in a more detailed form, such as Peter Henderson's Lispkit Lisp compiler, which has been distributed since 1980. Since then it has been used as the target for several other experimental compilers.

In 1989 researchers at the University of Calgary worked on a hardware implementation of the machine.[1]

Informal description

When evaluation of an expression begins, the expression is loaded as the only element of C. The environment E, stack S and dump D begin empty.

During evaluation of C it is converted to reverse Polish notation with ap (for apply) being the only operator. For example, the expression F (G X) (a single list element) is changed to the list X:G:ap:F:ap.

Evaluation of C proceeds similarly to other RPN expressions. If the first item in C is a value, it is pushed onto the stack S. More exactly, if the item is an identifier, the value pushed onto the stack will be the binding for that identifier in the current environment E. If the item is an abstraction, a closure is constructed to preserve the bindings of its free variables (which are in E), and it is this closure which is pushed onto the stack.

If the item is ap, two values are popped off the stack and the application done (first applied to second). If the result of the application is a value, it is pushed onto the stack.

If the application is of an abstraction to a value, however, it will result in a lambda calculus expression which may itself be an application (rather than a value), and so cannot be pushed onto the stack. In this case, the current contents of S, E, and C are pushed onto D (which is a stack of these triples), S is reinitialised to empty, and C is reinitialised to the application result with E containing the environment for the free variables of this expression, augmented with the binding that resulted from the application. Evaluation then proceeds as above.

Completed evaluation is indicated by C being empty, in which case the result will be on the stack S. The last saved evaluation state on D is then popped, and the result of the completed evaluation is pushed onto the stack contents restored from D. Evaluation of the restored state then continues as above.

If C and D are both empty, overall evaluation has completed with the result on the stack S. Registers and memory

The SECD machine is stack-based. Functions take their arguments from the stack. The arguments to built-in instructions are encoded immediately after them in the instruction stream.

Like all internal data-structures, the stack is a list, with the S register pointing at the list's head or beginning. Due to the list structure, the stack need not be a continuous block of memory, so stack space is available as long as there is a single free memory cell. Even when all cells have been used, garbage collection may yield additional free memory. Obviously, specific implementations of the SECD structure can implement the stack as a canonical stack structure, so improving the overall efficiency of the virtual machine, provided that a strict bound be put on the dimension of the stack.

The C register points at the head of the code or instruction list that will be evaluated. Once the instruction there has been executed, the C is pointed at the next instruction in the list—it is similar to an instruction pointer (or program counter) in conventional machines, except that subsequent instructions are always specified during execution and are not by default contained in subsequent memory locations, as it is the case with the conventional machines.

The current variable environment is managed by the E register, which points at a list of lists. Each individual list represents one environment level: the parameters of the current function are in the head of the list, variables that are free in the current function, but bound by a surrounding function, are in other elements of E.

The dump, at whose head the D register points, is used as temporary storage for values of the other registers, for example during function calls. It can be likened to the return stack of other machines.

The memory organization of the SECD machine is similar to the model used by most functional language interpreters: a number of memory cells, each of which can hold either an atom (a simple value, for example 13), or represent an empty or non-empty list. In the latter case, the cell holds two pointers to other cells, one representing the first element, the other representing the list except for the first element. ...

Instructions

• nil pushes a nil pointer onto the stack
• ldc pushes a constant argument onto the stack
• ld pushes the value of a variable onto the stack. The variable is indicated by the argument, a pair. The pair's car specifies the level, the cdr the position. So "(1 . 3)" gives the current function's (level 1) third parameter.
• sel expects two list arguments, and pops a value from the stack. The first list is executed if the popped value was non-nil, the second list otherwise. Before one of these list pointers is made the new C, a pointer to the instruction following sel is saved on the dump.
• join pops a list reference from the dump and makes this the new value of C. This instruction occurs at the end of both alternatives of a sel.
• ldf takes one list argument representing a function. It constructs a closure (a pair containing the function and the current environment) and pushes that onto the stack.
• ap pops a closure and a list of parameter values from the stack. The closure is applied to the parameters by installing its environment as the current one, pushing the parameter list in front of that, clearing the stack, and setting C to the closure's function pointer. The previous values of S, E, and the next value of C are saved on the dump.
• ret pops one return value from the stack, restores S, E, and C from the dump, and pushes the return value onto the now-current stack.
• dum pushes a "dummy", an empty list, in front of the environment list.
• rap works like ap, only that it replaces an occurrence of a dummy environment with the current one, thus making recursive functions possible

A number of additional instructions for basic functions like car, cdr, list construction, integer addition, I/O, etc. exist. They all take any necessary parameters from the stack. " -- https://en.m.wikipedia.org/wiki/SECD_machine (this article has additional Further reading links not in the above excerpt)

Reynolds 1993 says that the dump (the D in SECD) is a representation of a continuation.

---

## AMBER

The Amber Machine by Luca Cardelli

"The Amber machine is a stack machine designed as an intermediate language for compiling higher-order languages, in the tradition of SECD machines [Landin 64] and combinator machines [Turner 79] [Curien 86]. This is a revision of the Functional Abstract Machine described in [Cardelli 83, Cardelli 84], and is specialized for the Amber language [Cardelli 86]"

---

## SAIL

http://www.cs.utexas.edu/~isil/sail/

SAIL: Static Analysis Intermediate Language with a Two-Level Representation

"SAIL is an intermediate language designed for static analysis; it explicitly both a high- and a low-level representation, with mappings between them, to facilitate both analysis (on the low-level representation) and reporting (through the high-level representation)." [29]

"SAIL is a front-end for program analysis systems that provides a two-level representation, consisting of both a language-specific high-level intermediate language as well as a low-level, language-independent representation. SAIL provides a precise mapping from the low-level instructions to the high-level expressions and statements used in the source code, greatly simplifying tasks such as error reporting and relating reasoning on the low-level representation back to the original source code. SAIL also provides support for control flow graph construction as well as compact and efficient serialization. SAIL currently parses C and C++ code and uses GCC 4.3.4 to generate its high-level intermediate representation." [30]

"In contrast to CIL, Sail provides both a high and a low-level representation; however, Sail 's high-level representation is much higher than CIL whereas its low-level represen- tation is much lower. While CIL allows for intricate program transformations, Sail is exclusively targeted for performing static analysis" [31]

---

## SQUID

http://cs.lmu.edu/~ray/notes/squid/

• COPY, COPY_FROM_DEREF (y = *x), COPY_TO_DEREF (*y = x), COPY_FROM_OFS (z = *(x+y)), COPY_TO_OFS (*z = (x+y))
• arith: ADD, SUB, MUL, DIV, NEG, MOD, REM
• more arith: ABS, SQRT, SIN, COS, ATAN, LN, INC, DEC
• INC_DEREF (*x = *x + 1), DEC_DEREF
• shifts: SHL, SHR, SAR (arithmetic right shift)
• bitwise: AND, OR, XOR, NOT, COMP (bitwise complement)
• LT, LE, EQ, NE, GE, GT
• LABEL (label annotation)
• JUMP, JZERO, JNZERO, JLT, JLE, JEQ, JNE, JGE, JGT
• STARTCALL, PARAM, CALLP, CALLF, RETP, RETF (CALLs go like: STARTCALL, PARAM x, PARAM y, CALL(P
 F) f, #bytes_passed, (dest)?..., RET(P F), where P means no return value (and so no 'dest' in CALLP) and F means a return value (and a 'dest' in CALLF))
• INT_TO_STRING, FLOAT_TO_STRING, BOOL_TO_STRING, CHAR_TO_STRING, TO_FLOAT (integer to real)
• ALLOC
• assertions: NULL_CHECK, ASSERT_POSITIVE, BOUND (BOUND x y z = assert y <= x < z)
• NO_OP, EXIT

---

## SWEET16

16 x 16-bit registers. Special registers:

• R0 - accumulator
• R12 - subroutine stack pointer
• R13 - stores the result of all comparison operations for branch testing
• R14 - status register
• R15 - program counter

Instrutions:

• SET (load constant)
• LD, ST (loads and stores may be to/from registers into accumulator, or memory into registers; those are different opcodes)
• LDD, STD (load/store double-byte indirect; these include increments)
• POP (pop indirect), STP (store pop indirect), POPD (Pop Double Indirect) (these include decrements)
• ADD, SUB, INR (inc), DCR (dec), CPR (cmp)
• RTN (return to 6502 mode, eg terminate SWEET16)
• BR, BC (branch if carry), BNC (branch if no carry), BP (branch if plus), BM (branch if minus), BZ, BNZ, BM1 (branch if minus 1), BNM1, BK (break),
• BS (branch to subroutine), RS (return from subroutine)

---

## Morte

"Morte is a minimalist implementation of the calculus of constructions ...You can think of Morte as a very low-level intermediate language for functional languages"

Morte is said to be 'superoptimizing'; it removes recursion by transforming recursive data structures to things that hold information telling you how to do folds on them (Boehm-Berarducci encoding), and represents it in a lambda-calculus. Then it essentially optimizes by inlining; it applies functions to their arguments (beta reduction), and applies https://wiki.haskell.org/Eta_conversion .

---

## Wren VM

Stack-based.

### Wren VM instructions

Much of the following is often quoted or paraphrased from [33].

• constants:
• CONSTANT (push the constant at index [arg] onto the top of the stack)
• NULL, FALSE, TRUE
• LOAD_LOCAL_0-LOAD_LOCAL_8, LOAD_LOCAL (Pushes the value in local slot [arg]), STORE_LOCAL
• LOAD_FIELD (Pops an instance and pushes the value of the field in slot [arg] of it), STORE_FIELD
• LOAD_FIELD_THIS (Pushes the value of the field in slot [arg] of the receiver of the current function. This is used for regular field accesses on "this" directly in methods. This instruction is faster than the more general CODE_LOAD_FIELD instruction), STORE_FIELD_THIS,
• stack ops: POP (Pop and discard the top of stack)
• calls: CALL_0-CALL_16 (Invoke the method with symbol [arg]. The number indicates the number of arguments (not including the receiver), SUPER_0-SUPER_16 (Invoke a superclass method with symbol [arg])
• control flow:
• JUMP (Jump the instruction pointer [arg] forward), LOOP (Jump the instruction pointer [arg] backward)
• JUMP_IF (Pop and if not truthy then jump the instruction pointer [arg] forward)
• AND (If the top of the stack is false, jump [arg] forward. Otherwise, pop and continue), OR (If the top of the stack is non-false, jump [arg] forward. Otherwise, pop and continue)
• RETURN (Exit from the current function and return the value on the top of the stack)
• closures:
• CLOSURE (Creates a closure for the function stored at [arg] in the constant table. Following the function argument is a number of arguments, two for each upvalue. The first is true if the variable being captured is a local (as opposed to an upvalue), and the second is the index of the local or upvalue being captured)
• CLOSE_UPVALUE (Close the upvalue for the local on the top of the stack, then pop it)
• classes:
• CONSTRUCT (Creates a new instance of a class. Assumes the class object is in slot zero, and replaces it with the new uninitialized instance of that class. This opcode is only emitted by the compiler-generated constructor metaclass methods), FOREIGN_CONSTRUCT (Creates a new instance of a foreign class),
• CLASS (Creates a class. Top of stack is the superclass. Below that is a string for the name of the class. Byte [arg] is the number of fields in the class), FOREIGN_CLASS (Creates a foreign class. Top of stack is the superclass. Below that is a string for the name of the class)
• METHOD_INSTANCE (Define a method for symbol [arg]. The class receiving the method is popped off the stack, then the function defining the body is popped. If a foreign method is being defined, the "function" will be a string identifying the foreign method. Otherwise, it will be a function or closure), METHOD_STATIC (like METHOD_INSTANCE, but for static rather than instance methods)
• misc: END (This pseudo-instruction indicates the end of the bytecode. It should always be preceded by a CODE_RETURN`, so is never actually executed)

There are also a bunch of primitives in [34] . These are invoked with the CALL or SUPER opcodes:

class Object (the root class; classes have both a superclass and a metaclass (except that Object has no superclass); Object's metaclass is objectMetaclass ):

• ! (not)
• == (equals)
• !=
• is
• toString
• type

class Class (a subclass of Object, like everything else; its metaclass is itself (Class))

• name
• supertype

class objectMetaclass (a subclass of Class; objectMetaclass's metaclass is Class):

• same

class Bool:

• (nothing new; just overrides some methods in Object ('!' (not) and toString)

class Fiber:

• static/class methods:
• new
• abort
• current
• suspend
• yield
• call
• error
• isDone
• transfer
• transferError
• try

class Fn:

• static/class methods:
• new
• arity
• call (with up to 16 arguments)

class Null:

• (nothing new; just overrides some methods in Object ('!' (not) and toString)

class Num:

• static/class methods:
• pi
• largest
• smallest
• -
• +
• *
• /
• <
• >
• <=
• >=
• &
• ^
• 1
• abs
• acos
• asin
• atan
• ceil
• cos
• floor
• -
• sin
• sqrt
• tan
• log
• %
• ~
• ..
• ...
• atan
• pow
• fraction
• isInfinity
• isInteger
• isNan
• sign
• toString
• truncate

class String:

• static/class methods:
• fromCodepoint
• +
• [_] (subscript)
• byteAt_
• byteCount_
• codePointAt_
• contains
• endsWith
• indexOf
• indexOf
• iterate
• iterateByte_
• iteratorValue
• startsWith

class List:

• static/class methods:
• new
• filled
• [_] (subscript)
• [_]=(_) (subscript setter)
• clear
• count
• insert
• iterate
• iteratorValue
• removeAt

class Map:

• static/class methods:
• new
• [_] (subscript)
• [_]=(_) (subscript setter)
• clear
• containsKey
• count
• remove
• iterate
• keyIteratorValue_
• valueIteratorValue_

class Range:

• from
• to
• min
• max
• isInclusive
• iterate(_)
• iteratorValue(_)

class System:

• static/class methods:
• clock
• gc
• getModuleVariable
• importModule
• writeString_

---

---

## IL4 Lisp-ahtava VM

Vm made for demoscene 4k intros.

Lisp syntax.

"Comments start with a # char and go up to the end of the line. "

" The language is totally untyped (well, strictly speaking the type is "32-bits" :). That means you have to be careful to call the right arithmetic and comparison functions (as well as others). The values may very well be pointers, integers, floats or whatever. "

top level constructs:

• constants
• global variables
• Bytecode functions
• Assembly functions: contain a block of (nasm?) assembly code; this code will be compiled and executed when the function is called. Arguments are passed to the stack from left to right (it's the callee's responsibility to pop them before returning); one return value is placed on the stack.
• Compiler directives (currently only 'heapsize')

in the 'bytecode functions': "

• Other function calls
• "(if (expr) (then) (else))" where "expr", "then" and "else" are some code blocks.
• Variable definitions (same as globals but inside functions).
• Variable sets, "(set var-name value)"
• While loops, "(while (expr) code)" This is the only looping mechanism in IL4.
• External symbol pointer lookups, "(external_symbol "_wglCreateContext@4)" This compiles directly to a constant value, which is a pointer to the symbol. This helper form is used to call external C/asm functions.
• Stdcalls/Cdecl calls. Some examples:
(fun glVertex3fv (arr) (stdcall (external_symbol "_glVertex3fv@4") arr))
(fun glViewport (x y w h) (stdcall (external_symbol "_glViewport@16") x y w h))

There are different forms for functions which return a floating point value, namely "stdcall-fp" and "cdecl-fp".

"

• Every assembly function end with a "jmp ebx"
• Every opcode has a 16bit loopup index to it's code
• Every bytecode function has a 16bit lookup index to it's code
• Every bytecode function has 16bits of function header and one return bytecode at the end (8bit); total overhead 3 bytes
• Constants currently take 4 bytes each.
• Every global that has a preset value takes 4 bytes (even if the same value is used as a constant elsewhere).
• Globals that have no preset values take no space. "

-- [35]

types of items in the source code are (from il4_reader.ml):

• atom
• int
• float
• str
• lists

types of constants are (from program.ml, 'parse_const'):

• int
• float
• RawArray?

types of statements are (from program.ml):

• Nop
• Block
• ConstInt?, ConstFloat?, ConstRawArray?
• GetGlobal?, GetLocal?, GetParam?, GetExtSymbol?, SetGlobal?, SetLocal?, SetParam?
• Call, CallAsm?, CallC?
• If
• While
• Break, Continue
• Return

imbc (intermediate bytecode) opcodes (from imbc.ml):

• Label of string (* Marks a label. *)
• Jump of string (* Unconditional jump. *)
• JumpIfNot? of string (* Pops topmost and if it's zero, jumps. *)
• PushLocal? of int (* Pushes a paramater (+5 onwards) or a local (-1 backwards) to the stack. *)
• PushPresetGlobal? of int (* Pushes a preset global (index given) to the stack. *)
• PushUninitGlobal? of int (* Pushes an uninitialized global (index given) to the stack. *)
• PushConstant? of int (* Pushes a constant (index to the constant table). *)
• PushConstantByte? of int (* Pushes a byte constant (raw value). *)
• Pop (* Pops the topmost item in stack. *)
• StoreLocal? of int (* Stores the topmost to a paramater (+2 onwards) or to a local (-1 backwards). *)
• StorePresetGlobal? of int (* Stores the topmost to a preset global (index given). *)
• StoreUninitGlobal? of int (* Stores the topmost to an uninitialized global (index given). *)
• Return (* Leaves the function and returns the topmost item in stack. *)
• Call of string (* Calls a function. Pops the arguments and pushes the return value to the stack. Arguments are valuated from right to left. *)
• Call_Cdecl of int (* Calls a C function (cdecl calling convention). Argument count is given, and so much is popped (first is the function to call). The return value is pushed back. Arguments are valuated from right to left. *)
• Call_CdeclFP? of int (* Calls a C function (cdecl calling convention). Argument count is given, and so much is popped (first is the function to call). The floating point return value is pushed back. Arguments are valuated from right to left. *)
• Call_Stdcall (* Calls a C function (stdcall calling convention). First argument is function to call, the rest are popped after call. The return value is pushed back. Arguments are valuated from right to left. *)
• Call_StdcallFP? (* Calls a C function (stdcall calling convention). First argument is function to call, the rest are popped after call. The floating point return value is pushed back. Arguments are valuated from right to left. *)
• AsmFun? of string * int * (string list) * (Program.fun_attrs) (* An opcode that is pure assembly. Parameters: descriptive name, parameter count and the assembly code itself. Arguments are valuated from left to right. *)
• fixed parameter opcodes (same as above, but the value "read" is fixed and not read from the byte stream. *):
• FixedPushLocal? of int
• FixedPushPresetGlobal? of int
• FixedPushUninitGlobal? of int
• FixedPushConstant? of int
• FixedPushConstantByte? of int
• FixedStoreLocal? of int
• FixedStorePresetGlobal? of int
• FixedStoreUninitGlobal? of int
• FixedCall? of string

An demoscene 4k intro, 'roseshank', included with the distribution also contains a 'core' library with:

constants:

• true, false
• pi, pihalf, pi2
• integer arithmetic: negi (negation) + - * / %
• floating point arithmetic: negf + - * / absf sinf cosf sqrtf powf i2f (integer to float) f2i
• binary arithmetic: & (and)
 (or) ^ (xor) ~ (not) 2> (shr) >> (sar)
• boolean operators: not or and
• integer boolean comparison operators: == != <i >i <=i >=i
• floating point boolean comparison operators: <f >f <=f >=f ==f !=f
• memory operators: mem_clear mem_copy mem_getb mem_setb mem_getw mem_setw mem_getd mem_setd
• memory allocation: get_alloc_ptr_ptr (Returns a pointer to the heap allocation pointer) get_alloc_ptr (Returns the current heap allocation pointer) set_alloc_ptr alloc_raw (Allocates x bytes of memory from the heap. The memory is cleared)
• Array functions: array_alloc (Allocates an array for x elements (each element is 4 bytes). The array is cleared) array_get (Gets x:th element of an array) array_set barray_get (Gets x:th element of a byte array) barray_set
• random numbers: RAND_MAX (constant 65535) gbl_rand_seed (variable) rand rand_range rand_rangef
• misc: debug_break f2d valid_float (Tests if a floating point value is value (not NaN? or +-INF))

---

## Win4k

Forth VM made for demoscene 4k intros.

http://neoscientists.org/~plex/win4k/index.html

Some core words (from ccore.h):

compiler:

• : (define word)
• ; (term word)
• eval (evaluate)

debug:

• see (see word)

exec:

• call (Call word by reference)

mem:

• fill (Fill memory)

stack manipulation:

• dup, over, swap, rot, drop, pick

float math:

• fadd (float add), fsub, fmul, fdiv, fmod, fsin, fcos, fsqrt, flog

signed int math:

• + (int add), -, *, /, %, & (signed int and),
 (or), ^ (xor)

cast:

• i2f (cast int to float), f2i

loops:

• BEGIN, UNTIL (This loop performs its body until a false condition is left on the stack)

heap:

• var (create variable <name>), const (create constant <name>)
• alloc (alloc on heap memory ( bytes -- ptr ))
• ! (store <value> in ), @ (load <value> from )

edit:

• clr (clear stack)
• echo (echo)
• cat (cat word)
• pwd (print working directory)
• inc (include source file)
• exit (;C exit console)
• ds.free (# of free items on datastack)
• ds.used (# of used items on datastack)
• ds.size (size of datastack)
• ds.init (init with #size)
• ds.reset (reset datastack)

repl stuff:

• i. (pop+dump as integer)
• f. (pop+dump as float)
• p. (pop+dump as ptr)
• s. (pop+dump as string ptr)
• .m (dump map)
• .s (dump stack)
• : (define word)
• ; (term word)
• .m ( see words)
• ? (help)

Some graphics constants (from cgl.h): constants:

• GL_MODELVIEW, GL_PROJECTION, GL_POINTS, GL_LINES, GL_QUADS, GL_TRIANGLES, GL_BLEND, GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA, GL_COLOR_BUFFER_BIT, GL_RGB, GL_UNSIGNED_BYTE, GL_UNPACK_ALIGNMENT, GL_PACK_ALIGNMENT

Some graphics functions (from cgl.h):

• glViewport, glClear
• glMatrixMode, glPushMatrix, glPopMatrix
• glEnable, glDisable
• glBlendFunc, glOrtho, glFrustum
• glTranslatef, glScalef, glRotatef
• glPointSize
• glBegin, glEnd,
• glVertex2i, glVertex3i, glVertex3f
• glColor3ub, glColor4ub, glColor4f
• glCallList, glGenLists, glNewList, glEndList
• glDrawPixels
• glRasterPos2i, glRasterPos3i, glRasterPos2f, glRasterPos3f
• glPixelStorei

Some core bytecodes(?) (from code.c, or from [36]):

• code_call (void* wp );
• code_btrue(void* dest );
• code_push (void* imm32 );
• code_ret (void* mark );
• code_imm (void* imm_exec );
• code_throw(void* excid );
• code_exit (void* mark );
• code_dump (void* strptr );

## Cogol

https://github.com/QuestForTetris/tetris-writeup/blob/master/QFTASM_Cogol.md

" Cogol exists at a level just above assembly language. Generally, most lines in a Cogol program each correspond to a single line of assembly, but there are some important features of the language:

Basic features include named variables with assignments and operators that have more readable syntax. For example, ADD A1 A2 3 becomes z = x + y;, with the compiler mapping variables onto addresses.
Looping constructs such as if(){}, while(){}, and do{}while(); so the compiler handles branching.
One-dimensional arrays (with pointer arithmetic), which are used for the Tetris board.
Subroutines and a call stack. These are useful for preventing duplication of large chunks of code, and for supporting recursion.

The compiler (which I wrote from scratch) is very basic/naive, but I've attempted to hand-optimize several of the language constructs to achieve a short compiled program length.

Here are some short overviews about how various language features work:

Tokenization

The source code is tokenized linearly (single-pass), using simple rules about which characters are allowed to be adjacent within a token. When a character is encountered that cannot be adjacent to the last character of the current token, the current token is deemed complete and the new character begins a new token. Some characters (such as { or ,) cannot be adjacent to any other characters and are therefore their own token. Others (like > or =) are only allowed to be adjacent to themselves, and can thus form tokens like >>> or ==. Whitespace characters force a boundary between tokens but aren't themselves included in the result. The most difficult character to tokenize is - because it can both represent subtraction and unary negation, and thus requires some special-casing. "

## Viua

https://viuavm.org/

" parallel, register-based, bytecode virtual machine

...

Feature list:

• compile-time checks of code perfomed by the assembler
• separate compilation of bytecode modules
• static and dynamic linking of bytecode modules
• flat, module-less structure of code (compilation of various languages' module systems with no problems)
• almost free-form function names, with no length limit
• first-class functions
• mechanisms for static and dynamic dispatch of function calls
• process-based concurrency
• inter-process communication via message passing
• running multiple processes in parallel (on preemptive schedulers)
• virtual process restarting on failure (using user-programmed watchdog process)
• exception-based error handling (every object is throwable and catchable)
• eager, scope-based resource management
• non-blocking foreign function interface (requires at least two-core CPU)
• clean FFI for straightforward cooperation with C++ code
• no-copy function returns
• by-copy and by-move parameter passing to functions
• safe pointers for inside-VM objects
• move semantics for many instructions
• closures with multiple capture mechanisms (by-copy, by-reference, by-move)
• deferred calls
• tail calls
• built-in basic data structures (text, vectors, structures) "

### Viua instructions

• arithmetic: add div fstore (Put a floating-point number inside a register) idec iinc istore izero (Put a 64 bit signed integer with the value of 0 into a register) mul sub
• bits: ashl ashr bitand bitat bitnot bitor bits bitset bitxor rol ror shl shr
• calls: arg (Fetch value from arguments register set) argc (DEPRECATED) call (Call a function) defer (Call a function when current frame is being discarded) frame (Prepare an empty frame for a function call) msg (Call a function of a type using dynamic dispatch (method call)) pamv (Move value to parameter slot) param (Copy value to parameter slot) return tailcall
• casts: itof stof stoi
• closures: capture (Capture a reference to a value for use inside a closure. References are transparent to user-code, and are automatically resolved by the runtime. Using references implies a switch from scope-based resource management into reference-counting scheme; the switch is done on a per-value basis) capturecopy (Capture copy of a value for use inside a closure) capturemove (Moves a value from surrounding scope into a closure) closure (Creates a closure from a function)
• concurrency: join process receive (Receive message) self (Return PID of current process) send (Send message to another process)
• control-flow: if (Checks if the source value is true or false, and takes a branch based on the state of said value; Canonical form: if condition:r-op jump-if-true:instruction-index jump-if-false:instruction-index ) jump
• error-handling: catch (Register a type-catcher in a try-frame) draw (Fetch exception value from exception register and put it in a general-purpose register) enter (Enter a block of instructions) leave (Leave a block of instructions) throw (Throw a value up the stack to signal an exception) try (Prepare exception frame) watchdog (Set up watchdog function for a process)
• functional: function (Create a function object)
• logic: and bool eq gt gte lt lte not or
• miscellaneous: echo (Print value to standard output without adding newline character) halt import nop print (Print value to standard output and adds a newline) ptr (Create a pointer to a value)
• register-manipulation: copy isnull move ress (Switch primary register set) swap
• resource-management: delete (Delete a value)
• strings: streq strstore
• structs: insert new remove
• text: text textat (Return copy of the character at a given index in text) textcommonprefix (Return length of common prefix of two text values) textcommonsuffix (Puts a string into a register) textconcat texteq textlength textsub (Return a copy of a part of the given text between given indexes)
• typesystem: attach (Attach function to a type to act as a method implementation) class (Create a new type) derive (Mark one type as derived from another) register (deprecated)
• lists ('vectors'): vat (Create a pointer to an element of a vector at given index) vec (create vector)vinsert vlen vpop vpush

## Project Oberon 2013 RISC.mod

https://www.inf.ethz.ch/personal/wirth/ProjectOberon/Sources/RISC.Mod.txt

• mov lsl asr ror and ann ior xor add sub mul div
• memory instructions: load input ReadInt? eot store output
• branch instructions

## Cranelift

A low-level retargetable code generator with an IR (the same sort of thing as LLVM). Designed for the Webassembly project. Written in Rust. Formerly called Cretonne.

## RiSC-16

https://ece.umd.edu/~blj/RiSC/RiSC-isa.pdf

" ...the 16-bit Ridiculously Simple Computer (RiSC?-16), a teaching ISA that is based on the Little Computer (LC-896) developed by Peter Chen at the Uni- versity of Michigan. The RiSC?-16 is an 8-register, 16-bit computer. All addresses are shortword- addresses (i.e. address 0 corresponds to the first two bytes of main memory, address 1 corresponds to the second two bytes of main memory, etc.). Like the MIPS instruction-set architecture, by hardware convention, register 0 will always contain the value 0. The machine enforces this: reads to register 0 always return 0, irrespective of what has been written there. "

instructions: add addi nand lui sw lw beq jalr

## WebKit's Air Assembly language

?? Is this the same as:

## IJVM

https://en.wikipedia.org/wiki/IJVM

" an instruction set architecture created by Andrew Tanenbaum for his MIC-1 architecture. It is used to teach assembly basics in his book Structured Computer Organization.

IJVM is mostly a subset of the JVM assembly language that is used in the Java platform. This instruction set is so simple that it's difficult to write complex programs in it (for example, no shift instructions are provided). " -- [37]

"

• BIPUSH: Push a byte onto stack
• DUP: Copy top word on stack and push onto stack
• ERR: Print an error message and halt the simulator
• GOTO: name Unconditional jump
• HALT: Halt the simulator
• IADD: Pop two words from stack; push their sum
• IAND: Pop two words from stack; push Boolean AND
• IFEQ: Pop word from stack and branch if it is zero
• IFLT: Pop word from stack and branch if it is less than zero
• IF_label: Pop two words from stack and branch if they are equal
• IINC: Add a constant value to a local variable
• ILOAD : Push local variable onto stack
• IN: Reads a character from the keyboard buffer and pushes it onto the stack. If no character is available, 0 is pushed
• INVOKEVIRTUAL : Invoke a method, pops object reference and optionally pops arguments from stack.
• IOR: Pop two words from stack; push Boolean OR
• IRETURN: Return from method with integer value
• ISTORE: Pop word from stack and store in local variable
• ISUB: Pop two words from stack; subtract the top word from the second to top word, push the answer;
• LDC_W: Push constant from constant pool onto stack
• NOP: Do nothing
• OUT: Pop word off stack and print it to standard out
• POP: Delete word from top of stack
• SWAP: Swap the two top words on the stack
• WIDE: Prefix instruction; next instruction has a 16-bit index

There's also a set of special ARRAY instructions. Instruction Stack before* Stack after Description

• NEWARRAY: Create new array on the heap. The count must be of type int. It is popped off the operand stack. The count represents the number of elements in the array to be created. Based on the Sun JVM-spec. The atype parameter is omitted.
• IALOAD: Load from int array. The arrayref must be of type reference and must refer to an array whose components are of type int. The index must be of type int. Both arrayref and index are popped from the operand stack. The int value in the component of the array at index is retrieved and pushed onto the operand stack. Part of the Sun JVM Spec.
• IASTORE: Store into int array. The arrayref must be of type reference and must refer to an array whose components are of type int. Both index and value must be of type int. The arrayref, index, and value are popped from the operand stack. The int value is stored as the component of the array indexed by index. Part of the Sun JVM Spec.
• W_OUT: Pop the value from the stack and write a decimal representation of it to the display. For debugging purposes. Not part of the JVM Spec. " -- [38]

## RCPU

https://speakerdeck.com/ddfreyne/simulating-a-cpu-with-ruby?slide=122

https://github.com/ddfreyne/rcpu

33 instructions (from slide 27):

• call ret
• push pop
• prn (print) halt
• j je jne jg jge jl jle
• cmp (compare and update FLAGS register) mod add sub mul div xor or and shl shr not
• mov li
• lw lh lb sw sh sb

(the github repo documentation also mentions another instruction, SLEEP)

registers:

• 8 GPRs
• PC
• FLAGS
• SP (stack pointer)
• BP (base pointer)
• R (return value)

## HSVM

https://github.com/endeav0r/hsvm

Instructions:

• add, sub, mul, div, mod, and, or, xor
• cmp, je, jne, jl, jle, jg, jge
• call, jmp, ret
• load, loadb, stor, storb, push, pop
• in, out
• mov, hlt, nop, syscall

## IRE

https://github.com/b3orn/ire/

16 bit register based virtual machine

16 registers

50 instructions:

• NOP
• CPY STO LOD
• LODI
• PSH POP
• CAL JMP IRT
• CALI JMPI IRTI
• RET IRET
• NEG NOT
• CMP ADD SUB MUL DIV MOD AND OR XOR SHL SHR ROL ROR
• CMPI ADDI SUBI MULI DIVI MODI ANDI ORI XORI SHLI SHRI ROLI RORI
• IN OUT INI OUTI
• HLT HLTI
• ESC

## skx simplevm

https://github.com/skx/simple.vm

" The following are examples of all instructions:

:test :label goto 0x44ff # Jump to the given address goto label # Jump to the given label jmpnz label # Jump to label if Zero-Flag is not set jmpz label # Jump to label if Zero-Flag is set

store #1, 33 # store 33 in register 1 store #2, "Steve" # Store the string "Steve" in register 1. store #1, #3 # register1 is set to the contents of register #3.

exit # Stop processing. nop # Do nothing

print_int #3 # Print the (integer) contents of register 3 print_str #3 # Print the (string) contents of register 3

system #3 # Call the (string) command stored in register 3

add #1, #2, #3 # Add register 2 + register 3 contents, store in reg 1 sub #1, #2, #3 # sub register 2 + register 3 contents, store in reg 1 mul #1, #2, #3 # multiply register 2 + register 3 contents, store in reg 1 concat #1, #2,#3 # store concatenated strings from reg2 + reg3 in reg1.

dec #2 # Decrement the integer in register 2 inc #2 # Increment the integer in register 2

string2int #3 # Change register 3 to have a string from an int is_integer #3 # Does the given register have an integer content? int2string #3 # Change register 3 to have an int from a string is_string #3 # Does the given register have a string-content?

cmp #3, #4 # Compare contents of reg 3 & 4, set the Z-flag. cmp #3, 42 # Compare contents of reg 3 with the constant 42. sets z. cmp #3, "Moi" # Compare contents of reg 3 with the constant string "Moi". sets z.

peek #1, #4 # Load register 1 with the contents of the address in #4. poke #1, #4 # Set the address stored in register4 with the contents of reg1. random #2 # Store a random integer in register #2.

push #1 # Store the contents of register #1 in the stack pop #1 # Load register #1 with the contents of the stack. call 0xFFEE # Call the given address. call my_label # Call the defined label ret # Return from a called-routine. "

## LightVM

https://bitbucket.org/earlz/lightvm/src/master/doc/

## Rusty-jsyc

https://github.com/jwillbold/rusty-jsyc/blob/master/vm/vm.js

"

// Misc
PROPACCESS: 10,
FUNC_CALL: 11,
EVAL: 12,
CALL_BCFUNC: 13,
RETURN_BCFUNC: 14,
COPY: 15,
EXIT: 16,
COND_JUMP: 17,
JUMP: 18,
JUMP_COND_NEG: 19,
BCFUNC_CALLBACK: 20,
PROPSET: 21,
// Comparisons
COMP_EQUAL: 50,
COMP_NOT_EQUAL: 51,
COMP_STRICT_EQUAL: 52,
COMP_STRICT_NOT_EQUAL: 53,
COMP_LESS_THAN: 54,
COMP_GREATHER_THAN: 55,
COMP_LESS_THAN_EQUAL: 56,
COMP_GREATHER_THAN_EQUAL: 57,
// Math
MUL: 101,
MINUS: 102,
DIV: 103}; "

## LIL

https://www.researchgate.net/publication/220817745_LIL_An_Architecture-Neutral_Language_for_Virtual-Machine_Stubs

" Table 2: LIL General-Purpose Instructions LIL syntax; Description Arithmetic:

• v = o; Move
• v = uop o; Unary
• v = o1 op o2; Binary

Memory:

• st addr, o; Store
• inc addr; Increment
• cas addr=o1,o2, label; Atomic cmp and swap

Calls:

• call o; Ordinary
• tailcall o; Tail call
• call.noret o; No return
• ret; Return

Branches:

• jc cond, label; Conditional
• j label; Always

Table 3: LIL VM-Specific Instructions LIL syntax Description

M2nFrames:

• push m2n o ( ,handles ) ? ;
• pop m2n;
• m2n save all;

JNI Handles:

• handles = o;

• v = ts;

Allocation:

• alloc v,n; "

## RockSalt RTL

https://web.archive.org/web/20170830050228/http://www.cse.psu.edu/~gxt29//paper/rocksalt.pdf

"RTL is a small RISC-like language for computing with bit-vectors. The language abstracts over an architecture’s definitionof machine state, which in the case of the x86 includes the variouskinds of registers shown in Figure 1 as well as a memory, represented as afinite map from addresses to bytes. Internally, the language supports a countably infinite supply of local variables thatcan be used to hold intermediate bit-vector values. The RTL instruction set is sketched in Figure 3 and includes standard arithmetic, logic, and comparison operations for bit vectors; operations to sign/zero-extend and truncate bit vectors; an operation to load an immediate value into a local variable; operationsto load/store values in local variables from/to registers; operations to load and store bytes into memory; and a special operation fornon-deterministically choosing a bit-vector value of a particular size. We use dependent types to embed the language into Coq and ensure that only bit-vectors of the appropriate size are used in instructions."

Figure 3.The RTL Language

Machine locations: PC

 EAX ... CF ··· SS ...

Local variables: x, y, z \in identifier

Arithmetic operators:

• xor
• shl
• ...

Comparison operators:

• lt
• eq
• gt

RTL instructions:

• x := y op z
• x := y cmp z
• x := imm
• x := load loc
• store loc x
• x := Mem[y]
• Mem[x] := y
• x := choose

## UCSD p-code

https://en.wikipedia.org/wiki/P-code_machine#UCSD_p-Machine

## Transputer

" The transputer instruction set consisted of 8-bit instructions assembled from opcode and operand nibbles. The upper nibble contained the 16 possible primary instruction codes, making it one of the very few commercialized minimal instruction set computers. The lower nibble contained the one immediate constant operand, commonly used as an offset relative to the Workspace (memory stack) pointer. Two prefix instructions allowed construction of larger constants by prepending their lower nibbles to the operands of following instructions. Further instructions were supported via the instruction code Operate (Opr), which decoded the constant operand as an extended zero-operand opcode, providing for almost endless and easy instruction set expansion as newer implementations of the transputer were introduced.

All these instructions take a constant, representing an offset or an arithmetic constant. If this constant was less than 16, all these instructions coded to one byte.

The first 16 'secondary' zero-operand instructions (using the OPR primary instruction) were: Mnemonic Description REV Reverse – swap two top items of register stack LB Load byte BSUB Byte subscript ENDP End process DIFF Difference ADD Add GCALL General Call – swap top of stack and instruction pointer IN Input – receive message PROD Product GT Greater Than – the only comparison instruction WSUB Word subscript OUT Output – send message SUB Subtract STARTP Start process OUTBYTE Output byte – send one-byte message OUTWORD Output word – send one-word message

...

Transputers were intended to be programmed using the programming language occam, based on the communicating sequential processes (CSP) process calculus. The transputer was built to run Occam specifically, more than contemporary CISC designs were built to run languages like Pascal or C.

...

The transputer did not include a memory management unit (MMU) or a virtual memory system...The transputer's lack of support for virtual memory inhibited the porting of mainstream variants of the Unix operating system" -- [39]

---

---

## VTIL

https://github.com/vtil-project/VTIL-Core/blob/0e0f34925590ff6946595653c3ab5564ff5e2259/VTIL-Architecture/arch/instruction_set.hpp

---

## REIL

https://github.com/Cr4sh/openreil#_3

---

## Schneider's IL

"a first-order language IL that re- stricts function application to tail position, and does not allow mutually recursive definitions. We give two semantic interpretations to the language: The functional interpretation IL/F given in Section 3.2 yields a standard, first-order func- tional language with a tail call restriction. The imperative interpretation IL/I given in Section 3.3 reveals a low-level imperative register transfer lan- guage.

3.1 Syntax of IL We emphasize the first-order nature of the language by using different al- phabets for the names of variables and functions. x ranges over V , the alphabet for variables, which denote values of base type. f ranges over L, the alphabet for labels, which we use to denote first-order functions. The language is parametrized over a structure of simple expressions which we call operations and denote by Op. By convention, e ranges over Op. We assume a partial function [[·?]] : (V → V ) * V which evaluates an op- eration given the values of the variables. Evaluation of an operation cannot change the value of a variable, hence operations cannot have side effects. As [[·?]] is a partial function, operation evaluation may err: [[e?]] E = ⊥, but evaluation is deterministic.

Figure 3.1: Syntax of IL Exp \oe s, t ::= let x = e in s first-order let

 if x then {s} else {t} conditional x value fun f x_bar = s in t second-order recursive let fx_bar application

...

The following syntax is an alternative presentation of the syntax of IL, which generates the same abstract syntax trees, but is more suggestive of IL/I’s imperative interpreta- tion:

Exp \in s, t ::= x := e; s assignment

 if x then {s} else {t} conditional return x value block f x {s}; {t} block definition goto f x jump + parallel assignment

---

Footnotes:

(shl)