NOP: No operation.
DUP: Duplicate the stack top. This is the only way to allocate stack space.
ONE: Shift the stack top left one bit, shifting one into the least significant bit.
ZERO: Shift the stack top left one bit, shifting zero into the least significant bit.
LOAD: Use the value on the stack top as a memory address; replace it with the contents of the referenced location.
POP: Store the value from the top of the stack in the memory location referenced by the second word on the stack; pop both.
SUB: Subtract the top value on the stack from the value below it, pop both and push the result.
JPOS: If the word below the stack top is positive, jump to the word pointed to by the stack top. In any case, pop both.

Jones' The Ultimate RISC

http://homepage.cs.uiowa.edu/~jones/arch/risc/

Minimal Forths

http://stackoverflow.com/questions/407987/what-are-the-primitive-forth-operators?rq=1

http://newsgroups.derkeiler.com/Archive/Comp/comp.lang.forth/2005-09/msg00337.html

http://code.google.com/p/subtle-stack/downloads/detail?name=jonesforth.s.txt

http://groups.google.com/group/comp.lang.forth/msg/10872cb68edcb526

" A long time ago, I had a book called "Threaded Interpretive Languages", published I think by Byte, that discussed how to implement a Forth-like language (I don't think they ever called it Forth) in Z80 assembly.

You may not have a Z80 handy, or want one, but the book might be instructive.

+1 for Loeliger's book. The book NEVER said the language was FORTH. Between that book and the Ritter & Walker paper in Byte around the same time, I learned enough to build a reentrant direct-threaded code interpreter for a test set project a few years later at General Dynamics. – John R. Strohm Dec 13 '09 at 4:41

+1 for that book which I love, it was the book which hooked me on the idea of inventing and studying languages – Andy Dent Apr 18 '10 at 10:16

I picked up that book used based on this recommendation. It's great! – Barry Brown

Var'aq

A 'Klingon' language. Stack-based.

4 data types:

number
function
list
string

Interesting stack operations:

REMEMBER: place an opaque 'flag' on the stack
FORGET: Clears the stack down to the flag and pops the flag.

Interesting control flow and variable operations:

by default, names are lookup up if defined, like LISP. There is a quote operator that allow the name itself to be specified. 'variables and procedures live in a common namespace, since the act of pushing the content of a variable is essentially the same as executing the variable's name.'
SET: assign to a name
EVAL
IFYES
IFNO
RETURN
REPEAT

Interesting list operations:

SHATTER 'Reduces a list to its component elements and pushes them on the stack in order'
CONSUME 'Pops all objects on the stack down to mark and returns them in a list.'
empty? 'Examines a list on the stack and returns 1 if its value is null (pagh), a 0 if it contains anything.'
split 'Pops a list off the stack and returns the first item and the rest of the list.'
cons 'Takes an object and adds it to the head of a list. Equivalent to the LISP cons operator. '

Interesting string operations:

strtie 'Concatenates the top two strings on the stack into one.'
compose 'Pops objects (executing proc objects if necessary) off the stack until a marker (placed by qaw) is hit and combines them into one string. '
streq 'Pops the top two strings on the stack, compares them, and returns 1 if identical, 0 if not. '
strcut 'Pops two values and a string and returns the section of the string between character startval and character endval.'
strmeasure 'Pops a string off the stack and returns its length in characters. '
explode 'Separates individual "words" in a string by whitespace.'

Links:

---

BF

The real name is Brainf*, a curse word.

There are 8 commands (from http://web.archive.org/web/20060202010638/http://www.iwriteiam.nl/Ha_BF.html ):

'>' : move the memory pointer to the next cell,
'<' : move the memory pointer to the previous cell,
'+' : increment the memory cell under the memory pointer,
'-' : decrement the memory cell under the memory pointer,
',' : fills the memory cell under the memory pointer with the ASCII value of next character from the input,
'.' : writes the contents of the memory cell under the memory pointer as a character with the corresponding ASCII value,
'[' : moves to the command following the matching ']', if the memory cell under the memory pointer is zero, and
']' : moves to the command following the matching '[', if the memory cell under the memory pointer is not zero.

Thue

http://esolangs.org/wiki/Thue

MicroMini VM

Stack machine.

https://github.com/jarcane/MicroMini/blob/master/docs/MicroMini.txt

Instructions:

NOP - No OPeration
HLT - HaLT?
DATA num
ADD, SUB
AND, OR, XOR, NOT
EQ?, LES?, GRT?
PUSH num
PUFA address - PUsh From Address
PUCA - PUsh CArry
PUTI - PUsh TImer
POP
POTA address - POp To Address
JMP address - JuMP?
JSR address - Jump to SubRoutine?
JIF address - Jump IF
RET - Return
TRMI, TRMO - TeRMinal? Input/Output

---

related to the York Reduceron

F-lite

F-lite is a subset of Haskell.

It is lazy. It is untyped. Constructor expressions like 'Cons x xs' can be used, but Cons does not have to be defined first.

It supports:

function application and (non-anonymous) function definition
uniform pattern matching. "Uniform" means that the order of equations doesn't matter. Patterns can be:
- expressed in a partial function definition style or a case expression style.
- nested
- incomplete
- note: "case patterns containing more than one constructor or wild-card matches", are sugar for nested case expressions, "each dealing with a single constructor component" [1]
'let' expressions (but with only variables, not patterns, on the LHS); multiple bindings within a let expression are permitted
cyclic data structures
finite precision integer and string literals (strings are Nil-terminated lists of characters)
the binary functions + -, and the binary boolean functions <= == /= (/= is inequality). These operators must be written in prefix form and cannot be partially applied.
the function 'emit' to print a character to the console and 'emitInt' to print an integer
'if' syntactic sugar for a 'case' expression with True and False cases

Semicolons are required to separate equations. The function 'main', which returns an integer, is run.

Code examples:

  append Nil ys = ys;
  append (Cons x xs) ys = Cons x (append xs ys);


  zipWith f Nil ys = Nil;
  zipWith f (Cons x xs) Nil = Nil;
  zipWith f (Cons x xs) (Cons y ys) = Cons (f x y) (zipWith f xs ys);

  init (Cons x Nil) = Nil;
  init (Cons x (Cons y ys)) = Cons x (init (Cons y ys));

  pow Nil = Cons Nil Nil;
  pow (Cons x xs) = let { rest = pow xs } in
                      append rest (map (Cons x) rest);


  repeat x = let { xs = Cons x xs } in xs;

  negate n = (-) 0 n;

  fromTo n m = case (<=) n m of {
                 True -> Cons n (fromTo ((+) n 1) m);
                 False -> Nil;
               };


  sayHi k = emit 'h' (emit 'i' (emit '!' k))

Full F-lite program to display the 10th fibonacci number:

  fib n = case (<=) n 1 of {
            True  -> 1;
            False -> (+) (fib ((-) n 2)) (fib ((-) n 1));
          };

  emitStr Nil k = k;
  emitStr (Cons x xs) k = emit x (emitStr xs k);

  main = emitStr "fib(10) = " (emitInt (fib 10) (emit '\n' 0));

F-lite has been extended with generics to form G-lite.

Links:

https://www.cs.york.ac.uk/fp/reduceron/jfp-reduceron.pdf Fig. 1. page 3
- https://www.cs.york.ac.uk/fp/reduceron/memos/Memo9.txt

The reduceron's "template code" (its ISA)

[2] section 2.7 page 8

Example from [3] page 33 PDF page 47:

Flite append function translated from Flite to Template code

Figure 3.3:

append xs ys = case (xs) of {
Cons x xs -> Cons x (append xs ys);
Nil       -> ys; }
main = append Nil Nil;

(0,[FUN 2 1,CON 0 1, PTR 0],[[CON 0 1]])
(2,[ARG 0, TAB 2, ARG 1],[])
(2,[ARG 1],[])
(4,[CON 2 0, ARG 0, PTR 0],[[FUN 2 1, ARG 1, ARG 3]])

" The template number zero (0,[FUN 2 1, CON 0 1, PTR 0],[[CON_0_1?]]) , is the main program, of arity zero. The atom FUN 2 1 indicates a function of arity two and index one, FUN 2 1 is applied to two arguments CON 0 1 which is the encoding for Nil . The atom PTR 0 is a reference to the off-spine application number zero, which contains the second argument of the function append.

Now, the template number one is the function append of arity two (2,[ARG 0, TAB 2, ARG 1],[]) . In the spine application ARG 0 is the case subject xs , the TAB 2 atom is a reference to the case alternatives, for example if the case subject ARG 0 evaluates to Nil then the second template will be evaluated. Otherwise, the third template will be evaluated ((i don't understand this, why shouln't 'CON 2 0' (cons) cause the first one to be evaluated, and 'CON 0 1' (nil) cause the second one to be evaluated?)). The atom ARG 1 is the free variable ((ys)) used in all case alternatives.

The template number three is the recursive case alternative (4,[CON 2 0, ARG 0, PTR 0], [[FUN_2_1,_ARG_1,_ARG_3?]]) . The arity of this template is four, the reason is the special encoding which includes the two arguments ARG 0 and ARG 1 representing x and xs , an argument for case table and one free variable. The first atom CON 2 0 is a constructor which expects two arguments and the index of this constructor is zero. The first argument is ARG 0 and the second is a pointer the application [FUN 2 1, ARG 1, ARG 3] . This application corresponds to the recursive application at the high-level (append xs ys) . "

Types of atoms:

data Atom
 = FUN Arity Int   // Function with arity a and address i
 | ARG Int         // Reference to the i-th the function argument
 | PTR Int         // Pointer to the i-th application
 | CON Arity Int   // Constructor of arity a and index i
 | INT Int         // Integer literal
 | PRI String      // Primitive function (named)
 | TAB Int         // Case table

[4] and [5]

"Each template can be viewed as a low-level encoding of a let expression. The let-bindings are encoded as off-spine applications and the body of the let as the spine-application . Each template ( arity;app;apps ) in Reduceron template code is a tuple of 'arity', spine application 'app' and off-spine applications 'apps'. Instead of names the results of the off-spine applications are referred to by position (eg. PTR 0 for the first off-spine application)." [6]

relating to the CON (constructor) atom, "A variant of Scott/Jansen encoding combines data construction and case selection. A constructor of datatype d translates to a function that in addition to components takes as arguments alternative continuations for each possible d-construction." [7].

The rules for how a computation proceeds in this template language are given in The Reduceron Reconfigured section 3 page 4, or The Reduceron Reconfigured talk slides at ICFP 2010, slides starting with 'Reduceron machine state', or Static Methods to Check Low-Level Code for a Graph Reduction Machine by Marco Polo Perez Cervantes (thesis) section 3.4 page 34 PDF page 48, or The Reduceron reconfigured and re-evaluated by Matthew Naylor and Colin Runciman section 3 page 11 (also at https://github.com/tommythorn/Reduceron/blob/666e78d1ec4a9fd42438a6f4f45bd861a5d2c1b0/semantics/Reduceron.lhs#L127).

To summarize, the machine has a program code memory (the templates), a heap (storing various expressions; PTRs are indices into the heap), a reduction stack (the current expression being reduced), and an update stack (stating where on the heap to write back the current expression after it has been reduced). When the TOS (top of stack) of the reduction stack is a PTR, the PTR is dereferenced and the corresponding expression is copied from the heap onto the TOS of the reduction stack, and an entry is pushed onto the update stack to remember to save the result back to that heap location after this expression has been reduced. At each step, we check to see if the TOS of the reduction stack has been reduced, and if so, we write it back to the heap as directed by the update stack. Otherwise, we evaluate primitives, apply functions, and turn case tables into functions to be applied.

Links:

---

Simpl/Com

" The syntax of Simpl is defined by the polymorphic data type ( 0 s, 0 p, 0 f ) com , where 0 s Definition 2.1 Syntax of Simpl I is the state space type, 0 p the type of procedure names and 0 f the type of faults. The constructors are listed in the following table, where c , c 1 and c 2 are Simpl commands of type ( 0 s, 0 p, 0 f ) com and the Boolean condition b and the guard g are modelled as state sets of type 0 s set .

Skip Do nothing Basic f Basic command, where f is a state-update: 0 s ⇒ 0 s Seq c 1 c 2 Sequential composition, also written as c 1 ; c 2 Cond b c 1 c 2 Conditional statement (if-then-else) While b c Loop Call p Procedure call, p is of type 0 p Guard f g c Guarded command, fault f is of type 0 f Throw Initiate abrupt termination Catch c 1 c 2 Handle abrupt termination Spec r Basic (nondeterministic) command defined by the relation r of type ( 0 s × 0 s ) set DynCom? c s Dynamic (state dependent) command, where c s has type 0 s ⇒ ( 0 s, 0 p, 0 f ) com

... Runtime faults are modelled by the guarded command Guard f g c , where g is the guard that watches for runtime faults that may occur in c . If g is violated the fault f is signalled and the computation stops.

The core procedure call Call p is parameterless. Parameter passing is imple- mented in section 2.4 .

Throw and Catch are the basic building blocks to model various kinds of abrupt termination, like break , continue , return and exceptions. In Catch c 1 c 2 the com- mand c 2 can be seen as handler. It is only executed if c 1 terminates abruptly.

Command Spec r defines the potential next states via the relation r . It can be used to model nondeterministic choice or in the context of refinement [ 72 , 28 ] it can represent a specification of a piece of code rather than an implementation.

The dynamic command DynCom? c s allows to abstract a command over the state space. It can be used for di ff erent purposes: To implement scoping, parameter passing, expressions with side-e ff ects or “real” dynamic construct like pointers to procedures or dynamic method invocations. Details follow in section 2.4 .

The set of Simpl commands is not minimal. A Skip can be implemented by Basic ( λ s . s ), the dynamic command DynCom? can be used to implement the condi- tional, and the Spec command can be used to implement the Basic command. This separation reflects the di ff erent concepts behind the commands."

...

2.4.2 Concrete Syntax for Simpl To improve readability of Simpl programs I introduce pseudo-code syntax. First of all let me address the assignment. With the state represented as record, an assignment m = m - 1 is mapped to a Basic command that performs a record update:

Basic ( λ s . s (

m : = m s − 1

)).

...

concrete syntax abstract syntax SKIP Skip m := e Basic ( λ s . s (

c 1 ; c 2 Seq c 1 c 2 IF b THEN c 1 ELSE c 2 FI Cond { s . b } c 1 c 2 IF b THEN c 1 FI Cond { s . b } c 1 Skip WHILE b DO c OD While { s . b } c TRY c 1 CATCH c 2 END Catch c 1 c 2 g → c Guard False g c {

m : = e	))
P	} { s . P }

--- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.388&rep=rep1&type=pdf , section 2.1, page 16, PDF page 30

" Basic models the atomic transitions in the embedded language’s state space, e.g. variable assignment or heap update.

Seq provides sequential composition,

Cond and While conditional and iterative statements respectively. It should be observed that the condition expressions are side-effect free.

Spec can be used to describe non-deterministic transitions, but we do not make any use of this in the following.

Function call semantics map to Call and DynCom? statements.

Throw and Catch give an exception mechanism with which one can model abrupt termination in language statements like return .

Finally, Guard s lead to failure semantics if guard expressions do not hold prior to execution of parameter programs. We drop the χ parameter when describing Guard programs in the following, as from a soundness point-of-view the cause of failure semantics is irrelevant " -- [8]

---

HITECO

A pedagogical self-compiling compiler.

[Chapter 21, Self-compiling Compilers, of Symbolic Processing In Pascal, by Professor Manfred von Thun

---

Bcompiler

A pedagogical self-compiling compiler by Edmund Grimley Evans.

https://github.com/certik/bcompiler

There were a number of bootstrapping stages, divided into 5 larger phases labeled HEX1 through HEX5.

In HEX4, there are labels, 'push constant', 'push label address', 'call label address', and the following predefined functions:

Stack manipulation: drop dup rot pick swap
Arithmetic: + - * / % 1 log
Comparisons: < <= == != >= >
Bitwise logic: &

^ ~

Memory access: @ = c@ c=
Flow of control, using immediate relative address: branch call
Flow of control, using stored absolute address: call
Address conversion: address
Array support: [] []& []= c[] c[]& c[]=
Access of arguments and variables: arg arg& arg= var var& var=
Function support: enter vars xreturnx xreturn0 xreturn1
Dynamic memory: wsize sbrk / malloc free realloc
System calls: exit in out

(and jump, which doesn't appear to be in that table; and presumably other stuff)

HEX5 contains (presumably in addition to HEX4): string constants, named variables, if/then/else, while loops with break and continue, and probably other stuff too.

Lox

Pedagogical language from Crafting Interpreters

http://craftinginterpreters.com/the-lox-language.html

C-style syntax. Dynamic. Automatic memory management.

Types: booleans, numbers, strings, nil

Statements and expressions.

Expressions:

+, -, *, /, also - (as arithmetic negation)

< <= > >= == !=

! and or ('and' and 'or' are short-circuiting and so are also control flow operators)

statements:

'print'

blocks {}

grouping by parens ()

variable declaration with 'var'

variable assignment with '='

if, while, for

functions. Define functions with 'fun' keyword. Functions are first-class. Closures.

classes with (single inheritance) subclasses, fields, methods, 'this', 'init' methods, 'super'. The syntax for class declaration is the 'class' keyword. The syntax for construction is just use the classname, eg "var breakfast = Breakfast();" The syntax for subclasses is '<', eg:

" class Brunch < Breakfast { drink() { print "How about a Blood Mary?"; } } "

stdlib: 'print', 'clock'

token types:

" enum TokenType? { Single-character tokens. LEFT_PAREN, RIGHT_PAREN, LEFT_BRACE, RIGHT_BRACE, COMMA, DOT, MINUS, PLUS, SEMICOLON, SLASH, STAR,

  // One or two character tokens.
  BANG, BANG_EQUAL,
  EQUAL, EQUAL_EQUAL,
  GREATER, GREATER_EQUAL,
  LESS, LESS_EQUAL,

  // Literals.
  IDENTIFIER, STRING, NUMBER,

  // Keywords.
  AND, CLASS, ELSE, FALSE, FUN, FOR, IF, NIL, OR,
  PRINT, RETURN, SUPER, THIS, TRUE, VAR, WHILE,

  EOF} " -- [9]

---

Lithp

https://web.archive.org/web/20050219051229/http://www.cis.rit.edu/~jerry/Software/lithp/README

"a basic, tiny LISP implementation. It was created to be a configuration/logic file format for a game I am working on."

# # foo comment - skipped during parsing
; ; foo comment - skipped during parsing
A (A) returns the value

Variables, Functions:

set (set A B ... ) sets A to B, evaluating both. Returns B.
setf (setf A B ... ) sets A to B, evaluating only B. Returns B.
setq (setq x 4 y 3) same as 'setf'
defun (defun A B C) defines a function called A with parameter list B and code block C. Returns A.
(A X Y) Calling the above (if there were two parameters.) Wrong number of parameters returns NIL w/o evaluating

Numbers:

+ (+ A B C) add a list
- (- A B C) subtract a list
* (* A B C) multiply a list
/ (/ A B C) divide a list
% (% A B) After A is divided by B, what's the leftover?
1+ (1+ A) returns the number plus one
1- (1- A) returns the number minus one

Comparisons:

< (< A B) returns T if A < B, otherwise NIL
<= (<= A B) returns T if A <= B, otherwise NIL
> (> A B) returns T if A > B, otherwise NIL
>= (>= A B) returns T if A >= B, otherwise NIL
= (= A B) returns T if A == B, otherwise NIL
and (and A B) eval's the arguments until it hits a NIL
or (or A B) eval's the arguments while args are NIL
not (not A) returns the opposite of A (T->NIL),(NIL->T)
null (null A) same as 'not'
if (if A B C) if A is true, then B else C. if there's no C, return NIL
unless (unless A B C) unless A is true, do B, C and any others.
when (when A B C) when A is true, do B, C and any others.
cond (cond (A B C)) if A is true then do B,C... otherwise, try the next set

Evaluations:

eval (eval (A B)) evaluates (A B) as if it were directly input
prog1 (prog1 A B C) evaluates all parts, returns the first's return value
prog2 (prog2 A B C) evaluates all parts, returns the second's return value
progn (progn A B C) evaluates all parts, returns the last's return value

Lists:

quote quote (A B) returns the element instead of evaluating it
' '(A B) same as 'quote'
atom (atom E) returns T if E evaluates to an atom, not a list
equal (equal A B) returns T if A and B have the same structure and atoms
car (car E) returns the head of list E
cdr (cdr E) returns all but the head of list E
cons (cons A B) returns a appended to the head of list B
list (list A B) returns a list of the elements as passed in

Output:

princ (princ A B) print out the list entries and atoms
terpri (terpri) print out a new line (terminate printing)

---

META II 'order codes'

the following is quoted from [10], following figures 6.1 and 6.2 of [11] D. V. Schorre, "META II: A Syntax-Oriented Compiler Writing Language," Proceedings of the 19th ACM National Conference (1964), ACM Press, New York, NY, 41.301-41.3011.] http://doi.acm.org/10.1145/800257.808896

Below is a list of all of the operators of the META II parsing machine as given in [Schorre64]. These are called order codes just as the assembly language operations of a computer of that era were called. This is the language that appears in the Code window. These are the op codes of the META II virtual machine. These are the objects and operations of the domain of META II compiling. They are complete in themselves. To move Meta II to a different language these and only these are the objects and operations that must be implemented in the target language.

Mnemonic Purpose Actions

TST 'string': Test for string in input. After skipping initial whitespace in the input string compare it to the string given as argument. If the comparison is met, skip over the string in the input and set switch. If not met, reset switch.
ID: Identifier token. After skipping initial whitespace in the input string, test if it begins with an identifier, i.e., a letter followed by a sequence of letters and/or digits. If so, copy the identifier to the token buffer; skip over it in the input; and set switch. If not, reset switch.
NUM: Number token. After deleting initial whitespace in the input string, test if it begins with an number, i.e., a sequence of digits. If so, copy the number to the token buffer; skip over it in the input; and set switch. If not, reset switch.
SR: String token. After deleting initial whitespace in the input string, test if it begins with an string, i.e., a single quote followed by a sequence of any characters other than a single quote followed by another single quote. If so, copy the string (including enclosing quotes) to the token buffer; skip over it in the input; and set switch. If not, reset switch.
CLL AAA: Call subroutine. Enter the subroutine beginning at label AAA. Push a stackframe of three cells on the stack containing:

    label 1 cell, initialized to blank
    label 2 cell, initialized to blank
    location cell, set to the return from call location

R: Return from subroutine. Return from CLL call to location on the top of the stack and pop the stackframe of three cells.
SET: Set switch. Set the switch to true.
B AAA: Unconditional branch. Branch unconditionally to the label AAA.
BT AAA: Branch if true. If the switch is true, branch to label AAA.
BF AAA: Branch if false. If the switch is false, branch to label AAA.
BE: Branch to error if false. If the switch is false, report error status and halt.
CL 'string': Copy literal. Copy the variable length string (without enclosing quotes) given as argument to the output buffer.
CI: Copy input. Copy the token buffer to the output buffer.
GN1: Generate label 1. If the label 1 cell in the top stackframe is blank, then generate a unique label and save it in the label 1 cell. In either case output the label.
GN2: Generate label 2. Same as for GN1 except acting on the label 2 cell.
LB: Move to label field. Set the output buffer column to the first column.
OUT: Output record. Output the output buffer with line terminator; clear it; and set the output buffer column to the eighth column.

META II Pseudo Operations (following figure 6.3 of [Schorre64]):

ADR AAA: Starting location. Pseudo operation that specifies the starting label to call.
END: End of source. Pseudo operation that specifies the end of input.

---

VALGOL II 'order codes'

I think VALGOL II is implemented with META II in the paper [12] D. V. Schorre, "META II: A Syntax-Oriented Compiler Writing Language," Proceedings of the 19th ACM National Conference (1964), ACM Press, New York, NY, 41.301-41.3011.] http://doi.acm.org/10.1145/800257.808896 .

NOTE 1: If the top item in the stack is an address, it is replaced by its contents before beginning this operation

NOTE 2: Same as note 1 but applies to the top two items

LD AAA: LOAD. Put the address aaa on top of the stack.
LDL NUMBER: LOAD LITERAL. Put the given number on top of the stack.
SET: SET. Put the integer 1 on top of the stack.
RST: RESET. Put the integer 0 on top of the stack.
ST: STORE. Store the contents of the register, STACK1, in the address which is on top of the stack, then pop up the stack.
ADS: ADD TO STORAGE; NOTE 1. Add the number on top of the stack to the number whose address is next to the top, and place the sum in the register STACK1, then store the contents of that register in that address and pop the top two items out of the stack.
SST: SAVE AND STORE; NOTE 1. Put the number on top of the stack into the register STACK1, then store the contents of that register in the address which is the next to top term of the stack. The top two items are popped out of the stack.
RSR: RESTORE. Put the contents of the register STACK1 on top of the stack.
ADD: ADD; NOTE 2. Replace the two numbers which are on top of the stack with their sum.
SUB: SUBTRACT; NOTE 2. Subtract the number which is on top of the stack fron the number which is next to the top, then replace then by this difference.
NLT: MULTIPLY; NOTE 2. Replace the two numbers which are on top of the stack with their product.
DIV: DIVIDE; NOTE 2. Divide the number which is next to the top of the stack by the number which is on top of the stack, then replace them by this quotient.
NEG: NEGATE NOTE 1. Change the sign of the number on top of the stack.
WHL: WHOLE. Truncate the number which is on top of the stack.
NOT: NOT. If the top term in the stack is the integer 0, then replace it with the integer 1 otherwise, replace it with the integer 0.
LEQ: LESS THAN OR EQUAL; NOTE 2. If the number which is next to the top of the stack is less than the number on top of the stack, then replace them with the integer 1. Otherwise, replace them with the integer 0.
LES: LESS THAN; NOTE 2. If the number which is next to the top of the stack is less than the number on top of the stack, then replace them with the integer 1, otherwise, replace them with the integer 0.
EQU: EQUAL NOTE 2. Compare the two numbers on top of the stack. replace them by the integer 1, if they are equal, or by the integer 0 if they are unequal.
B AAA. BRANCH. Branch to the address AAA.
BT AAA: BRANCH TRUE. Branch to the address AAA if the top term in the stack is not the integer 0, otherwise, continue in sequence. Do not pop up the stack.
BF AAA: BRANCH FALSE. Branch to the address AAA if the top term in the stack is the integer 0, otherwise, continue in sequence. Do not pop up the stack.
BTP AAA: BRANCH TRUE AND POP. Branch to the address aaa if the top term in the stack is not the integer 0. otherwise, continue in sequence. In either case, pop up the stack.
BFP AAA: BRANCH FALSE AND POP. Branch to the address AAA if the top term in the stack is the integer 0, otherwise, continue in sequence. In either case, pop up the stack.
CLL: CALL. Enter a procedure at the address which is below the flag.
LDF: LOAD FLAG. Put the address which is in the flag register on top of the stack, and put the address of the top of the stack into the flag register.
R: AAA: RETURN. Return from procedure.
AIA: ARRAY INCREMENT ADDRESS. Increment the address which is next to the top of the stack by the integer which is on top of the stack, and replace these by the resulting address.
FLP: FLIP. Interchange the top two terns of the stack.
POP: POP. Pop up the stack.
EDT: STRING EDIT; NOTE 1. Round the number which is on top of the stack to the nearest integer N. Move the given string into the print area so that its first character falls on print position N. In case this would cause characters to fall outside the print area, no movement takes place.
PRT: PRINT. Print a line, then space and clear the print area.
EJT: EJECT. Position the paper in the printer to the top line of the next page. Read the first n numbers from a card and store them beginning in the address which is next to the top of the stack. The integer N is the top term of the stack, pop out both the address and the integer. Cards are punched with up to 10 eight digit numbers. Decimal point is assumed to be in the middle. An 11-punch over the rightmost digit indicates a negative number.
RED: READ
WRT: WRITE. Print a line of n numbers beginning in the address which is next to the top of the stack. the integer n is the top term of the stack. Pop out both the address and the integer. twelve character positions are allowed for each number. There are four digits before and four digits after the decimal. Leading zeroes in front of the decimal are changed to blanks. If the number is negative, a minus sign is printed in the position before the first non-blank character.
HLT: HALT. Halt.

CONSTANT AND CONTROL CODES:

SP N: SPACE. N = 1--9; constant code producing n blank spaces.
BLK NNN: BLOCK. Produces a block of NNN eight character words.
END: END. Denotes the end of the program.

---

IBNIZ

A VM designed for size-constrained demoscene 'demos'; "a programming language that generates video and audio from very short strings of code".

" two-stack machine somewhat similar to Forth, but with the major execption that the stack is cyclical and also used at an output buffer. Also, as every IBNIZ program is implicitly inside a loop that pushes a set of loop variables on the stack on every cycle, even an empty program outputs something (i.e. a changing gradient as video and a constant sawtooth wave as audio). ... runs the program in two separate contexts with separate stacks and internal registers: the video context and the audio context ... The scheduling and main-looping logic is the only somewhat complex thing in IBNIZ. All the rest is very elementary, something that can be found as instructions in the x86 architecture or as words in the core Forth vocabulary. Basic arithmetic and stack-shuffling. Memory load and store. An if/then/else structure, two kinds of loop structures and subroutine definition/calling. Also an instruction for retrieving user input from keyboard or pointing device. Everything needs to be built from these basic building blocks. And yes, it is Turing complete, and no, you are not restricted to the rendering order provided by the implicit main loop. " [13]

" Word width: 32 bits (arithmetic in 16.16 fixed-point) Address space: 2^20 words (4 megabytes, ~3 of which free user RAM) Video output: 256x256 pixels at 60 Hz, 32 bits per pixel (VVUU.YYYY) Audio output: 61440 Hz mono (30720 Hz stereo), 16 bits per sample " [14]

Instructions (from [15]):

arithmetic: add sub mul div mod sqrt
bitwise: and or xor right (ROR) left (<<) neg
trig: sin atan
conditional arithmetic: isneg ispos iszero
stack: dup pop exchange (SWAP) trirot (ROT) pick bury
memory: load, store
data segment: getdata (can be used for reading the data segment sequentially. It fetches the given number of next bits from the data segment) startdata (end code segment, start data segment)
control if: if else endif
control loops:
- times (i0 --) loop i0 times (push i0 and insptr on rstack)
- loop decrement RSTACK[top-1], jump back if non-0
- index (-- i) load value from RSTACK[top-1]
- outdex (-- j) load value from RSTACK[top-3]
- do begin loop (push insptr on rstack)
- while (cond --) jump back if cond!=0
- jump (v --) set instruction pointer to value v
control subroutines:
- defsub (i --) define subroutine (store pointer to MEM[i]). The return stack is used for storing the return addresses when visiting subroutines. Defsub ('{') stores the address of the next instruction to the memory address given by the value on top of stack and then skips instructions until '}' or end-of-code is reached.
- return end of subroutine; pop insptr from rstack
- visit (i --) visit subroutine pointed to by MEM[i]
control return stack manipulation:
- retaddr (-- val) (val --) moves from rstack to stack
- pushtors (val --) (-- val) moves from stack to rstack
misc: mediaswitch (switches between audio and video context) whereami (pushes exterior loop variable(s) on stack) userin (get data from input device) terminate (halt)

---

TIS-100

From the game TIS-100

http://www.zachtronics.com/images/TIS-100P%20Reference%20Manual.pdf

https://unwiredben.livejournal.com/308306.html

https://alandesmet.github.io/TIS-100-Hackers-Guide/assembly.html

---

EXA

From the game EXAPUNKS

http://exapunks.wikia.com/wiki/EXA_instructions

---

LC-3

Pedagogical language

Variants:

"The LC-3b ISA describes a modified version of the LC-3 that includes the following changes:

    The machine's word size remains 16 bits, but its memory is now byte-addressable with the same address space.
    The LD and ST instructions (load and store data using PC-relative addressing) have been removed.
    The LDI and STI instructions (indirect loads and stores) use register-based addressing instead of PC-relative addressing.
    Two instructions, LDB and STB, have been added to manipulate individual bytes of memory; the other load and store instructions continue to act on entire words.
    The reserved opcode has been converted into a shift instruction, SHF, that supports arithmetic and logical shifts of arbitrary size in both directions.

" [16]

---

StoneKnifeForth

https://github.com/kragen/stoneknifeforth

bootstrap interpreter in Python:

https://github.com/kragen/stoneknifeforth/blob/master/tinyboot.py

(main part of?) StoneKnifeForth?:

https://github.com/kragen/stoneknifeforth/blob/master/tinyboot1.tbf1

"a very simple language inspired by Forth. It is not expected to be useful; instead, its purpose is to show how simple a compiler can be. The compiler is a bit under two pages of code when the comments are removed. " -- [17]

compile_time_dispatch = { '(': eat_comment, 'v': dataspace_label, ':': define_function, 'b': literal_byte, '#': literal_word, '*': allocate_space, '^': set_start_address, '[': start_conditional, ']': end_conditional, '{': start_loop, '}': end_loop, ' ': eat_byte, '\n': eat_byte, "'": skip_literal_byte, }

...

run_time_dispatch = { '(': jump, 'W': write_out, 'G': read_byte, 'Q': quit, '-': subtract, '<': less_than, '@': fetch, '!': store, # 'f': fetch_byte, not yet needed 's': store_byte, ';': return_from_function, '[': conditional, ']': nop, '{': nop, '}': loop, ' ': nop, '\n': nop, "'": literal_byte, }" -- https://github.com/kragen/stoneknifeforth/blob/master/tinyboot.py

" It’s a dumbed-down Forth with no compile-time execution, in which only the first byte of each word is significant, with some heterodox names for functions. Bytes preceded by ' are literals for their numerical value, e.g. “'A” is 65.

What that means, if you know how to program but don’t know Forth:

Each word of the program is generally a subroutine call. Words are separated by spaces. There are no types. There’s an implicit stack; words pop their input off the stack and push their output onto it. Numbers and literal bytes are pushed onto the stack.

So “2 3 + 4 -” calculates 2 + 3 - 4; “2 3 4 + -” calculates 2 - [3 + 4]. “'Z 'A -” calculates the ASCII value of “Z” minus the ASCII value of “A”, i.e. it calculates 25. [“+” isn’t in the base language implemented by the compiler; it’s a user-defined function. “-” is in the base language.]

“: foo” marks the beginning of a subroutine called “foo”; “;” is the return instruction. So “: . %flush u ;” defines a subroutine called “.” which consists first of calling “%flush”, then calling “u”, then returning.

“var foo” marks the beginning of a variable named “foo”; the word “foo” thereafter will push its address onto the stack. “# 37” includes the number 37 in the program at that point as a 32-bit number, so “var X # 0” creates a variable “X” and puts four bytes of zeroes there.

“@” and “!” fetch from and store to a memory address respectively. So “var X # 0 : dup X ! X @ X @ ;” creates a variable X, gives it 32 bits of storage initialized to 0, and defines a subroutine called “dup”. “dup” takes whatever is on the stack, stores it in those 32 bits of storage at “X”, then fetches it out of X twice, leaving two copies of it on the stack.

Combining all of the above, “var HERE # 0 : forward HERE @ + HERE ! ;” fetches the value at the location “HERE”, adds whatever was on the stack to it, and stores the result back at “HERE”.

Compile-time primitives: ( — defines a comment that extends to the next right-paren v — defines a dataspace label, like for a variable. b — compiles a literal byte, numerically, into data space.

— compiles a literal four-byte little-endian number into data space. ^ — the location where the program should start executing [everything else is just definitions] space, newline — ignored

: — defines a function [ — begins a conditional block, which consumes the value on top of the stack and is skipped if that value is zero ] — ends a conditional block, matching the latest unmatched [ { — begins a loop } — ends a loop — consumes the value on top of the stack and loops back to the matching { if it is nonzero

Run-time primitives: G — gets a byte from stdin W — given an address and size on the stack, writes the specified number of bytes to stdout Q — exits the program with a return code of 0

— subtracts two numbers. [Addition isn't a primitive because it’s easy to synthesize from subtraction.] @ — fetches a word from memory ! — stores to a word in memory whose address is on top of the stack ; — returns from a function < — a less-than test m — not used directly since the interpreter doesn’t implement it, this invokes an arbitrary system call with up to three arguments. The compiler’s implementations of G, W, and Q use this. s — stores to a byte in memory whose address is on top of the stack

Defined in this file, all necessarily run-time: h — the ELF header e — the location of the entry point word in the ELF header o — the location of the word in the ELF program header that specifies the .ORG S — the location of the word in the ELF program header that specifies filesize $ — the start of the area where the program gets compiled + — addition Z — the end of the predefined blob of bytes to output, with slight changes H — the point where the program is currently getting compiled f — increases H by its argument X, Y — two temporary variables d — DUP p — DROP x — SWAP

— an equality test

> — a greater-than test D — dispatch a byte being compiled R — compile the sequence `xchg %ebp, %esp`, buffered by a peephole optimizer . — add a byte to the program being compiled, flushing the peephole buffer u — add a byte to the program being compiled, used to implement the peephole buffer U — the peephole buffer itself, a boolean storing whether there’s a pending `xchg %ebp, %esp` T — table of types of definitions A — table of addresses of definitions F — multiply by four C — register a Colon definition in the tables V — register a Variable definition in the tables a — translate an address from compiler address space to user address space & — compile a byte if it’s a colon or variable definition O — boolean OR N — boolean NOT w — like G, but skips whitespace B — a buffer for w E — stick the item on top of the stack at H as four bytes, advancing H by 4 L — compile code to push the constant on the compile-time stack t — one step in base-ten decoding: x <- 10x + y n — parse a number from input and push it on the compile-time stack J — compile a forward conditional jump [ P — resolve a forward conditional jump ] j — compile a backwards negative conditional jump } M — the location in the compiler’s memory for the system call routine " -- [18]

" The entire “trimmed” source code is 1902 bytes, which is less than half the size of the nearest comparable project that I’m aware of, otccelf, which is 4748 bytes.

As demonstrated by the interpreter written in Python, the programming language itself is essentially machine-independent, with very few x86 quirks:

items on the stack are 32 bits in size;
arithmetic is 32-bit;
stack items stored in memory not only take up 4 bytes, but they are little-endian.

...

Why? To Know What I’m Doing ... "Alan Kay frequently expresses enthusiasm over the metacircular Lisp interpreter in the Lisp 1.5 Programmer’s Manual. ... But if you try to implement a Lisp interpreter in a low-level language by translating that metacircular interpreter into it...you run into a problem. The metacircular interpreter glosses over a large number of things that turn out to be nontrivial to implement — indeed, about half of the code is devoted to things outside of the scope of the metacircular interpreter. Here’s a list of issues that the Lisp 1.5 metacircular interpreter neglects, some semantic and some merely implementational:

memory management
argument evaluation order
laziness vs. strictness
other aspects of control flow
representation and comparison of atoms
representation of pairs
parsing
type checking and type testing (atom)
recursive function call and return
tail-calls
lexical vs. dynamic scoping

...

A metacircular compiler forces you to confront this extra complexity. Moreover, metacircular compilers are self-sustaining in a way that interpreters aren’t; once you have the compiler running, you are free to add features to the language it supports, then take advantage of those features in the compiler.

So this is a “stone knife” programming tool: bootstrapped out of almost nothing as quickly as possible.

Why? To Develop a Compiler Incrementally

When I wrote Ur-Scheme, my thought was to see if I could figure out how to develop a compiler incrementally, starting by building a small working metacircular compiler in less than a day, and adding features from there. I pretty much failed; it took me two and a half weeks to get it to compile itself successfully.

Part of the problem is that a minimal subset of R5RS Scheme powerful enough to write a compiler in — without making the compiler even larger due to writing it in a low-level language — is still a relatively large language. Ur-Scheme doesn’t have much arithmetic, but it does have integers, dynamic typing, closures, characters, strings, lists, recursion, booleans, variadic functions, let to introduce local variables, character and string literals, a sort of crude macro system, five different conditional forms (if, cond, case, and, or), quotation, tail-call optimization, function argument count verification, bounds checking, symbols, buffered input to keep it from taking multiple seconds to compile itself, and a library of functions for processing lists, strings, and characters. And each of those things was added because it was necessary to get the compiler to be able to compile itself. The end result was that the compiler is 90 kilobytes of source code, about 1600 lines if you leave out the comments.

Now, maybe you can write 1600 lines of working Scheme in a day, but I sure as hell can’t. It’s still not a very large compiler, as compilers go, but it’s a lot bigger than otccelf. So I hypothesized that maybe a simpler language, without a requirement for compatibility with something else, would enable me to get a compiler bootstrapped more easily.

So StoneKnifeForth? was born. It’s inspired by Forth, so it inherits most of Forth’s traditional simplifications:

no expressions;
no statements;
no types, dynamic or static;
no floating-point (although Ur-Scheme doesn’t have floating-point either);
no scoping, lexical or dynamic;
no dynamic memory allocation;

And it added a few of its own:

no names of more than one byte;
no user-defined IMMEDIATE words (the Forth equivalent of Lisp macros);
no interpretation state, so no compile-time evaluation at all;
no interactive REPL;
no do loops;
no else on if statements;
no recursion;
no access to the return stack;
no access to the filesystem, just stdin and stdout;
no multithreading.

Surprisingly, the language that results is still almost bearable to write a compiler in, although it definitely has the flavor of an assembler.

Unfortunately, I still totally failed to get it done in a day. It was 15 days from when I first started scribbling about it in my notebook until it was able to compile itself successfully, although git only shows active development happening on six of those days (including some help from my friend Aristotle). So that’s an improvement, but not as much of an improvement as I would like. At that point, it was 13k of source, 114 non-comment lines of code, which is definitely a lot smaller than Ur-Scheme’s 90k and 1600 lines. (Although there are another 181 lines of Python for the bootstrap interpreter.)

It’s possible to imagine writing and debugging 114 lines of code in a day, or even 300 lines. It’s still maybe a bit optimistic to think I could do that in a day, so maybe I need to find a way to increase incrementality further.

My theory was that once I had a working compiler, I could add features to the language incrementally and test them as I went. So far I haven’t gotten to that part.

...

Related work

Andre Adrian’s 2008 BASICO: http://www.andreadrian.de/tbng/

        is a small imperative programming language that is just powerful enough to compile itself (compiler bootstrapping).
        has no GOTO, but has while-break-wend and multiple return
        has C-like string handling.
        is implemented in less then 1000 source code lines for the compiler.
        produces real binary programs for x86 processors, not P-code or Byte-Code.
        uses the C compiler toolchain (assembler, linker)
        uses C library functions like printf(), getchar(), strcpy(), isdigit(), rand() for run-time support.

Actually it produces assembly, not executables.

Version 0.9 was released 15 Jul 2006. The 1000 source lines include a recursive-descent parser and a hand-coded lexer.

Sample code:

return 1 if ch is in s, 0 else func in(ch: char, s: array char): int var i: int begin i = 0 while s[i] # 0 do if ch = s[i] then return 1 endif i = i + 1 wend return 0 end

FIRST and THIRD, from the IOCCC entry.

Ian Piumarta’s COLA system.

Oberon.

Fabrice Bellard’s OTCC.

F-83.

eForth, especially the ITC eForth.

Jack Crenshaw’s Let’s Build a Compiler. This is a how-to book that walks you through an incrementally-constructed compiler for a toy language, written in Pascal, in about 340 pages of text. The text is really easy to read, but it will still take at least three to ten hours to read. It uses recursive-descent parsing, no intermediate representation, and it emits 68000 assembly code.

Ikarus.

Ur-Scheme.

PyPy?.

Bootstrapping a simple compiler from nothing: Edmund GRIMLEY EVANS 2001 http://web.archive.org/web/20061108010907/http://www.rano.org/bcompiler.html

" -- [19]

" /* StoneKnifeForth? (currently) only uses the following 23 * instructions, in decimal: * * - 15 157 192: setge %al * - 15 182 0: movzbl (%eax), %eax * - 15 190 192: movsbl %al, %eax * - 41 4 36: sub %eax, (%esp) * - 80: push %eax * - 88: pop %eax * - 89: pop %ecx * - 90: pop %edx * - 91: pop %ebx * - 116: jz (short, PC-relative) (1-byte operand) * - 117: jnz (short, PC-relative) (1-byte operand) * - 129 237: subtract immediate from %ebp (4-byte operand) * - 133 192: test %eax, %eax * - 135 236: xchg %ebp, %esp * - 136 8: mov %cl, (%eax) * - 137 229: mov %esp, %ebp * - 139 0: mov (%eax), %eax * - 143 0: popl (%eax) * - 184: load immediate %eax (e.g. mov $12345678, %eax) (4-byte operand) * - 195: ret * - 205 128: int $0x80 * - 232: PC-relative call to absolute address (4-byte operand) * - 254 200: dec %al * * Moreover, jz and jnz are always preceded two instructions * previously by a test instruction, and setge is always preceded two * instructions previously by a sub instruction, so of the flags, only * ZF and SF are used, and their state is never saved or restored or * transferred across jumps. * */ " -- [20]

" Register-use conventions: `%eax` — the top of the operand stack `%esp` — normally, points to the rest of the operand stack in memory, in the usual `%esp` way: growing downward, predecrement, postincrement. Temporarily swapped with `%ebp` during `call` and `ret` instructions. `%ebp` — normally, points to the return stack in memory, in the usual `%esp` way. `%ecx` — occasionally used as a scratch register Other registers are only used to pass arguments to Linux system calls. ) " -- [21]

Inertia register machine

https://github.com/Conceptual-Inertia/Inertia/

https://github.com/Conceptual-Inertia/Inertia/blob/master/docs/instr_file.txt

used in A Performance Survey on Stack-based and Register-based VirtualMachines

i am assuming the following details from reading the above-referenced instr_file.txt; i could have misunderstood:

16 instructions: add div mul ltn (less-than) eql and not or inc dec print load goto if (conditional skip) return call

4 registers

3-address code

the three operands fields have 4 modes: immediate constant, register direct, absolute address in memory, register indirect

Wirth p-code example machine

https://en.wikipedia.org/wiki/P-code_machine#Example_machine

CHIFIR

http://www.vpri.org/pdf/tr2015004_cuneiform.pdf

Table 1, pg. 10:

" Table 1. Chifir instruction set Opcode Semantics 1 PC←M?[A] 2 If M[B] = 0, then PC←M?[A] 3 M[A]←PC 4 M[A]←M[B] 5 M[A]←M[M[B]] 6 M[M[B]]←M[A] 7 M[A]←M[B] + M[C] 8 M[A]←M[B] - M[C] 9 M[A]←M[B]×M[C] 10 M[A]←M[B]÷M[C] 11 M[A]←M[B] modulo M[C] 12 If M[B] <M[C], then M[A]←1, else M[A]←0 13 M[A]←NOT(M[B] AND M[C]) 14 Refresh the screen 15 Get one character from the keyboard and store it in M[A] "

---

L2

https://github.com/murisi/L2

---

T3X

http://t3x.org/t3x/

---

INTCODE

http://www.gtoal.com/languages/bcpl/amiga/bcpl/booting.txt

---

Linoleum

https://en.wikibooks.org/wiki/Linoleum

---

Maru

https://www.piumarta.com/software/maru/

---

MIPSter

http://www.cs.uni-salzburg.at/~ck/content/publications/conferences/Onward17-Selfie.pdf

" MIPSter is a tiny subset of MIPS32. There are only 16 MIP-Ster instructions (nop,addu,subu,multu,divu,mfhi,mflo,slt,jr,syscall,addiu,lw,sw,beq,jal,j). MIPSter re-flects what is needed to implement C*. In particular, thereare no bitwise instructions and no sub-word data transferinstructions.The only issue is with signed integer literals in C* (32 bits)that are too large to fit as immediate values in MIPSter (16bits). Such values are decomposed by the compiler into thelargest numerals in base two that can still be loaded into reg-isters and then reconstructed with the available instructions.MIPSter itself can be taught in around two weeks of classeseven to audiences with little architecture background "

From the Selfie project

---

Woolite

Educational language in a compiler course.

http://www.cs.williams.edu/~tom/courses/434/project/woolite/Woolite.pdf

---

Pocketlang

https://thakeenathees.github.io/pocketlang/getting-started-learn-in-15-minutes.html https://thakeenathees.github.io/pocketlang/index.html

---

Bauer and Pretnar's Programming Languages Zoo

https://plzoo.andrej.com/index.html

"a collection of miniature programming languages which demonstrates various concepts and techniques used in programming language design and implementation"

---

nand2tetris Jack

https://www.csie.ntu.edu.tw/~cyy/courses/introCS/14fall/lectures/handouts/lec13_Jack_4up.pdf

" Jack language specification

Syntax
Data types
Variable kinds
Expressions
Statements
Subroutine calling
Program structure
Standard library

(for complete language specification, see the book).

Jack syntax ...

Jack data types

Primitive types (Part of the language; Realized by the compiler):

int
boolean
char

Abstract data types (Standard language extensions; Realized by the OS / standard library):

String
Array ... (extensible)

Jack variable kinds and scope ...

A Jack expression is any one of the following:

A constant
A variable name in scope (the variable may be static, field, local, or a parameter)
The keyword this, denoting the current object
An array element using the syntax arrayName[expression], where arrayNname is a variable name of type Array in scope
A subroutine call that returns a non-void type
An expression prefixed by one of the unary operators – or ~ :
‐expression (arithmetic negation)
~expression (logical negation)
An expression of the form expression op expression where op is one of the following:
+ ‐ * / (integer arithmetic operators)
&

(boolean and and or operators, bit-wise)

< > = (comparison operators)
( expression ) (an expression within parentheses

Jack Statements let varName = expression; or let varName[expression] = expression;

if (expression) { statements } else { statements }

while (expression) { statements } do function-or-method-call;

return expression; or return;

Jack subroutine calls General syntax: subroutineName(arg0, arg1, ... ) where each argument is a valid Jack expression Parameter passing is by-value (primitive types) or by-reference (object types)

Example 1: Consider the function (static method): function int sqrt(int n) This function can be invoked as follows: sqrt(17) sqrt(x) sqrt((b * b) – (4 * a * c)) sqrt(a * sqrt(c ‐ 17) + 3) Etc. In all these examples the argument value is computed and passed by-value

Example 2: Consider the method: method Matrix plus (Matrix other); If u and v were variables of type Matrix, this method can be invoked using: u.plus(v) The v variable is passed by-reference, since it refers to an object.

Noteworthy features of the Jack language

The (cumbersome) let keyword, as in let x = 0;
The (cumbersome) do keyword, as in do reduce();
No operator priority: 1 + 2 * 3 yields 9, since expressions are evaluated left-to-right; To effect the commonly expected result, use 1 + (2 * 3)
Only three primitive data types: int, boolean, char; In fact, each one of them is treated as a 16-bit value
No casting; a value of any type can be assigned to a variable of any type
Array declaration: Array x; followed by x = Array.new();
Static methods are called function
Constructor methods are called "constructor"; Invoking a constructor is done using the syntax ClassName?.new(argsList)

Q: Why did we introduce these features into the Jack language? A: To make the writing of the Jack compiler easy! Any one of these language features can be modified, with a reasonable amount of work, to make them conform to a more typical Java-like syntax.

Jack program structure class ClassName? { field variable declarations; static variable declarations; constructor type { parameterList ) { local variable declarations; statements } method type { parameterList ) { local variable declarations; statements } function type { parameterList ) { local variable declarations; statements } }

About this spec:

Every part in this spec can appear 0 or more times
The order of the field / static declarations is arbitrary
The order of the subroutine declarations is arbitrary
Each type is either int, boolean, char, or a class name

A Jack program:

Each class is written in a separate file (compilation unit)
Jack program = collection of one or more classes, one of which must be named Main
The Main class must contain at least one method, named main()

" -- [22]

Jack variable kinds are: static, field (ie an object's field), local, parameter variables

Hurl

https://ntietz.com/blog/introducing-hurl/

all control flow via resumable exceptions

GUIX boostrapping

https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/

proj-plbook-plChMinLangs

LOOP

GOTO

WHILE

BASIC

Visual Basic

Hypercard

Logo

Jones' A Minimal CISC

Jones' The Ultimate RISC

Minimal Forths

Var'aq

BF

Thue

MicroMini VM

related to the York Reduceron

F-lite

The reduceron's "template code" (its ISA)

Simpl/Com

HITECO

Bcompiler

Links

Lox

Lithp

META II 'order codes'

VALGOL II 'order codes'

IBNIZ

TIS-100

EXA

LC-3

StoneKnifeForth

— an equality test

Inertia register machine

Wirth p-code example machine

CHIFIR

L2

T3X

INTCODE

Linoleum

Maru

MIPSter

Woolite

Pocketlang

Bauer and Pretnar's Programming Languages Zoo

nand2tetris Jack

Hurl

GUIX boostrapping