proj-plbook-plChMinLangs

Table of Contents for Programming Languages: a survey

and esoteric langs

LOOP

http://web.archive.org/web/20120311032544/http://loopgotowhile.eugenkiss.com/

GOTO

http://web.archive.org/web/20120311032544/http://loopgotowhile.eugenkiss.com/

WHILE

http://web.archive.org/web/20120311032544/http://loopgotowhile.eugenkiss.com/

BASIC

Because of its simplicity and historical popularity, BASIC gets its own chapter.

See chapter on BASIC.

Visual Basic

Tutorials:

Hypercard

Logo

some notes on this history of logo are in the middle of:

http://web.augsburg.edu/~crockett/210/210%20Lab%20Set%204/Reed_OOP_Epistemology.pdf

Jones' A Minimal CISC

http://david.carybros.com/html/minimal_instruction_set.html

8 instructions:

Jones' The Ultimate RISC

http://homepage.cs.uiowa.edu/~jones/arch/risc/

Minimal Forths

http://stackoverflow.com/questions/407987/what-are-the-primitive-forth-operators?rq=1

http://newsgroups.derkeiler.com/Archive/Comp/comp.lang.forth/2005-09/msg00337.html

http://code.google.com/p/subtle-stack/downloads/detail?name=jonesforth.s.txt

http://groups.google.com/group/comp.lang.forth/msg/10872cb68edcb526

" A long time ago, I had a book called "Threaded Interpretive Languages", published I think by Byte, that discussed how to implement a Forth-like language (I don't think they ever called it Forth) in Z80 assembly.

You may not have a Z80 handy, or want one, but the book might be instructive.

+1 for Loeliger's book. The book NEVER said the language was FORTH. Between that book and the Ritter & Walker paper in Byte around the same time, I learned enough to build a reentrant direct-threaded code interpreter for a test set project a few years later at General Dynamics. – John R. Strohm Dec 13 '09 at 4:41

+1 for that book which I love, it was the book which hooked me on the idea of inventing and studying languages – Andy Dent Apr 18 '10 at 10:16

I picked up that book used based on this recommendation. It's great! – Barry Brown

"

Var'aq

A 'Klingon' language. Stack-based.

4 data types:

Interesting stack operations:

Interesting control flow and variable operations:

Interesting list operations:

Interesting string operations:

Links:

---

BF

The real name is Brainf*, a curse word.

There are 8 commands (from http://web.archive.org/web/20060202010638/http://www.iwriteiam.nl/Ha_BF.html ):

Thue

http://esolangs.org/wiki/Thue

MicroMini VM

Stack machine.

https://github.com/jarcane/MicroMini/blob/master/docs/MicroMini.txt

Instructions:

---

related to the York Reduceron

F-lite

F-lite is a subset of Haskell.

It is lazy. It is untyped. Constructor expressions like 'Cons x xs' can be used, but Cons does not have to be defined first.

It supports:

Semicolons are required to separate equations. The function 'main', which returns an integer, is run.

Code examples:

  append Nil ys = ys;
  append (Cons x xs) ys = Cons x (append xs ys);


  zipWith f Nil ys = Nil;
  zipWith f (Cons x xs) Nil = Nil;
  zipWith f (Cons x xs) (Cons y ys) = Cons (f x y) (zipWith f xs ys);

  init (Cons x Nil) = Nil;
  init (Cons x (Cons y ys)) = Cons x (init (Cons y ys));

  pow Nil = Cons Nil Nil;
  pow (Cons x xs) = let { rest = pow xs } in
                      append rest (map (Cons x) rest);


  repeat x = let { xs = Cons x xs } in xs;

  negate n = (-) 0 n;

  fromTo n m = case (<=) n m of {
                 True -> Cons n (fromTo ((+) n 1) m);
                 False -> Nil;
               };


  sayHi k = emit 'h' (emit 'i' (emit '!' k))

Full F-lite program to display the 10th fibonacci number:

  fib n = case (<=) n 1 of {
            True  -> 1;
            False -> (+) (fib ((-) n 2)) (fib ((-) n 1));
          };

  emitStr Nil k = k;
  emitStr (Cons x xs) k = emit x (emitStr xs k);

  main = emitStr "fib(10) = " (emitInt (fib 10) (emit '\n' 0));

F-lite has been extended with generics to form G-lite.

Links:

The reduceron's "template code" (its ISA)

[2] section 2.7 page 8

Example from [3] page 33 PDF page 47:

Flite append function translated from Flite to Template code

Figure 3.3:

append xs ys = case (xs) of {
Cons x xs -> Cons x (append xs ys);
Nil       -> ys; }
main = append Nil Nil;

(0,[FUN 2 1,CON 0 1, PTR 0],[[CON 0 1]])
(2,[ARG 0, TAB 2, ARG 1],[])
(2,[ARG 1],[])
(4,[CON 2 0, ARG 0, PTR 0],[[FUN 2 1, ARG 1, ARG 3]])

" The template number zero (0,[FUN 2 1, CON 0 1, PTR 0],[[CON_0_1?]]) , is the main program, of arity zero. The atom FUN 2 1 indicates a function of arity two and index one, FUN 2 1 is applied to two arguments CON 0 1 which is the encoding for Nil . The atom PTR 0 is a reference to the off-spine application number zero, which contains the second argument of the function append.

Now, the template number one is the function append of arity two (2,[ARG 0, TAB 2, ARG 1],[]) . In the spine application ARG 0 is the case subject xs , the TAB 2 atom is a reference to the case alternatives, for example if the case subject ARG 0 evaluates to Nil then the second template will be evaluated. Otherwise, the third template will be evaluated ((i don't understand this, why shouln't 'CON 2 0' (cons) cause the first one to be evaluated, and 'CON 0 1' (nil) cause the second one to be evaluated?)). The atom ARG 1 is the free variable ((ys)) used in all case alternatives.

The template number three is the recursive case alternative (4,[CON 2 0, ARG 0, PTR 0], [[FUN_2_1,_ARG_1,_ARG_3?]]) . The arity of this template is four, the reason is the special encoding which includes the two arguments ARG 0 and ARG 1 representing x and xs , an argument for case table and one free variable. The first atom CON 2 0 is a constructor which expects two arguments and the index of this constructor is zero. The first argument is ARG 0 and the second is a pointer the application [FUN 2 1, ARG 1, ARG 3] . This application corresponds to the recursive application at the high-level (append xs ys) . "

Types of atoms:

data Atom
 = FUN Arity Int   // Function with arity a and address i
 | ARG Int         // Reference to the i-th the function argument
 | PTR Int         // Pointer to the i-th application
 | CON Arity Int   // Constructor of arity a and index i
 | INT Int         // Integer literal
 | PRI String      // Primitive function (named)
 | TAB Int         // Case table

[4] and [5]

"Each template can be viewed as a low-level encoding of a let expression. The let-bindings are encoded as off-spine applications and the body of the let as the spine-application . Each template ( arity;app;apps ) in Reduceron template code is a tuple of 'arity', spine application 'app' and off-spine applications 'apps'. Instead of names the results of the off-spine applications are referred to by position (eg. PTR 0 for the first off-spine application)." [6]

relating to the CON (constructor) atom, "A variant of Scott/Jansen encoding combines data construction and case selection. A constructor of datatype d translates to a function that in addition to components takes as arguments alternative continuations for each possible d-construction." [7].

The rules for how a computation proceeds in this template language are given in The Reduceron Reconfigured section 3 page 4, or The Reduceron Reconfigured talk slides at ICFP 2010, slides starting with 'Reduceron machine state', or Static Methods to Check Low-Level Code for a Graph Reduction Machine by Marco Polo Perez Cervantes (thesis) section 3.4 page 34 PDF page 48, or The Reduceron reconfigured and re-evaluated by Matthew Naylor and Colin Runciman section 3 page 11 (also at https://github.com/tommythorn/Reduceron/blob/666e78d1ec4a9fd42438a6f4f45bd861a5d2c1b0/semantics/Reduceron.lhs#L127).

To summarize, the machine has a program code memory (the templates), a heap (storing various expressions; PTRs are indices into the heap), a reduction stack (the current expression being reduced), and an update stack (stating where on the heap to write back the current expression after it has been reduced). When the TOS (top of stack) of the reduction stack is a PTR, the PTR is dereferenced and the corresponding expression is copied from the heap onto the TOS of the reduction stack, and an entry is pushed onto the update stack to remember to save the result back to that heap location after this expression has been reduced. At each step, we check to see if the TOS of the reduction stack has been reduced, and if so, we write it back to the heap as directed by the update stack. Otherwise, we evaluate primitives, apply functions, and turn case tables into functions to be applied.

Links:

---

Simpl/Com

" The syntax of Simpl is defined by the polymorphic data type ( 0 s, 0 p, 0 f ) com , where 0 s Definition 2.1 Syntax of Simpl I is the state space type, 0 p the type of procedure names and 0 f the type of faults. The constructors are listed in the following table, where c , c 1 and c 2 are Simpl commands of type ( 0 s, 0 p, 0 f ) com and the Boolean condition b and the guard g are modelled as state sets of type 0 s set .

Skip Do nothing Basic f Basic command, where f is a state-update: 0 s ⇒ 0 s Seq c 1 c 2 Sequential composition, also written as c 1 ; c 2 Cond b c 1 c 2 Conditional statement (if-then-else) While b c Loop Call p Procedure call, p is of type 0 p Guard f g c Guarded command, fault f is of type 0 f Throw Initiate abrupt termination Catch c 1 c 2 Handle abrupt termination Spec r Basic (nondeterministic) command defined by the relation r of type ( 0 s × 0 s ) set DynCom? c s Dynamic (state dependent) command, where c s has type 0 s ⇒ ( 0 s, 0 p, 0 f ) com

... Runtime faults are modelled by the guarded command Guard f g c , where g is the guard that watches for runtime faults that may occur in c . If g is violated the fault f is signalled and the computation stops.

The core procedure call Call p is parameterless. Parameter passing is imple- mented in section 2.4 .

Throw and Catch are the basic building blocks to model various kinds of abrupt termination, like break , continue , return and exceptions. In Catch c 1 c 2 the com- mand c 2 can be seen as handler. It is only executed if c 1 terminates abruptly.

Command Spec r defines the potential next states via the relation r . It can be used to model nondeterministic choice or in the context of refinement [ 72 , 28 ] it can represent a specification of a piece of code rather than an implementation.

The dynamic command DynCom? c s allows to abstract a command over the state space. It can be used for di ff erent purposes: To implement scoping, parameter passing, expressions with side-e ff ects or “real” dynamic construct like pointers to procedures or dynamic method invocations. Details follow in section 2.4 .

The set of Simpl commands is not minimal. A Skip can be implemented by Basic ( λ s . s ), the dynamic command DynCom? can be used to implement the condi- tional, and the Spec command can be used to implement the Basic command. This separation reflects the di ff erent concepts behind the commands."

...

2.4.2 Concrete Syntax for Simpl To improve readability of Simpl programs I introduce pseudo-code syntax. First of all let me address the assignment. With the state represented as record, an assignment m = m - 1 is mapped to a Basic command that performs a record update:

Basic ( λ s . s (

m : = m s − 1)).

...

concrete syntax abstract syntax SKIP Skip m := e Basic ( λ s . s (

c 1 ; c 2 Seq c 1 c 2 IF b THEN c 1 ELSE c 2 FI Cond { s . b } c 1 c 2 IF b THEN c 1 FI Cond { s . b } c 1 Skip WHILE b DO c OD While { s . b } c TRY c 1 CATCH c 2 END Catch c 1 c 2 g → c Guard False g c {
m : = e ))
P } { s . P }

--- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.388&rep=rep1&type=pdf , section 2.1, page 16, PDF page 30

" Basic models the atomic transitions in the embedded language’s state space, e.g. variable assignment or heap update.

Seq provides sequential composition,

Cond and While conditional and iterative statements respectively. It should be observed that the condition expressions are side-effect free.

Spec can be used to describe non-deterministic transitions, but we do not make any use of this in the following.

Function call semantics map to Call and DynCom? statements.

Throw and Catch give an exception mechanism with which one can model abrupt termination in language statements like return .

Finally, Guard s lead to failure semantics if guard expressions do not hold prior to execution of parameter programs. We drop the χ parameter when describing Guard programs in the following, as from a soundness point-of-view the cause of failure semantics is irrelevant " -- [8]

---

HITECO

A pedagogical self-compiling compiler.

[Chapter 21, Self-compiling Compilers, of Symbolic Processing In Pascal, by Professor Manfred von Thun

---

Bcompiler

A pedagogical self-compiling compiler by Edmund Grimley Evans.

https://github.com/certik/bcompiler

There were a number of bootstrapping stages, divided into 5 larger phases labeled HEX1 through HEX5.

In HEX4, there are labels, 'push constant', 'push label address', 'call label address', and the following predefined functions:

^ ~

(and jump, which doesn't appear to be in that table; and presumably other stuff)

HEX5 contains (presumably in addition to HEX4): string constants, named variables, if/then/else, while loops with break and continue, and probably other stuff too.

Links

---

Lox

Pedagogical language from Crafting Interpreters

http://craftinginterpreters.com/the-lox-language.html

C-style syntax. Dynamic. Automatic memory management.

Types: booleans, numbers, strings, nil

Statements and expressions.

Expressions:

+, -, *, /, also - (as arithmetic negation)

< <= > >= == !=

! and or ('and' and 'or' are short-circuiting and so are also control flow operators)

statements:

'print'

blocks {}

grouping by parens ()

variable declaration with 'var'

variable assignment with '='

if, while, for

functions. Define functions with 'fun' keyword. Functions are first-class. Closures.

classes with (single inheritance) subclasses, fields, methods, 'this', 'init' methods, 'super'. The syntax for class declaration is the 'class' keyword. The syntax for construction is just use the classname, eg "var breakfast = Breakfast();" The syntax for subclasses is '<', eg:

" class Brunch < Breakfast { drink() { print "How about a Blood Mary?"; } } "

stdlib: 'print', 'clock'

token types:

" enum TokenType? { Single-character tokens. LEFT_PAREN, RIGHT_PAREN, LEFT_BRACE, RIGHT_BRACE, COMMA, DOT, MINUS, PLUS, SEMICOLON, SLASH, STAR,

  // One or two character tokens.
  BANG, BANG_EQUAL,
  EQUAL, EQUAL_EQUAL,
  GREATER, GREATER_EQUAL,
  LESS, LESS_EQUAL,
  // Literals.
  IDENTIFIER, STRING, NUMBER,
  // Keywords.
  AND, CLASS, ELSE, FALSE, FUN, FOR, IF, NIL, OR,
  PRINT, RETURN, SUPER, THIS, TRUE, VAR, WHILE,
  EOF} " -- [9]

---

Lithp

https://web.archive.org/web/20050219051229/http://www.cis.rit.edu/~jerry/Software/lithp/README

"a basic, tiny LISP implementation. It was created to be a configuration/logic file format for a game I am working on."

Variables, Functions:

Numbers:

Comparisons:

Evaluations:

Lists:

Output:

---

META II 'order codes'

the following is quoted from [10], following figures 6.1 and 6.2 of [11] D. V. Schorre, "META II: A Syntax-Oriented Compiler Writing Language," Proceedings of the 19th ACM National Conference (1964), ACM Press, New York, NY, 41.301-41.3011.] http://doi.acm.org/10.1145/800257.808896

Below is a list of all of the operators of the META II parsing machine as given in [Schorre64]. These are called order codes just as the assembly language operations of a computer of that era were called. This is the language that appears in the Code window. These are the op codes of the META II virtual machine. These are the objects and operations of the domain of META II compiling. They are complete in themselves. To move Meta II to a different language these and only these are the objects and operations that must be implemented in the target language.

Mnemonic Purpose Actions

    label 1 cell, initialized to blank
    label 2 cell, initialized to blank
    location cell, set to the return from call location

META II Pseudo Operations (following figure 6.3 of [Schorre64]):

---

VALGOL II 'order codes'

I think VALGOL II is implemented with META II in the paper [12] D. V. Schorre, "META II: A Syntax-Oriented Compiler Writing Language," Proceedings of the 19th ACM National Conference (1964), ACM Press, New York, NY, 41.301-41.3011.] http://doi.acm.org/10.1145/800257.808896 .

NOTE 1: If the top item in the stack is an address, it is replaced by its contents before beginning this operation

NOTE 2: Same as note 1 but applies to the top two items

CONSTANT AND CONTROL CODES:

---

IBNIZ

A VM designed for size-constrained demoscene 'demos'; "a programming language that generates video and audio from very short strings of code".

" two-stack machine somewhat similar to Forth, but with the major execption that the stack is cyclical and also used at an output buffer. Also, as every IBNIZ program is implicitly inside a loop that pushes a set of loop variables on the stack on every cycle, even an empty program outputs something (i.e. a changing gradient as video and a constant sawtooth wave as audio). ... runs the program in two separate contexts with separate stacks and internal registers: the video context and the audio context ... The scheduling and main-looping logic is the only somewhat complex thing in IBNIZ. All the rest is very elementary, something that can be found as instructions in the x86 architecture or as words in the core Forth vocabulary. Basic arithmetic and stack-shuffling. Memory load and store. An if/then/else structure, two kinds of loop structures and subroutine definition/calling. Also an instruction for retrieving user input from keyboard or pointing device. Everything needs to be built from these basic building blocks. And yes, it is Turing complete, and no, you are not restricted to the rendering order provided by the implicit main loop. " [13]

" Word width: 32 bits (arithmetic in 16.16 fixed-point) Address space: 2^20 words (4 megabytes, ~3 of which free user RAM) Video output: 256x256 pixels at 60 Hz, 32 bits per pixel (VVUU.YYYY) Audio output: 61440 Hz mono (30720 Hz stereo), 16 bits per sample " [14]

Instructions (from [15]):

---

TIS-100

From the game TIS-100

http://www.zachtronics.com/images/TIS-100P%20Reference%20Manual.pdf

https://unwiredben.livejournal.com/308306.html

https://alandesmet.github.io/TIS-100-Hackers-Guide/assembly.html

---

EXA

From the game EXAPUNKS

http://exapunks.wikia.com/wiki/EXA_instructions

---

LC-3

Pedagogical language

Variants:

"The LC-3b ISA describes a modified version of the LC-3 that includes the following changes:

    The machine's word size remains 16 bits, but its memory is now byte-addressable with the same address space.
    The LD and ST instructions (load and store data using PC-relative addressing) have been removed.
    The LDI and STI instructions (indirect loads and stores) use register-based addressing instead of PC-relative addressing.
    Two instructions, LDB and STB, have been added to manipulate individual bytes of memory; the other load and store instructions continue to act on entire words.
    The reserved opcode has been converted into a shift instruction, SHF, that supports arithmetic and logical shifts of arbitrary size in both directions.

" [16]

---

StoneKnifeForth

https://github.com/kragen/stoneknifeforth

bootstrap interpreter in Python:

(main part of?) StoneKnifeForth?:

"a very simple language inspired by Forth. It is not expected to be useful; instead, its purpose is to show how simple a compiler can be. The compiler is a bit under two pages of code when the comments are removed. " -- [17]