Bayle Shanks's website: proj-oot-old-150618-ootCoreNotes2

thinking about the bit on models of computation i just wrote, some primitives for oot: pointer (turing head location in memory) array (turing tape, imperatives) goto, while (imps) integers in variables functions in variable, function application logic gates if 2d arrays (automata) copying an expression in one step (cow) expressions linear lists of code variable environemnts (lambda) mu successor functions with multiple arguments function composition recursion, primitive recursion expression trees selection structural equality nock operator

other common concepts: call stacks strings dicts lists regex db tables relation? types maps, folds what from logic programming? transition table grammar category theory? lattices? ordering relations?

can strings, cellular automata, etc be generalized with a 'topology' (or is this a geometry? i think topology) on top of something unstructured like a graph? is this like oot. pattersn/graph regexs/structural types? relatio to Chu spaces?

---

some more stack ops:

FLAG: place an opaque 'flag' on the stack CLEAR-TO-FLAG, ROT-TO-FLAG, etc; applied in the section from the top of the stack (TOS) down to the first flag

a 'flag' is like an Oot 'boundary' and maybe the command should be called STACKBOUND instead of FLAG.

(http://www.reocities.com/connorbd/varaq/varaqspec.html has this)

---

also might want list-to-stack and stack-to-list ops; e.g. SHATTER to 'Reduces a list to its component elements and pushes them on the stack in order' [1] and CONSUME n or CONSUME-TO-BOUNDARY to take the first n elements of the stack (or all elements up to a BOUNDARY) and return them as a list

(http://www.reocities.com/connorbd/varaq/varaqspec.html has this)

also http://www.reocities.com/connorbd/varaq/varaqspec.html has

empty? 'Examines a list on the stack and returns 1 if its value is null (pagh), a 0 if it contains anything.'
split 'Pops a list off the stack and returns the first item and the rest of the list.'
cons 'Takes an object and adds it to the head of a list. Equivalent to the LISP cons operator. '

---

http://www.reocities.com/connorbd/varaq/varaqspec.html also has some interesting stack-to-string ops:

strtie 'Concatenates the top two strings on the stack into one.'
compose 'Pops objects (executing proc objects if necessary) off the stack until a marker (placed by qaw) is hit and combines them into one string. '
streq 'Pops the top two strings on the stack, compares them, and returns 1 if identical, 0 if not. '
strcut 'Pops two values and a string and returns the section of the string between character startval and character endval.'
strmeasure 'Pops a string off the stack and returns its length in characters. '
explode 'Separates individual "words" in a string by whitespace.'

---

http://www.reocities.com/connorbd/varaq/varaqspec.html control flow and variables:

by default, names are lookup up if defined, like LISP. There is a quote operator that allow the name itself to be specified.
there is a 'set' operator to assign to a name
there is an 'eval'
there is an 'ifyes' and an 'ifno'
there is a 'return' (they call it 'escape')
there is a 'repeat'

---

[Self-proj-plbook-plChModelsOfComputation], and also the notes in section 'some ideas relating to generalizing/translating the instructions of Nock' in [Self-proj-oot-ootNotes12], make me think that defining an Oot core might just be picking one thing for each of these roles;

eg for Nock:

what is the thing that is like an atom in Nock
what is the thing that is like a cell in Nock
what is the thing that is like the state in Nock
what is the thing that is like the cons in Nock
what is the thing that is like the instruction in Nock
what is the thing that is like the select instruction in Nock
what is the thing that is like the arg instruction in Nock
what is the thing that is like the Nock instruction in Nock (the S combinator in combinatorial calc)
what is the thing that is like the isCell instruction in Nock
what is the thing that is like the addition instruction in Nock
what is the thing that is like the structural equality instruction in Nock

and for the "another writeup" section of [Self-proj-plbook-plChModelsOfComputation]:

what is the thing that provides data executed as code?
what is the thing that provides abstraction?
what is the thing that provides extensible memory?
what is the thing that provides conditionals?
what is the thing that provides looping?

and for the lambda calc model:

what is the thing that is like substitution in lambda calc?
what is the thing that is like variables in lambda calc? etc

and for the turing machine model:

what is the thing that is like pointers in turing machines?
what is the thing that is like instruction words in turing machines?
what is the thing that is like the tape in turing machines? etc

and for the imperative language with WHILE model:

what is the thing that is like while loops imperative languages with WHILE?

etc

and for Haskell:

what is the thing that is like lists in Haskell?
what is the thing that is partial function application in Haskell?
what is the thing that is currying in Haskell?
what is the thing that is types in Haskell?

etc

and for Python:

what is the thing that is like lists in Python?
what is the thing that is like dicts in Python?

etc

once (a) these lists are all completed, and (b) you have answer a few of those questions (for instance, if you only provided one thing like each thing in Nock, you'd be done, in terms of Turing completeness), you should have enough for a core language. Since i want Oot Core to not be orthogonal, but instead to have the sum of many basic models of computation, we should instead (c) answer almost all of the questions, for all the models of computation, and take the sum to be Oot Core.

i like the idea of starting with graphs as a data structure. The difference from dicts is just that in our library functions etc, we emphasize the whole graph as one object, not just those elements reachable from directly from the first node; and also that we provide a little more infrastructure than usual (labels, reification, etc) to give our graphs a little more 'flavor' (by which i mean, conventions for the interpretation of structural form). Similarly to how the 4-role (5-role, if you count the address of each instruction) system of syntax for Oot Assembly (see [Self-proj-oot-ootAssemblyNotes4]) gives a lot of 'flavor' to syntax.

---

so i guess mb:

first-class call stack, and first-class closures; goto, switch, but also while, repeat..until, if, foreach; grammar rule 'reductions'

probably not a full C-style 'for' b/c it seems kinda hacky, just syntactic sugar for a fancy 'while'. Perhaps a constrained 'for', though, which is always simply incrementing until it hits a limit

---

'return' to return in the middle of a fn; equivalent to assigning the return args to a temporary variable, then a JMP to the end of the fn, and insertion of a statement at the end that returns from those temp vars

(also 'break', 'continue')

---

keyword args and currying: hold onto future keyword until reqd args are satisfied

---

Oot needs a concept of:

a variable/document that has various versions/revisions (which might be arranged in a chain (linear time), a DAG (version control), etc
distinct from the concept of a particular value

---

starting with BASIC-like:

types: string, bignum integer, bignum float, graph, higher-order functions, ADTs
if
goto
some sort of concurrency thingamajig
function defn
function calling
- with keywords, defaults
macros (what else? syntax transforms? fexprs?)
foreach
continue
break
while
stuff for constructing graphs
patterns and pattern matching
bools
and/or/not
isEq
isPtrEq
var assignment
multidim arrays
error handling stuff
'is'
annotations (capable of representing transactions, labels, type annotations, function annotations, field annotations, assertions, etc)
comments
views

[2] is telling; webassembly is pretty small and the intersection of asmj.s, nacl, and webassembly is similar to webassembly. Suggests a slightly higher-level:

types:

int, float, void, low-level SIMD vectors, fn pointers, pointers

commands:

module
export
function defn
statements
- block
- expression (see below)
- if
- while, switch
- continue, break
- labelled
- goto (?but note webassembly comment below)
Function calling (statements and/or expressions)
- call(fn name literal): specify the callee by index into a function table
- call(fn ptr)
- addressof: obtain a function pointer value for a given function
- return
memory access statements
- heap access (load, store, alloca, load_with_offset, store_with_offset)
- load/store global
expressions
- expression
- literal
- get_local, set_local
- member
- ternary ?: operator
- comma: evaluate and ignore the result of the first operand, evaluate and return the second operand
Arithmetic expressions
- add, sub, mul
- neg
- copysign (? only in webassembly?)
- div, mod, rem
- shifts: shl, shr, sar
- <, <=
- ==, !=
- and, or, not, xor
- Infinity, NaN?
- various trig (note, not in webassembly mvp)
- exp, log, pow
- ceil, floor, round, trunc
- sqrt
- abs
- min, max
- E, PI
- clz (count leading zeros), ctz (count trailing zeros), popcnt (count number of ones)
- coercions
  - float/int (various)
  - ptr/int

other stuff that looks cool from nacl:

threading
- llvm.nacl.read.tp (i8* @llvm.nacl.read.tp(): Returns a read-only thread pointer. The value is controlled by the embedding sandbox’s runtime)
- : nacl: "Threading is explicitly supported through C11/C++11’s threading libraries as well as POSIX threads. Communication between threads should use atomic primitives as described in Memory Model and Atomics." [3]. global variable TLS models are not supported. volatile is not supported. See PNaCl C/C++ Language Support: Memory Model and Atomics.
freaky control flow
- C11 setjmp and longjmp
atomics
- llvm.nacl.atomic.store, llvm.nacl.atomic.load, llvm.nacl.atomic.rmw (read-modify-write operations; compile-time constant specifying the operation as one of: add, sub, or, and, xor, exchange), llvm.nacl.atomic.cmpxchg, llvm.nacl.atomic.fence, llvm.nacl.atomic.fence.all, llvm.nacl.atomic.is.lock.free: See atomic intrinsics
  - note to self: how do these compare to LLVM's? see http://llvm.org/releases/3.3/docs/LangRef.html#memmodel http://llvm.org/releases/3.3/docs/Atomics.html
instrinsics:
- memcpy, memmove, memset
- bswap (?)
- llvm.stacksave, llvm.stackrestore (These intrinsics are used to implement language features like scoped automatic variable sized arrays in C99. llvm.stacksave returns a value that represents the current state of the stack. This value may only be used as the argument to llvm.stackrestore, which restores the stack to the given state)
- llvm.trap (This intrinsic is lowered to a target dependent trap instruction, which aborts execution)
unreachable

webassembly comment regarding goto: " Break and continue statements can only target blocks or loops in which they are nested. This guarantees that all resulting control flow graphs are reducible, which leads to the following advantages:

    Simple and size-efficient binary encoding and compilation.
    Any control flow—even irreducible—can be transformed into structured control flow with the Relooper algorithm, with guaranteed low code size overhead, and typically minimal throughput overhead (except for pathological cases of irreducible control flow). Alternative approaches can generate reducible control flow via node splitting, which can reduce throughput overhead, at the cost of increasing code size (potentially very significantly in pathological cases).
    The signature-restricted proper tail-call feature would allow efficient compilation of arbitrary irreducible control flow."

" If you use only while-loops, for- loops, repeat-loops, if-then(-else), break, and continue, then your flow graph is reducible. ... Why Care About Back/Retreating Edges? 1. Proper ordering of nodes during iterative algorithm assures number of passes limited by the number of “nested” back edges. 2. Depth of nested loops upper-bounds the number of nested back edges " -- http://infolab.stanford.edu/~ullman/dragon/w06/lectures/dfa3.pdf

" Certain control-flow patterns make flowgraphs irreducible. Such patterns are called improper regions, and, in general, they are multiple-entry strongly connected components of a flowgraph ... as long as we avoid gotos, specifically, gotos into loop bodies (my note: based on a footnote at the bottom of the page, this includes loops made up of ifs and gotos) ... statistical studies..have shown that irreducibility is infrequent, even in languages that make...(little)...effort to restrict control-flow constructs...before structured programming became a serious concern...over 90% of a selection of real-world Fortran 77 programs have reducible control flow and all of a set of 50 large Fortran programs are reducible. Thus, irreducible flowgraphs occur rarely in practice...however, they do occur, so we must make sure our approaches to control- and data-flow analysis are capable of dealing with them.

There are three practical approaches to dealing with irreducibility...iterative data-flow analysis, as described in section 8.4, on irreducible regions...The second is to use a technique called node splitting that transforms irreducible regions into reducible ones. If irreducibility were common, node splitting could be very expensive, since it could exponentially increase the size of the flowgraph; fortunately, this is not the case in practice. The third approach is to perform an induced iteration on the lattice of monotone functions from the lattice to itself (see Sections 8.5 and 8.6). " -- Advanced Compiler Design Implementation, section 7.5, page 196 https://books.google.com/books?isbn=1558603204 Steven S. Muchnick - 1997

varargs not supported.

missing (todo see what this is again):

from NaCL?: icmp, fcmp, phi, select, insertelement, extractelement, bitcast, imul

missing (todo, replace vectors with arrays and structs with graphs? or both with graphs?)

from asm.js, sorta: arrays, structs

more on what a 'signature-restricted tail call' might be:

Tail Call Optimization in GCC

To address optimization needs, GCC has introduced the concept called "sibcalls" (short for "sibling calls"). Basically, a sibcall is an optimized tail call, but restricted to functions that share a similar signature. In other words, the compiler considers two functions as being siblings if they share the same structural equivalence of return types, as well as matching space requirements of their arguments.

For example, again assuming the ABI of ix86 Linux/UNIX, a tail call from function foo to bar would be a potential optimization candidate, because both share the same return type. Two arguments of type short are represented internally by using four bytes altogether, which is the same size as one long long argument:

int foo (long long a); int bar (short a, short b);

This restriction is necessary because in a chain of sibcalls, the top-most caller who is calling a tail-recursive function (and being unaware of it) attempts to clean up the callee's arguments when computation has finished. However, if this callee is allowed to exceed its own incoming argument space to perform a sibcall to a function requiring more argument stack space, you would end up with a memory leak when the top-most caller attempts to free the stack slots.

Another reason, related to this, is the shifting of the return address; see Figure 3. Apart from being a technical challenge, it would also break binary compatibility with other programs and libraries that do not support this notion of stack handling. Unaware third-party procedures would not be prepared to perform stack-shifting operations or, alternatively, to let the callee worry about the necessary memory clean ups. "

-- http://www.drdobbs.com/tackling-c-tail-calls/184401756