proj-oot-old-150618-ootCoreNotes2

thinking about the bit on models of computation i just wrote, some primitives for oot: pointer (turing head location in memory) array (turing tape, imperatives) goto, while (imps) integers in variables functions in variable, function application logic gates if 2d arrays (automata) copying an expression in one step (cow) expressions linear lists of code variable environemnts (lambda) mu successor functions with multiple arguments function composition recursion, primitive recursion expression trees selection structural equality nock operator

other common concepts: call stacks strings dicts lists regex db tables relation? types maps, folds what from logic programming? transition table grammar category theory? lattices? ordering relations?

can strings, cellular automata, etc be generalized with a 'topology' (or is this a geometry? i think topology) on top of something unstructured like a graph? is this like oot. pattersn/graph regexs/structural types? relatio to Chu spaces?

---

some more stack ops:

FLAG: place an opaque 'flag' on the stack CLEAR-TO-FLAG, ROT-TO-FLAG, etc; applied in the section from the top of the stack (TOS) down to the first flag

a 'flag' is like an Oot 'boundary' and maybe the command should be called STACKBOUND instead of FLAG.

(http://www.reocities.com/connorbd/varaq/varaqspec.html has this)

---

also might want list-to-stack and stack-to-list ops; e.g. SHATTER to 'Reduces a list to its component elements and pushes them on the stack in order' [1] and CONSUME n or CONSUME-TO-BOUNDARY to take the first n elements of the stack (or all elements up to a BOUNDARY) and return them as a list

(http://www.reocities.com/connorbd/varaq/varaqspec.html has this)

also http://www.reocities.com/connorbd/varaq/varaqspec.html has

---

http://www.reocities.com/connorbd/varaq/varaqspec.html also has some interesting stack-to-string ops:

---

http://www.reocities.com/connorbd/varaq/varaqspec.html control flow and variables:

---

[Self-proj-plbook-plChModelsOfComputation], and also the notes in section 'some ideas relating to generalizing/translating the instructions of Nock' in [Self-proj-oot-ootNotes12], make me think that defining an Oot core might just be picking one thing for each of these roles;

eg for Nock:

and for the "another writeup" section of [Self-proj-plbook-plChModelsOfComputation]:

and for the lambda calc model:

and for the turing machine model:

and for the imperative language with WHILE model:

etc

and for Haskell:

etc

and for Python:

etc

once (a) these lists are all completed, and (b) you have answer a few of those questions (for instance, if you only provided one thing like each thing in Nock, you'd be done, in terms of Turing completeness), you should have enough for a core language. Since i want Oot Core to not be orthogonal, but instead to have the sum of many basic models of computation, we should instead (c) answer almost all of the questions, for all the models of computation, and take the sum to be Oot Core.

i like the idea of starting with graphs as a data structure. The difference from dicts is just that in our library functions etc, we emphasize the whole graph as one object, not just those elements reachable from directly from the first node; and also that we provide a little more infrastructure than usual (labels, reification, etc) to give our graphs a little more 'flavor' (by which i mean, conventions for the interpretation of structural form). Similarly to how the 4-role (5-role, if you count the address of each instruction) system of syntax for Oot Assembly (see [Self-proj-oot-ootAssemblyNotes4]) gives a lot of 'flavor' to syntax.

---

so i guess mb:

first-class call stack, and first-class closures; goto, switch, but also while, repeat..until, if, foreach; grammar rule 'reductions'

probably not a full C-style 'for' b/c it seems kinda hacky, just syntactic sugar for a fancy 'while'. Perhaps a constrained 'for', though, which is always simply incrementing until it hits a limit

---

'return' to return in the middle of a fn; equivalent to assigning the return args to a temporary variable, then a JMP to the end of the fn, and insertion of a statement at the end that returns from those temp vars

(also 'break', 'continue')

---

keyword args and currying: hold onto future keyword until reqd args are satisfied

---

Oot needs a concept of:

---

starting with BASIC-like:


[2] is telling; webassembly is pretty small and the intersection of asmj.s, nacl, and webassembly is similar to webassembly. Suggests a slightly higher-level:

types:

commands:

other stuff that looks cool from nacl:

webassembly comment regarding goto: " Break and continue statements can only target blocks or loops in which they are nested. This guarantees that all resulting control flow graphs are reducible, which leads to the following advantages:

    Simple and size-efficient binary encoding and compilation.
    Any control flow—even irreducible—can be transformed into structured control flow with the Relooper algorithm, with guaranteed low code size overhead, and typically minimal throughput overhead (except for pathological cases of irreducible control flow). Alternative approaches can generate reducible control flow via node splitting, which can reduce throughput overhead, at the cost of increasing code size (potentially very significantly in pathological cases).
    The signature-restricted proper tail-call feature would allow efficient compilation of arbitrary irreducible control flow."

" If you use only while-loops, for- loops, repeat-loops, if-then(-else), break, and continue, then your flow graph is reducible. ... Why Care About Back/Retreating Edges? 1. Proper ordering of nodes during iterative algorithm assures number of passes limited by the number of “nested” back edges. 2. Depth of nested loops upper-bounds the number of nested back edges " -- http://infolab.stanford.edu/~ullman/dragon/w06/lectures/dfa3.pdf

" Certain control-flow patterns make flowgraphs irreducible. Such patterns are called improper regions, and, in general, they are multiple-entry strongly connected components of a flowgraph ... as long as we avoid gotos, specifically, gotos into loop bodies (my note: based on a footnote at the bottom of the page, this includes loops made up of ifs and gotos) ... statistical studies..have shown that irreducibility is infrequent, even in languages that make...(little)...effort to restrict control-flow constructs...before structured programming became a serious concern...over 90% of a selection of real-world Fortran 77 programs have reducible control flow and all of a set of 50 large Fortran programs are reducible. Thus, irreducible flowgraphs occur rarely in practice...however, they do occur, so we must make sure our approaches to control- and data-flow analysis are capable of dealing with them.

There are three practical approaches to dealing with irreducibility...iterative data-flow analysis, as described in section 8.4, on irreducible regions...The second is to use a technique called node splitting that transforms irreducible regions into reducible ones. If irreducibility were common, node splitting could be very expensive, since it could exponentially increase the size of the flowgraph; fortunately, this is not the case in practice. The third approach is to perform an induced iteration on the lattice of monotone functions from the lattice to itself (see Sections 8.5 and 8.6). " -- Advanced Compiler Design Implementation, section 7.5, page 196 https://books.google.com/books?isbn=1558603204 Steven S. Muchnick - 1997

varargs not supported.

missing (todo see what this is again):

missing (todo, replace vectors with arrays and structs with graphs? or both with graphs?)


more on what a 'signature-restricted tail call' might be:

"

Tail Call Optimization in GCC

To address optimization needs, GCC has introduced the concept called "sibcalls" (short for "sibling calls"). Basically, a sibcall is an optimized tail call, but restricted to functions that share a similar signature. In other words, the compiler considers two functions as being siblings if they share the same structural equivalence of return types, as well as matching space requirements of their arguments.

For example, again assuming the ABI of ix86 Linux/UNIX, a tail call from function foo to bar would be a potential optimization candidate, because both share the same return type. Two arguments of type short are represented internally by using four bytes altogether, which is the same size as one long long argument:

int foo (long long a); int bar (short a, short b);

This restriction is necessary because in a chain of sibcalls, the top-most caller who is calling a tail-recursive function (and being unaware of it) attempts to clean up the callee's arguments when computation has finished. However, if this callee is allowed to exceed its own incoming argument space to perform a sibcall to a function requiring more argument stack space, you would end up with a memory leak when the top-most caller attempts to free the stack slots.

Another reason, related to this, is the shifting of the return address; see Figure 3. Apart from being a technical challenge, it would also break binary compatibility with other programs and libraries that do not support this notion of stack handling. Unaware third-party procedures would not be prepared to perform stack-shifting operations or, alternatively, to let the callee worry about the necessary memory clean ups. "

-- http://www.drdobbs.com/tackling-c-tail-calls/184401756