Bayle Shanks's website: proj-oot-ootAssemblyOpsNotes1

fns

APPLY (same as GET?)

sync and time

WAIT (block until something is done) NOP SLEEP

data misc

PERSPECTIVE/VIEW GETTABLE? SETTABLE?

feature negotiation

CPUID -- args in reg, first call CPUID(0) to get highest arg supported: http://en.m.wikipedia.org/wiki/CPUID, vendor id, processor id, feature bits, serial number, core topology, cache config, virtual and physical address sizes

graph regex

MATCH REPLACE FILL-IN-TEMPLATE (like python's '%' on strings, but with graphs)

types

VARTYPE (but with our 'type attributes' this is just a set of attrs, right?) ZEROOFTYPE (get the 'zero' or default member of a type) ISA ISSUBTYPE (is <= in types; mb just use <= tho)

allocs

UNDEF (done with the contents of this register) ENTER, LEAVE (enter and leave block scopes; something similar may be/seems to be present in Ruby YARV bytecode and Perl55555) MALLOC, MEMFREE GENSYM (see below) ISDEFINED

copies and moves

SHALLOWCPY (shallow copy) DEEPCPY (deep copy) COW (copy-on-write) MOV (shallowcopy and promise never to use old one)

aliases

ALIAS, GETALIAS (see [1])

attribute access

GET SET (or mb PUT) DIRECTGET, DIRECTSET (non-indirection versions of array indexing; take the array stride as arguments; 1d arrays only?) (or, just GET and SET with modalities, but that would decrease code density)

objects

PUTSELF (both Lua and Ruby Yarv appear to have an instruction to reference Self/This DEFINEMETHOD DEFINECLASS SEND RECEIVE INVOKESUPER INVOKEBLOCK APPEND_METHOD_TO_INSTANCE (YARV's opt_ltlt)

loops

FINISH (YARV; return from this vm loop)

jumps

JMP (like assembly JMP or basic GOTO; like http://llvm.org/docs/LangRef.html#br-instruction unconditional form)

(note; JMP's argument can be a label; this makes it a 'computed goto', like http://llvm.org/docs/LangRef.html#indirectbr-instruction ; but either you must explicitly provide all possible labels, or all labels in the fn (or module?!?) will be assumed to be possible; note that this implies that labels are first-class (are values that can be held in variables))

conditionals

SWITCH (like HLL switch or like http://llvm.org/docs/LangRef.html#switch-instruction but mb more general if boolean expressions in the conditionals) IF (like HLL IF or like http://llvm.org/docs/LangRef.html#br-instruction conditional form)

tests for conditionals/comparisons

unary: eq0cond poscond negcond odd (and their negations, and unsigned versions of some of them) always (1) never (0)

binary: lt lte eq gt gte testbit (first argument is the thing to be looked at, second argument is the index of the bit to be looked at, where index 0 is the least significant bit) testbitfromleft (first argument is the thing to be looked at, second argument is the index of the bit to be looked at, where index 0 is the most significant bit) (note: you can also do this with a mask using bitwise-AND) (and their negations) (and unsigned versions of these)

what happened during ALU operations: carry overflow (usually there is also negative and zero but i believe this can be emulated by testing the result)

misc control

phi (like http://llvm.org/docs/LangRef.html#phi-instruction )

address spaces

probably don't need ops for these but some languages have them:

LOCAL GLOBAL UPVARIABLES (also enclosing LEXICAL) CONSTANTS DYNAMIC (semiglobals) ALIAS/SYMLINKS SPECIAL INSTANCEVARS CLASSVARS

and special constants and literals: PUTNIL PUTSELF PUTOBJECT (YARV) PUTSTRING (YARV)

errors and exceptions

DEFER (golang-style) THROW/RAISE TRY (? isnt this a lexical thing?) CATCH (? isnt this a lexical thing?) FINALLY (? isnt this a lexical thing?)

non-primitive control

SWITCH

interop

CALL_C_FUNCTION

debugging

TRACE

ordinary fns

stacks

PUSH POP DUP DUPN SWAP REPUT (YARV; ?) STACKGETN (YARV topn) STACKSETN (YARV setn) EMPTYSTACK n (YARV adjuststack)

strings

STRSCONCAT n (n strings)

arithmetic

ADD SUB MUL DIV MAC (multiply-and-accumulate) INC DEC MOD REM NEG (and unsigned versions of many of these)

MIN MAX

ABS SGN (sign) COPYSIGN CEIL FLOOR TRUNC ROUND

POW (exponentiation) EXP (exponentiation with base e; unary op; maybe call EXPE) EXP2 (exponentiation with base 2; unary op; maybe call EXP2) LN SQRT NTHROOT CONSTE CONSTPI

knuth ops (see http://www-cs-faculty.stanford.edu/~uno/fasc1.ps.gz page 11 (PDF page 16)): MUX: combines two bit vectors by looking at a third argument, the multiplex mask, choosing bits in the first argument where the mask is 0 and bits in the second argument where the mask is 1 (note: i reversed the convention from Knuth; Knuth chooses the first argument where the mask is 1 and the second argument where the mask is 0) SADD ("sideways add"): count the number of bit positions in which first argument has a 0 and second argument has a 1 (note: i reversed the convention from Knuth; Knuth counts where the first argument is 1 and the second is zero) MOR: interpret operands as square matrices of bits. MOR is like matrix multiply, but instead of addition between the terms, the operation is OR. MXOR: interpret operands as square matrices of bits. MXOR is like matrix multiply, but instead of addition between the terms, the operation is XOR.

matrix multiply generalized matrix multiply (you provide the functions to be used in place of addition and multiplication) matrix determinant

bit shifts

LSL LSR ASL ASR ROL ROR ROLC (rotate left thru carry) RORC (rotate right thru carry)

and bit counts: CLZ (count leading zeros) CTZ (count trailing zeros) POPCNT (count number of ones)

bit ops

TESTBIT SETBIT

boolean expressions

STRUCT-EQ PTR-EQ LT LTE GT GTE AND OR NOT EQ XOR (NEQ) PROJ0 or X or LEFT (projection of first/left argument) PROJ1 or Y or RIGHT (projection of second/right argument) 0 1 (and bitwise operators for many of those)

coercion

tostr toregexp

arrays

newarray duparray expandarray (YARV) concatarray (YARV) splatarray; play array contents onto the stack (YARV splatarray?) unsplatarray; play stack contents into an array ismemberofarray (YARV's checkincludearray) arrayany (like ismemberofarray True; YARV's checkincludearray with flag set) aref aset length

hashs

newhash

ranges

newrange; create a Range object (YARV) splatrange put a range onto the stack

logic

muKanren: UNIFY NEWVARIABLE DISJUNCT CONJUNCT

miniKanren core: UNIFY FRESH (NEWVARIABLE combined with a conjunction) CONDE (a conjunction in the form of a 'conditional')

todo

stuff about grammars (parsing and producing?) stuff about logic oberon stuff (while loops, etc)? lua ethereum

MISC SELECT K (K combinator/identity) ISATOM STRUCTURAL-EQ CONS IF COMPOSE CALL HINT RAND BRANCH BIND (variable bind) GET-ENVIRONMENT, SET-ENVIRONMENT GET-STACK, SET-STACK (call stack, i mean; but i guess data stacks might be interesting too?) LAMBDA (anonymous function construtor) VAU (anonymous operative constructor from Kernel [2]) GET-AST-OF DEFMACRO

metaprogramming stuff: PARSE (string to ast) COMPILE (ast to an optimized opaque representation of code) EVAL (run code, given as input either an ast or an optimized opaque representation of code; has no side effects (at least not in any 'unmasked' state; debug, logging, profiling, caching etc side effects are probably allowed, depending on the statemask)). Returns a value. RUN (like EVAL but with side effects). Returns a value.

EVAL is just a wrapper for RUN; RUN has an optional arg for statemask (EVAL gives the current statemask); RUN has optional args for capabilities, and RUN has optional args for which kinds of effects the RUN'd code can have upon outside variable environments, and also for which kinds effects metaprogramming within the RUN'd code can have upon the outside (to allow you to RUN code but be sure that, regardless of what weird metaprogramming it uses inside, it won't affect you; or, on the other hand, to allow you to choose to RUN code and invite it to execute metaprogramming in your context)

UNHYGIENIC-CALL: CALLS a function but within the current local variable environment, call stack, and other context. Allows you to have functions that do things like 'save local var x to file' (the input is the name of the local variable), 'save all local vars to file'; these can be emulated with python locals() i guess, but even so i'd want some sugar in the HLL (high level language) if not here. But another thing that UNHYGIENIC-CALL can do is that a RETURN from within the UNHYGIENIC-CALL actually RETURNS from the caller. You might think of this as doing a textual INCLUDE of the function text. They are sort of like Ruby blocks. You might call them 'INLINE-CALL' but that might confuse ppl.

---

The 'core' of the Lua VM instructions imo:

all instructions (from proj-plbook-plChSingleIntermedLangs, from [3]):

move
loads (constant, bool, nil)
get/set (upval, global, table)
misc: newtable, self, close, closure, vararg
arithmetic: add, sub, mul, div, mod, pow, unm (unary minus)
logic: not, test, testset (both test and testset can be used to implement both 'and' and 'or', see http://luaforge.net/docman/83/98/ANoFrillsIntroToLua51VMInstructions.pdf)
arrays: len, concat, setlist
jumps: jmp, call, tailcall, return (also some other conditionals)
conditionals: eq, lt, le (also test and testset)
loops: forloop, forprep, tforloop

unnecessary/redundant (just for efficiency) or non-'core' imo ones or renames:

move: rename to copy
self: redundant
loops: the for loops are redundant (but pretty cool)
testset: redundant
of course some of the arithmetic is redundant in that if you have 'add and 'unm' and control flow then you can implement the rest
some of the logic and conditionals are redundant with each other
concat is redundant but cool
vararg is not redundant, but Oot doesnt have varargs

---

for I/O:

open/send/recv file-like thing (stream with cursor)

open/send/recv tree-like or graph-like thing

open/send/recv relational db-like thing

open/send/recv key/value db-like thing

'document-oriented db's: keys that go to flat/non-hierarchical sets of records; or keys that go to hierarchical records

open/send/recv 'volatile' shared memory-like thing

open/send/recv unordered message-like thing (like UDP)

open/send/recv unordered ordered message stream-like thing (like TCP) (in Golang's channels we also have: async or blocking?)

open/send/recv tuplespace-like thing

open/send/recv object/RPC-like thing

mb see also IPC, and https://en.wikipedia.org/wiki/Category:Inter-process_communication and zmq

https://en.wikipedia.org/wiki/Inter-process_communication#Approaches has a list:

"Shared memory is a general, securable approach to effects. Our shared memory may consist of stream buffers, message queues, publish-subscribe topic lists, tuple spaces, databases, component entity systems, or something application specific" -- https://awelonblue.wordpress.com/

what other kinds of I/O and/or IPC and/or distributed computing data structures are there?

---

it would be interesting to use something like Nock for the stateless transition function in a cellular automata.

It would also be interesting to use something like Nock in a Propagator model (cellular automata with append-only state).

---

it seems to me that Nock should have the equivalent of a MOV command (which is really COPY) in traditional assembly. How about making Hoon's evaluate-with-changes ("centis", %=) a primitive Nock instruction?

The motivation is similar to the motivation for structural equality (this is relatively fundamental, and also can clearly be optimized by implementations that use pointers to memoize subtrees of diverged copies of large trees)

---

thinking about Nock's not being expressive enough to handle native cyclic structures; and comparing to my idea of allowing lazy and cyclic structures; there's something to be said for not allowing infinite lazy structures because if you want structural equality to be a primitive, like in Nock, well a structural equality check may never terminate if both of the operands are identical infinite lazy structures (WILL never terminate unless the implementation can prove their identity in some higher-order way, eg noticing that the pointers to the generator functions are the same)

---

ok if we add something like Hoon's centis (%=), evaluate-with-changes, what would it look like?

we don't need 'evaluate with changes' because we have variables, we are not only combinator calculus like Hoon. So we can just do a search-and-replace and return the result. We'll call ours graph replace (mb grep? that might confuse ppl..), or 'rep' for short.

It does seem important that Hoon allows you to search-and-replace a whole list at once (atomically); eg if you want to swap 'a' and 'b', with Hoon it looks like you may be able to do that in one step, rather than replacing a with c, then b with a, then c with b, because the replacement of a with b won't be 'seen' by the search for 'b's.

i think Hoon centis only matches exact matches. Let's generalize that to have our graph-replace ('rep') take as input a function ('isMatch') that evaluates a node (or a node and its subtree) and decides whether it matches or not.

We can generalize this further to have graph regular expressions (grex) instead of just finding and replacing individual nodes that match with no attention to surrounding graph structure.

A special case would be just using the match predicate without grex. A specialer case would be just doing exact matches, like Hoon centis.

---

i feel like it's not as important to have core instructions for ordinary function; stuff like add, subtract, sqrt, and, or, not are just functions that take an input or two and return an output. These can be in a library. And std libraries can be 'jetted', to use Hoon's terminology. What you really need are all the control structures that you want.

---

awelon bytecode (ABC) speaks of only having 'fixpoint combinators' rather than backwards jumps for looping. So i guess we should offer that.

---

Is a fixpoint combinator for looping just something that iterates an input fn until a fixpoint is reached and then returns the result (the fixpoint)?

if so, with generalized centis and a fixpoint operator (="combinator" in this case, i think) you could probably implement brute force predictive (no backtracking) parsing in as few lines.

And that suggests generalizing the fixpoint operator to do backtracking, too. The user would provide some code that would determine the particular backtracking behavior. Could generalize it further into a search operator (not just fixpoints; user provides code to say when the search is 'done'), but one that can see the previous result (out previous k results); this allows stopping at a fixpoint to be a special case, and it also allows eg local search optimization loops where you stop when the rate of improvement drops below some threshold (an almost-fixpoint, i guess)

to-do figure out if i'm right about what a fixpoint combinator for looping is;

"Loops in AO are ultimately expressed using fixpoint combinators. A loop will repeatedly copy and apply a block (which represents the body of the loop) until a terminating condition is observed. In practice, explicit loops should be avoided whenever possible, except when modeling collections types (streams, lists, etc.). Ultimately, AO should have the flavor of collection-oriented programming languages - i.e. very few loops exposed to programmers - despite the lack of built-in collection types.

 Compared to the recursive functions or built-in loops of more conventional programming languages, using fixpoint combinators is very awkward. Convenient loops are simply a low priority for AO's design. Instead, AO favors developing a few common collection types with collection-oriented words (which covers many use-cases for loops), plus incremental application models such that the outer loop is implicit.
 
 Explicit loops in AO are required to terminate. Due to the halting problem, termination cannot always be proven or disproven statically. However, a non-terminating loop is always a bug, and developers are expected to treat one thusly.

...

@doc.fixpoint "This fixpoint combinator binds a function to receive itself (in fixpoint form) as an argument on the stack. Fixpoint enables expression of recursive behavior.

        [foo] fixpoint
            is equivalent to
        [[foo] fixpoint foo]
            is equivalent to
        [[[foo] fixpoint foo] foo]
 
 When invoked, the function `foo` has access to itself (in fixpoint form) as a block on the stack. 
 
 Loops are expressed this way in Awelon to avoid reliance on a namespace, to support streaming programs and AO's simple 'inline everything' semantics. Unfortunately, expressing loops in this manner is relatively awkward for humans, and a direct interpretation will be less efficient than a stored program's jump based loops. The performance issue can be mitigated by a compiler. The awkwardness should be avoided by developing libraries for collection-oriented programming, such that most loops become implicit to processing of collections.
 
 NOTE: Loops in Awelon are required to terminate. The proof burden for termination ultimately falls on the programmer, but a compiler or linter is allowed to issue errors if it can prove non-termination, or warnings if it cannot prove termination. Long running behaviors in Awelon project are modeled with an *implicit* top-level loop, e.g. with RDP or a process object.~ -- https://github.com/dmbarbour/awelon/blob/master/ao/loops.ao

https://github.com/dmbarbour/awelon/blob/master/AboutABC.md points to:

https://en.wikipedia.org/wiki/Fixed-point_combinator#Strict_fixed_point_combinator

---

diff patch merge substitute/graph-replace crud (create, read, update delete) append sync

---

match regular expressions execution of DFAs, NFAs send recv tell/say get, set pick up, put down move (yourself) posture eg eg stand, sit give, take lend, borrow trade/exchange translate, rotate, deform push, pull position, velocity, acceleration space, time

---

atomicity and transactions compare-and-swap

---

virtualize arbitrary VM, and also virtualize OOT (with memory mapping; with trapping for certain instructions, memory location accesses)

---

assertion of fact (opposite: query fact (and then match pattern to multiassign results)) (you could assert an equation rather than a special value assignment, but you could assert a value assignment too) vs command vs assignment (which is like pronouns) and also (although mb not separate from the above) RESTful interactions like GET address, SET (PUT) document=value; CRUD; VERBs applied to NOUNs, possibly with other arguments too (eg the value being assigned to a document in PUT)

---

ASSERT-FACT QUERY-FACTS COMMAND EVALUATE-EXPRESSION-AGAINST-ENVIRONMENT

---

nock-like binary tree select (displayed by 1) (ie 0 is the whole tree (root cons cell), 1 is head, 2 is tail, 3 is head(head), 4 is tail(head), etc)

           0
       /       \
     1           2
   /   \       /   \
  3     4     5     6     
 / \   / \   / \   / \
7   8 9  10 11 12 13 14

(figure modified from [4])

prefix modifier: custom type id(s) (ie the operand(s) in the prefix are integer identifiers for custom types, which are the types of the operand(s) of the following ('main') instruction) prefix modifier: quote (ie the following ('main') instruction or parethesized block are not to be executed, but rather are just 'quoted' data, representing a linear sequence of code (or maybe an AST?)) prefix modifier: antiquote (to level) (ie when found within a region of quoted data, this says that the following 'main' instruction should be executed, and the result should be placed within the quoted data at this location prefix: modifier: additional flags (ie this provides a way to have additional flags for the following ('main') instruction and each of its operands)

---

for foreach while term-rewriting map filter fold/reduce rep/repmat/tile plus cons/append concat zip dims

---

EVAL PMAP

---

prefix: left-parens right-parens quote antiquote type annotation assert atomic boxed cow

2-operand: jmp while bnz for foreach call-with-stack return tailcall len hint rand

3-operand: rep cons append concat binary-tree-select zip fold filter add sub mul map pmap type coerce/constructor array-select array-set eval-expr call-with-arg open close send recv term-rewrite query assert-fact run-virtualized substitute/graph-replace get put cpy mov create delete match merge sync and or not struct-eq lt le apply if NEWVARIABLE unify GET-ENVIRONMENT, SET-ENVIRONMENT GET-STACK, SET-STACK bind lambda wait nop sleep VARTYPE (but with our 'type attributes' this is just a set of attrs, right?) ZEROOFTYPE (get the 'zero' or default member of a type) ISA ISSUBTYPE (is <= in types; mb just use <= tho) UNDEF (done with the contents of this register) ENTER, LEAVE (enter and leave block scopes; something similar may be/seems to be present in Ruby YARV bytecode and Perl55555) MALLOC, MEMFREE GENSYM (see below) ISDEFINED SHALLOWCPY (shallow copy) DEEPCPY (deep copy) ALIAS, GETALIAS (see [5]) switch try catch

...

already we're way over the 32-instruction limit for 3-operand things! i stopped going back and adding more. gonna have to be (a lot) more selective..

prefix: left-parens right-parens quote antiquote type annotation assert atomic boxed cow

2-operand: jmp while bnz for foreach call-with-stack return tailcall len hint rand

3-operand: and or not structural-eq graph-replace match add sub mul le lt cpy mov cons binary-tree-select array-get array-set graph-get graph-set graph-enumerate eval map filter fold assert-fact query unify send recv spawn apply run-virtualized

fork join sync conditionvar union intersect

relational sqlish primitives forth primitives transitive closure other relational algebra primitives forall other fol type function composition

JMP_CALLER

debuglog stacktrace interactivedebug monad

import importtype importopcode importaddrmode (and also _atruntime variants of these)

ok we have more than 32 3-operand already again. I guess we really gotta consolidate these guys in to fewer, more general instructions. For future reference, one of the things i was trying to do here is:

get the key operations from each computational paradigm (eg. 'apply' and 'lambda' from the functional paradigm; add, sub, bnz, cpy from assembly, also maybe ror etc; leq and or not; from structured programming, call while foreach for; continuations; etc)
get the key operations from each fundamental datastructure (eg for binary trees, cons and binary-tree-select; for arrays, array-create, array-get, array-set, array-len; for stacks, push pop dup and the other Forth things; graph/dict/table get set create enumerate; graph-replace; integer arithmetic; set union and intersection and filtering/comprehensions; map fold/reduce filter; spawn, channel-send, channel-recv; assert-fact query unify forall structural-equality; etc)

remember that all we REALLY need are the control flow operations; functions that take inputs and return outputs but don't affect control flow can always be accomplished via CALLing ordinary fns, once we have call

i guess we also don't need the 'data parallel' versions of things, eg. 'pmap', because those can be ordinary functions.

---

So we actually have 3 kinds of copying:

CPY: immutable value copy
MOV: equivalent to CPY and then UNDEF the old location (UNDEF is a (compile-time checked) promise not to use that value again)
ALIAS: copy the pointer, not the value it points to.

What happens if you try to apply ALIAS to an atomic value, e.g. the integer 5? Well, there are two obvious possibilities: either there's an error, or it replaces 5 with a pointer to a newly allocated integer in the heap. I prefer the latter, because that way the programmer doesn't have to distinguish between things that fit in registers and things that are too big to fit in registers and so are pointers.

LEA (two input arguments are addr mode and operand; note, this LEAs an INPUT operand only) LEA_OUTPUT_0 (LEAs an OUTPUT operand in 'immediate constant' mode, see below, where the first input is the value of the related input operand, and the second input is the value of the output operand) LEA_OUTPUT_1 (LEAs an OUTPUT operand in 'constant table' mode, see below, where the first input is the value of the related input operand, and the second input is the value of the output operand) EVAL1INSTRSTACK (mimics the execution of a single instruction, from 8 arguments from the stack (the four fields, and the four addressing modes) EVALBLOCK (evals a 'block' value)

---

three types of CALLs: (implicit) CALL3 CALL_WITH_PRESERVED_REGS CALL_WITH_NEW_REGS RETURN (mb how many levels to return? maybe a return value? maybe an error code?)

CALL3's arity is limited to 3 (or is 2 possible too?), the latter two use the stack to pass arguments and results

---

EVAL
- needs two forms:
- EVAL_INSTR(X) execute instruction which resides at X location
EVALSTACK execute the instruction indicated by the subfields on the data stack (addressing mode, opcode-operand, addressing mode1, operand1, addressing mode2, operand2, addressing mode3, operand3) (note that at this point the format bits haven't yet been consulted, since the actual opcode has not yet been resolved (since its addr mode may be register direct or reg indirect))

---

some special(ish) instructions:

left-brace: begins a quoted block
right-brace: terminates a quoted block
antiquote (optionally multilevel)
map
pmap, parallel map
lambda-left-brace
(should we have these?) load, store, loadk, loadi: unnecessary duplications of our addressing modes
?: shared memory/IO/do under lock/atomic operation/transaction/memory barriers around this/volatile
EVAL
- needs two forms: EVALINSTR execute instruction at X location; and, EVALSTACK execute the instruction indicated by the subfields on the data stack (format, flags, opcode, addressing mode, opcode, possibly up to three of: type, addressing mode, operand)

---

comefrom phi

(annotations)

---

BRZ: Branch if Zero: int, int -> pointer. If first input is equal to zero, increments the PC by (signed) second input, and returns the previous PC. Second input must be in the range of -2047 <= argument1 <= 2047. TODO: replace with something more universal. limit of 127? or mb just a skip?
BLT: Branch if Less Than: int, int -> pointer. If first input is strictly less than zero, increments the PC by (signed) second input, and returns the previous PC. Second input must be in the range of -2047 <= argument1 <= 2047. TODO: replace with something more universal. limit of 127? or mb just a skip?

---

remember to look at http://wiki.luajit.org/Bytecode-2.0 when designing the base instructions. It has ~128 instructions.

--- " RISC vs. CISC vs. HLL machines

FFT, QUICKSORT, POLY, FP instructions?

VAX INDEX instruction (array access with bounds checking)

...

Small Semantic Gap Examples in VAX

FIND FIRST

Find the first set bit in a bit field

Helps OS resource allocation operations

SAVE CONTEXT, LOAD CONTEXT

Special context switching instructions

INSQUEUE, REMQUEUE

Operations on doubly linked list

INDEX

Array access with bounds checking

STRING Operations

Compare strings, find substrings, ...

Cyclic Redundancy Check Instruction

EDITPC

Implements editing functions to display fixed format output

Digital Equipment Corp., “ VAX11 780 Architecture Handbook , ” 1977-78 "

---

union, intersection, set difference, merge sort, integer hashing, string hashing -- http://dsg.uwaterloo.ca/seminars/notes/2014-15/Lehner.pdf

---

cellular automata (or irregular graph automata) timestep

spiking neural net timestep

grid automata simulation timestep (eg weather simulation)

---

NFA automata

(mb +boolean logic, +counters?)

---

support MISD (like NFA processor) and SIMD (like vector ops on CPUs/GPUs) and MIMD

---

some arith ops :

add sub mul div inc dec hyperoperation div mod rem exp log nroot sqrt abs sgn floor ceil round leq eq

e pi golden ratio

sin cos tan asin acos atan

---

SYSCALL (call out to underlying platform using its calling conventions) (how to make lazy syscalls eg when on a haskell platform? how to make strict syscalls eg when on a haskell platform?)

---

fmap?

3sat linear programming logic matrix multiplication crypto fmax gradient descent pca linear regression supervised learning classification regression convolution sort closure stuff

---

http://shenlanguage.org/Documentation/shendoc.htm#Kl

---

S combinator (and nock Nock operator) K combinator (and other projection)

---

SHA-256

AES

SipHash?

a hash table with an arbitrary hash function and a specialization based on SipHash? (siphash24) note: you just take the 64-bit SipHash? hash and throw out the high-order bits to get the actual hash, assuming you have less than 2^64 slots in your hash table (as noted in the SipHash? paper page 14 [6]) note: what to do about collisions? one solution ('open addressing') is just go to the next empty bucket, and put it there, and rely on dynamic resizing to always have empty buckets available. another solution is 'chaining' or 'separate chaining' which means using a linked list for all the items in each bucket, and just search the list by brute force. [7] suggests moving any item which is retrieved to the front of the linked list (so the list works like a cache) so apparently the advantage of open addressing is one less pointer indirection (and so also better cache locality), and the advantage of separate chaining is graceful degradation as the table gets full (before you dynamically resize it). Seems to me the only way to have good worst case performance is to have separate chaining with a balanced binary tree in each bucket. As an optimization when there is just one element in the bucket you could just put it in the table. As another optimization you could reserve memory for the binary trees when you allocate for the table, so that you never have to call malloc to do an insert, and so the binary trees are near each other, and near the table, in memory (so now we've probably about tripled or quadrupled the amount of memory taken by the table). If you want good time guarantees, do dynamic table resizing in the background and if needed malloc more memory for the binary trees in the meantime (which means you still have to wait for that malloc, though). As another optimization, for a table of size large enough to accomodate all strings of length n or less, map those strings to consecutive integers instead of hashing them. If you don't have a total ordering on the keys, i guess you have to go back to open addressing, or to linked lists in each bucket. note: dynamic resizing: python resizes the hash table when it is 2/3s full. Java does when it is 3/4s full. [8] note: see the Python hashtable implementation. It uses SipHash? [9]. For Python SipHash? implementation, See https://hg.python.org/cpython/file/374f501f4567/Python/pyhash.c#l146 noting that this isnt used for hashing pointers, see https://hg.python.org/cpython/file/374f501f4567/Python/pyhash.c#l131 and http://wayback.archive.org/web/20160808072050/https://stackoverflow.com/questions/35709339/what-is-the-source-code-of-hash-and-eq-of-object-in-python. See also https://hg.python.org/cpython/file/374f501f4567/Include/pyhash.h . Note that the Py_HASH_CUTOFF is set to 0 in pyhash.h; see https://www.python.org/dev/peps/pep-0456/#small-string-optimization stating that the performance benefits of that weren't so great (and it seems to me that if there are a lot of 'small strings', which there are if than doing something like that may negate the security of SipHash?, right?). Still, if it were i, i would consider at least mapping the first 257 strings (all strings of length 1 or less) to the first 257 values in the table, assuming the table is at least that large (maybe extend this to longer lengths as the table gets larger); note that Python already maps to 0. for Python dict implementation, see http://stackoverflow.com/a/9022835/171761 note: to make SipHash? secure, under the covers generate a random number at the beginning of the program (or even periodically!) and use this as the 'secret key' for the SipHash? tables. Note: i guess this will make profiling non-deterministic, so maybe we should abide by the user-given seed? since 'the beginning of the program' predates a random seed set, maybe rehash everything when a random seed is set? or just let performance be non-deterministic? hmm...

http://www.sebastiansylvan.com/post/robin-hood-hashing-should-be-your-default-hash-table-implementation/ suggest "when you probe for a position to insert a new element, if the probe length for the existing element is less than the current probe length for the element being inserted, swap the two elements and keep going"

note "Some chaining implementations use an optimization where the first record of each chain is stored in the table. Although this can increase performance, it is generally not recommended: chaining tables with reasonable load factors contain a large proportion of empty slots, and the larger slot size causes them to waste large amounts of space." -- https://en.m.wikibooks.org/wiki/Data_Structures/Hash_Tables

note "some hash table implementations, notably in real-time systems, cannot pay the price of enlarging the hash table all at once, because it may interrupt time-critical operations....perform the resizing gradually..Each time an insertion is performed, add that element to the new table and also move k elements from the old table to the new table." " -- -- https://en.m.wikibooks.org/wiki/Data_Structures/Hash_Tables

set random seed

a linked list

an array

a https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree (note: hash tables usually have O(1) lookup and insertion, but in the worst case (everything hashes into one bucket) degenerates into linked lists, that is, O(n); self-balancing binary trees are O(log n), so if worst-case behavior is a concern, you may need these instead of hash tables; but also i think though you can use hash tables that use self-balancing binary trees for 'chaining' ('separate chaining') all the items in each bucket, which would ameliorate this? note: i think using a binary trees requires that you can have a total ordering over all possible items to be stored, with no two items comparing equal (mb you could fall back to linear?)

---

"SAXPY stands for “Single-Precision A·X? Plus Y”. It is a function in the standard Basic Linear Algebra Subroutines (BLAS)library. SAXPY is a combination of scalar multiplication and vector addition, and it’s very simple: it takes as input two vectors of 32-bit floats X and Y with N elements each, and a scalar value A. It multiplies each element X[i] by A and adds the result to Y[i]."

"axpy is the type-generic form of saxpy"

---

Python Copperhead primitives:

" map(f, ...) Applies the function f element-wise across a number of sequences. The function f should have arity equal to the number of arguments. Each sequence must have the same length. For practical reasons, this arity is limited to 10. For programmer convenience, list comprehensions are treated equivalently to map, so the following two lines of code are equivalent.

z = [fn(xi, yi) for xi in zip(x, y)] z = map(fn, x, y)

reduce(fn, x, init) Applies binary function fn cumulatively to the items of x so as to reduce them to a single value. The given function fn is required to be both associative and commutative, and unlike Python’s built-in reduce, parallel semantics mean that the elements are not guaranteed to be reduced from left to right.

filter(fn, x) Returns a sequence containing those items of x for which fn(x) returns True. The order of items in sequence x is preserved.

gather(x, indices) Returns the sequence [x[i] for i in indices]

scatter(src, indices, dst) Creates a copy of dst and updates it by scattering each src[i] to location indices[i] of the copy. If any indices are duplicated, one of the corresponding values from src will be chosen arbitrarily and placed in the result. The updated copy is returned.

scan(fn, x) Returns the inclusive scan (also known as prefix sum) of fn over sequence x. Also, rscan, exclusive_scan, and exclusive_rscan.

zip(...) Returns a sequence of tuples, given several sequences. All input sequences must have the same length.

unzip(x) Returns a tuple of sequences, given a sequence of tuples.

sort(fn, x) Returns a sorted copy of x, given binary comparator fn(a,b) that returns True if a < b.

indices(x) Returns a sequence containing all the indices for elements in x.

replicate(x, n) Returns a sequence containing n copies of x, where x is a scalar.

range(n) Returns a sequence containing [0, n).

bounded_range(a, b) Returns a sequence containing [a, b) "

-- [10] (already read the rest of that link, no need to reread)

---

ARM cryptographic extension "provides instructions for the acceleration of encryption and decryption to support the following: AES, SHA1, SHA2-256." -- [11]

AES is still relevant:

"A strong configuration for a cryptographic protocol such as TLS might use ECC (in the form of ECDHE) for key-exchange, AES for the cipher, and SHA-2 for message authentication." -- [12]

"Right now there is no threat, not even a theoretic one, that could require AES to be improved or even superseded by something else." -- [13]

above, i researched SHA-1 and SHA-256 and found that SHA-256 appears to be replacing SHA-1. So, we should probably have ops for SHA-256 and for AES.

---

" Add basic primitives ((*), (.&.), (.

" -- https://github.com/reduceron/Reduceron

.), (.^.), (.1.), ...) DONE: (.&.)

https://hackage.haskell.org/package/york-lava-0.2/docs/Lava-Prelude.html

"Richer set of primitives, including multiplication, shifts, logical and, or, ..."

---