Bayle Shanks's website: proj-oot-ootAssemblyNotes8

https://en.m.wikipedia.org/wiki/Find_first_set gives more info on why count trailing zeros (ctz) and count leading zeros (clz) are popular operations; they are related to log base 2.

---

why have meta bits in the graph data representation, rather than only having meta bits in access fns, eg in the arguments to whatever we ops we have in program code to mimic __get and __set? b/c we want graph to be able to 'modify' accesses, ie, views, ie we want a node to be able to have an edge whose target is not just another node, but the 'meta view' of that other node.

---

so is the stuff i'm think of meta level 3? stuff like the 'metagraph'? no, actually, right now i think that stuff is still just meta level 2. Recall, metalevel 1 ('hyper') is 'a message about a message', metalevel 2 ('meta') is 'a message about the connection between (a message about a message M) and the underlying/target message M; metalevel 3 is about the metalevels, that is, the relation between metalevel1 and metalevel2. Now, you could say that the metagraph is itself metalevel3 because it may contain nodes representing 'views' corresponding to different meta-levels. But it probably contains a node representing a view for metaedges; and another node representing a view for meta-meta-edges. This is distinct from having one node representing 'hyper' and another node representing 'meta'. Also, metalevel 2 already contains the concept of an edge that points to an edge. Also, recall that metalevel 3 is analogous to perspective, eg the way that a 2-d perspective view of a 3-d scene has converging lines to cram 3-d information into 2-d in a 'bent' way; which is analogous to 'rotating' information and presenting it in different ways; a single 'metagraph' doesn't seem to contain this (although perhaps if there were ways to transform the metagraph in various ways to represent metainformation in different ways, such as transforming from a representation in which (there is one node representing a view with metaedges and another node representing a view with metametaedges) into a representation in which (there is one node representing metalevel1 and another node representing metalevel2...).

Similarly, one might think that the 'metagraph' is in fact metalevel4 because it contains many levels in one graph, but i don't think it is, as level 4 is supposed to be even further out there than level 3, and in fact is supposed to possibly generate/include new metalevels.

---

We only allow 12 bit operands in instructions. We are a '16-bit' system; however it is possible for the actual memory to be larger than 2^16, and for the (opaque) pointers to be implemented as larger than 16 bits. However, the 12/16 difference suggests that on at least some naive or small implementations, there will be 4 bits 'left over' in some cases. We could use these as type tags or 'fat pointer' bounds checks [1].

How would we use these 4 bits? How does Haskell GHC use pointer tagging? Who else does that?

---

" nabla9 5 hours ago

Memory allocation feature asked is not a language feature. It has to be baked into web assembly if you want to use portable byte masking with pointers.

It's very low level implementation level detail that enables fast execution of dynamically typed languages. Boxing has high cost and it consumes memory. Type tags embedded in pointers can be very fast. For that to work, you need objects that are aligned with power-of-two byte boundaries.

Adding this feature enables efficient execution strategy for scripting and dynamic languages.

reply "

---

i guess if we only had 4 types, Parrot's "INSP" is a good bet: integers, numbers (floats), strings, other. We could also replace 'floats' with either 'boxed' or 'pointer (reference)' or 'native pointer' or 'dyn'.

---

XPC primitive types:

Data
Boolean
Double
String
Signed Integer
Unsigned Integer
Date
UUID
Array
Dictionary
Null

-- [2]

---

some random ideas on what it might take to make an IL (such as WebAssembly?) readable:

http://wllang.com/ https://github.com/wllang/wll

---

"Usually, a method accepts either a string or a number or a function or an object, etc" [3]

---

created ootVm.txt; moved old proposal from ootAssemblyThoughts to below this; copied current proposal from top of ootAssemblyNotes7.txt to ootVm.txt.

todos:

add proposed list of ops

---

perhaps we put Java-esque restrictions on the data stack for safety? We still need to permit call stack manipulation though, so the HLL can easily implement continuations.

A different strategy would be to let the language implementation do whatever it likes, and only check user code libraries for stack safety restrictions.

todo

---

OLD proposal from ootAssemblyThoughts.txt moved to here (new version copied to ootVm.txt):

motivations and goals

The idea here is that Oot might have 3 well-defined compilation stages:

Oot -> Oot Core -> this Assembly/Bytecode -> platform-specific code

"Oot" would be the high-level language. It would be built from Oot Core using Oot Core's metaprogramming facilities.

The rationale for Oot Core is:

to conceptually highlight which high-level operations are important in the language (for example, in Haskell we see that things like partially applied functions, closures (let-bound stuff), variables applied to arguments, and generic thunks are all pretty 'core')
to focus implementation effort around these without worrying about VM impedence mismatch until the next level down

The rationale for Oot Assembly is:

to inspire me with ideas for Oot Core 'from the bottom up'
to make porting Oot easier
to make the implementation of Oot Core easier to read by providing an additional layer decoupling the implementation of Oot Core constructs from platform-specific details

Oot Assembly will make porting Oot easier by:

providing a small, compartmentalized language; all a porter needs to do is (a) implement the interpreter loop, (b) implement each opcode, (c) execute the reference implementation of Oot Core (which will be compiled to Oot Assembly) on top of the VM they've built

Some properties we want Oot Assembly to have:

easy to implement
simple
portable, abstracted from implementation details
a small set of opcodes, to ease implementation and to keep it simple
preserve higher-level intent (expressivness)
highly customizable
cross-platform preprocessing that must be done in any case should be done at a higher level above Oot Assembly; for example, Oot code must be parsed before Oot Assembly is generated
administrative automation such as garbage collection, lazy control flow, greenthread, and copy-on-write should be done at a lower-level beneath Oot Assembly, but should be exposed as primitives via Oot Assembly opcodes
parallelizable decoding (this is why we align to fixed-length chunks)
linear (but perhaps it should support more general graph shapes, too?)
support for annotations
efficient emulation
targetting a world of 'brain-like computers' with tens of thousands or more CPUs (or at least virtual threads or 'kernels'), each of which have a small amount of attached local memory

Some things that many traditional assembly languages do that we don't:

fixed length instructions
purely linear
purely imperative

What do we mean by 'preserve intent', and why do we want that?

What we mean is "don't map a specific operation S to a special case of a general operation G if, by looking at the result in the form of G, you cannot discover that this is actually S without doing non-local analysis".

Some examples:

we use 3-operand instructions, and memory-to-memory (rather than registers), in order to make the code more closely match its meaning (e.g. not obscure "c = a + b" by having MOVs in the middle of it). In particular, we want to make it so that, often, when there is one variable at the source code level, that translates into one particular memory location in the Oot Assembly level, rather than translating into a bunch of different memory locations with MOVs etc in between them (ie need for inessential register transfers).
having a way to explicitly state that an output value is 'discard'ed
forcing jumps to use specific instructions (as opposed to just writing to the PC)

There are three reasons to 'preserve intent':

to defer platform-specific optimization choices (eg if many platforms provide a primitive that efficiently implements a high-level construct, it would be a shame to instead use a slow reimplementation in Oot Assembly code of that same construct; better to have that primitive be a single Oot Assembly bytecode)
to make it easier to write quick-and-dirty program analysis
elegance/readability

Efficiency: We want Oot Assembly to be reasonably efficient to interpret; however, efficiency should not be at the expense of preservation of intent or customizability. We expect that performance-minded platform-specific implementations might compile Oot Assembly into custom high-performance bytecode formats anyways.

Examples of the consequnces of this choice:

use of a variable length encoding (in this case, customizability trumps efficiency)
mostly linear encoding (for efficiency)
separate opcodes for the same operation on different types, rather than polymorphic opcodes and a separate type field (for efficiency)
we don't precompute platform-specific addresses at compile-time, because we want to stay portable (but maybe at link-time?)

current proposal

note: as of 201602 this has been superceded by the proposal in ootAssemblyNotes7.

Design goals

Ease of implementation

(Massively) parallelizable decoding

supports linear instruction stream; also optionally supports tree structure

Message constituents with a hard upper bound on their length in bits (in this case, 32 bits is the upper bound of werds, and the payload is max length of 24 bits)

supports at least 12-bit addressing (in fact, we support 24-bit addressing, although using some addressing modes on operands of more than 12 bits requires two instructions)

Preservation of HLL intent

Extensibility

note: efficiency is NOT a primary design goal; as with the Dalvik encoding of JVM bytecode, or the LuaJIT? bytecode, we expect that performance-minded implementations may create their own variant encodings. Eg the primary purpose of the 64-bit frame alignment is to support parallelized decoding (although efficiency is a secondary design goal).

Note that in this design, the operation might be indirectly specified via a reference to memory, rather than being immediately specified in the bytecode. This should be useful for encoding of calling first-class functions assigned to local variables.

Syntax

Note that the syntax of Oot Bytecode is a very general syntax that could be used for multiple languages (by varying the operations available, the modalities, the constaints, and the semantics of memory cells). Indeed, within Oot, this syntax is used for at least two 'languages'; one is Oot Instruction Bytecode and the other is Oot Graph Bytecode.

Sentences and phrases: Oot Bytecode is divided into variable-length sentences. Sentences consist of one or more variable-length phrases. Phrases consist of one or more werds

Each werd is either 16 bits or 32 bits. A 16-bit werd consists of an 8-bit header and an 8-bit payload. A 32-bit werd consists of an 8-bit header and a 24-bit payload.

An aside on terminology: The natural language linguistic concept of 'word' is a good fit for what is here called a 'werd' because, like linguistic words, Oot Bytecode werds are composed of a 'root morpheme' (the payload) and other syntactical and modifier morphemes (the header). However, in computing, the term 'word' is already in use to refer to architecture-specific fixed-length 'words', so to avoid confusion, i changed the spelling slightly. If this annoys you, feel free to call them "words"; in fact, in most contexts, i use the spelling "word" myself, and i only use "werd" when i am particularly worried about confusion with architecture-defined words.

Werds are grouped into 64-bit frames. Phrases cannot span multiple frames unless the parentheses construct is used.

Each phrase has one of 8 'roles' within the sentence. Each werd has one of (the same 8) subroles within the phrase. It's possible for multiple phrases within a sentence, or for multiple werds within a phrase, to have the same role or subrole. The ordering of phrases or werds with the same role or subrole is significant. Otherwise, the ordering is insignificant (except that some roles have positional constraints, namely, the first werd of a phrase is always subrole 0, and phrase roles 6 and 7 can only appear as the first phrase of a 64-bit frame).

The bits in the 8-bit werd header are as follows:

1 bit: is this a 16-bit or 32-bit werd (ie, is the payload of this werd 8 bits or 24 bits?)?
1 bit: EOS or BOP
3 bits: role
3 bits: addressing mode

EOS/BOP: if this is the first werd in the 64-bit frame, then its an EOS. Otherwise its a BOP. EOS means that this 64-bit frame is the last 64-bit frame in the current sentence. BOP means that this werd begins a new phrase.

role: if this is a BOP, then this is the role of this phrase within the sentence, and the role of this werd within the phrase is role 0. Otherwise, this is the subrole of the werd within the phrase. The 8 roles are:

0: operation/head/verb; this selects what sort of construct or operation is represented by this phrase or sentence. Somewhat analogous to opcode.
1: input/rvalue/source: Somewhat analogous to input operands.
2: output/lvalue/destination: Somewhat analogous to destination operands.
3: modality: modifications to the way that the meaning is processed. For example, lazy vs strict.
4: constraints ("such that" or "where" clauses): For example, "orange" in "Pick up the orange ball."
5: conjunctions. For example, 'a and b and c' or 'a or b or c' or 'a xor b xor c'
6: alternate formats. For example, can be used to select an alternate, spartan 'packed' format for this frame.
7: grouping constructs that span multiple frames. For example, hints about how long a multi-frame sentence is; parentheses; quoting and antiquoting; annotations

todo: what about 'metadata'?

addressing modes:

0: direct
1: indirect
2: immediate
3: immediate, from constant table
4: split; constant[direct]
5: split; direct[immediate]
6: split; direct[constant]
7: split; direct[direct]

the 'split' modes split the payload in half (so an 8-bit payload is split into 2 4-bit payloads, and a 24-bit payload into 2 12-bit payloads), and apply the specified addressing mode to each half, then combine the two in a 'GET' (or index-into) operation. For example, mode 5 retrieves the contents of the memory cell indicated by the first half of the payload (the direct mode), and then finds the index within that data structure indicated by the second half of the payload; for example, if the first half of the payload is '3' and the second half is '5', and memory location 3 contains an array, then the effective address indicated would be the 5th element within this array.

the semantics of memory cells, indirect addressing, and the GET operation are language-specific

more details on some of the roles:

role 0 phrase details

role 1 phrase details

in this role, addressing modes yield the value found at the effective address

werds with subrole 2 within this phrase can be used to pass 'named arguments'; eg the name of the argument would be in subrole 2, and the value of the argument would be in subrole 1 (or implict in subrole 0)

role 2 phrase details

in this role, addressing modes yield an effective address

werds with subrole 2 within this phrase can be used to pass 'named return arguments'; eg the name of the return argument would be in subrole 2, and the lhs expression of the return argument would be in subrole 1 (or implict in subrole 0)

role 3 phrase details

todo: there should be a way to include a single 8-bit modality payload, but also a way to include arbitrary settings of named modality fields to values.

role 4 phrase details

role 5 phrase details

role 6 phrase details

Role 6 is generically defined as anything that can be translated in a context-independent manner (that is, the translator is only allowed to look at a contiguous group of 64-bit frames at a time, and must be stateless in between groupings) into one or more sentences.

When the payload is 8-bits, the 8-bit payload is modality (that is, it is interpreted as if there were a single-werd phrase with role modality, where the werd had role 0, and this was the payload of the werd), and the rest of the 64-bit frame is as follows:

8-bit opcode + 3x4-bit operands
8-bit opcode (stack addressing assumed for all operands)
8-bit opcode + 3x4-bit operands

When the payload is 24-bits, the first 8 bits select the number of 64-bit frames included in the packed segment, the next 8 bits specifies a language-specific packed representation, and the last 8 bits are language-defined. So far this is not used by Oot.

role 7 phrase details

Role 7 reinterprets the addressing mode field to indicate what type of grouping construct is present. All role 7 constructs apply at the granularity of entire 64-bit frames.

0: sentence length; Mandatory at the beginning of multi-frame sentences (if it is not present, length is assumed to be 1). Sentence lengths 0 and 1 are reserved for future use.
1: left parens
2: right parens
3: region annotation begin
4: region annotation end
5: quasiquote begin
6: quasiquote end
7: antiquote

The payload of role 7 is always split; the first half of the payload is how many 64-bit frames are spanned by the construct, and the second half is the 'type' of the construct. Frame lengths 0 and 1 may have special meanings, todo. Note that grouping-ending constructs such as right parens have an identical payload to the matching left-parens; since this includes the number of frames spanned, this makes it efficient to jump from the right parens to the corresponding left parens (or vice versa). The most-significant-bit of the 'type' of the construct is reserved. For quasiquote, this means whether or not there is any antiquote within this quasiquote; the semantics of this bit for other modes is reserved for future use.

As an optional extension, a language may support arbitrary-length constructs. In this case, if the length of the payload is the maximum value (2^24 - 1), then this indicates that the construct is actually arbitrarily longer than 2^24 - 2, and at displacement 2^24 - 2 will be found another construct opening werd. Most languages don't support this arbitrary-length construct feature, in which case payload value 2^24 - 1 is illegal.

some of these constructs are used to embed things from a different 'language' within bytecode of a default language. todo figure out which ones?

todo: can parens only enclose single phrases, or can they enclose whole sentences?

todo: What about EOS bits within these constructs?

todo is region annotation of length 0 a point annotation?

todo: can 'sentence length' also specify a 'foreign language' sentence?

note: the difference between 'foreign languages' in role 7 vs language-specific packed representations is that a role 7 'foreign language' uses the same format/syntax as described here, but varies the semantics (eg list of operations, list of modalities, list of constraints, semantics of memory cells, semantics of indirect addressing mode and GET), whereas role 6 is an extension mechanism to contain segments of arbitrary format/syntax.

languages

a language using this syntax must define:

a list of operations
the semantics of memory cells, including any 'special' memory cells (eg is location 0 the PC?)

and must either define the semantics of or forbid the use of:

a list of modalities
indirect addressing
GET
constraints
conjunctions
the semantics of role 7 constructs

and may define:

limits on sentence length, phrase length, and role 7 construct length, beyond those inherent in the syntax
limits on language-level semantics resulting from length limits
language-specific role 6 packed representations
a list of 'foreign languages' that may be embedded using role 7 constructs

numerology

There are 2^24 accessible memory cells (24-bit addressing). The size of the constant table must be less than 2^24.

Because an 8-bit payload can be split into two 4-bit payloads in split addressing mode, memory cells 0-15 can be accessed somewhat more easily than others. Similarly, constant table entries 0-15 can be accessed more easily than others.

Similarly, the largest memory cell location that can be accessed using split mode addressing is 2^12 - 1 (using a 24-bit payload split into 2 12-bit payloads). Similarly, constant table entries up to 2^12 - 1can be accessed more easily than others.

profiles

There are many aspects of this format which aid extensibility but which may make implementation more difficult. Therefore, although they are described above, the 'default profile' turns them off.

role-6 segments of length more than 1 frame (b/c this makes it hard to tell if you are looking at a role 6 chuck if you plop down at an arbitrary place in the middle of one, which may make massively parallel decoding harder)
arbitrary-length role-7 constructs

todo: let's have short construct length limits by default so that a default-profile interpreter doesn't have to reserve much memory

segment format

todo: need to specify constant table format, format for file containing both constant table and bytecode, etc? or is this implementation-dependent? (i'm leaning towards implementation-dependent, although in that case we should still specify it for the Oot language)

natural languages

some inspiration was taken from the syntax of natural languages (i took a course once in head-driven phrase structure grammar, so i wouldn't be surprised if this turned out to be particularly close to that). In addition, this syntax is probably sufficient for encoding most natural language sentences (assuming you are willing to do things like sometimes use multiple Oot Bytecode sentences to represent one natural language sentence).

Here's how you might encode some natural language constructs:

subject: role 2
object (direct or indirect): role 1
verb, verb phrase: role 0
prepositional phrase: either role 4 (constraint) (eg "flip the switch above the red light" is constraint; the switch such that it is above the red light) or role 3 (modality) (eg "with great urgency" is modality)
verb tense, aspect, mood, conjugation, modality etc: subrole 3 (modality) within role 0 phrase (note that our use of 'modality' is broader than its use in linguistics)
auxiliary verbs, multiple verbs: subrole 3 (modality) within role 0 phrase (eg "We are trying to understand the difference"; the root verb would be 'understand', and 'trying' would be a modality)
noun phrases: role 1 or 2 depending upon whether this is an object or subject
adjectives: something within a noun phrase; involves either subrole 0 (head), subrole 3 (modality) or subrole 4 (constraint) (eg 'Pick up the orange ball'; 'orange' is a subrole 4 (constraint))
adverbs: something within a verb phrase; involves either subrole 3 (modality) or subrole 4 (constraint)
pronoun: At the referent, make an assignment to a memory cell. Then when the pronoun is used, reference this memory cell. When there is a pronoun whose referent is in the same sentence, a separate Oot Bytecode sentence would probably be needed to assign the referent to a memory cell, which could then be referenced in the pronoun-using sentence
interjections: role 6
conjunctions: role 5 or role 7
clauses aside from the root clause: role 7
possessives: usually role 4 (constraint) (Sam's dog = a dog with the constraint that it belongs to Sam), but possibly GET addressing mode (Sam's dog = sams_possessions[dog])
plural, quantifiers: i'm not sure. conjunction? modality?

semantics: Oot Instruction Bytecode

indirect: todo; this probably will have something to do with aliasing or symlinking but it's not certain (see [[ootAssemblyNotes5?]]