Bayle Shanks's website: proj-oot-bootReferenceOld201026

Boot (Oot Boot) reference

Version: unreleased (0.0.0-0)

Boot is a low-level 'assembly language'-style virtual machine (VM) that is easy to implement.

# Introduction

Boot is a target language that is easy to implement on a wide variety of platforms, even 'on top of'/within an existing high-level languages such as Python.

Highlights:

3-operand fixed-length register machine
16-bit instruction encoding
signed 32-bit integers
integers and pointers both have implementation-dependent sizes in memory, and may be different sizes
opaque pointer representation
5 integer registers, 5 pointer registers, a zero register, and a null pointer register, plus a 5-integer fixed size stack and a 5-pointer fixed size stack
<=64 instructions
RISC-like; no addressing modes (or rather, each instruction has one fixed addressing mode), and only a few instructions access (non-register) memory
jump range of +- 2 GiB? (via signed 32-bit jump offsets embedded in instruction stream)
all control flow is PC-relative
the BootX? extension specifies additional instructions, functionality, and conventions (see bootx_reference.md)

[[TOC?]]

# Architecture

0
6 bits: opcode
3 bits: op0
3 bits: op1
3 bits: op2

Note that:

The first (most significant) bit of the first byte is always 0
the op0 field has 1 bit in the first byte (the most-significant-bit of the op0 field is the least-significant-bit of the first byte) and 2 bits in the second byte.
when implementing Boot on a little-endian system, recall that if you read or write both bytes of a Boot instruction at once as a single 16-bit integer, the second byte will be treated as the most-significant-byte and the first byte as the least-significant-byte, for the purposes of the value of that 16-bit integer. If this bullet point is confusing to you, then it will probably be easier for you to implement Boot by reading and writing each byte separately.

Datatypes ##

two primary datatypes:

int32 (32-bit integers)
ptr (pointers)

ptr has two subtypes: - ptrd (data pointers) - ptrc (code pointers)

## Registers ## Two banks of 8 registers each; one for int32, one for ptr.

The notation $n refers to the n-th int32 register, and &n refers to the n-th ptr register, for example the first and last registers in each bank are: $0, $7, &0, &7.

The zero-th register in the int32 bank, $0, is constant zero, and the zero-th register in the ptr bank, &0, is the constant null pointer; writes to these registers have no effect.

Registers $1 and &1 are called the 'smallstack' registers and have special behavior; writes to these registers push the written value onto a stack, and reads from these registers pop a value from a stack. There are two stacks, one for integers and one for pointers; these stacks are called "smallstacks". They each have a capacity of 5 items. At all times there must be at least 1 item on each smallstack; an attempt to pop the last item off the stack is illegal. If more than one operand specifies a smallstack register, they are applied in this order: op2, op1, op0.

Registers $2 and &2 are called the 'TOS registers' and have special behavior; they are aliased to the item on top of the respective smallstacks. That is to say, reading and writing to these registers read and write the most recently pushed item on the stack (without pushing or popping).

At the beginning of the program, the smallstacks have one item, and the value of this item and of every register is arbitrary.

At each instruction in the program, at the end of executing that instruction, the depth of the smallstack when control reaches that location must be the same in every possible execution of the program.

# Instructions ##

46 instructions


---	---
annotation	ann
load constants	lm6 lm22 lm32 lpc6 lpc22 lpc32
loads and stores and copies	l8 l8u l16 l16u l32 lp s8 s16 s32 sp cp cpp
arithmetic of ints	add32 sub32 mul32 add32m
bitwise arithmetic	and or xor shl shrs shru
adding ints to pointers	ap apm app ap32 ap16 appm ap32m ap16m
comparision control flow	bne blt bltu beq bnep beqp
other control flow	j9 j25 j32 jy
system library call	lib
misc	sinfo

(Notation for the instruction tables below) #imm3, #imm6, #imm9, #imm32 are signed immediate constants, #imm3u, #imm6u, #imm9u are immediate constants interpreted as unsigned, $X is an integer register or its contents, &X is a pointer register or its contents, 0 is an unused argument that must always be 0 (other values are RESERVED for future use). All signed immediate constants are two's-complement.

From left to right, the arguments go into operands op0, op1, op2. Immediate operands are always on the right (the highest-numbered operand). When two immediate operands are combined into an #imm6 (as with instruction li6), op1 is the high-order bits and op2 is the low order bits (imm6 = (op1 << 3) + op2). Similarly for #imm9 (imm9 = (op0 << 6) + (op1 << 3) + op2).

Jump and branch immediate offsets are in units of bytes in the Boot code, where '0' indicates the following instruction location (in the case of j32, that means the location after the embedded 32-bit immediate offset). JREL and branch immediates may not jump into the middle of an instruction. Platforms which compile or represent Boot code in memory in ways such that one instruction spans more or less than 4 memory locations must adjust the jump/branch immediate offsets accordingly before executing them.

The 'm' in the mnemonics lm6, lm32, add32m, apm, appm, ap32m, ap16m stands for 'iMmediate' (although not all instructions with immediates have an 'm' in the mnemonic).

annotation:

ann #imm3 #imm3 #imm3: ANNotation; no effect on execution

load constants:

lm6 $dest #imm6: Load immediate signed int constant, 6-bit: $dest = #imm6
lm22 $dest #imm6; #imm16: $dest = concatenation of #imm6 and #imm16
lm32 $dest ; #imm32: $dest = #imm32
lpc6 &dest #imm6: Load Program Counter plus 6-bit signed immediate offset: &dest = PC + #imm6
lpc22 &dest #imm6; #imm16: Load Program Counter plus 22-bit signed immediate offset: &dest = PC + (concatenation of #imm6 and #imm16)
lpc32 &dest; #imm32: Load Program Counter plus 32-bit signed immediate offset: &dest = PC + #imm32

l32 $dest &addr #imm3: Load Word int32 from memory addr (&addr + #imm3*INT32_SIZE)
l16 $dest &addr #imm3: Load Halfword int16 from memory addr (&addr + #imm3*INT16_SIZE)
l16u $dest &addr #imm3: Load Unsigned int16 from memory addr (&addr + #imm3*INT16_SIZE)
l8 $dest &addr #imm3: Load Byte int8 from memory addr (&addr + #imm3)
l8u $dest &addr #imm3: Load unsigned int8 from memory addr (&addr + #imm3)
lp &dest &addr #imm3: Load Ptr from memory addr (&addr + #imm3*PTRD_SIZE)
s32 $src &addr #imm3: Store Word int32 to memory addr (&addr + #imm3*INT32_SIZE))
s16 $src &addr #imm3: Store Halfword int16 to memory addr (&addr + #imm3*INT16_SIZE)
s8 $src &addr #imm3: Store Byte int8 to memory addr (&addr + #imm3)
sp &src &addr #imm3: Store Ptr to memory addr (&addr + #imm3*PTRD_SIZE)
cp $dest $src $cond: CoPy? int: if $cond == 0 then $dest = $src
cpp &dest &src $cond: CoPy? Ptr: if $cond == 0 then &dest = &src

stack manipulation:

swap_int: swap the top two items on the int smallstack
swap_ptr: swap the top two items on the ptr smallstack
over_int: pushes a copy of the second item on the int smallstack onto the top of the int smallstack
over_ptr: pushes a copy of the second item on the ptr smallstack onto the top of the ptr smallstack

arithmetic of ints (result always defined and all results mod 2^32):

add32 $dest $src1 $src2: $dest = $src1 + $src2
add32m $dest $src1 #imm3: $dest = $src1 + #imm3
sub32 $dest $src1 $src2: $dest = $src1 - $src2
mul32 $dest $src1 $src2: $dest = $src1 * $src2

bitwise arithmetic:

shl $dest $src #imm3u: Shift Left (C's '<<' operator) by #imm3u bits
shru $dest $src #imm3u: Shift Right Unsigned (logical shift) by #imm3u bits
shrs $dest $src #imm3u: Shift Right Signed (arithmetic shift) by #imm3u bits
and $dest $src1 $src2
or $dest $src1 $src2
xor $dest $src1 $src2

Adding ints to Pointers (only valid on data pointers, not code pointers):

ap &dest &src1 $src2: &dest = &src1 + $src2
apm &dest &src1 #imm3: &dest = &src1 + #imm3
app &dest &src1 $src2: &dest = &src1 + $src2*PTRD_SIZE
ap32 &dest &src1 $src2: &dest = &src1 + $src2*INT32_SIZE
ap16 &dest &src1 $src2: &dest = &src1 + $src2*INT16_SIZE
appm &dest &src1 #imm3: &dest = &src1 + #imm3*PTRD_SIZE
ap32m &dest &src1 #imm3: &dest = &src1 + #imm3*INT32_SIZE
ap16m &dest &src1 #imm3: &dest = &src1 + #imm3*INT16_SIZE

conditional branches:

beq $src0 $src1 #imm3: Branch-if-EQual
beqp &src0 &src1 #imm3: Branch-if-EQual on Ptrs
bne $src0 $src1 #imm3: Branch-if-Not-eQual
bnep &src0 &src1 #imm3: Branch-if-Not-Equal on Ptrs
blt $src0 $src1 #imm3: Branch-if-Less-Than
bltu $src0 $src1 #imm3: Branch-if-Less-Than-Unsigned

unconditional jumps and other control flow:

j9 #imm9: unconditional Jump (range~ +- 255)
j25 #imm9; #imm16: unconditional Jump (range~ +- 32 MiB?)
j32; #imm32: unconditional Jump; 32-bit signed offset (range~ +- 2GiB?)
jy &target: Jump dYnamic (indirect)

system library call:

lib #imm9u: call LIBrary function number #imm9u

misc:

sinfo $dest #query1_imm6u: $dest = System INFOrmation query

Notes on certain instructions ###

ann: implementations may ignore or strip ann instructions
ap, app, ap32, ap16: the int32 arguments $src2 are interpreted as signed, so although only addition is provided, subtraction can also be accomplished
app: cannot be used on ptrc's (codepointers)
blt: the int32 arguments (in $src0 and $src1) are signed
bltu: the int32 arguments (in $src0 and $src1) are unsigned
cp, cpp, not, halt: to make the cp/cpp/not/halt instructions unconditional, just use register $0 for $cond, since $0 is the always-zero register which always holds 0
cp: when $cond is 0, this is equivalent to add32m $dest $src 0
cpp: this is not equivalent to app with $src2=0,$cond=0 because this can be used on codeptr and app can only be used on 'ptrd's
j9, j25, j32: jumps target a signed offset relative to the next instruction.
j25: 16-bit constant is embedded in instruction stream immediately following instruction; 25-bit signed offset is concatenation of #imm9 in the instruction and the #imm16 constant following the instruction
j32, lm32, lpc32: 32-bit signed constant is embedded in instruction stream immediately following instruction
j32: relative to the next instruction after the embedded offset
jy: &target must be a code pointer provided at runtime (either by lpc6, lpc32, or by platform-specific or foreign code) and which did not have pointer arithmetic performed on it afterwards
l8: guaranteed to produce values between -128 and 127, inclusive (when interpreted as signed).
l8u: guaranteed to produce values between 0 and 255, inclusive (when interpreted as unsigned)
l16: guaranteed to produce values between -32768 and 32767, inclusive (when interpreted as signed).
l16u: guaranteed to produce values between 0 and 65535, inclusive (when interpreted as unsigned)
lm22, lpc22: 16-bit constant is embedded in instruction stream immediately following instruction; 22-bit immediate is concatenation of #imm6 in the instruction and the #imm16 constant following the instruction
lpc32: the &dest argument cannot be &0 (the instruction whose encoding is all-zero bits is illegal)
more64: placeholder instruction to encode 2-operand instruction opcodes 64 thru 72; add #imm3u to 64 to get the opcode of the 2-operand instruction being encoded. op0 and op1 are interpreted according to the indicated instruction.
s8, s16: stores the least-significant 8- and 16- bits, respectively, from the 32-bit $src register
sinfo: See below for defined queries
xlib: See below for defined libfn numbers.

sinfo queries ##

when query = ..., this returns in &dest ...:

00: VERSION (currently always 0)
01: PTRD_SIZE, the number of memory locations per ptrd
02: INT16_SIZE, the number of memory locations per int16
03: INT32_SIZE, the number of memory locations per int32
04: RESERVED for BootX?
05: UINTMAX_8 (= 255, 0xff hex)
06: UINTMAX_16 (= 65535, 0xffff hex)
07: UINTMAX_32 (as unsigned, = 4294967296, 0xffffffff hex; as 32-bit signed, -1)
08: RESERVED for BootX?
09: INTMAX_8 (= 127 decimal, 0x7f hex)
0A: INTMAX_16 (= 32767 decimal, 0x7fff hex)
0B: INTMAX_32 (= 2147483647 decimal, 0x7fffffff hex)
0C: RESERVED for BootX?
0D: INTMIN_8 (= -128 decimal, -0x80 hex as signed; as unsigned, 128 decimal, 0x80 hex)
0E: INTMIN_16 (= -32768 decimal, -0x8000 hex as signed; as unsigned, 32768 decimal, 0x8000 hex))
0F: INTMIN_32 (= -2147483648 decimal, -0x80000000 hex as signed; as unsigned, 2147483648 decimal, 0x80000000 hex)))
10: RESERVED for BootX?
11-38: RESERVED for extensions
39: implementation-defined

Note that the sinfo query results defined above (and possibly others) are static -- they must never change during the execution of a program.

# lib calls ## The argument specifies which library function is called.

Number 0 thru 1 are defined below and 2 thru 255 are RESERVED for extensions. libfn numbers 256 thru 511 are implementation-defined and may be used to access linked libraries, if the implementation supports that.

## lib 0: halt(RESULT: $6) ###

Terminate program, with RESULT code passed in $6. The result code is interpreted in a platform-specific way (however, most typically, success is indicated with a result code of 0).

## lib 1: memcpy(DST: &5, SRC: &6, SIZE: $5) ### Copy SIZE bytes starting at memory location SRC to memory starting at memory location DST.

The caller must assume that the values in registers $4, $5, $6, $7, and &4, &5, &6, &7 may be overwritten during the call.

# Table of opcodes ##

The following CSV-formatted table contains tuples of the form:

(opcode (as found in the opcode field of the instruction), reference_opcode (a number uniquely identifying the instruction) mnemonic, 1 if the instruction has an embedded 32-bit immediate word following it and 0 otherwise, type of op0, type of op1, type of op2, 1 if the instruction might write to the register specified by op0, 1 if the instruction might read from the register specified by op0, 1 if the instruction might write to the PC, 1 if the instruction might read the PC, 1 if the instruction might write to memory 1 if the instruction might read from memory, )

Type identifiers in the following table:

i3, i6, i9: signed immediate of the specified bitwidth
u3, u6, u9: unsigned immediate of the specified bitwidth
ri, rp: register specifier for int32 bank, ptr bank, respectively
rf: register specifier for floating point bank (unused until BootX?)
a number: must be that number
_: this operand does not exist / is concatenated with the previous one
?: this operand type is unspecified

Note that the opcode field is written in decimal notation (not hexadecimal).

## Instruction decode table ###

The fields are: (opcode field value) (op0 field value) (op1 field value) (op2 field value): mnemonic (reference opcode)

     0 * * *: 0  j9
     1 * * *: 1  j25
     2 * * *: 2  lib
     3 * * *: 3  lpc6
     4 * * *: 4  lpc22
     5 * * *: 5  lm6
     6 * * *: 6  lm22
     7 * * *: 7  sinfo
     8 * * *: 8  ann
     9 * * *: 9  beq
     A * * *: 10 bne
     B * * *: 11 blt
     C * * *: 12 bltu
     D * * *: 13 beqp
     E * * *: 14 bnep
     F * * *: 15 s32
    10 * * *: 16 s16
    11 * * *: 17 s8
    12 * * *: 18 sp
    13 * * *: 19 l32
    14 * * *: 20 l16
    15 * * *: 21 l16u
    16 * * *: 22 l8
    17 * * *: 23 l8u
    18 * * *: 24 lp
    19 * * *: 25 add32m
    1A * * *: 26 apm
    1B * * *: 27 appm
    1C * * *: 28 ap32m
    1D * * *: 29 ap16m
    1E * * *: 30 shl
    1F * * *: 31 shru
    20 * * *: 32 shrs
    21 * * *: 33 add32
    22 * * *: 34 sub32
    23 * * *: 35 mul32
    24 * * *: 36 and
    25 * * *: 37 or
    26 * * *: 38 xor
    27 * * *: 39 cp
    28 * * *: 40 cpp
    29 * * *: 41 ap
    2A * * *: 42 app
    2B * * *: 43 ap32
    2C * * *: 44 ap16

    3F 7 0 *: 70 lm32
    3F 7 1 *: 71 lpc32
    3F 7 2 *: 73 jy
    
    3F 7 7 0: 80 j32
    3F 7 7 1: 81 swap_int
    3F 7 7 2: 82 swap_ptr
    3F 7 7 5: 83 over_int
    3F 7 7 6: 84 over_ptr

## Instruction metadata table ###

    opcode, reference_opcode, mnemonic, has_i16_data, has_i32_data, op0_type, op1_type, op2_type, op0_w, op0_r, op1_w, op1_r, op2_w, op2_r, PC_w, PC_r, mem_w, mem_r
    2,2,j9,0,0,i9,_,_,0,1,0,0,0,0,1,1,0,0
    3,3,j25,1,0,i9,_,_,0,1,0,0,0,0,1,1,0,0
    4,4,lib,0,0,u9,_,_,0,0,0,0,0,0,1,1,1,1
    5,5,lpc6,0,0,rp,i6,_,1,0,0,0,0,0,0,1,0,0
    6,6,lm6,0,0,ri,i6,_,1,0,0,0,0,0,0,0,0,0
    7,7,sinfo,0,0,ri,u6,_,1,0,0,0,0,0,0,0,0,0
    8,8,ann,0,0,?,?,?,0,0,0,0,0,0,0,0,0,0
    9,9,beq,0,0,ri,ri,i3,0,1,0,1,0,0,1,1,0,0
    10,10,bne,0,0,ri,ri,i3,0,1,0,1,0,0,1,1,0,0
    11,11,blt,0,0,ri,ri,i3,0,1,0,1,0,0,1,1,0,0
    12,12,bltu,0,0,ri,ri,i3,0,1,0,1,0,0,1,1,0,0
    13,13,beqp,0,0,rp,rp,i3,0,1,0,1,0,0,1,1,0,0
    14,14,bnep,0,0,rp,rp,i3,0,1,0,1,0,0,1,1,0,0
    15,15,s32,0,0,ri,rp,i3,0,1,0,1,0,0,0,0,1,0
    16,16,s16,0,0,ri,rp,i3,0,1,0,1,0,0,0,0,1,0
    17,17,s8,0,0,ri,rp,i3,0,1,0,1,0,0,0,0,1,0
    18,18,sp,0,0,rp,rp,i3,0,1,0,1,0,0,0,0,1,0
    19,19,l32,0,0,ri,rp,i3,1,0,0,1,0,0,0,0,0,1
    20,20,l16,0,0,ri,rp,i3,1,0,0,1,0,0,0,0,0,1
    21,21,l16u,0,0,ri,rp,i3,1,0,0,1,0,0,0,0,0,1
    22,22,l8,0,0,ri,rp,i3,1,0,0,1,0,0,0,0,0,1
    23,23,l8u,0,0,ri,rp,i3,1,0,0,1,0,0,0,0,0,1
    24,24,lp,0,0,rp,rp,i3,1,0,0,1,0,0,0,0,0,1
    25,25,add32m,0,0,ri,ri,i3,1,0,0,1,0,0,0,0,0,0
    26,26,apm,0,0,rp,rp,i3,1,0,0,1,0,0,0,0,0,0
    27,27,appm,0,0,rp,rp,i3,1,0,0,1,0,0,0,0,0,0
    28,28,ap32m,0,0,rp,rp,i3,1,0,0,1,0,0,0,0,0,0
    29,29,ap16m,0,0,rp,rp,i3,1,0,0,1,0,0,0,0,0,0
    30,30,shl,0,0,ri,ri,u3,1,0,0,1,0,0,0,0,0,0
    31,31,shru,0,0,ri,ri,u3,1,0,0,1,0,0,0,0,0,0
    32,32,shrs,0,0,ri,ri,u3,1,0,0,1,0,0,0,0,0,0
    33,33,add32,0,0,ri,ri,ri,1,0,0,1,0,1,0,0,0,0
    34,34,sub32,0,0,ri,ri,ri,1,0,0,1,0,1,0,0,0,0
    35,35,mul32,0,0,ri,ri,ri,1,0,0,1,0,1,0,0,0,0
    36,36,and,0,0,ri,ri,ri,1,0,0,1,0,1,0,0,0,0
    37,37,or,0,0,ri,ri,ri,1,0,0,1,0,1,0,0,0,0
    38,38,xor,0,0,ri,ri,ri,1,0,0,1,0,1,0,0,0,0
    39,39,cp,0,0,ri,ri,ri,1,0,0,1,0,1,0,0,0,0
    40,40,cpp,0,0,rp,rp,ri,1,0,0,1,0,1,0,0,0,0
    41,41,ap,0,0,rp,rp,ri,1,0,0,1,0,1,0,0,0,0
    42,42,app,0,0,rp,rp,ri,1,0,0,1,0,1,0,0,0,0
    43,43,ap32,0,0,rp,rp,ri,1,0,0,1,0,1,0,0,0,0
    44,44,ap16,0,0,rp,rp,ri,1,0,0,1,0,1,0,0,0,0
    
    
    
    
    63,63,,0,0,0,,,0,0,0,0,0,0,0,0,0,0
    63,64,,0,0,1,?,?,0,0,0,0,0,0,0,0,0,0
    63,65,,0,0,2,?,?,0,0,0,0,0,0,0,0,0,0
    63,66,,0,0,3,?,?,0,0,0,0,0,0,0,0,0,0
    63,67,,0,0,4,?,?,0,0,0,0,0,0,0,0,0,0
    63,68,,0,0,5,?,?,0,0,0,0,0,0,0,0,0,0
    63,69,,0,0,6,?,?,0,0,0,0,0,0,0,0,0,0
    
    63,70,lm32,0,1,7,0,ri,0,0,0,0,1,0,0,0,0,0
    63,71,lpc32,0,1,7,1,rp,0,0,0,0,1,0,0,1,0,0
    63,73,jy,0,0,7,2,rp,0,0,0,0,0,1,1,0,0,0
    
    
    63,80,j32,0,1,7,7,0,0,0,0,0,0,0,1,1,0,0
    63,81,swap_int,0,0,7,7,1,0,0,0,0,0,0,1,1,0,0
    63,82,swap_ptr,0,0,7,7,2,0,0,0,0,0,0,1,1,0,0
    63,83,over_int,0,0,7,7,3,0,0,0,0,0,0,1,1,0,0
    63,84,over_ptr,0,0,7,7,4,0,0,0,0,0,0,1,1,0,0

## Notes about opcodes ### When the opcode field contains 63, the instruction is further dispatched based on the value of op0, interpreted as a u3. The reference_opcode field is used to identify each instruction with a unique integer, even though multiple instructions share an encoding with a 63 in the the 'opcode' field. When the opcode field contains 63 and op0 is 7, the instruction is further dispatched based on the value of op1, and when the opcode field contains 63 and op0 is 7 and op1 is 7, the instruction is further dispatched based on the value of op2. In this way, an additional 6 two-operand, 6 one-operand, and 7 zero-operand instructions can be encoded.

todo the following may need to have their numbers updated:

Note that the reference opcodes have the following properties:

refernce opcodes between 70-73, inclusive, plus reference opcode 80, plus reference opcode 3, have a 32-bit immediate embedded in the instruction stream immediately following the instruction, and are the only such opcodes
opcodes between 2-4, inclusive, have 9-bit immediates spanning op0, op1, op2, and are the only such opcodes
opcodes between 5-7, inclusive, have a 6-bit immediate in op1 and op2, and are the only such opcodes
opcodes 8 is ANNotate
opcodes between 8-29, inclusive, have a 3-bit signed immediate in op2, and are the only such opcodes
opcodes between 30-32, inclusive, have a 3-bit unsigned immediate in op2, and are the only such opcodes
opcodes between 33-45, inclusive, have no immediates, and are the only such opcodes

Note that later additions that assign instruction(s) to the RESERVED opcodes may break the 'only such' parts of these properties.

# Semantics details ##

For example, for multiplication, imagine if we had 3-bit integers instead of 32-bit integers. If we multiply the unsigned representations of 2*3, that is, 010*011, the result is 6, that is, 110. In two's complement, 010 represents 2 and 011 also represents 3, and 110 represents -2; and 2*3 = 6 = -2 mod 2^3. To give another example, if we multiply the unsigned representations of 6*2, that is, 110*010, the result is 12, and 12 mod 2^3 is 4, that is, 100 in binary. In two's complement, 110 represents -2 and 010 represents 2, and the result, 100, represents -4; and -2*2 = -4 mod 2^3 = 4. Do note that these are only correct modulo the bitwidth; for example, 2*3 = 6, but mul 010 011 = 110, which when viewed as two's complement yields 2*3 = -2, an incorrect result in ordinary arithmetic, but -2 is equivalent to 6 mod 8, so the result is correct in mod 8 arithmetic. In the examples in this paragraph we used mod 8 for 3-bit integers, but in reality, we are using mod 2^32 for 32-bit integers, not mod 8.

On many platforms, it may be easiest to implement add32, add32m, sub32, mul32 by viewing the int32s as unsigned integers and then applying unsigned arithmetic operations, because many platforms don't implement wrap-around signed numbers.

Note that many arithmetic operations are provided only for integers; the only arithmetic you can do to pointers is add signed integers to them.

## Integer bitwidths ###

The instructions lb, lbu, lh, lhu guarantee that the numbers read into registers are in certain ranges that fit in 8- and 16-bits, respectively. lb and lh sign-extend the number read to 32-bits, and lbu and lhu zero-extend the number read to 32-bits.

However, lb and lh result in signed two's complement representations in the destination register; note that the bit pattern of a small negative number in a 32-bit register, when coded with signed two's complement, is equivalent to a number larger than 16 bits if interpreted as unsigned. For example, a -1 in a register, signed, would be viewed as (232 - 1 = 4294967295) unsigned.

Boot guarantees that bytes (8-bit integers) have a size 1 in memory (meaning that values that are stored with sb occupy one memory location). INT8_SIZE is not defined because it would always be 1.

### (not) mixing integer bitwiths #### When in registers and being operated upon, the internal representation of int32s is a defined sequence of bits, however, when in memory, the internal representation of integers is opaque. For example, if a memory location x contains a 32-bit integer, and you read it with lh or lhu, the value that is read is unspecified other than that it's no larger than 16 bits. Similarly, if a memory location contains an 8-bit integer and you read it using l32, the value that is read is unspecified other than that it's some int32 (also, reading a byte using l32 near the edge of accessible memory will cause undefined behavior if there are less than INT32_SIZE memory locations in accessible memory, starting with the location read). You cannot write a sequence of bytes (8-bit integers) into memory and then usefully read it back using l32, and you cannot write a 32-bit integer into memory and then usefully read out its component bytes.

Furthermore there is no guarantee that 32-bit integers occupy more than one memory location, or that larger integer bitwidths occupy more memory locations than smaller; it's possible for both of INT16_SIZE, INT32_SIZE to be identically 1 (this can happen if the implementation chooses to make each single memory location large enough to store 32 bits of data). Larger integer bitwidths are guaranteed to occupy at least as many memory locations as smaller.

## Stack manipulation tips ### Constants can be pushed onto the smallstacks by using the load-immediate instructions and writing to the special smallstack pseudoregister, 1. For example, to push constant 1 onto the integer stack, 'l9 $1 1'.

The smallstacks can be pushed and popped from/to other registers by using cp/cpp and the special smallstack pseudoregister, 1. 'DUP' can be accomplished by pushing a copy of the top-of-stack register, 2, onto the smallstack using cp/cpp and writing to the smallstack pseudoregister, 1, e.g. 'cp $1 $2'.

'DROP' can be accomplished by popping the top-of-stack register, 2, and discarding it by using cp/cpp and writing to the zero pseudoregister, eg 'cp $0 $2'.

## Undefined behavior and arbitrary values ###

These lists are probably accidentially incomplete right now, but we hope to make this list comprehensive as time goes on.

### Undefined behavior ####

The following are undefined behaviors in Boot. Any program containing undefined behavior on any codepath has undefined behavior as a whole:

branching or jumping to a location outside the bounds of the program (unless the location was provided in an implementation-dependent manner for the purpose of being jumped to)
branching or jumping to a location in the middle of an instruction
creating a pointer to or accessing memory that was not provided to the Boot program by an external function, unless the platform permits this in an implementation-dependent way
performing addition (arithmetic) on a code pointer, or accesing or using a code pointer upon which arithmetic was performed
branching or jumping back to the same instruction
lpc32 with the &dest register operand set to &0
any opcode which is RESERVED
loading a pointer into an int32 register
note: to blindly copy memory, use the 'memcpy' lib function
loading a non-pointer into a ptr register
load from a memory location that is in the middle of an integer or pointer
any instruction that does not have a 0 for an operand listed with a 0 above
performing arithmetic on a pointer that would cause the result to point outside of memory to which the program has access
assuming that pointers 'wrap-around'; adding an integer to a pointer which is greater than the distance between that pointer and the top of accessible memory, or subtracting an integer to a pointer which is greater than the distance between that pointer and the bottom of accessible memory
accessing a pointer that points to memory to which the program doesn't have access
popping the last item in a smallstack
pushing an item to a smallstack that is already at capacity

Arbitrary values ####

The following do not cause undefined behavior and do not make the whole program invalid, but do not define the resulting values of certain operations:

loading part of an integer by using l16, l16u, l8, l8u on a larger-bitwidth integer (this is guaranteed to produce an int16 for l16 and l16u, and an int8 for l8 and l8u, but otherwise the value produced is not specified)
loading a larger bitwidth than was stored by using l32, l16, l16u on a smaller-bitwith integer (this is guaranteed to produce an int32 for l32, and an int16 for l8 and l8u, but otherwise the value produced is not specified)

Reserved and implementation-defined items ###
1. Reserved for extensions, vs. implementation-defined, vs. reserved for future use #### There is a distinction between items that are reserved for extensions, vs. items that are implementation-defined. The former are expected to be defined in extension languages such as BootX?. By contrast, implementation-defined items are allocated for the implementation to do what it wishes and are not expected to be used by extensions such as BootX?.

Another category is items which are reserved for future use. These items are reserved for use in future versions of Boot itself, and should not be used by either extensions or by implementations.

Implementations must not define or use items which are reserved for extensions, or items which are reserved for future use; if they do so, they risk incompatibility with extensions or future Boot versions. Extension languages must not define or use items which are defined in Boot to be implementation-dependent.

### Reserved instruction encoding space #### The 0 in the most-significant bit is intended to allow Boot to be made a part of other instruction formats which use 0 in this bit to indicate a Boot instruction, a 1 to indicate something else (for example, instructions of different lengths). That is to say, instructions with a 1 in the most-significant bit are reserved for extensions.
1. Boot Assembly ## Boot Assembly is a plaintext syntax for Boot.

Boot Assembly is ASCII text. Each line is processed separately; lines are delimited by the newline character, '\n' (a byte with the value 10). Whitespace is defined as one of the characters: ' \t\n\r\f\v' (where \t indicates tab, \n indicates newline, etc). Lines which are all whitespace, or which begin with a semicolon, are skipped. Trailing whitespace on any line is ignored.

Some lines may begin with '.d' (a 'data line'). This is followed by a space and then a 16-bit unsigned hexadecimal number (4 characters which are each digits or letters within a-f). This may be followed by whitespace which may be followed by a semicolon. After a semicolon the rest of the line is a comment (all characters except newline are ignored), up to the first newline, which still terminates the line.

Therefore, data lines must match the following regular expression (regex): ^\.d [0-9a-f][0-9a-f][0-9a-f][0-9a-f]\s*(;.*)?$

Other lines (an 'instruction line') begin with an instruction mnemonic. Instruction mnemonics are at most 12 characters, and are all lowercase alphanumeric. Mnemonics begin with an alphabetic character and are followed by one or more alphanumeric characters. This is followed by a space, possibly followed by a hexadecimal number (a string of digits from 0 to 9 and lowercase letters from a to f, possibly prefixed by one of '-' or '+') denoting the first operand, op0. This may be followed by another space and a second number (op1), and maybe by a another space and a third number (op2). After three operands the usable information in the line is exhausted and the assembler may skip to the next line. If the instruction is followed by a 32-bit embedded immediate constant, this must be included manually on the next line using '.d'. The usable information in the line may be followed by whitespace which may be followed by a semicolon. After a semicolon the rest of the line is a comment (all characters except newline are ignored), up to the first newline, which still terminates the line.

Operands are hexadecimal integer in base 16 and may be prefixed by '+' or '-' to indicate sign. Immediate operand ranges are (note that these ranges are written in hexadecimal):

#imm3: -4 to 3
#imm3u: 0 to 7
#imm6: -20 to 1f
#imm6u: 0 to 0x39
#imm9: -100 to ff
#imm9u: 0 to 1ff

Operands of register type must be in the range 0 to f, inclusive.

Therefore, instruction lines must match the following regular expression (regex): ^([a-z][a-z0-9]+)( ([-+]?[0-9a-f]+))?( ([-+]?[0-9a-f]+))?( ([-+]?[0-9a-f]+))?\s*(;.*)?$

The last line in the file must end in a newline, unless the last line is all whitespace.

# TODO ##

the above is up-to-date as of 201020!
reduce to 32 opcodes
yknow, in LOVM stack addr mode, we can already treat the TOS as a register via operand data 0. So maybe there's not a need for register 2 to be TOS? If we got rid of that, then the 16-bit encoding (Boot) would have one more register but it wouldn't be able to use TOS like a register. It could still mutate the TOS tho via SOMEOPERATION smallstack smallstack. Seems a little cleaner. Could still access TOS in 8-bit mode. Maybe do it?
todo finish the writeup in ootAssemblyNotes27
start copying in from = General architecture from other file
implement this new version
after implementing in assembly, see if having 6 stack items is realistic; by my count, this gives 5 hidden register to spare for use by the implementation on 32-register architectures like AArch64 or RISC-V, which or may not be enough; 2 of those will be needed to store the current smallstack depths, leaving 3 others; one of which i assume will be the PC, and one of which i assume will be a pointer to some structure in memory storing global interpreter state, leaving only 1 register to be used for temporary calculation by the implementation; don't we want at least 2 so that we can swap without using memory? for now, i set the stack capacity to 5 (this is mentioned in 2 places in the reference).