- Boot (Oot Boot) reference
Version: unreleased (0.0.0-0)
Boot is a low-level 'assembly language' virtual machine (VM) that is easy to implement.
- # Introduction
Boot is a target language that is easy to implement on a wide variety of platforms, even on very primitive bare metal, or 'on top of'/within an existing high-level languages such as Python.
Highlights:
- 3-operand fixed-length register machine
- signed 32-bit integers
- integers and pointers both have implementation-dependent sizes in memory, and may be different sizes
- opaque pointer representation
- 7 integer registers, 7 pointer registers, 15 opaque registers to copy values of any type, and a zero register, and a null pointer register
- <=64 instructions
- RISC-like; no addressing modes (or rather, each instruction has one fixed addressing mode), and only a few instructions access (non-register) memory
- instructions for I/O, memory allocation, and calling to and from host platform
- the extension BootX? is available (well, rather it's in progress) that specifies more instructions and functionality (see bootx_reference.md)
[TOC]
- # Instruction encoding ## 4 bytes per instruction. The bytes are:
- op0 (operand 0)
- op1
- op2
- opcode
Op0 is restricted to a maximum value of 63 (when interpreted as unsigned), meaning that the 2 most-significant-bits are always zero.
- # Datatypes ##
two primary datatypes:
- int32 (32-bit integers)
- ptr (pointers)
ptr has two subtypes: - ptrd (data pointers) - ptrc (code pointers)
- # Registers ## Two banks of 8 registers each; one for int32, one for ptr. The first register in the int32 bank is constant zero, and the first register in the ptr bank is the null pointer; writes to these registers have no effect. The notation $n refers to the n-th int32 register, and &n refers to the n-th ptr register, for example the first and last registers in each bank are: $0, $15, &0, &15.
- Instructions ##
47 instructions
== |
---|
annotation | ann |
load constants | l32m |
loads and stores and copies | lb lbu lh lhu lp lw sb sh sp sw cp cpp |
arithmetic of ints | add sub mul addm |
bitwise arithmetic | and or xor not shl shrs shru |
adding ints to pointers | app ap32 ap16 ap8 appm ap32m ap16m ap8m |
comparision control flow | bne blt bltu beq bnep beqp |
other control flow | jrls jrlm jrll jy lpc |
I/O | in inp |
interop | xlib |
misc | break impl sinfo |
== |
---|
(Notation for the instruction tables below) #imm6, #imm8, #imm16, #imm22 are immediate constants using two's complement encoding (#imm6 is 6 bits instead of 8 and #imm22 is 22 bits instead of 24 because op0 can only reach 63, as noted in 'Instruction encoding' above), #imm22u, #imm16u, #imm8u, #imm6u are immediate constants interpreted as unsigned, $X is an integer register, &X is a pointer register, _ is an unused argument that must always be 0 in proper Boot programs (Boot implementation are free to make use of these locations however), and X is an untyped argument. All immediate constants are signed two's-complement ints.
From left to right, the arguments go into operands op0, op1, op2. Immediate operands are always on the right (the highest-numbered operand). When two immediate operands are combined into an #imm16 (as with instruction lm), op1 is the high-order bits and op2 is the low order bits (imm16 = (op1 << 8) + op2). Similarly for #imm22 (imm24 = (op0 << 16) + (op1 << 8) + op2).
JREL and branch immediates are in units of bytes in the Boot code. JREL and branch immediates may not jump into the middle of an instruction. Platforms which compile or represent Boot code in memory in ways such that one instruction spans more or less than 4 memory locations must adjust the jrel and branch offsets accordingly before executing them.
Mnemonics with a trailing 'm' represent instructions involving an 'iMmediate' (although not all instructions with immediates have a trailing 'm' in the mnemonic).
annotation:
- ann ? ? ?: ANNotation; no effect on execution
load constants:
- l32m $dest $src #imm8u: Load 32-bit iMmediate int constant (embedded in instruction stream immediately following instruction), shift it left by #imm8u, add it to $src, then write it to $dest
register loads and stores and copies:
- lw $dest &addr #imm8: Load Word int32 from memory addr (&addr + #imm8*INT32_SIZE)
- lh $dest &addr #imm8: Load Halfword int16 from memory addr (&addr + #imm8*INT16_SIZE)
- lhu $dest &addr #imm8: Load Unsigned int16 from memory addr (&addr + #imm8*INT16_SIZE)
- lb $dest &addr #imm8: Load Byte int8 from memory addr (&addr + #imm8)
- lbu $dest &addr #imm8: Load unsigned int8 from memory addr (&addr + #imm8)
- lp &dest &addr #imm8: Load Ptr from memory addr (&addr + #imm8*PTRD_SIZE)
- sw &addr $src #imm8: Store Word int32 to memory addr (&addr + #imm8*INT32_SIZE))
- sh &addr $src #imm8: Store Halfword int16 to memory addr (&addr + #imm8*INT16_SIZE)
- sb &addr $src #imm8: Store Byte int8 to memory addr (&addr + #imm8)
- sp &addr &src #imm8: Store Ptr to memory addr (&addr + #imm8*PTRD_SIZE)
- cp $dest $src $cond: if $cond == 0, then CoPy? int between registers
- cpp &dest &src $cond: if $cond == 0, then CoPy? Pointer between registers
arithmetic of ints (result always defined and all results mod 2^32):
- add $dest $src1 $src2: $dest = $src1 + $src2
- addm $dest $src1 #imm8: $dest = $src1 + #imm8
- sub $dest $src1 $src2: $dest = $src1 - $src2
- mull $dest $src1 $src2: $dest = $src1 * $src2
bitwise arithmetic:
- shl $dest $src #imm8u: Shift Left (C's '<<' operator) by #imm8u bits
- shru $dest $src #imm8u: Shift Right Unsigned (logical shift) by #imm8u bits
- shrs $dest $src #imm8u: Shift Right Signed (arithmetic shift) by #imm8u bits
- and $dest $src1 $src2
- or $dest $src1 $src2
- xor $dest $src1 $src2
- not $dest $src1 $cond: if $cond == 0, then $dest = bitwise_NOT($src1)
Adding ints to Pointers (only valid on data pointers, not code pointers):
- app &dest &src1 $src2: &dest = &src1 + $src2*PTRD_SIZE
- ap32 &dest &src1 $src2: &dest = &src1 + $src2*INT32_SIZE
- ap16 &dest &src1 $src2: &dest = &src1 + $src2*INT16_SIZE
- ap &dest &src1 $src2: &dest = &src1 + $src2
- appm &dest &src1 #imm8: &dest = &src1 + #imm8*PTRD_SIZE
- ap32m &dest &src1 #imm8: &dest = &src1 + #imm8*INT32_SIZE
- ap16m &dest &src1 #imm8: &dest = &src1 + #imm8*INT16_SIZE
- apm &dest &src1 #imm8: &dest = &src1 + #imm8
conditional branches:
- beq $src0 $src1 #imm8: Branch-if-EQual
- beqp &src0 &src1 #imm8: Branch-if-EQual on Ptrs
- bne $src0 $src1 #imm8: Branch-if-Not-eQual
- bnep &src0 &src1 #imm8: Branch-if-Not-Equal on Ptrs
- blt $src0 $src1 #imm8: Branch-if-Less-Than
- bltu $src0 $src1 #imm8: Branch-if-Less-Than-Unsigned
unconditional jumps and other control flow:
- jr #imm9: unconditional Jump (Relative to the next instruction) (range +-255)
- jrls #imm9: unconditional Jump Relative (to the next instruction) Long; 9-bit signed immediate offset (twos complement of concatenation of all 3 operands)
- jrlm #imm24: unconditional Jump Relative (to the next instruction) Long; 24-bit signed immediate offset (twos complement of concatenation of all 3 operands)
- jrll #imm32: unconditional Jump Relative (to the next instruction) Long; 32-bit signed offset in next word in instruction stream
- jy &target _ _: Jump dYnamic (indirect)
- lpc &dest: Load Program Counter
I/O:
- in1 $dest $device #imm8u: read IN one int8 from device ($device + #imm8)
- out1 $device $src #imm8u: write OUT one int8 to device ($device + #imm8)
- in &dest &device $len: read IN $len int8s from device $device
- out $device &src $len: write OUT $len int8s to device $device
interop:
- xlib #libfn_imm8u #nargsi_imm6u #nargp_imm8u: call eXternal LIBrary function
misc:
- break ? ? ?: BREAKpoint (implementation-dependent debugging)
- impl ? ? ?: IMPLementation-dependent instruction
- sinfo $dest #query1_imm8u #query2_imm8u: $dest = System INFOrmation query
- Notes on certain instructions ###
- app, ap32, addpi17, ap: the int32 arguments $src2 are interpreted as signed, so although only addition is provided, subtraction can also be accomplished
- ann: implementations may ignore or strip ann instructions
- blt: the int32 arguments (in $src0 and $src1) are signed
- bltu: the int32 arguments (in $src0 and $src1) are unsigned
- cp, cpp, not, halt: to make the cp/cpp/not/halt instructions unconditional, just use register $0 for $cond, since $0 is the always-zero register which always holds 0
- cp: when the condition is true, this is equivalent to addi $dest $src 0
- cpp: this is not equivalent to app because this can be used on codeptr and app can only be used on 'ptrd's
- impl and break: Strictly speaking, any program containing either of these instructions is invalid Boot code, as it is not really a Boot program but rather is a program in some implementation-defined dialect/extension of the Boot language
- in, out, inp, outp: If successful, $3 is set to the number of bytes read or written (either 0 or 1 for IN; for IN, if nothing was available to be read or if EOF was reached, IN writes 0 to $3 and the output register holds an arbitrary value; for either IN or OUT, some platforms may return 0 in 3 rather than returning an error in some cases); in case of an error, a negative error code is written to $3. The device number is interpreted as unsigned.
- jmp: the #imm22 specifies an code location in terms of bytes from the start of the program
- jrel: JREL 0 (the instruction whose encoding is all-zero bits) is illegal
- jy: &target must be a code pointer provided at runtime (either by lpc, or by platform-specific or foreign code) and which did not have pointer arithmetic performed on it.
- lb: guaranteed to produce values between -128 and 127, inclusive (when interpreted as signed).
- lbu: guaranteed to produce values between 0 and 255, inclusive (when interpreted as unsigned)
- lh: guaranteed to produce values between -32768 and 32767, inclusive (when interpreted as signed).
- lhu: guaranteed to produce values between 0 and 65535, inclusive (when interpreted as unsigned)
- sb, sh: stores the least-significant 8- and 16- bits, respectively
- sinfo: when query_1 = ..., this returns in &dest ...: 0, PTRD_SIZE, the number of memory locations per ptrd 1, INT32_SIZE, the number of memory locations per int32 2, INT16_SIZE, the number of memory locations per int16 3, VERSION (currently 0) 4, INTMAX_32 5, INTMAX_16 6, INTMAX_8 247-254, implementation-defined others: RESERVED for extensions query_2 should always be 0 when query_1 is 0,1,2,3.
- xlib: call external library function #libfn_imm8 with #nargsi_imm8 integer arguments and #nargsv_imm8 pointer arguments placed as per Boot calling convention. See below for defined libfn numbers.
- Arithmetic ## Int32 overflow on addition, subtraction, multiplication wraps around (that is, mathematically the operations are done mod 2^32). Note that that the operations of add, addm, sub, mul give valid results whether you consider the int32 operands to be signed two's complement or unsigned, as long as you consider the result to be similarly unsigned or signed.
For example, for multiplication, imagine if we had 3-bit integers instead of 32-bit integers. If we multiply the unsigned representations of 2*3, that is, 010*011, the result is 6, that is, 110. In two's complement, 010 represents 2 and 011 also represents 3, and 110 represents -2; and 2*3 = 6 = -2 mod 2^3. To give another example, if we multiply the unsigned representations of 6*2, that is, 110*010, the result is 12, and 18 mod 2^3 is 4, that is, 100. In two's complement, 110 represents -2 and 010 represents 2, and the result, 100, represents -4; and -2*2 = -4 mod 2^3 = 4. Do note that these are only correct modulo the bitwidth; for example, 2*3 = 6, but mul 010 011 = 110, which when viewed as two's complement yields 2*3 = -2, an incorrect result in ordinary arithmetic, but -2 is equivalent to 6 mod 8, so the result is correct in mod 8 arithmetic. In the examples in this paragraph we used mod 8, but in reality, we are using mod 2^32, not mod 8.
On many platforms, it may be easiest to implement add, sub, addm by viewing the int32s as unsigned integers and then applying unsigned addition, subtraction, because many platforms don't implement wrap-around signed numbers.
Note that many arithmetic operations are provided only for integers; the only arithmetic you can do to pointers is add or subtract integers to/from them.
- # (Not) mixing integer bitwidths ##
When in registers and being operated upon, the internal representation of int32s is a defined sequence of bits, however, when in memory, the internal representation of integers is opaque. For example, if a memory location x contains a 32-bit integer, and you read it with lh or lhu, the value that is read is unspecified other than that it's no larger than 16 bits. Similarly, if a memory location contains an 8-bit integer and you read it using lw, the value that is read is unspecified other than that it's some int32 (also, reading a byte using lw near the edge of accessible memory will cause undefined behavior if there are less than INT32_SIZE memory locations in accessible memory, starting with the location read). You cannot write a sequence of bytes (8-bit integers) into memory and then usefully read it back using lw, and you cannot write a 32-bit integer into memory and then usefully read out its component bytes.
Furthermore there is no guarantee that 32-bit integers occupy more than one memory location, or that larger integer bitwidths occupy more memory locations than smaller; it's possible for both of INT16_SIZE, INT32_SIZE to be identically 1 (this can happen if the implementation chooses to make each single memory location large enough to store 32 bits of data).
The instructions lb, lbu, lh, lhu guarantee that the numbers read into registers are in certain ranges that fit in 8- and 16-bits, respectively. However, lb and lh result in signed two's complement representations in the destination register; note that the bit pattern of a small negative number in a 32-bit register, when coded with signed two's complement, is equivalent to a number larger than 16 bits if interpreted as unsigned. For example, a -1 in a register, signed, would be viewed as (232 - 1 = 4294967295) unsigned.
Boot guarantees that bytes (8-bit integers) have a size 1 in memory (meaning that values that are stored with sb occupy one memory location). INT8_SIZE is always 1.
- # I/O ## If standard console streams STDIN, STDOUT, exist on the platform and are supported by the implementation, they must be devices #0, #1, respectively, and device #2 must be STDERR if it exists, and otherwise should be an alias to STDOUT or may be a null device (one which never emits anything and to which writing has no effect).
An implementation does not have to support INP, OUTP.
- # xlib calls ## Number 0 is defined below and 3 thru 127 are RESERVED for extensions. libfn numbers 128 thru 254 are implementation-defined.
- xlib 0: halt(result: int32) ###
End program with result code 'result'. The result code is interpreted as signed.
- # The Boot Calling Convention ##
Up to 3 integer arguments and up to 3 pointer arguments are passed in registers.
Registers 1, 2, 4, 5 (all banks) are caller-saved. Registers 3, 6, 7 (all banks) are callee-saved.
Pointer register 3 is used as a memory stack pointer when applicable (TODO what does 'when applicable' mean?), otherwise it is callee-saved.
Pointer register 5 is used as a return address pointer/link register. When using xlib, there is no need to set this register, these instructions will set it if needed.
Registers 1, 2, 4 (both integer bank and pointer bank) are used to pass arguments and return values (from lower to higher number get arguments from left to right). Upon making a call, up to 3 integer arguments are in integer registers 1, 2, 4, and up to 3 pointer arguments are in pointer registers 1, 2, 4, and the return address is found in pointer register 5.
Registers 5,6,7 (both banks) are caller-saved scratch registers and may be overwritten and used for any purpose by the callee.
Upon returning from a call, up to 3 integer and up to 3 pointer return values will be found in registers 1,2,4 using the same convention as for calling.
- # Undefined behavior and arbitrary values ##
These lists are probably accidentially incomplete right now, but we hope to make this list comprehensive as time goes on.
- ## Undefined behavior ###
The following are undefined behaviors in Boot. Any program containing undefined behavior on any codepath has undefined behavior as a whole:
- branching or jumping to a location outside the bounds of the program
- branching or jumping to a location in the middle of an instruction
- creating a pointer to or accessing memory that was neither malloc'd, nor provided to the Boot program by an external function, unless the platform permits this in an implementation-dependent way
- performing addition (arithmetic) on a code pointer, or accesing or using a code pointer upon which arithmetic was performed
- mfreeing malloc'd memory more than once
- mfreeing a pointer which was not previously returned by malloc
- branching or jumping back to the same instruction
- jrel 0
- an instruction with op0 > 63 (when interpreted as unsigned)
- any opcode which is RESERVED
- loading a non-integer into an int32 register
- loading a non-pointer into a ptr register
- load from a memory location that is in the middle of an integer or pointer
- any instruction that does not have a 0 for an operand listed with a '_' above (e.g. mallo in op2)
- Arbitrary values ###
The following do not cause undefined behavior and do not make the whole program invalid, but do not define the resulting values of certain operations:
- loading part of an integer by using lh, lhu, lb, lbu on a larger-bitwidth integer (this is guaranteed to produce an int16 for lh and lhu, and an int8 for lb and lbu, but otherwise the value produced is not specified)
- loading a larger bitwidth than was stored by using lw, lh, lhu on a smaller-bitwith integer (this is guaranteed to produce an int32 for lw, and an int16 for lb and lbu, but otherwise the value produced is not specified)
- Boot Assembly ## Boot Assembly is a plaintext syntax for Boot.
Boot Assembly is ASCII text. Each line is processed separately; lines are delimited by the newline character, '\n' (a byte with the value 10). Whitespace is defined as one of the characters: ' \t\n\r\f\v' (where \t indicates tab, \n indicates newline, etc). Lines which are all whitespace, or which begin with a semicolon, are skipped. Trailing whitespace on any line is ignored.
Lines begin with an instruction mnemonic, which consists of lowercase letters and digits and is at most 5 characters long. This is followed by whitespace, followed by a number (a string of digits between 0 and 9, possibly prefixed by one of '-' or '+') denoting the first operand, op0. This may be followed by more whitespace and second number (op1), and maybe by more whitespace and a third number (op2). This may be followed by whitespace which may be followed by a semicolon. After a semicolon the rest of the line is a comment (all characters are ignored), up to the first newline, which still terminates the line.
Operands are integers in base 10 and may be prefixed by '+' or '-' to indicate sign. Instructions with exactly one unsigned immediate operand may have any unsigned value from 0 thru 4194303, inclusive, in that operand. Instructions with two operands, where unsigned immediates, may have any value from 0 thru 63, inclusive, in the first operand, and any value from 0 thru 65535, inclusive, in the second operand. Instructions with three operands, where unsigned immediates, may have any value from 0 thru 63, inclusive, in the first operand and any value from 0 to 255 in each other operand. When an immediate operand type for this instruction is signed, unsigned ranges from 0 to 63 are replaced by signed ranges from -32 to 31, unsigned ranges from 0 to 255 are replaced by signed ranges from -128 to 127, unsigned ranges from 0 to 65535 are replaced by signed ranges from -32768 to 32767, unsigned ranges from 0 to 4194303 are replaced by signed ranges from -2097152 to 2097151. Operands of register type must be in the range 0 to 15, inclusive.
Therefore, instruction lines must match the following regular expression (regex): ^([a-z][a-z0-9]+)(\s+([-+]?[0-9]+))(\s+([-+]?[0-9]+))?(\s+([-+]?[0-9]+))?\s*(;.*)?$
The last line in the file must end in a newline, unless the last line is all whitespace.
- # Reserved and implementation-defined items ##
- Reserved for extensions, vs. implementation-defined, vs. reserved for future use ### There is a distinction between items that are reserved for extensions, vs. items that are implementation-defined. The former are expected to be defined in extension languages such as BootX?. By contrast, implementation-defined items are allocated for the implementation to do what it wishes and are not expected to be used by extensions such as BootX?.
Another category is items which are reserved for future use. These items are reserved for use in future versions of Boot itself, and should not be used by either extensions or by implementations.
Implementations must not define or use items which are reserved for extensions, or items which are reserved for future use; if they do so, they risk incompatibility with extensions or future Boot versions. Extension languages must not define or use items which are defined in Boot to be implementation-dependent.
- ## Reserved instruction encoding space ### The limitation of op0 to have zeros in the two most-significant bits is intended to allow Boot to be made a part of other instruction formats which use zero values in these bits to indicate a Boot instruction, and non-zero values to indicate something else (for example, instructions of different lengths). That is to say, instructions with a 1 in either of the two most-significant bits in op0 are reserved for extensions.
- Reserved instruction opcodes ### Boot only defines opcodes under 64; higher opcodes are reserved for extensions. Opcode 62 is implementation-defined. Opcode 63 is reserved for future use.
- Reserved sinfo query types ### Boot defines sinfo query numbers 0-3. Queries 247 thru 254 inclusive are implementation-defined. Other query numbers are reserved for extensions.
- Reserved xlib functions ### Boot defines no xlib functions, but reserves numbers 0 thru 127 for extensions. Numbers 128 thru 254 are implementation-defined.
- Misc ## Instruction mnemonics are at most 5 characters, and are all lowercase alphanumeric. Mnemonics begin with an alphabetic character and are followed by one or more alphanumeric characters.
Note that the opcodes (see table below) have the following properties:
- opcodes between 0-1, inclusive, have 22-bit immediates spanning op0, op1, op2, and are the only such opcodes
- opcodes between 2-4, inclusive, have 16-bit immediates spanning op1, op2, and are the only such opcodes
- opcodes between 5-44, inclusive, have an 8-bit immediate in op2, and are the only such opcodes (excluding mallo and mfree which have _ (forced 0) in op2)
- opcodes between 6-31, inclusive, have a signed 8-bit immediate in op2, and are the only such opcodes
- opcodes between 6-14, and 32-36, inclusive, are control flow, and are the only such opcodes
- opcodes between 37-40, inclusive, are I/O, and are the only such opcodes
- opcodes between 15-26, inclusive, are load/stores, and are the only opcodes that directly access memory (although there are others, such as mallo, which call system subroutines that probably access memory)
- opcodes between 45-48, inclusive, have a conditional in op2, and are the only such opcodes
- opcodes between 45-58, inclusive, have an integer register in op2, and are the only such opcodes
Note that later additions that assign instruction(s) to the RESERVED opcode may break the 'only such' parts of these properties.
The short descriptions under Instructions, above, have been kept to at most 80 characters per line.
- # Opcodes and argument types ##
Type identifiers in the following table:
- i22, i16, i8: signed immediate of the specified bitwidth
- u22, u16, u8: unsigned immediate of the specified bitwidth
- ri, rp, ra: register specifier for int32, ptr, anytype bank, respectively
- _: must be 0
Opcode and argument type table:
- 0: jrel ('i22',)
- 1: jmp ('u22',)
- 2: lentr ('rp', 'i16',)
- 3: lm ('ri', 'u16',)
- 4: sam ('ri', 'u16',)
- 5: ann ('u8', 'u8', 'u8',)
- 6: beq ('ri', 'ri', 'i8',)
- 7: beqp ('ri', 'ri', 'i8',)
- 8: bne ('ri', 'ri', 'i8',)
- 9: bnep ('ri', 'ri', 'i8',)
- 10: blt ('ri', 'ri', 'i8',)
- 11: bltu ('ri', 'ri', 'i8',)
- 12: halt ('ri', 'ri', 'i8',)
- 13: xreti ('rp', 'ri', 'i8',)
- 14: xretp ('rp', 'rp', 'i8',)
- 15: lw ('ri', 'rp', 'i8',)
- 16: lh ('ri', 'rp', 'i8',)
- 17: lhu ('ri', 'rp', 'i8',)
- 18: lb ('ri', 'rp', 'i8',)
- 19: lbu ('ri', 'rp', 'i8',)
- 20: lp ('rp', 'rp', 'i8',)
- 21: la ('ra', 'ra', 'i8',)
- 22: sw ('rp', 'ri', 'i8',)
- 23: sh ('rp', 'ri', 'i8',)
- 24: sb ('rp', 'ri', 'i8',)
- 25: sp ('rp', 'rp', 'i8',)
- 26: sa ('ra', 'ra', 'i8',)
- 27: addm ('ri', 'ri', 'i8',)
- 28: appm ('rp', 'rp', 'i8',)
- 29: ap32m ('rp', 'rp', 'i8',)
- 30: ap16m ('rp', 'rp', 'i8',)
- 31: apm ('rp', 'rp', 'i8',)
- 32: xcall ('rp', 'u8', 'u8',)
- 33: xentr ('_', 'u8', 'u8',)
- 34: xlib ('u6', 'u8', 'u8',)
- 35: xaftr ('_', 'u8', 'u8',)
- 36: xret0 ('rp', '_', '_',)
- 37: in ('ri', 'ri', 'u8',)
- 38: out ('ri', 'ri', 'u8',)
- 39: inp ('rp', 'ri', 'u8',)
- 40: outp ('ri', 'rp', 'u8',)
- 41: shl ('ri', 'ri', 'u8',)
- 42: shru ('ri', 'ri', 'u8',)
- 43: shrs ('ri', 'ri', 'u8',)
- 44: sinfo ('ri', 'u8', 'u8',)
- 45: cp ('ri', 'ri', 'ri',)
- 46: cpp ('rp', 'rp', 'ri',)
- 47: cpa ('ra', 'ra', 'ri',)
- 48: not ('ri', 'ri', 'ri',)
- 49: app ('rp', 'rp', 'ri',)
- 50: ap32 ('rp', 'rp', 'ri',)
- 51: ap16 ('rp', 'rp', 'ri',)
- 52: ap ('rp', 'rp', 'ri',)
- 53: add ('ri', 'ri', 'ri',)
- 54: sub ('ri', 'ri', 'ri',)
- 55: mul ('ri', 'ri', 'ri',)
- 56: and ('ri', 'ri', 'ri',)
- 57: or ('ri', 'ri', 'ri',)
- 58: xor ('ri', 'ri', 'ri',)
- 59: mallo ('rp', 'ri', '_',)
- 60: mfree ('rp', '_', '_',)
- 61: break ('u8', 'u8', 'u8',)
- 62: impl ('u8', 'u8', 'u8',)
- 63: RESERVED ()
- # TODO ##
- update Opcode and argument type table for recent changes
- rethink the choice of register numbes for the Boot calling convention (and rethink which of the 4 first registers should be the SMALLSTACK pointer in loot)
- todo finish the writeup in ootAssemblyNotes27
- make the immediates imm3, so that they'll fit the 16-bit encoding
- start copying in from = General architecture from other file
- search for TODOs above
- maybe xret0 could take an $ncond?
- mb get rid of inp, outp, ap16, ap16m?
- or.. mb add ina, outa? (no, i'd rather get rid of inp, outp)
- without a way to read the PC, can't manually push a return address onto a stack for calling; so reduced to only using xcall, which can only call xentr's. Is this an issue? Maybe... we wanted to preserve xcall for EXTERNAL platform calls (or call to externally visible entry points in Boot code), we want to allow the code to call itself normally. Otoh maybe it's fine for now. This makes this even less of an ASSEMBLY language however. Would like to add CALL and RET in this case but no room. Mb remove ap16, ap16m?
- if making CALL, RET, then need a stack for them. Make pointer register 3 unusable/reserved for system use? What about inside xcalls? Or specify that we are using register 3? How would CALL, RET be compiled to JVM -- would we make CALL-subroutines and translate them to methods, just like XCALLs, or would be make a gigantic switch statment, sort of like an interpreter?
- mb remove HALT and just have xret instead? and mb xret to 0 is halt, or something like that. Hmm, some platforms will have some cleanup stuff to do upon HALT, and we don't want to make xret less efficient by having to check if the return addr is zero every time; and if we started the program by 'calling' it from a virtual/sentinel address (which would be found in the link register, &4, at the beginning of the program), then letting the program xret to that address to halt, now we have to introduce the inefficiency of checking the xret address on each xret call. Not sure if that inefficiency is material, though.
- how do we do halfwords if the platform natively stores words in one storage location? Can halfwords still overlap with words? Is it possible to write to both the high and low halfword and then read the word (and each halfword) and see the expected results? I think not; I think whether there is overlap or not is implementation-dependent (and so is undefined behavior to rely upon? Nah, surely you are allowed to scan through memory at any stride -- it's just that the values you see are in that case are not deterministic with what was written, in the overlap sense). Must document.
- todo: should you be able to loads ints into ptr regs, or should you be able to load ptrs into int regs, when doing blind copying?
- ptr regs are bigger
- but it might be nice to check at load time if a memory location is definitely a ptr, before even allowing it into a ptr register
- however by that reasoning, maybe you would want to choose the pointer regs for blind copying anyways, b/c otherwise when you 'launder' pointers through the int32 regs during a blind copy, they would lose their 'pointer' attribute when written back out
- mb introduce yet another register type?!?
- actually i think this is the best idea. For systems with locations in terms of bytes, this wont even be much memory (16 bytes is only 4 int32s)
- rename anytype to byte? but right now they can hold other HLL constructs too..
- should i add the jy instruction, with the proviso that it must only be used intraprocedure? would it be hard to convert to a switch statement on JVM?
- probably, and probably also:
- rename lentr back to lcm; specify (if not already specified) that valid xcall targets must have an xentr
- specify that the target of a jy instruction must not be in a different 'compilation unit', which is implementation-defined; a jy, all lcms that might feed into it, and all of their targets, must all be available in one compilation unit at compile time
- reiterate that jy, like other jumps, must not jump between xcall subroutines
- mb, along with CALL, we should split codeptr/ptrc into two data types, 'Boot code ptr' and 'external code ptr', because mb external code ptrs are a different format than Boot code ptrs. Eg external code ptrs could be a native ptr (into a Harvard memory), and a Boot code ptr could just be an integer describing an offset from the start of code (that a Boot interpreter uses, as opposed to a compiler, where the Boot code pointers will probably just be external code ptrs).
- two questions:
- should we make a separate register bank for codeptrs? Ot1h, this makes it easier for implementations to verify that no arithmetic is done on codeptrs. Otoh, we do probably need a few of these, so it adds state; also, an implementation can get around the no-arithmetic-on-codeptrs thing just by writing the codeptr to memory and reading it back as something else (this is undefined, but so is arithmetic on codeptrs anyhow). An alternative would be to say that arithmetic can only be performed on the first 8 ptrs (note that later we want to put SMALLSTACK in there). However, this alternative doesn't help out Harvard architectures, like a new register bank would. The only really really safe thing is to not allow codeptrs to be read/written from general memory, e.g. Java has return addrs on the stack and that's it? But don't they allow method handles to be written as data? i guess java keeps track of the type of stuff in memory so it's not a problem either way. Todo check out what they do there
- should we define CALL-subroutines as code between a target of a CALL instruction, and the next such, and require these blocks of code to be like 'xentr-subroutines' in the sense that you can't jump between them except via CALL? This would allow CALL-subroutines to be compiled to functions/methods in e.g. JVM, rather than having a sort of switch statement and compiling return addresses on a return stack to constant integers, then dispatching through a SWITCH statement to actually go to them.
- if wacky control flow, then no need for x* interop instrucs here; move to BootX?; also move calling conventions there
- make section on BootX? noting that it defines interop, library and system calling, calling convention, file ops, and that ppl should use parts of that rather than reinventing the wheel
- should I/O really be spec'd as blocking? esp. input. For now i removed "I/O is blocking"
- should OUT be allowed to return 0 in $3, perhaps as part of non-blocking I/O? right now i say no; think about this more
- consider eliminating IN and OUT and just having read and write syscalls
- better define stuff like '32-bit offset in instruction stream' seen in jrl and l32m by defining #imm32 elsewhere
- consider moving all I/O into syscalls
- mb platformspecific int bitwidth after all (and bitwidth, and intmax, in sysinfo) (and not specifying the result of overflow, rather than soecifying wraparound?). With minimum of 16 bits specified.
- maybe we should specify that one of our regs is designated 'stack pointer'
- l32m could use its operands to specify: which register to load into, the number of bytes that the immediate has, the displacement where the immediate is to be found (as a register operand). Having a number of bytes would allow only having to use 1 or 2 bytes for small immediates, and having a displacement which is a register operand would allow dynamic indexing into nearly arrays of immediates. The number of bytes could be powers of 2, with 0, so 0, 1, 2, 4, 8 (the other 3 possibilities are unused). If the number of bytes is zero, then the displacement is instead a 3-bit immediate, and there is nothing else in the instruction stream. This also allows 'scaling up' to 64-bit immediates on systems that support that.
- one issue with 'immediate arrays' is that, in order to find the next instruction after the immediate pool in the instruction stream, we need to specify the length of the array. Maybe the unused three possibilities could specify this (immediate arrays with 8 items of length 1, or 4 items of length 4, or 8 items of length 8, or something like that).
- since l32m is already loading immediates in the instruction stream, maybe have an absolute jump that loads an immediate from the instruction stream?
- l32m replace by: l32m dest offset sz means dest = load (sz bytes) from (PC+ offset) ?
later todos (transfer to other file):
- unsigned integer arithmetic or conversions?
- verify that all instructions fit in the compact encoding with the given number of operands
- it's annoying that in Boot Assembly, lm and sam always interpred the immediate constant literal as unsigned; when you want to load a signed constant, you have to do the conversion to and from unsigned (when you are writing or reading Boot Assembly) yourself. But if we change this to allow both, it complicates implementation, because then there is no guarantee that you can roundtrip between a Boot assembler and a Boot disassembler. Otoh, maybe we don't care about roundtripping, because it isn't that hard to write a Boot assembler, and anyways, you can run the assembler/disassembler on another system, the only thing that the porter needs to implement on the target platform is the VM itself. On the third hand, it doesn't help much to allow both, because the disassembler still doesn't know which one is desired in any particular instruction, so it would only help for writing Boot assembly, not for reading it, and imo reading is where you'd really want it anyhow. The disassembler could always output comments on those lines showing the signed values.
- normalize the position of memory addresses and their offsets, eg in STORE, put the address in the same place it was in LOAD (in accordance with the 2nd bullet point in Ugly in [1]). Look to see if there are any other memory address calculation instructions in here.
- use 'int' instead of 'int32'; there is no crashing on overflow, size is at least 32 bits, and the lower 32 bits of the result are correct