linux ioctl is applied to open file descriptors, e.g.:
" Open the KVM device, kvmfd=open("/dev/kvm", O_RDWR
| O_CLOEXEC) |
---
i think having the polymorphic 64-bit instructions is overengineering; we already have two stages of assembly-like instructions, 16-bit and 32-bit, having a third stage is just spending too much complexity on the assembly-like stuff. The 64-bit LONG instructions should just be the grammar-like, AST-like/data-structure-like flexible encoding.
---
having the 8-bit shortcut instructions prevents 16-bit alignment, which is a pity as 16 is a "more perfect" power-of-2 than 8. Let's get rid of that, and instead have alternate formats for 16-bit instructions. We still have 1 format bit for 16-bits, and when it is one value, we have our ordinary 16-bit instructions, and when it is a different value, we have another format bit; when it is one value, we have a 32-bit (or longer) instruction, and when it is a different value, we have the new 16-bit format.
So the new 16-bit format has 14 bits left for us to work with. We can make the first two of these be more format bits. When they are 11, we have 12 bits left to work with, which we can use to recover the 6-bit shortcuts (in packets of 2) by treating these 12 bits as two 6-bit shortcut instructions. Now we still have 00, 01, 10 to work with.
One thing we could do here is pull out all of the instructions that don't really fit well with the 3 operand model from the ordinary 16-bit instructions, in order to save a few opcodes over there. This could include JREL, LDI, ANNOTATE. Note that these each require at least 11 bits of content, however, (and it would be great if we could give them 12) so if we did all three of those, that would probably be our entire budget; but we may want to save some for longer macro instructions, or for future expansion. The 6-bit shortcuts will probably already need a NOP, so we could also pull out NOP and represent it as two shortcut NOPs.
---
[1] says that SMS payloads, even though they are 160 characters long, are actually encoded into only 140 bytes (by using 7-bit characters). And then a 36 byte header is attached. This suggests:
32 or 64 bits would be a good low-level message size (to fit a little less than this header, or this header + some content), or 128 or 256 bits (to fit a little less than this content, or a little more than it).
---
[2] speaks of:
a 512-bit unit
256-wide ports
"Each memory operation can be of any register size up to 512 bits"
"The L2 to L1 bandwidth in Skylake is the same as Haswell at 64 bytes per cycle in either direction"
"Each (DDR4) channel is 72-bit (64 bit and an 8-bit ECC)"
" Store is now 64B/cycle (from 32B/cycle) Load is now 2x64B/cycle (from 2x32B/cycle)"
" L1I Cache:
32 KiB/core, 8-way set associative
64 sets, 64 B line size"
L1D, L2, L3 caches also each have 64 B line size
" DTLB
4 KiB page translations:
64 entries; 4-way set associative"" Port 4 now performs 512b stores (from 256b)"
L1D cache:
" 128 B/cycle load bandwidth 64 B/cycle store bandwidth "
the picture/section "Individual Core" [3] has some bandwidths of:
16 bytes/cycle
64B/cycle
512bit/cycle
and the Decoded Stream Buffer (DSB) has a "64 B window" and Instruction Fetch and PreDecode?