proj-oot-old-150618-ootAssemblyNotes5

how does our role system map to natural language parts of speech and phrasal categories?

so what does this suggest we add?

note: we also have parens and quoting in role 7; i guess these aren't considered natural language lexicals because they mainly appear in writing (but you can speak "Jane said, quote, i love him, end quote"), but they have are parsed as tokens as if they are separate words.

so adding the first two of those, our roles currently are:

note on filtering constraint vs modality:

i mean modality more broadly than it is used in linguistics; tense, aspect, mood, adverbs, and prepositional phrases are all 'modality' modifiers here. In my scheme, tense, aspect and adverbs modify the verb, mood modifiers the whole sentence.

Adjectives are examples of filtering constraints modifying a noun, and prepositional phrases are filtering constraints modifing a whole sentence (or a noun; 'He gave the bone to the dog in cage #3').

But is there modality for nouns and filtering constraints for verbs? If not, maybe we don't really need two roles for these.


if we give packed repr its own role like this, and, like role 7, give it special meaning (its primary meaning) when it is used at the beginning of a 64-bit packet, now we can use the payload of the role 6 16-bit word for something. I propose we use it as 8 bits of modality. Then the 48 remaining bits are for (8-bit opcode + 3x4-bit operands) + (8-bit opcode) + (8-bit opcode + 3x4-bit operands), as before.

note: we may not want to actually define the packed reprs yet, because they are an optimization. Just define them as things that consist of a defined 8-bit role 6 header followed by an opaque 64-bit packet that can be statically peephole-translated into other sentence forms.


on addressing modes:

note that 'direct with offset' is an uncommon mode, the common one is 'indirect with offset':

"MIPS does not provide a lot of choice when it comes to addressing modes for load and store operations. None at all, really. The only addressing mode supported by MIPS is address register indirect with offset. This, of course, is the same as ARMs basic register indirect mode, although variations such as incrementing the register are not allowed." an example is given of (paraphrased) 'Load r2 from memory pointed at by (r3 + 4)'

so eg we dont need a split mode for 'direct with offset' eg 'register (3 + r1)' (could be written 'r(3 + r1)'); we only need 'the memory pointed at by (r1 + 3)' (and i guess if you have memory-mapped registers, these would be the same thing). And in our model, + is replaced by .__GET, so this becomes 'r1.__GET(3)' (equivalent to r1.3, r1[3], etc).

note that, to unify structs and hashtables, you should be able to address r3["string"], not just r3[3]; also the case of structs, like r3.field2, are replaced by structs with integer-enumerated fields, like r3.2, and further, this is unified with hashtables as a special case where the lookup key is an integer and the keys are statically known; and the implementation use of struct instead of a hashtable is just an optimization when the compiler detects this special case (note that recognizing that only a fixed set of integers are ever gotten from certain tables may require some dependent typing, but probably an easier-than-usual special case of it, because since this is just a shortcut for structs, it is much like nondependent typing)

now, what i mean by you should be able to address r3["string"]; you should be able to use some LEA-like instruction to get the 'effective address' of the expression r3["string"] and then store it and use it later in any place where an address is accepted, in a similar way to how in traditional/'real'/hardware assembly language you can get the address of r3[2] by taking the indirect index of the object in r3 with the offset 2.

this implies two things:

note that in the previous the notation was a little confusing; in hardware assembly, "r3 = &x[33]" makes sense because a pointer to x[33] is in r3, and operations on r3 would affect that pointer. But in Oot Assembly, the corresponding situation is for r3 to be thought of as containing x[33] itself, and operations on it would affect x[33], not a pointer to x[33]. So r3 = x[33]; this is a statement of equivalence (==), not an assignment statement, but it is pointer equivalence, not structural (value) equivalence.

Furthermore, these semantics are confusing when it comes to unboxed values. If r3 = 9, and then we add 1 to r3, what does that mean? It doesn't mean that we alter the meaning of "9" systemwide. r3 = 9 means there is a single copy of the value 9 in r3, and mutations to it affect that copy, not some system-wide thing.

If we were to run "MOV r3 to r4", this would literally mean 'take what is in r3 out of r3 and place it into r4'. r3 then becomes undef (at least as far as the type system is concerned).

If we were to run "CPY r3 to r4", this would literally mean 'make a (non-aliased) (deep) copy of the value that is in r3 and place it into r4'. CPY would be implemented by the VM as COW (copy-on-write) under the covers. Further writes to r4 would not change the value in r3.

If we were to run "ALIAS r4 to r3", this would literally mean 'into r4, place a symlink that links to r3". Further writes to r4 would change the value in r3.

So what if we say "ALIAS r3 to x[33]; MOV 3 to r3"; does this mean x[33] = 3? I think it does. If we want to look at the alias itself, we use a meta-command, like ALIAS or GETALIAS. Or maybe ALIAS is just SET with a meta-modifier (or 'alias' modifier), and GETALIAS is just GET with that same modifier.

So what if we say "ALIAS r3 to x[33]; ALIAS r4 to r3; ALIAS r3 to y[2]"; do changes to r4 now affect x[33], or y[2]? Todo.

So it might be easier to think in terms of two assignment operators, "=" for values and "=&" for aliases (symlinks), and two equality operators, '==' for structural equality and '=&' for pointer equality, in a higher level language.

tangentially, note that since we are also unifying syntax for not just structs and hashes/arrays, but also function calling, we can just write x[33] as x 33, which is the same as x.__get(33).

now what are the possible split modes (where x y = x.__get(y))? taking every permutation of the four basic addr modes, we'd have 4*4 = 16 possibilities from 00 to 33: 00: i1 i2 01: i1 k2 02: i1 r2 03: i1 *r2 10: k1 i2 11: k1 k2 12: k1 r2 13: k1 *r2 20: r1 i2 21: r1 k2 22: r1 r2 23: r1 *r2 30: *r1 i2 31: *r1 k2 32: *r1 r2 33: *r1 *r2

recall however that our registers arent different from our memory. So is r1 and r2 identical? No, we can take the 'r's (direct addressing) to represent meta-addressing the aliases, and the '*r's to represent addressing the values

Now let's write all those in terms of array indexing to make their function clearer:

00: i1[i2] 01: i1[k2] 02: i1[r2] 03: i1[*r2] 10: k1[i2] 11: k1[k2] 12: k1[r2] 13: k1[*r2] 20: r1[i2] 21: r1[k2] 22: r1[r2] 23: r1[*r2] 30: *r1[i2] 31: *r1[k2] 32: [*r1 r2] 33: [*r1 *r2]

an aside on aliases:

The idea of 'alias' and 'value' addressing is that each memory cell has two 'levels'; the 'alias' level, which might contain a pointer (or actually, maybe an entire 'alias record', which is a pointer with some attached metadata), or nothing is the cell is not a symlink but an actual value; and the 'value' level, which in the case of non-symlinked cells contains the contents of the cells, and which in the case of the symlinked cells looks like it contains the contents of the symlink target. So when you use value addressing on a symlink cell (that is, one with a non-empty alias level), you are really manipulating the cell that is the symlink target; when you use alias addressing on this cell, you are reading or mutating its symlink record.

In order to CREATE an alias, a third addressing mode is needed, because you want to take the 'address' of a cell, and place it into the alias level of a different cell. Instead of 2 different alias-and-addressing relating modes, we could have some special instructions. Note that an 'address-of' mode could not be written to, only read, so maybe that one should be omitted?

We also need a sentinel to represent an empty alias cell. We can use 0, which would mean that the PC cannot be aliased, which is no great loss.

Also, is this 'alias level' identical to the 'meta level' above a 'perspective' (view), or do we need yet another mode for that? i'm guessing it's a field within the metadata in the meta level. But mb the meta level associated with an address-of the cell, not of the value contained at that address.

so, back to those split addressing modes. Which of these are the most useful? Regardless of whether we use one addressing mode to represent the address-of or to represent accesses to the 'alias level', it won't be very common to apply .__GET (to index into) these things; addresses themselves can't be indexed into (although we COULD allow straight-out pointer arithmetic addition when you try to do that :) ), and if you want to index into metadata attached to an alias record, you can always alias it to the value level of another cell first.

so let's eliminate all of the 'r's above. but, to make things easier to read and write, instead let's just get rid of the *s and just remember that we are dealing only with the value level here, not the address-of or alias levels (see below; i like address-of better).

so we have

00: i1[i2] 01: i1[k2] 02: i1[r2] 10: k1[i2] 11: k1[k2] 12: k1[r2] 20: r1[i2] 21: r1[k2] 22: r1[r2]

now, i1[i2], i1[k2], and i1[r2] don't make much sense, because integers cannot be indexed into. k1[i2], and k1[k2] are not necessary, because they could have been statically computed and added to the constant table.

So we only have 4 modes left:

k1[r2] r1[i2] r1[k2] r1[r2]

so these are the split addressing modes. A little thought shows examples when each of them would be useful:

k1[r2]: let k1 = ARGV. ARGV[r2]. r1[i2]: let r1 = x. x.3 r1[k2]: let r1 = the sin fn, let k2 = pi. sin(pi) (recall that f(x) = f[x] = f x) r1[r2]: let r1 = x. x[r2]


if we used an 'alias' addressing mode, and a special instruction/opcode for GET_ADDRESS_IDENTIFIER and CREATE_ALIAS then:

If we wanted to look at or manipulate metainformation in the alias record, we would first move the alias record itself into an ordinary value cell, by using issuing a CREATE_ALIAS instruction whose input had alias addressing (r) and whose destination had value addressing (*r) (this create an alias from the destination cell to the alias record which is controlling the input cell; if we wanted to alias a destination cell to the same symlink target as a source cell, we would instead do CREATE_ALIAS with value addressing in the input (the more typical case). (note: if this is what we're doing, tho, then 'alias' addressing in CREATE_ALIAS's output is useless, which seems like a waste). Note: this is irregular in that the CREATE_ALIAS and GET_ADDRESS_IDENTIFIER opcodes would interpret value addressing different from every other opcode.

A more regular solution would be to use an address-of addressing mode, and to have GETALIAS and SETALIAS operations. To create an alias from r4 to r3, you would SETALIAS(input= address-of(r3), destination= address-of(r4)). To create an alias from r4 to whatever r3 is directly aliased to, GETALIAS(input=address-of(r3), destination=r1); SETALIAS(input= r1, destination= address-of(r4)). Etc. I like this better.

A downside there is that address-of can't go in the destination operand. But i guess that's good, it means we have a spare 1/2 addressing mode. hmm, mb too clever, but we could use that in place of SETALIAS... presumably SETALIAS will be somewhat common, but GETALIAS will not. to clear a symlink, we can copy 0 in using this addressing mode

Note that SETALIAS is a special case of setting .__GET and .__SET.


Note that due to perspectives (views), if aaa is an array, even though most likely aaa is represented in memory as a struct containing two fields, a length field, and a pointer to data in the heap, the standard view of an array would just be the data itself.

Eg if r3 'contained' an array, and you wanted the third element of the array, you wouldn't do:

r3.1[2]

you'd just do

r3[2]

if you wanted to access the length field directly, you'd use something more like

GETMETA(r3).VIEWS."whatever-the-name-is-of-a-view-with-a-length-field-and-a-data-field".0

---

need 'load' and 'store' instrs to do dereferencing since we no longer have dereferencing built into the addressing mode (instead we have a C-like & operator)

we dont actually need loadstore b/c could always put an address that we have into a cell's alias level and then read/write to that cell's value

i guess thats what this system gives you; like python, you can have local variable represneting structs that you address directly like 'x', not like '*x', but you can also alias two locals togeether, eg changing x can change y.

---

in oot impure (including aliased, except for views) variables and fns would have an & to their left. vars without & are just values (from the point of view of this statespace mask). does this mean we have a non-cell address space , too? mb not b/c of statespace masking. but otoh dont want to have to do indirection upon every memory access, right, or worse actually traverse a chain of symlinks?

this is why i wanted an addressing mode to override getters and setters, as opposed to having an address-of mode. so we have 3 contending modes; addressof, override/nonmeta, and alias level/meta level. i had conceived of the overrride and meta and alias level as similar, the level at which you bypass the getters and setters (or assign to them), because there you can directly access the obj representation w/o going thru getters and setters. but i guess if u wanted to use that to touch the data itself, youd have to do 'x.0' (where x refers to x's meta record b/c we're in meta addressing mode). so you have to use split mode/offset just to get to the data, so u cant also use the offset to index into the data.

this seems more common to me than address_of. so if this were the other addr mode, then the code would just say '3, direct mode' when it wants to directly access 3, and '3, indirect mode' when it wants to access a potential symlink; the compiler knows which address corresponds to which local varname, and doesnt need to look it up at runtime with address-of. or, address_of can be special opcodes. in an HLL addr-of could still be something like &, like in C (or &&, since we've already used & as a warning sigil for impurity or aliasing)

the question is, when r3 is a potential symlink, what does 'direct mode' yield? i am tempted to say it should yield the alias record.

also, instead of having a 'symlink or content (terminator)' flag in the record, the alias record could just be the payload with an addressing mode. if this mode is itself indirect, which it would always be if this is an actual and not just a potential symlink, then a chain of links could be followed (like the nested indirect addr mode on some old ISAs)

but now what if you want to access the META of an object to which you only have a symlink? if direct mode is for both aliasing and symlinks, you cant do this w/o implementing ur own symlink chain walkeer, which seems silly.

another possibility is that the direct mode does access the meta node after going thru symlinks, but the meta node is dynamically constructed and has fields that tell you about each link on the chain you took to get here this time. (but see below: this would mean we wouldnt have an efficient unboxed mode).

but dont we want to make it fast for the runtime to add a=3; b=4; c = a+b without indirection? with that suggestion you still have to do c+0 = *(a.0) + *(b.0)

well mb. if a and b were known at compiletime, then a.0 and b.0 could be in the constant table. so we dont have to lookup 0 from a. and if a and b hold small unboxable values, and the meta for those nodes are never used, the compiler could secretly not have a meta node for those, and could just store their actual unboxed value on the stack or wherever it is stored.

also remember that even if the meta records cant be all dynamically constructed, we dont have to store the meta record in the local variable table value and the content in a field within this record. we can have a separate 'local var meta' table. we could also have a separate table for pure local vars, and local vars for which we need to store some meta records

so we're contemplating replacing 'direct registers' with 'direct memory' and 'indirect addressing mode' (memory via registers) with 'indirect' (follow symlinks, possibly nested; and/or use __get and __set). theres other kinds of indirection also; lazy thunks/promises, timeouts, etc

IMPORTANT: really, if you need to access meta or read an alias or something, you can use an extended sentence and have a modality modifier in ur noun phrase. so we dont need to worry too much about these special cases; we effectively have lots of addr modes available. we probably just need one default indirect mode with our language defaults (all indirection on), and then also a fast, direct addr modes, unboxed, no symlinks, etc, so that it's possible to write portable relatively efficient tight loops in the self-hosting reference implementation.

so, all indirect, and all direct. the all direct mode is NOT used for metaprogrammy things like directly accessing symlink chains, meta nodes, etc; it's used for straight out jvm-like unboxed arithmetic, with nary a pointer in sight (well i dunno about that, we might still have pointers to some arrays and even boxed structs, but we'll see).

IMPORTANT: we might not want to directly expose the 'internals' of the meta nodes, symlinks etc anyways; we should provide special commands for those so if we decide to return a virtual representation of these, we dont have to trap 'direct' memory accesses to create an illusion of this virtual reepresentation in 'direct' memory.

so we still have the question of what a 'direct' access sees if it tries to access a boxed and/or symlinked value.

IMPORTANT: otoh we know what is boxed and unboxed at compile time.. could just never have direct accesses allowed to potentially boxed or symlinked things. hmm, i like that.

which means, yes, it is a separate addr space. but it doesnt correspond to lack of & identifier prefix sigil in oot. i'm thinking the compiler figures out what can be unboxed, with the possible aid of annotations (can this be done with precondition demands? eg the precondition contract says you can only pass in nonsymlinks with default getters and setters, and furthermore they must be of types that can be unboxed? or do we need to somehow put some constraints on how it can be used that are postcondition demands (which you would think would never be needed but perhaps there are some 'implicit postcondition promises' about how you can use the returned values that would need to be overridden)?)

hmm this might actually be close to something that almost corresponds to &; the part that i can think of that is distinct from & is laziness; because if something were 'strictifiable' (meaning no chance of an infinite loop when you strictify it) with no visible mutable state (effectively pure; the '&' sigil; note that 'effectively pure' excluses aliasing also, because something aliased can be 'mutated') then it can just be strictified into an unboxed value at the time of assignment, even if you have to call a getter to do that.

potential syntax for 'strictifiable' vars ("unlifted" in the haskell terminology; types without "bottom"; operationally, types that cannot be thunks; haskell has unboxed, unlifted; and also boxed, unlifted; and also boxed, lifted types, there cant be any unboxed, lifted types because you would need a boxing struct to hold the thunk reference) not haskell's "data structs that automatically strictify" but rather a Maybe-like construct that forces u to explicitly do "strict assignment" (eg strictify then assign) when u assign to the var.

haskell seems to use a '#' postfix for unlifted types (see https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/TypeType ). let's use that ourselves. i dunno if it'll mean unboxed, unlifted, or possibly boxed, unlifted for us. this suggests =# (or mb #=) for strictifying assignment. we might also want to use # postfix as a strictification operator in other expressions. eg mb every unlifted type has a # at the end of its name, and mb in ordeer to assign to such a thing from something else, you must do a =# assignment, to let the reader kno that this is where strictification occurs

this also suggests using a# for 'strictify a'. in which case we dont need =#. btw, imo having the sender do this is easier to read then GHC's BangPatterns? (see https://wiki.haskell.org/Seq ) or Haskell's "$!" (strict application) operator (but we should consider the syntax !x instead of x#)

also do we need something similar when we 'freeze' a mutable object by doing a deepcopy of a &var into a purevar? after all a deepcopy might also take a long time, might strictify, etc. hmm think about i/o monads; how does that work? monadic bind. from within an impure IO monad fn you can eg read a keystroke and bind it to a local var, then pass it into a pure fn as if the value were pure.

since we already have &impurevars, i dont think we need further special notation for purevar = &impurevar; the concatenation of "= &" is easy enough to see when scanning/grepping

IMPORTANT: all those intermediate modes between full direct and full indirect are just examples of views

---

some terminology from HTTP2:

"

    HTTP/2 streams, messages and frames "

so perhaps we should call the 16- or 24-bit (or 64-bit) things 'frames' instead of 'words' to match this?

---

if we have regexp * (and regexp ?) in our stack maps, how do we verify that a program has consumed exactly those? one easy way would be to have 'stack markers' which are opaque constants that you push onto the stack (maybe labeled with integers)

---

our current proposal forces you to go thru metaprogrammy GET (and SET) protocol to index into an array using the split addr modes. When the compiler knows for sure that there is no metaprogramming override and knows the array layout in memory, want a way to directly index in without indirection, for high-speed numerics.

i guess one way to do this would be having special opcodes DIRECTGET and DIRECTSET. Another way would be just to have GET and SET with modalities, but this would reduce code density in one place we probably do care about it (numerics inner loops).

---

thinking about upvariables

perhaps each scope must statically state how many new locals n it is using.

then the memory cells from 0-3 are reserved, and from 4 to (n-3) are the new locals. Values above that are the old locals, or lexically bound upvariables, in order. i suppose that if the closure doesn't contain all of the locals used in the enclosing scope, there is no reason that they have to take up 'space' here, eg only bound upvariables need be included, not necessarily all locals in the enclosing scope.

behind the scenes, there is probably a saguaro tree of lexical scope frames, not a single block of all of them concatenated

since the number of bound upvariables in each enclosing lexical scope is known, and since the number of scopes in the tree is finite, we have a statically known bound on all variables of this type.

what about dynamically scoped variables? are these treated as module-level 'globals'? if so, do we then need getglobal/setglobal or are these assumed to be statically known, in which case we can place their locations after the lexical stuff? or do we require the dynamically scoped variables to ALSO be lexically bound, eg at the module level if not elsewhere?

---

hmm we might want to use

(namespace type, namespace, identifier)

for identifiers, instead of just single integers, because we have lexically scoped variables and dynamically scoped variables, and we have multiple enclosing lexical scopes.

Or maybe we a separate addressing mode for each 'namespace type'? Or maybe we limit the number of namespaces of various types (eg 'lexical depth of at most 16') and then have (namespace type x namespace, identifier)?

This would also allow us to be clearer about 'non-archimedian addressing', eg just have a 'heap' namespace type and use 'namespace' to refer to regions of the heap, and then 'identifier' is an address within that region.

note that the way that eg Lua does it is having separate instruction, eg LOADK, GETUPVAL, GETGLOBAL, instead of separate addressing modes or namespace types.

If we use either addressing modes or namespace types, and if we use namespace numbers, this certainly eats up a lot of bits for identifiers.

Eg with 2 namespace types and 2 namespaces per type, that's already all 4 bits that we had in split addressing mode with an 8-bit payload.

so i suppose that we'd say that with 8-bit payloads, we'd always implicitly assume namespace type 0, namespace 0, although we'd probably still map the PC and the stack pointer into that namespace (not sure tho?).

otoh dont we want to unify object attribute lookup, lexically scoped var lookup, dynamically scoped var lookup? in which case each object is its own namespace, so we have a ton of those -- and namespace lookup is just GET

but otoh we want to resolve lexically scoped lookup at compile time

---

hmm i feel like we're overduplicating the same concept on too many levels again... just like i feel like registers and memory locations and stack frame offsets are too many duplicates of the same thing...

what i was trying to do was remove the need for eg LOADK and GETUPVAL and GETGLOBAL, and to unify object GET and array GET and structure GET (and function calling)

but i guess you have to have some "pronouns" eg primitive 'registers' because when you use an expression to specify an effective address to which you want to put the result of something, you need to place that function in a 'register' if you are then going to use a fixed-length instruction to say to put something into that location (but.. i guess the alternative is to have an AST with no fixed length 'instructions', only fixed-length opcodes?)

anyhow, we already have GET, and we have 'noun phrases', all these references to various kinds of namespaces should be done then, right? but they're not.. we have to resolve references to locals and upvals differently, right?

hmmm

i guess Lua's use of LOADK and GETUPVAL and GETGLOBAL do make some sense.. because they allow ordinary local lookups to be really fast (less indirection). Lua also has GETTABLE.

---

i guess we could have the first 16 constants be namespaces such as UPVARS, etc, and use the k[r] split addressing mode to access these quickly

hmm.. then we'd really prefer a k[i] split addressing mode, though, right? to address the first 16 elements of each of the 16 special namespaces?

---

i guess a nice thing about using a stack is that (a) there isnt a limit on how deep expressions can be, unlike our 2-level-deep limit above, and (b) you dont have to have lots of UNDEFs after each statement for one-time-use nouns

so want to encourage use of the data stack (especially) for 'subexpressions' that conceptually form part of one conceptual statement

---

does oot byc (bytecode; oob) really need 2 levels if we have the stack for singleuse pronouns? i guess so; 2 lvls gives us extensible modality and addr modes

--

Intel SGX security enclave instructions:

http://web.stanford.edu/class/ee380/Abstracts/150415-slides.pdf

EEXIT EGETKEY EREPORT EENTER ERESUME


"We present a new compression-friendly wire format for the LLVM IR. We compiled 3511 real-world programs from the Debian archive and observed that on average we only need 68% of the space used by gzipped bitcode. Our format achieves this by using two simple ideas. First, it separates similar information to different byte streams to make gzip's job easier. Second, it uses relative dominator-scoped indexing for value references to ensure that locally similar basic blocks stay similar even when they are inside completely different functions." -- https://aaltodoc.aalto.fi/handle/123456789/13468

azakai 5 hours ago

Of course, several of the people working on WebAssembly? are actually from the PNaCl? team. And others, like the Emscripten people, have lots of LLVM experience as well. WebAssembly? is being designed while taking into account previous approaches in this field, including of course using LLVM bitcode for it.

LLVM bitcode is simply not a good fit for this:

Each of those can be tackled with a lot of effort. But overall, better results are possible using other approaches.

reply

---

pro-big-endian:

drfuchs 1 day ago

Because big-endian matches how most humans have done it for most of history ("five hundred twenty one" is written "521" or "DXXI", not "125" or "IXXD"). Because the left-most bit in a byte is the high-order bit, so the left-most byte in a word should be the high-order byte. Because ordering two 8-character ascii strings can be done with a single 8-byte integer compare instruction (with the obvious generalizations). Because looking for 0x12345678 in a hex dump (visually or with an automatic tool) isn't a maddening task. Because manipulating 1-bit-per-pixel image data and frame buffers (shifting left and right, particularly) doesn't lead to despair. Because that's how any right-thinking person's brain works.

The one place I've seen little-endian actually be a help is that it tends to catch "forgot to malloc strlen PLUS ONE for the terminating NUL byte" bugs that go undetected for much longer on big-endian machines. Making such an error means the NUL gets written just past the end of the malloc, which may be the first byte of the next word in the heap, which (on many implementations of malloc) holds the length of the next item in the heap, which is typically a non-huge integer. Thus, on big-endian machines, you're overwriting a zero (high order byte of a non-huge integer) with a zero, so no harm done, and the bug is masked. On little-endian machines, though, you're very likely clobbering malloc's idea of the size of the next item, and eventually it will notice that its internal data structures have been corrupted and complain. I learned this lesson after we'd been shipping crash-free FrameMaker? for years on 68000 and Sparc, and then ported to the short-lived Sun386i.

reply

http://geekandpoke.typepad.com/geekandpoke/2011/09/simply-explained-1.html

pro-little-endian: daemin 1 day ago

Little endian is easier to deal with at the machine level as you don't need to adjust pointer offsets when referencing a smaller sized variable. A pointer to an 8, 16, 32, 64, 128 bit quantity will be the same. Big endian you will need to adjust the pointer up/down accordingly.

reply