Bayle Shanks's website: ideas-computer-jasper-jasperNotes4

huh, this looks very general, clean, and like the keywordish relations that i've been looking for:

http://www.lshift.net/blog/2010/12/22/conditional-statements-the-lambda-calculus-and-earlylate-binding

" Conditionals in Smalltalk work like this: the class Boolean has two subclasses, True and False. Each has a well-known instance, true and false, respectively [2].

The equivalent of C’s

  if (condition) {
    onefunc();
  } else {
    otherfunc();
  }

  condition
    ifTrue: [ self oneMethod ]
    ifFalse: [ self anotherMethod ]

How does it work? Boolean defines a method #ifTrue:ifFalse: taking two blocks as parameters. (Remember, what Smalltalkers call a block other folks call a closure. A Lisper might call the parameterless blocks that #ifTrue:ifFalse: takes thunks.) True and False in turn implement #ifTrue:ifFalse: thusly:

  True>>ifTrue: trueBlock ifFalse: falseBlock
    ^ trueBlock value.

False>>ifTrue: trueBlock ifFalse: falseBlock ^ falseBlock value.

So we evaluate condition to either true or false, send it the #ifTrue:ifFalse: message, and we’re done. "

---

Disciple exceptions: " The exception mechanism in the source language is based around try/catch, and doesn't have a proper semantics. We should probably just use the GHC approach where the 'error' function creates an exceptional value instead of causing a side effect.

    If we were going to treat exceptions as side effects, then perhaps we could do something like in Benton and Buchlovsky's "Semantics of an effect analysis for exceptions" paper. "

---

" The network is reliable.

    Latency is zero.

    Bandwidth is infinite.

    The network is secure.

    Topology doesn't change.

    There is one administrator.

    Transport cost is zero.

    The network is homogeneous."

    Structured programming (in a first-order language) limits a program’s control flow graph to being constructed using a particular grammar, which makes it easier to understand what are the potential control traces in a program.
    Data hiding (encapsulation) limits which modules may access the representation of an abstract type, which means that in other modules we can reason in terms of a simpler interface, and when inside the abstraction boundary we can assume the unprivileged modules rely only on the external interface.
    Immutability prevents objects from being modified by other modules or processes, which eases reasoning about concurrency and time, and allows us not to think about how another module might modify objects that we share with it.
    Purely functional programming goes further than immutability, eliminating non-local information flows, which means that we can understand expressions in terms of their text and lexical environment, and ignore time in extensional reasoning.
    Static types constrain which types of values can flow where, which means that there are fewer possibilities to think about in case analyses, and particular kinds of errors need not be considered."

---

[esq-ue tu speak Jasper](query, desired type of response=bool)

---

the words "I" and "you"

hypotheticals

a way of making a request, that is, of indicating the desired response

a way to say what someone [i or you, or someone else] known or what is unknown to them [and, branching off 'know', we then have more complex concepts such as 'understand']

a way to indicate an object, not by a noun phrase, but by specifying some properties about it, 'the thing which was here yesterday', for example

past and future tense?

plurality?

of course, there are different levels of interpretation; if text is quoted, 'i' and 'you' are unresolved, otherwise they are ('i' relative to 'speaker', 'you' relative to 'listener')

kant's correspondance between table of judgement and categories may be useful for uniform notation; e.g. coq's and haskell's definition of data types/constructors by cases == OR in the same way that disjunctive judegment == community (i think), and kant's hypothetical == causation link tells us that we can reuse the symbol for 'if' for 'right arrow' e.g. the horn connective (possibly this is also the same as the symbol for subtyping, too, e.g. ontological logic? see what descriptive logic does there)

the difference between 'i' and 'me' is like the different roles in a relation; the relation is like the very (look at RDF), and subject/object are roles (positions in the relational operator)

as one task (but not the only task) of philosophy is to form new language, programming language creation is rather philosophical

the Curry-Howard Isomorphism is somewhat related to Kant's relationship between the table of judegments and the categories

for example, union sum types mean a type where you don't know exactly which type you have, but you know you have one of the ones on the list. this is like Kant's category of Community. union sum types are related to logical disjunction; this is like Kant relating them to Disjunctive judgements.

in other words, Kant's relationship between the table of judegments and the categories is like the relationship between the logic of types (e.g. 'disjunction' for a union type), and their meaning as types (e.g. 'you have one of these types but you don't know which'), which is like the relationship described at http://web.archive.org/web/20130702221947/http://www.haskell.org/haskellwiki/GADTs_for_dummies between normal Haskell code, and what typeclasses would look like if expressed in a uniform notation with normal Haskell code

by this logic, an 'implication' type would indeed seem to be subtyping, e.g. 'Q -> P' means' 'not(Q) OR P' which is equal to 'not(Q) OR (Q AND P)'. Consider the relationship between Q=Integer, P=Number. Either something is not an integer, or you assume that it is both an integer and a number.

todo why is subtyping hard when it interacts with polymorphism?

substance, e.g. essential vs. accidental properties:

(a) in class-based OOP systems we have this distinction, in prototype-based ones we don't

the idea that everything is of some single type of 'substance' is like single inheritance

(b) one might also relate it to what is statically known (although this is a slightly different idea than (a)); essential properties can be statically known, accidental ones cannot.

(c) how to apply this to EverythingIsAnInterface?? First, think of multiple inheritance. We can generalize the idea of essential properties as being those that are part of 'the' substance, to saying that an object (a piece of data) has multiple substances, each of which carry/imply various essential properties. Second, realize that really what we are doing here is making an epistemological distinction; if and only if the reason that we know that an object has a given attribute is due to it being a substance, then we call that attribute 'essential'. This lets us see that in this case interfaces are like classes with respect to methods specified in the interface.

In fact, this means that which attributes are essential and which are accidental is relative to what we consider to be substances. In one context (say, serializing an object), we might not know that the object under examination supports the 'dict' interface, all we know is that it has a function called 'keys' attached to it. In this context, 'keys' is an accidental property. In another context, we might know that a given variable must contain an object of type 'dict'; in this context, the 'keys' function is an essential property of that variable.

(b) is explained as the context of dynamic languages by saying that it's not whether something is STATICALLY known, it's whether it is a property of an object's TYPE.

in the structural typing context of Jasper, typing is (mostly? or totally? what about type labels and the .types meta attribute?) equivalent to pattern matching. so essential/accidental is not so much about compile-time/run-time but about whether you may assume something because you've required it to be matchable against a pattern.

this kind of thinking probably should inform what we decide about class vs. prototype inheritance

btw the wording that "a substance IMPLIES various essential properties" is further support for the implication-as-subtyping

todo why is subtyping hard when it interacts with polymorphism?

" Subtyping and type inference

The type inference problem in the presence of parametric polymorphism and subtyping like in Java and C# is significantly harder than in Haskell (98) or SML. You're very unlikely to ever get anything like those FP languages' type inference. Scala is a good language to look at in this regard. It too uses (only) local type inference which is more or less as much as one can reasonably expect. That said, you can do way, way more with local type inference than C# is doing (as Scala demonstrates.) " -- http://lambda-the-ultimate.org/node/2862

" .. You ought to formalize the 'laws' of this particular graph theory, and explain it with node/edge diagrams and examples.

Definitely! I have one heck of a conference paper to write about this. It doesn't help that much of the existing type theory out there is based on term structure, not graph theory. However, I've done this before for Jiazzi, so I think its doable. It also turns out that any type theory that involves nominative subtyping is also based on some amount of graph theory. .. " -- http://lambda-the-ultimate.org/node/4764

inferred row polymorphism?

This seems to be an extension of the concept of row polymorphism to type/object/state inheritance and construction (read as a complement; I like extensions of excellent systems). That is, where row polymorphism accepts use site variance, YinYang? treats it as a point of assertion, constructing the structural definition from the implicit name-use constraints. YinYang? does for specification-definitional simplicity what RP does for use-site specification.

Then again, I've fallen for row polymorphism recently so I'm seeing its face in every crowd. Does somebody have a distinction between the systems I'm missing? By Chad Wellington at Wed, 2013-06-19 01:31

That is a quite useful

That is a quite useful comparison. I don't think YinYang? is very close to row polymorphism because it is completely nominative, but I'll try to learn more about the topic to be sure. By Sean McDirmid? at Wed, 2013-06-19 09:49

An explanation, sadly long-winded

Here's the analogy that I saw, with apologies in advance: (1) I'm very good at mixing up (mixing in?) concepts under names that are inappropriate, (2) I think in Lisp syntax so I'm not quite following your YinYang? syntactic conventions, and (3) this ended up a lot longer than I'd hoped.

From your YinYang? post: "what class an object extends depends on what it does" The analogous Row Polymorphism style: "the elements required of input types depend on names used"

I have seen the view of row polymorphism that I would characterize as "implementation-oriented", where RP is an indexing trick to expose subsets of a product type. I.e., the type (a * b * c) would carry around the map (a->1, b->2, c->3) so that a use-site requiring (a * c) could look up the appropriate indices in the type it is handed.

I'm here using RP in terms of its type-requirements inference rather than its implementation. That is, seeing a function use (a * c) tells us that it can use any type exposing "a" and "c". To repeat the example from the post:

(def foo (a b) (+ a b)) ;

; for example (def

+	must be a function of a, b
+	(a b)

    (_add_ a.as_number b.as_number)); so now we know that "a" and "b" need an "as_number" named property ; thus, we can infer that "foo" is polymorphic over any types with "as_number" property

We can apply this kind of reasoning to the ThicknessListener? function:

(def ThicknessListener? (w) (if w.Pressed (assign w.Thickness 10) nil)) ; "w" must have "Pressed" and "Thickness" properties

deriving type (w: :(product Pressed Thickness))

What I see YinYang? as extending this RP-like name-polymorphism inference is the capability to infer construction, and not merely inferring possible uses. Thus, if we use the ThicknessListener? as so:

I see YinYang? extending this in two ways

    namespaced product types through the trait extension hierarchy
    inferred type construction

We can see the first (namespace names) in the trait definitions:

(defTrait Widget () ()) (defTrait Bordered (Widget) (Thickness : double)) (defTrait Button (Widget) (Pressed : bool))

These imply that the named-trait "Pressed" can only be inferred in some type already declared or inferred to be a "Widget". In RP, I might make a type (say, "Cylinder") with "Thickness:double" and "Pressed:bool" properties, referring to an actual physical object made of pressed metal rather than a graphical widget.

We can see the second (inferred construction) in this hacky use of the above:

(let (w (new Widget)) (assign w.Pressed false) (ThicknessListener? w) )

The ThicknessListener? invocation exports the need for a type that has Pressed and Thickness properties; this can be validated against the declaration of w as a Widget type. Then, this export informs the instantiating environment that it will have to construct a type with the given traits and their initialization/methods mixed in. A (merely) row-polymorphic world would need a full declaration:

 (let
    (w (Thickness Pressed))
    (assign w.Pressed false)
    (ThicknessListener w) )

And doesn't have any native way of delineating identically-named properties. By Chad Wellington at Mon, 2013-07-01 21:56

This is quite useful,

This is quite useful, thanks!

The fact that YinYang? traits are nominal allows us to infer construction, while the structural approach have no hope of doing that (it doesn't even make sense).

But structural typing also has a problem with large libraries and code completion in general. I mean, when all you care about is if some string of bytes is equivalent, what does that mean for a code completion menu? By Sean McDirmid? at Tue, 2013-07-02 00:08

Hmm, okay.

I guess I think of row polymorphism primarily as a shorthand for composition of application. No type inference required (or allowed!) in the usual case, because typing is something of a Gordian knot under row polymorphism. Every function in the row takes a function as its argument and returns a function as its return value. Further, every function they take (and return) also takes (and returns) functions as its operands (and return values). That has already put us beyond the power of many languages' type systems to express any difference in function types.

But in a language with an absolutely complete type system, we may still have some generics with superposed distinct type signatures, either as functions in the application or as arguments/returns along the application row, or both. In that case an inference process may deduce which type signatures among them can be used.

However, you need the information from the ending type of the row, backpropagated through an arbitrary number of intermediate functions, in order to statically determine what the type-signature of the function something ought to return (and therefore the choices of the type signatures available of the function returning it and the choice of possible function types all of those can take as arguments) needs to be; the type of the argument alone is not sufficient.

The unfortunate things about backpropagating type inference on generic functions under row polymorphism are first that it is hard to guarantee that the inference process gives a unique result, second hard to guarantee that it gives a result at all without actually performing it, and third that it may require exponential compute power relative to the length of the longest application row between any two monomorphic functions.

It simply had never occurred to me to think of type inference on generic functions as part of an implementation of row polymorphism, nor even really to consider generics with multiply-typed definitions to be compatible with row-polymorphism. Now that you have drawn it out I see how the type-inference process required would be related to the type-and-trait inference here.

I'm sure that I've ever seen that type-inference process actually implemented on a row-polymorphic system anywhere, nor worked with any row-polymorphic system that allowed generic functions at all for that matter. I'm somewhat amazed that you were even able to conceive of it.

I guess I was having a blind spot due to my preconceptions about the limited expressiveness of most type systems and the incompatibility of these paradigms. Thanks for explaining.

Bear By Ray Dillinger at Tue, 2013-07-02 00:14

The Yin and Yang of Modular Object Type Inference

Better as a working title. Here is the working abstract to go with it:

    Type inference does not deal well with languages that support nominal subtyping: only local type inference often works reliably without costly global analyses. We present the YinYang language where all program term and object types are inferred with full support for encapsulated mutable fields, nominal subtyping, genericity, easy overriding, downcasting, along with local reasoning about type errors and other forms of semantic feedback. Inference supports these good OO features by forgoing typical unification that computes types, principal or otherwise, and instead by tracking type constraints in term assignment graphs that modularly deals with fields. Required mixin-like trait extensions are then propagated along these graphs towards assignees, providing a basis to form objects and provide semantic feedback; e.g. detect incompatibilities. This paper describes our design, trade-offs made for feasibility, and implementation.

By Sean McDirmid? at Sat, 2013-06-29 01:45

https://www.google.com/search?client=ubuntu&channel=fs&q=subtyping+polymorphism+%22lambda+the+ultimate%22&ie=utf-8&oe=utf-8#bav=on.2,or.r_cp.r_qf.&channel=fs&fp=be3d510ee858bb69&psj=1&q=type+constraint+graph+%22lambda+the+ultimate%22

http://lambda-the-ultimate.org/node/3144

http://lambda-the-ultimate.org/node/4776

" Isaac Schlueter Aug 13 There's been a lot of debates, theories, and requests about Node's core API patterns on this list lately. I'd like to clarify the actual plans of the project on these points.

Callbacks will remain the de facto way to implement asynchrony. Generators and Promises are interesting and will remain a userland option.

Streams are more consistent, faster, and 100% backwards compatible as of 0.10. The "compatibility mode" aka "old mode" is folded into the API more cleanly. You can `pause()` a flowing stream, and then start calling `read()` again safely. We'll keep exposing streams from core APIs, and advocating extending the stream base classes in userland programs.

Domains will be refactored to support more generic continuation-tracking systems, to enable alternative error-handling mechanisms in userland. Eventually the Domain module will be a thing that could be done in userland, but it will continue to be bundled with the Node binary.

There will be no changes to the module system. None. It's finished. It's been finished for over a year. Any proposed changes must come with a clear reproducible bug that cannot be worked around or addressed with documentation.

TypeScript? and CoffeeScript? will not be added to core. However, as the module system will remain locked, anything that works today will keep working.

A stable C API shim will be added such that binary addons can be code-stable (and perhaps, binary-stable) across stable branches of Node for the most common use-cases. Note that the V8 API changed

significantly* in 0.12, so basically every binary addon is broken today. Also, it's very difficult to bind to the appropriate openssl/zlib/c-ares/etc. We're addressing that.

As new language features are added to V8, they'll make their way into Node. We have no plans to "auto-enable" any specific flags. Sniff for what you need, and throw with a helpful error message if it's not available.

The VM module is being refactored to bring the features of the "contextify" module into core, so that contexts work like everyone expects that they should. Multi-context support is being added to the VM module API as well.

Synchronous child process execution is being added, at long last.

v0.12 is nearing feature completion. Once we get there, we'll be trying to get everyone to use it. After v0.12, there will probably not be any API changes before v1.0. Between v0.12 and v1.0, we'll be focusing on performance, bug fixing, and stability.

We are done making breaking changes at the Node.js layer. If your program runs today, we're doing everything we can to make sure that it will run next year, albeit faster and more reliably.

This is not a democracy. However, there's plenty of room for everyone's opinion. If you want to make exciting dramatic breaking changes to node-core, and you can't find satisfaction by writing userland npm modules, then please fork joyent/node, give it a new name and logo, and go as completely crazy as you want.

OSS FTW. "

---

when we say "Bob's cat", this is like Bob.cat. The "'s" shorthand for "The cat of Bob" is useful for the same reason as the "." short hand as field accessors.

zaphar 2 days ago

link

(full disclosure: I work at google and also like erlang)

Erlang has fantastic facilities for robustness and concurrency. What it does not have is type safety and it's terrible at handling text in a performant fashion. So if you don't care about either of those things and only care about robustness and concurrency then Erlang is great. There were internal discussions about Erlang here but the upshot was. We had already basically duplicated Erlangs supervision model in our infrastructure, only we did it for all languages and Erlang didn't offer any benefits in performance for us. It's only benefit would have been the concurrency model. That's much less benefit than Go gives.

Go gives you Erlangs concurency model, a similar philosophy of robustness, type safety, all batteries included, and performance. Equating the two languages works in 1 or 2 dimensions but not on all the dimensions google cares about.

derefr 2 days ago

link

Interesting, thanks for that; it's pretty much what I guessed (especially the bit about the supervision tree and hot-code-upgrade advantages being mooted by your infrastructure.)

On a tangent, though:

> What it does not have is type safety

I've tried to work this out before (I'm designing a new language for Erlang's VM), but as far as I can tell, type safety is in practice incompatible with VM-supported hot code upgrade.

If you have two services, A and B, and you need to upgrade them both, but you can't "stop the world" to do an atomic upgrade of both A and B together (because you're running a distributed soft real-time system, after all), then you need to switch out A, and then switch out B.

So, at some point, on some nodes, A will be running a version with an ABI incompatible with B. In a strongly-typed system, the VM wouldn't allow A's new code to load, since it refers to functions in B with type signatures that don't exist.

On the other hand, in a system with pattern-matching and a "let it crash" philosophy, you just let A's new code start up and repeatedly try-and-fail to communicate with B for a while, until B's code gets upgraded as well--and now the types are compatible again.

It's an interesting problem.

laureny 2 days ago

link

> type safety is in practice incompatible with VM-supported hot code upgrade.

That's not true.

First, it's very easy to hot reload changes that have been made to the code that are backward compatible. The JVM spec describes in very specific details what that means (adding or removing a method is not backward compatible, modifying a body is, etc...).

This is how Hotswap works, the JVM has been using it for years.

As for changes that are backward incompatible, you can still manage them with application level techniques, such as rolling out servers or simply allow two different versions of the class to exist at the same time (JRebel does that, as do other a few other products in the JVM ecosystem).

Erlang doesn't really have any advantages over statically typed systems in the hot reload area, and its lack of static typing is a deal breaker for pretty much any serious production deployment.

rdtsc 1 day ago

link

> lack of static typing is a deal breaker for pretty much any serious production deployment.

Are you talking about Google only where they made it a mandate or in general? There are serious production deployments on Python, Ruby, Erlang and Javascript.

I will trade expressiveness and less lines of code with a strong but dynamically typed language + tests over more a static typed language with more lines of code all being equal.

Or put it another way, if strong typing is the main thing that protects against lack of faults and crashes in production, there is a serious issue that needs to be addressed (just my 2 cents).

derefr 2 days ago

link

> As for changes that are backward incompatible, you can still manage them with application level techniques, such as rolling out servers or simply allow two different versions of the class to exist at the same time (JRebel does that, as do other a few other products in the JVM ecosystem).

Neither of these allow for the whole reason Erlang has hot code upgrade in the first place: allowing to upgrade the code on one side of a TCP connection without dropping the connection to the other side. Tell me how to do that with a static type system :)

pmahoney 1 day ago

link

Tomcat (and other app servers) has support for doing hot reloads of Java web apps while not reloading the HTTP layer (and not dropping TCP connections).

http://www.javacodegeeks.com/2011/06/zero-downtime-deploymen...

I have implemented a similar system for JRuby apps running inside a Servlet container. There are many caveats. I don't actually recommend it because for a while you're using nearly twice the memory (and JRuby is particularly memory hungry). Also there are many ways to leak the old class definitions such that they are not GC'd (e.g. thread locals). But it's certainly possible.

I suspect that Erlang, Java, and all languages are in the same boat: some parts can be upgraded live in the VM while other parts require a full restart (maybe coordinating with multiple nodes and a load balancer to achieve zero-downtime).

banachtarski 21 hours ago

link

Erlang is not in that boat. Generally, you can upgrade the entire thing (except the lowest level libraries obviously) without too much fuss.

lenkite 1 day ago

link

Out of curiosity, where/why would such an exotic feature be needed in today's internet architectures where you always front a group of servers with a load balancer ?

banachtarski 21 hours ago

link

Not everything is a website! Also, not everything is stateless. Consider writing a chat application for the web for example and letting users on one page communicate with another one.

butterflyhug 1 day ago

link

Not all Internet protocols are HTTP. If you're running a service where long-lived connections are the norm, "simply fronting a bunch of servers with a load balancer" can require a pretty smart load balancer. E.g. IMAP connections often last hours or even days, and are required to maintain a degree of statefulness.

mihai_ionic 1 day ago

link

No, the ZeroMQ? author should be writing idiomatic code in whatever language he is using.

In his blog post, he complains that std::list<widget* > causes 2 allocations per insertion. Fine, because that's not idiomatic C++. The C++ way would be to use std::list<widget>, since there's no need for the extra indirection. Note that this would only be a good choice if you for some reason really need a list, std::vector is mostly preferred due to its better cache behaviour and requires even fewer allocations.

Secondly, if he needs to remove list items by their address, there's no reason he couldn't use an intrusive list in C++. The article I linked shows an implementation of one, and there's always boost::intrusive::list if you don't want to roll your own (and you shouldn't).

rumcajz 1 day ago

link

Extra indirection is needed when the objects are non-assignable, e.g. if there's a thread running inside the object, if it owns a fd or similar. std::vector has O(n) complexity for a lot of operations, so it's not an option.

mihai_ionic 1 day ago

link

If you can use C++11, this becomes a non-issue with move semantics.

malkia 1 day ago

link

You can't use longjmp/setjmp - http://en.cppreference.com/w/cpp/utility/program/longjmp

There are several popular "C" libraries that use them - jpeglib comes to mind, but there are others. Also certain language's runtime environment uses it. There is a way to handle it, but you need to take care.

mihai_ionic 1 day ago

link

Even in C, setjmp/longjmp leave all non-volatile automatic storage duration variables on the current stack frame in an indeterminate state if they are modified between the two calls.

To get defined behaviour with regards to destructors, you simply have to make sure that the function calling setjmp doesn't internally use RAII. Call a wrapper function to do the setjmp and you're safe.

Of course it's a whole other question whether using setjmp/longjmp to implement "exception handling in C" is a good idea in the first place.

http://lua-users.org/wiki/LuaFaq

http://lua-users.org/wiki/LuaDevelopmentModel

Instruction types

Examples of operations common to many instruction sets include: Data handling and Memory operations

    set a register to a fixed constant value
    move data from a memory location to a register, or vice versa. Used to store the contents of a register, result of a computation, or to retrieve stored data to perform a computation on it later.
    read and write data from hardware devices

Arithmetic and Logic operations

    add, subtract, multiply, or divide the values of two registers, placing the result in a register, possibly setting one or more condition codes in a status register
    perform bitwise operations, e.g., taking the conjunction and disjunction of corresponding bits in a pair of registers, taking the negation of each bit in a register
    compare two values in registers (for example, to see if one is less, or if they are equal)

Control flow operations

    branch to another location in the program and execute instructions there
    conditionally branch to another location if a certain condition holds
    indirectly branch to another location, while saving the location of the next instruction as a point to return to (a call)

Complex instructions

CISC processors include "complex" instructions in their instruction set. A single "complex" instruction does something that may take many instructions on other computers.[citation needed] Such instructions are typified by instructions that take multiple steps, control multiple functional units, or otherwise appear on a larger scale than the bulk of simple instructions implemented by the given processor. Some examples of "complex" instructions include:

    saving many registers on the stack at once
    moving large blocks of memory
    complex and/or floating-point arithmetic (sine, cosine, square root, etc.)[clarification needed]
    performing an atomic test-and-set instruction
    instructions that combine ALU with an operand from memory rather than a register

A complex instruction type that has become particularly popular recently[citation needed] is the SIMD or Single-Instruction Stream Multiple-Data Stream operation or vector instruction, that is an operation that performs the same arithmetic operation on multiple pieces of data at the same time. SIMD have the ability of manipulating large vectors and matrices in minimal time. SIMD instructions allow easy parallelization of algorithms commonly involved in sound, image, and video processing. Various SIMD implementations have been brought to market under trade names such as MMX, 3DNow! and AltiVec?.

Specialised processor types like GPUs for example also provide complex instruction sets. Nonetheless many of these specialised processor complex instruction sets do not have a publicly available native instruction set and native assembly language for proprietary hardware related reasons and are usually only accessible to software developers through standardized higher level languages and APIs. The OpenGL? virtual instruction set and virtual assembly language ARB assembly language and CUDA are examples of such hardware abstraction layers on top of the specialised processor native instruction set.

...

On systems with multiple processors, non-blocking synchronization algorithms are much easier to implement[citation needed] if the instruction set includes support for something such as "fetch-and-add", "load-link/store-conditional" (LL/SC), or "atomic compare and swap".

" Architecture

    Harvard
    Modified Harvard
    von Neumann
    Dataflow
    Comparison

An instruction set, or instruction set architecture (ISA), is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O.

...

    0-operand (zero-address machines), so called stack machines: All arithmetic operations take place using the top one or two positions on the stack: push a, push b, add, pop c. For stack machines, the terms "0-operand" and "zero-address" apply to arithmetic instructions, but not to all instructions, as 1-operand push and pop instructions are used to access memory.
    1-operand (one-address machines), so called accumulator machines, include early computers and many small microcontrollers: most instructions specify a single right operand (that is, constant, a register, or a memory location), with the implicit accumulator as the left operand (and the destination if there is one): load a, add b, store c. A related class is practical stack machines which often allow a single explicit operand in arithmetic instructions: push a, add b, pop c.
    2-operand — many CISC and RISC machines fall under this category:
        CISC — often load a,reg1; add reg1,b; store reg1,c on machines that are limited to one memory operand per instruction; this may be load and store at the same location
        CISC — move a->c; add c+=b.
        RISC — Requiring explicit memory loads, the instructions would be: load a,reg1; load b,reg2; add reg1,reg2; store reg2,c
    3-operand, allowing better reuse of data:[4]
        CISC — It becomes either a single instruction: add a,b,c, or more typically: move a,reg1; add reg1,b,c as most machines are limited to two memory operands.
        RISC — arithmetic instructions use registers only, so explicit 2-operand load/store instructions are needed: load a,reg1; load b,reg2; add reg1+reg2->reg3; store reg3,c; unlike 2-operand or 1-operand, this leaves all three values a, b, and c in registers available for further reuse.[4]
    more operands—some CISC machines permit a variety of addressing modes that allow more than 3 operands (registers or memory accesses), such as the VAX "POLY" polynomial evaluation instruction.

" http://en.wikipedia.org/wiki/Addressing_mode

...

Note that there is no generally accepted way of naming the various addressing modes. In particular, different authors and computer manufacturers may give different names to the same addressing mode, or the same names to different addressing modes. Furthermore, an addressing mode which, in one given architecture, is treated as a single addressing mode may represent functionality that, in another architecture, is covered by two or more addressing modes. For example, some complex instruction set computer (CISC) computer architectures, such as the Digital Equipment Corporation (DEC) VAX, treat registers and literal/immediate constants as just another addressing mode. Others, such as the IBM System/390 and most reduced instruction set computer (RISC) designs, encode this information within the instruction. Thus, the latter machines have three distinct instruction codes for copying one register to another, copying a literal constant into a register, and copying the contents of a memory location into a register, while the VAX has only a single "MOV" instruction.

...

Most RISC machines have only about five simple addressing modes, while CISC machines such as the DEC VAX supermini have over a dozen addressing modes, some of which are quite complicated. The IBM System/360 mainframe had only three addressing modes; a few more have been added for the System/390.

When there are only a few addressing modes, the particular addressing mode required is usually encoded within the instruction code (e.g. IBM System/390, most RISC). But when there are lots of addressing modes, a specific field is often set aside in the instruction to specify the addressing mode. The DEC VAX allowed multiple memory operands for almost all instructions, and so reserved the first few bits of each operand specifier to indicate the addressing mode for that particular operand. Keeping the addressing mode specifier bits separate from the opcode operation bits produces an orthogonal instruction set.

Even on a computer with many addressing modes, measurements of actual programs[5] indicate that the simple addressing modes listed below account for some 90% or more of all addressing modes used. Since most such measurements are based on code generated from high-level languages by compilers, this reflects to some extent the limitations of the compilers being used.[6]

...

Useful side effect

Some processors, such as Intel x86 and the IBM/390, have a Load effective address instruction.[7] This performs a calculation of the effective operand address, but instead of acting on that memory location, it loads the address that would have been accessed into a register. This can be useful when passing the address of an array element to a subroutine. It may also be a slightly sneaky way of doing more calculation than normal in one instruction; for example, using such an instruction with the addressing mode "base+index+offset" (detailed below) allows one to add two registers and a constant together in one instruction.

...

Simple addressing modes for code Absolute PC-relative Register indirect

...

...typical jumps are to nearby instructions (in a high-level language most if or while statements are reasonably short). Measurements of actual programs suggest that an 8 or 10 bit offset is large enough for some 90% of conditional jumps.[8]

...Andrew Tanenbaum showed that 98% of all the constants in a program would fit in 13 bits (see RISC design philosophy)....

...

Another advantage of (PC-relative) addressing is that the code may be position-independent, i.e. it can be loaded anywhere in memory without the need to adjust any addresses.

...

Some versions of (the PC-relative) addressing mode may be conditional referring to two registers ("jump if reg1=reg2"), one register ("jump unless reg1=0") or no registers, implicitly referring to some previously-set bit in the status register. See also conditional execution below.

...

Sequential addressing modes (ordinary sequential execution..) conditional execution "Some computer architectures (such as ARM) have conditional instructions (or conditional loads, such as x86) which can in some cases obviate the need for conditional branches and avoid flushing the instruction pipeline. An instruction such as a 'compare' is used to set a condition code, and subsequent instructions include a test on that condition code to see whether they are obeyed or ignored"

...

Simple addressing modes for data Register(or Register Direct): reg1 := reg2 * reg3; Base plus offset, and variations: reg := RAM[base + offset] (offset is a constant) Immediate/literal: reg1 := reg2 + constant; Implicit: e.g. when no address or register is specified because it is hardcoded in the choice instruction; e.g. "Implied addressing was quite common on older computers (up to mid-1970s). Such computers typically had only a single register in which arithmetic could be performed—the accumulator....Among the x86 instructions, some use implicit registers for one of the operands or results (multiplication, division, counting conditional jump).

... Other addressing modes for code or data ... Base plus index: address = contents of specified base register + contents of specified index register Scaled: contents of specified base register + scaled contents of specified index register The scale factor is normally restricted to being a power of two, so that shifting rather than multiplication can be used. Base plus index plus offset: address = offset + contents of specified base register + contents of specified index register) Here, too, the processor may scale the index register to allow for the size of each array element (when this is used to index into arrays). Register indirect Memory indirect PC-relative e.g. The x86-64 architecture supports "RIP-relative" addressing, which uses the 64-bit instruction pointer RIP as a base register. This encourages position-independent code.

Obsolete addressing modes Multi-level memory indirect: the word referenced for memory-indirect addressing could itself have an indirect flag set to indicate another memory indirect cycle. Care is needed to ensure that a chain of indirect addresses does not refer to itself; if it did, one could get an infinite loop while trying to resolve an address. Memory-mapped registers: some computers, the registers were regarded as occupying the first 8 or 16 words of memory (e.g. ICL 1900, DEC PDP-10). This meant that there was no need for a separate "Add register to register" instruction — you could just use the "add memory to register" instruction. Register autoincrement indirect (from other section): After determining the effective address, the value in the base register is incremented by the size of the data item that is to be accessed... This addressing mode has a side effect in that the base register is altered. If the subsequent memory access causes an error (e.g. page fault, bus error, address error) leading to an interrupt, then restarting the instruction becomes much more problematic Memory indirect, autoincrement Zero page processors with very few internal registers.. Arithmetic and logical instructions were mostly performed against values in memory as opposed to internal registers. As a result, many instructions required a two byte (16-bit) location to memory. Given that opcodes on these processors were only one byte (8-bit) in length, memory addresses could make up a significant part of code size. Designers of these processors included a partial remedy known as "zero page" addressing. The initial 256 bytes of memory ($0000 - $00FF; a.k.a., page "0") could be accessed using a one byte absolute or indexed memory address. This reduced instruction execution time by one clock cycle and instruction length by one byte. As a result, the zero page was used similar to a register file. Direct page The zero page address mode was enhanced...The new mode, known as "direct page" addressing, added the ability to move the 256 byte zero page memory window from the start of memory (offset address $0000) to a new location within the first 64KB of memory. Scaled index with bounds checking This is similar to scaled index addressing, except that the instruction has two extra operands (typically constants), and the hardware would check that the index value was between these bounds. Another variation uses vector descriptors to hold the bounds; this makes it easy to implement dynamically allocated arrays and still have full bounds checking. Register indirect to byte within word The DEC PDP-10 computer used 36-bit words. It had a special addressing mode which allowed memory to be treated as a sequence of bytes (bytes could be any size from 1 bit to 36 bits). A one-word sequence descriptor held the current word address within the sequence, a bit position within a word, and the size of each byte. Instructions existed to load and store bytes via this descriptor, and to increment the descriptor to point at the next byte (bytes were not split across word boundaries). Index next instruction The Elliott 503,[13] the Elliott 803,[13][14] and the Apollo Guidance Computer only used absolute addressing, and did not have any index registers. Thus, indirect jumps, or jumps through registers, were not supported in the instruction set. Instead, it could be instructed to add the contents of the current memory word to the next instruction. Adding a small value to the next instruction to be executed could, for example, change a JUMP 0 into a JUMP 20, thus creating the effect of an indexed jump. Note that the instruction is modified on-the-fly and remains unchanged in memory, i.e. it is not self-modifying code. If the value being added to the next instruction was large enough, it could modify the opcode of that instruction as well as or instead of the address.

...

Many computers (such as x86 and AVR) have one special-purpose register called the stack pointer which is implicitly incremented or decremented when pushing or popping data from the stack, and the source or destination effective address is (implicitly) the address stored in that stack pointer.

...

Many 32-bit computers (such as 68000, ARM, or PowerPC?) have more than one register which could be used as a stack pointer—and so use the "register autoincrement indirect" addressing mode to specify which of those registers should be used when pushing or popping data from a stack.

...

http://en.wikipedia.org/wiki/Flynn%27s_Taxonomy

The term "von Neumann bottleneck" was coined by John Backus in his 1977 ACM Turing Award lecture. According to Backus:

    Surely there must be a less primitive way of making big changes in the store than by pushing vast numbers of words back and forth through the von Neumann bottleneck. Not only is this tube a literal bottleneck for the data traffic of a problem, but, more importantly, it is an intellectual bottleneck that has kept us tied to word-at-a-time thinking instead of encouraging us to think in terms of the larger conceptual units of the task at hand. Thus programming is basically planning and detailing the enormous traffic of words through the von Neumann bottleneck, and much of that traffic concerns not significant data itself, but where to find it.[22][23]"

hmm... makes me wonder.. if you had a more massively parallel architecture, but still not to the level of one CPU per data object, you could at least have the main CPU 'delegate tasks' to 'local CPUs', e.g. "ok, copy this memory from here to there'

a pedagogical instruction set for the 'little man computer':

" Instructions Numeric code Mnemonic code Instruction Description 1xx ADD ADD Add the value stored in mailbox xx to whatever value is currently on the accumulator (calculator).

    Note: the contents of the mailbox are not changed, and the actions of the accumulator (calculator) are not defined for add instructions that cause sums larger than 3 digits.

2xx SUB SUBTRACT Subtract the value stored in mailbox xx from whatever value is currently on the accumulator (calculator).

    Note: the contents of the mailbox are not changed, and the actions of the accumulator are not defined for subtract instructions that cause negative results - however, a negative flag will be set so that 8xx (BRP) can be used properly.

3xx STA STORE Store the contents of the accumulator in mailbox xx (destructive).

    Note: the contents of the accumulator (calculator) are not changed (non-destructive), but contents of mailbox are replaced regardless of what was in there (destructive)

5xx LDA LOAD Load the value from mailbox xx (non-destructive) and enter it in the accumulator (destructive). 6xx BRA BRANCH (unconditional) Set the program counter to the given address (value xx). That is, value xx will be the next instruction executed. 7xx BRZ BRANCH IF ZERO (conditional) If the accumulator (calculator) contains the value 000, set the program counter to the value xx. Otherwise, do nothing.

    Note: since the program is stored in memory, data and program instructions all have the same address/location format.

8xx BRP BRANCH IF POSITIVE (conditional) If the accumulator (calculator) is 0 or positive, set the program counter to the value xx. Otherwise, do nothing.

    Note: since the program is stored in memory, data and program instructions all have the same address/location format.

901 INP INPUT Go to the INBOX, fetch the value from the user, and put it in the accumulator (calculator)

    Note: this will overwrite whatever value was in the accumulator (destructive)

902 OUT OUTPUT Copy the value from the accumulator (calculator) to the OUTBOX.

    Note: the contents of the accumulator are not changed (non-destructive).

000 HLT/COB HALT/COFFEE BREAK Stop working. DAT DATA This is an assembler instruction which simply loads the value into the next available mailbox. DAT can also be used in conjunction with labels to declare variables. For example, DAT 984 will store the value 984 into a mailbox. "

another pedagogical instruction set, CARDIAC:

" CARDIAC had a 10 instruction machine language. An instruction consisted of three decimal digits (the sign is ignored). The first digit was the op code (O), the second and third digits comprised an address (A). Addressing was one of accumulator to memory absolute, absolute memory to accumulator, input to absolute memory and absolute memory to output.

OAA

High level languages were never developed for CARDIAC, since they would defeat one of the purposes of the device, to introduce concepts of assembly language programming.

Programs were hand assembled, then written, by pencil into the appropriate memory cells. Instruction Set CARDIAC Instruction Set Opcode Mnemonic Instruction Description 0 INP Input take a number from the input card and put it in a specified memory cell. 1 CLA Clear and add clear the accumulator and add the contents of a memory cell to the accumulator. 2 ADD Add add the contents of a memory cell to the accumulator. 3 TAC Test accumulator contents performs a sign test on the contents of the accumulator; if minus, jump to a specified memory cell. 4 SFT Shift shifts the accumulator x places left, then y places right, where x is the upper address digit and y is the lower. 5 OUT Output take a number from the specified memory cell and write it on the output card. 6 STO Store copy the contents of the accumulator into a specified memory cell. 7 SUB Subtract subtract the contents of a specified memory cell from the accumulator. 8 JMP Jump jump to a specified memory cell. The current cell number is written in cell 99. This allows for one level of subroutines. 9 HRS Halt and reset move bug to the specified cell, then stop program execution. "

http://en.wikipedia.org/wiki/Dataflow_architecture

" When all of the tagged operands of an instruction become available (that is, output from previous instructions and/or user input), the instruction is marked as ready for execution by an execution unit. "

"Unlike standard computer memory (random access memory or RAM) in which the user supplies a memory address and the RAM returns the data word stored at that address, a CAM is designed such that the user supplies a data word and the CAM searches its entire memory to see if that data word is stored anywhere in it. If the data word is found, the CAM returns a list of one or more storage addresses where the word was found (and in some architectures, it also returns the data word, or other associated pieces of data)."

" Ternary CAMs

Binary CAM is the simplest type of CAM which uses data search words consisting entirely of 1s and 0s. Ternary CAM (TCAM) allows a third matching state of "X" or "Don't Care" for one or more bits in the stored dataword, thus adding flexibility to the search. For example, a ternary CAM might have a stored word of "10XX0" which will match any of the four search words "10000", "10010", "10100", or "10110". "

" Content-addressable memory is often used in computer networking devices. For example, when a network switch receives a data frame from one of its ports, it updates an internal table with the frame's source MAC address and the port it was received on. It then looks up the destination MAC address in the table to determine what port the frame needs to be forwarded to, and sends it out on that port. The MAC address table is usually implemented with a binary CAM so the destination port can be found very quickly, reducing the switch's latency. "

" A systolic array is composed of matrix-like rows of data processing units called cells. Data processing units (DPUs) are similar to central processing units (CPU)s, (except for the usual lack of a program counter,[1] since operation is transport-triggered, i.e., by the arrival of a data object)...Systolic arrays are arrays of DPUs which are connected to a small number of nearest neighbour DPUs in a mesh-like topology. "

http://www.randomhacks.net/articles/2007/03/05/three-things-i-dont-understand-about-monads

http://research.microsoft.com/en-us/um/people/simonpj/papers/haskell-retrospective/index.htm

" GUI programs typically use an event queue to translate this multi-threaded system into a sequential one, at least from the programmer’s point of view. Each user action is handled one at a time, ignoring further user actions until the previous one is completely handled. The conversion from a multi-threaded process to a single-threaded one greatly simplifies the implementation of GUI programs.

Despite the programming convenience provided by a purely sequential event queue, certain situations require a less rigid dialog with the user:

    Nested event handling: In the process of handling an event, it may be necessary to obtain further information from the user. Usually, such information is obtained via a modal dialog; in whatever fashion the input is obtained, more user events must be received and handled before the original event is completely handled. To allow the further processing of events, the handler for the original event must explicitly yield to the system. Yielding causes events to be handled in a nested manner, rather than in a purely sequential manner.

    Asynchronous event handling: An application may consist of windows that represent independent dialogs with the user. For example, a drawing program might support multiple drawing windows, and a particularly time-consuming task in one window (e.g., a special filter effect on an image) should not prevent the user from working in a different window. Such an application needs sequential event handling for each individual window, but asynchronous (potentially parallel) event handling across windows. In other words, the application needs a separate event queue for each window, and a separate event-handling thread for each event queue.

An eventspace is a context for processing GUI events. Each eventspace maintains its own queue of events, and events in a single eventspace are dispatched sequentially by a designated handler thread. An event-handling procedure running in this handler thread can yield to the system by calling yield, in which case other event-handling procedures may be called in a nested (but single-threaded) manner within the same handler thread. Events from different eventspaces are dispatched asynchronously by separate handler threads. "

" I totally agree with keeping the compiler very simple

This puts quite a few constraints on how your language is designed. Optimizing Compilers for Structured Programming Languages is an excellent read. By Peter A Jonsson at Mon, 2006-06-05 08:02