Bayle Shanks's website: proj-oot-duffyRelatedNotes

I read some blog posts by Joe Duffy, who worked on Singularity and Midori. Related notes are here. There are also some notes in plPartConcurrency (grep 'duffy'), particularly in the STM (transactions) section. A summary of my conclusions for Oot is in ootConcurrencyNotes4.

---

great writeup on the experimental Singularity OS:

http://read.seas.harvard.edu/cs261/2011/singularity.html

---

notes from

http://read.seas.harvard.edu/cs261/2011/singularity.html

" The Sing# programming language was extended in several directions to make certain programming errors simply impossible (it “eliminates many preventable defects”). The biggest example is inter-process communication, or contract-based channels. IPC is defined by state-machine-like contracts whose specifications are verified by the compiler. This ensures that every process has explicit code to handle every possible message ... Singularity also addresses robustness by seriously limiting what processes can do. Singularity processes are sealed. They cannot load libraries dynamically, modify their own code, or share memory with other processes. Some serious limitations: just-in-time compilation is impossible in sealed processes, for example. ... Singularity is flexible enough to evaluate many different levels of hardware isolation...

    “No runtime checks”: No bounds checks, null pointer checks, etc.
    “Physical memory”: The default Singularity mode (for which the OS was named); segmentation but effectively no paging
    “Add 4KB Pages”: Turn on paging.
        What are the alternatives to 4 KB pages on IA–32? (There are two, PTE_PS and PTE_G.) Why would these alternatives perform better than 4 KB pages?
    “Add Separate Domain”: Introduce a separate, duplicate page table for the application process, so that context switches require lcr3.
    “Add Ring 3”: Give the application process user privilege. Now context switches require protected control transfer (syscall/interrupts), which is expensive.
    “Full Microkernel”: All of the drivers are also in their own separate privilege domains.
        This is the most surprising result to me—I’m shocked that the additional control transfers required for “full microkernel” shift overhead only from 33% to 37.7%.... Each Singularity process has its own page-disjoint heap. That is, no process can ever access objects in another process’s heap, and the heaps are disjoint at the level of pages, not objects. All of process A’s objects live on process A’s pages, which are disjoint from any other process’s pages ... Since normal heap data can’t be shared, a separate, explicitly-managed memory area called the exchange heap is used for message passing. Exchange heap objects must have an exchangeable type.

    “Exchangeable types encompass …. all values that can be sent from one process to another. They consist of scalars [e.g. numbers], rep structs (structs of exchangeable types), and pointers to exchangeable types. Pointers can either point to a single exchangeable value or to a vector of values.” [p179, 4] Channel endpoints are also exchangeable [p3, 1].

Exchangeable objects are thus relatively simple—think flat objects, or objects with pointers to simpler objects, such as a “packet” type that points to an array of bytes. (It is not clear from the papers whether recursive data structures are exchangeable, or more specifically, whether processes can construct circular structures in the exchange heap, since the exchange heap is reference counted [p6, 1].)

The kernel is ultimately responsible for managing the exchange heap’s memory; for example, it garbage collects the exchange heap to eliminate objects held by exited SIPs. But recall that for robustness, Singularity also prevents processes from simultaneously accessing objects in shared memory. Regular heaps are pagewise disjoint, but the exchange heap is explicitly designed for inter-process communication. How can Singularity prevent shared memory access in the exchange heap?

The answer: a fancy type system. Sing# was extended to support a linear type discipline for exchange heap objects. Linear types ensure that each process can have at most one pointer to an exchange heap object at a time. When a process sends a message, the type of the send “system call” forces the sending process to lose that sole pointer to the message. As a result, and because of memory safety, the process also loses the ability to modify the message, and each exchange heap object is accessible to at most one process at a time.

...

In contracts receive events are implicit—an endpoint must receive all previous messages in a contract line before sending a message itself "

on Sing# channel contracts, this is reminiscent of https://en.wikipedia.org/wiki/Axum_(programming_language) , another MS project that some of the same people worked on:

" A Channel contains input and output ports, which are queues which are used to send data to an agent or receive data from one. To co-ordinate the communication between agents, Axum allows each channel to have a user-defined protocol for communication. The protocol is defined as a state machine.[5] The data sent over a channel can be optionally restricted to conform to a certain pre-defined schema. The compiler and runtime will enforce the conformance with the schema.[6] Under the hood, a schema is translated into a serializable .NET class that contains only properties and side effect-free methods.[7] "

---

" Were we to replace all those try/catch/rethrow blocks of code with transactions, we could do away with this error prone spaghetti. We’d also eliminate try/filter exploits thanks to Windows/Win32 2-pass SEH. Sometimes I wish we focused on this simple step forward, forgot about concurrency-safety, and baby stepped our way forward. " -- http://joeduffyblog.com/2010/01/03/a-brief-retrospective-on-transactional-memory/

---

" I had always been a languages wonk. So, I was naturally fascinated by the decades of deep work in academia, including programming language and runtime symbiosis (especially Cilk and NESL), advanced type systems, and even specialized parallel hardware architectures (especially radical ones like the Connection Machine, and MIMD supercomputers, that innovated beyond our trustworthy pal, von Neumann). " -- http://joeduffyblog.com/2016/11/30/15-years-of-concurrency/

---

"In a Tale of Three Safeties, we discussed three kinds of safety: type, memory, and concurrency" -- http://joeduffyblog.com/2016/11/30/15-years-of-concurrency/

---

" STM could handle all of the decisions transparently like figuring out how coarse- or fine-grained synchronization to use, the contention policies around that synchronization, deadlock detection and prevention, and guarantee that you didn’t forget to lock when accessing a shared data structure. All behind a tantalizingly simple keyword, atomic.

STM also came with simple, more declarative, coordination mechanisms, like orElse. So, although the focus was on eliminating the need to manually manage locking, it also helped evolve synchronization between threads. "

-- http://joeduffyblog.com/2016/11/30/15-years-of-concurrency/

---

" Actors, message-passing, data parallel, auto-vectorization, …; the titles roll off the tongue, and yet none dominates and pervades " -- http://joeduffyblog.com/2010/01/03/a-brief-retrospective-on-transactional-memory/

---

" Contrast this elegant simplicity ((of STM))) with the many pitfalls of locks:

    Data races. Like forgetting to hold a lock when accessing a certain piece of data. And other flavors of data races, such as holding the wrong lock when accessing a certain piece of data. Not only do these issues not exist, but the solution is not to add countless annotations associating locks with the data they protect; instead, you declare the scope of atomicity, and the rest is automatic.

    Reentrancy. Locks don’t compose. Reentrancy and true recursive acquires are blurred together. If a locked region expects reentrancy, usually due to planned recursion, life is good; if it doesn’t, life is bad. This often manifests as virtual calls that reenter the calling subsystem while invariants remain broken due to a partial state transition. At that point, you’re hosed.

    Performance. The tension between fine-grained locking (better scalability) versus coarse-grained locking (simplicity and superior performance due to fewer lock acquire/release calls) is ever-present. This tension tugs on the cords of correctness, because if a lock is not held for long enough, other threads may be able to access data while invariants are still broken. Scalability pulls you to engage in a delicate tip-toe right up to the edge of the cliff.

    Deadlocks. This one needs no explanation." -- http://joeduffyblog.com/2010/01/03/a-brief-retrospective-on-transactional-memory/

---

this HN commentator apparently likes part of what i've been calling Milewski's type system:

sriram_malhar 5 days ago [-]

One of the things that this blog ought to have clearly mentioned is that all this business about static analysis, ownership types, read-write annotations, purity, immutability and all that is due to a simple conflation of two axes that ought to be orthogonal: mutation and aliasing.

This is what Rust gets right: you can have an object be mutable (in which case you have only one pointer to it), or you can multiply alias pointers, all of which can only read the object. The type system enforces a static RW locking scheme in a manner of speaking; it ensures that only one of those capabilities is available at any one point in time.

and so does Midori:

"They key insight driving the formalism here was that no two “threads” sharing an address space were permitted to see the same object as mutable at the same time" [1]

(the same author elsewhere says "side-effects and ownership of memory are very important concepts" [2])

---

"Parallel Language Integrated Query (PLINQ) ...

The idea was pretty straightforward. Take existing LINQ queries, which already featured operations like maps, filters, and aggregations – things that were classically parallelizable in both languages and databases – and auto-parallelize them. Well, it couldn’t be implicit, because of side-effects. But all it took was a little AsParallel? to enable ...This is essentially MapReduce?, on a single machine, across many processors."

---

" I named the team “PFX”, initially short for “parallel frameworks”, although by the time we shipped marketing working its magic on us, renaming it to “Parallel Extensions to .NET.” This team’s initial deliverable encompassed PLINQ, task parallelism, and a new effort, Coordination Data Structures (CDS), meant to handle advanced synchronization efforts, like barrier-style synchronization, concurrent and lock-free collections derived from many great research papers, and more. ... The Task<T> abstraction was born, we rewrote PLINQ to use it, and created a suite of Parallel APIs for common patterns such as fork/join and parallel for and foreach loops.

Before shipping, we replaced the guts of the thread-pool with our new shiny work-stealing scheduler, delivering unified resource management within a process, so that multiple schedulers wouldn’t fight with one another. To this day, the code is almost identical to my early implementation in support of PLINQ (with many bug fixes and improvements, of course).

We really obsessed over the usability of a relatively small number of APIs for a long time. Although we made mistakes, I’m glad in hindsight that we did so. We had a hunch that Task<T> was going to be core to everything we did in the parallelism space but none of us predicted the widespread usage for asynchronous programming that really popularized it years later. Now-a-days, this stuff powers async and await and I can’t imagine life without Task<T>.

---

" This was the first mainstream attempt I saw to raise the abstraction level beyond threads, locks, and events, to something more approachable: concurrent collections, fork/join, and more. "

---

Axum:

" The main idiom of programming in Axum is an Agent (or an Actor)...Agents do not share resources like memory...instead a message passing model is used....Axum allows each channel to have a user-defined protocol for communication. The protocol is defined as a state machine.[5] The data sent over a channel can be optionally restricted to conform to a certain pre-defined schema. The compiler and runtime will enforce the conformance with the schema.[6] Under the hood, a schema is translated into a serializable .NET class that contains only properties and side effect-free methods.[7] "

---

Midori:

    Isolation is paramount, and we will embrace it wherever possible.

    Message passing will connect many such isolated parties through strongly typed RPC interfaces.

    Namely, inside of a process, there exists a single message loop, and, by default, no extra parallelism.

    A “promises-like” programming model will be first class so that:
        Synchronous blocking is disallowed.
        All asynchronous activity in the system is explicit.
        Sophisticated coordination patterns are possible without resorting to locks and events."

---

" (Since then, I should note, there have been some nice efforts, like Orleans https://github.com/dotnet/orleans – built in part by some ex-Midori members – TPL Dataflow https://msdn.microsoft.com/en-us/library/hh228603(v=vs.110).aspx , and Akka.NET http://getakka.net/ . If you want to do actors and/or message passing in .NET today, I recommend checking them out.) "

midori

" Midori, on the other hand, embraced numerous levels of isolation, beginning with processes themselves, which were even cheaper than Windows threads thanks to software isolation. Even coarser-grained isolation was available in the form of domains, adding added belts-and-suspenders hardware protection for hosting untrusted or logically separate code. In the early days, we certainly wanted to go finer-grained too – inspired by E’s concept of “vats” http://www.erights.org/elib/concurrency/vat.html , the abstraction we already began with for process message pumps – but weren’t sure how to do it safely. So we waited on this. But this gave us precisely what we needed for a robust, performant, and safe message passing foundation.

Important to the discussion of this architecture is the notion of shared nothing, something Midori leveraged as a core operating principle. Shared nothing architectures are great for reliability, eliminating single points of failure, however they are great for concurrency safety too. If you don’t share anything, there is no opportunity for race conditions! (This is a bit of a lie, and generally insufficient, as we shall see.)

It’s interesting to note that we were noodling on this around the same time Node.js was under development. The core idea of an asynchronous, non-blocking, single process-wide event loop, is remarkably similar. Perhaps something tasty was in the water during 2007-2009. In truth, many of these traits are common to event-loop concurrency https://en.wikipedia.org/wiki/Event_loop

This formed the canvas on top of which the entire concurrency model was painted. I’ve already discussed this in the asynchronous everything http://joeduffyblog.com/2015/11/19/asynchronous-everything/ article. But there was more… "

---

linear logic can be used for unique typing for prevention of shared memory:

" The kernel is ultimately responsible for managing the exchange heap’s memory; for example, it garbage collects the exchange heap to eliminate objects held by exited SIPs ((Software Isolated Processes)). But recall that for robustness, Singularity also prevents processes from simultaneously accessing objects in shared memory. Regular heaps are pagewise disjoint, but the exchange heap is explicitly designed for inter-process communication. How can Singularity prevent shared memory access in the exchange heap?

more on midori from [5]

midori paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/msr-tr-2012-79.pdf

" We started with C#’s type system and added two key concepts: permission and ownership.

Permission

The first key concept was permission.

Any reference could have one and it governed what you could do with the referent object:

    mutable: The target object (graph) can be mutated through the usual ways.
    readonly: The target object (graph) can be read from but cannot be mutated.
    immutable: The target object (graph) can be read from and will never be mutated.

A subtyping relationship meant you could implicitly convert either mutable or immutable to readonly. In other words, mutable <: readonly and immutable <: readonly. ... The default, if unstated, was immutable for primitive types like int, string, etc., and mutable for all others... If we had decided to break from C# more radically – something in hindsight we should have done – then breaking with compatibility and choosing the safer default...the principle of least authority would lead you to choose readonly as the default ... These permissions could also appear on methods, to indicate how the this parameter got used:

class List<T> { void Add(T e); int IndexOf?(T e) readonly; T this[int i] { readonly get; set; } }

A caller would need a sufficient permission in order to invoke a method:

readonly List<Foo> foos = ...; foos[0] = new Foo(); error: cannot mutate a readonly object.

A similar thing could be stated using delegate types and lambdas. For example:

delegate void PureFunc?<T>() immutable;

This meant that a lambda conforming to the PureFunc? interface could only close over immutable state.

Notice how powerful this has suddenly become! This PureFunc? is precisely what we would want for a parallel task. As we will see shortly, these simple concepts alone are enough to enable many of those PFX abstractions to become safe.

By default, permissions are “deep”, in that they apply transitively, to the entire object graph. This interacts with generics in the obvious way, however, so that you could, for example, have combinations of deep and shallow permissions:

readonly List<Foo> foos = ...; a readonly list of mutable Foos. readonly List<readonly Foo> foos = ...; a readonly list of readonly Foos. immutable List<Foo> foos = ...; an immutable list of mutable Foos. immutable List<immutable Foo> foos = ...; an immutable list of immutable Foos. and so on...

There is an interesting twist...((i don't understand, in this paragraph, which way he is saying would be the best way; the paragraph talks about whether a readonly pointer indicates that just the pointer is readonly ('shallow'), whether everything reachable from it is ('deep'), and about how syntactically to override whichever of these is the default)) ... Ownership

The second key concept was ownership.

A reference could be given an ownership annotation, just as it could be given a permission:

    isolated: The target object (graph) forms an unaliased transitive closure of state.

For example:

isolated List<int> builder = new List<int>();

Unlike permissions, which indicate what operations are legal on a given reference, ownership annotations told us important aliasing properties about the given object graphs. An isolated graph has a single “in-reference” to the root object in the object graph, and no “out-references” (except for immutable object references, which are permitted). ... Given an isolated object, we can mutate it in-place...And/or destroy the original reference and transfer ownership to a new one: ... In fact, any expression consuming only isolated and/or immutable inputs and evaluating to a readonly type was implicitly upgradeable to immutable; and, a similar expression, evaluating to a mutable type, was upgradeable to isolated. This meant that making new isolated and immutable things was straightforward using ordinary expressions.

The safety of this also depends on the elimination of ambient authority and leaky construction. "

in the next section he says that what he means by 'ambient authority' appears to be allowing side-effects on external state within functions that are immutable with respect to their arguments, eg "static int x = 42; PureFunc?<int> func = () => x++;". Therefore, "All side-effects need to be exposed to the type system". They explored adding an extra annotation type "to say “this function has mutable access to static variables”; however, the mutable permission is already our way of doing that".

They responded the way Haskell does; "As a result, we eliminated all ambient side-effectful operations, leveraging capability objects instead. This obviously covered I/O operations – all of which were asynchronous RPC in our system – but also even – somewhat radically – meant that even just getting the current time, or generating a random number, required a capability object. This let us model side-effects in a way the type-system could see, in addition to reaping the other benefits of capabilities.".

Also "all statics must be immutable. " (by statics i think he means 'globals', or rather, anything which is not an instance variable (?).

Because globals were always immutable, "...we were able to eliminate all class constructor accesses, replacing them with constant offsets, leading to more than a 10% reduction in code size..."

" No Leaky Construction

This brings us to the second “hole” that we need to patch up: leaky constructors.

A leaky constructor is any constructor that shares this before construction has finished. Even if it does so at the “very end” of its own constructor, due to inheritance and constructor chaining, this isn’t even guaranteed to be safe.

So, why are leaky constructors dangerous? Mainly because they expose other parties to partially constructed objects. Not only are such objects’ invariants suspect, particularly in the face of construction failure, however they pose a risk to immutability too.

In our particular case, how are we to know that after creating a new supposedly-immutable object, someone isn’t secretively holding on to a mutable reference? In that case, tagging the object with immutable is a type hole.

We banned leaky constructors altogether. The secret? A special permission, init, that meant the target object is undergoing initialization and does not obey the usual rules. For example, it meant fields weren’t yet guaranteed to be assigned to, non-nullability hadn’t yet been established, and that the reference could not convert to the so-called “top” permission, readonly. Any constructor got this permission by default and you couldn’t override it. We also automatically used init in select areas where it made the language work more seamlessly, like in object initializers.

This had one unfortunate consequence: by default, you couldn’t invoke other instance methods from inside a constructor. (To be honest, this was actually a plus in my opinion, since it meant you couldn’t suffer from partially constructed objects, couldn’t accidentally invoke virtuals from a constructor, and so on.) In most cases, this was trivially worked around. However, for those cases where you really needed to call an instance method from a constructor, we let you mark methods as init and they would take on that permission. "

the lattice of types is:

isolated -> immutable

  \ /          \ /
   .            .mutable  -> readonly

Note that the transitions away from 'isolated' are 'consume' transitions, because they must consume the only pointer to the isolated object. Recall that 'readonly' just means that YOU can't write to this object, whereas 'immutable' means there is a guarantee that NO ONE can ever write to the object.

my note: i guess 'isolated' could be renamed 'unique' or 'unaliased' or 'noalias' (unfortunately 'noalias' has a different, more confusing meaning in another context [6]). C99 uses 'restrict'.

example:

" int[] arr = ...; int[] results = await Parallel.Fork( () => await arr.Reduce((x, y) => x+y), () => await arr.Reduce((x, y) => x*y) );

This code computes the sum and product of an array, in parallel, by merely reading from it. It is data race-free.

How so? This example Fork API uses permissions to enforce the required safety:

public static async T[] Fork<T>(params ForkFunc?<T>[] funcs);

public async delegate T ForkFunc?<T>() readonly; "

in other words, because ForkFunc? (the lambda functions doing Reduce that contains the other lamda functions with the addition and multiplication in the example) must be readonly, the compiler can prove that applying them in parallel to 'arr' won't mutate arr and so will be safe; and because Fork is static, we know that it can't retain an alias to 'arr' after returning. Note: we also assume that "the mutator is temporally paused for the duration of this fork/join operation"; not sure what this means, but i'm guessing it just means that because the Parallel.Fork is enclosed by an Await, we know this thread won't be asynchronously messing with the arrays in parallel with this stuff, and presumably we also need to guarantee that 'arr' isn't being concurrently accessed by any other threads.

"we can now introduce automatic parallelism! There are a few ways to go about this. One way is to never offer LINQ operators that aren’t protected by readonly annotations...my preferred approach"

" Since tasks do not have structured lifetime like our fork/join and data parallel friends above, we cannot permit capture of even readonly state. Remember, the trick that made the above examples work is that the mutator was temporarily paused, something we cannot guarantee here with unstructured task parallelism.

So, what if a task needs to party on mutable state? For that, we have isolated! There are various ways to encode this, however, we had a way to mark delegates to indicate they could capture isolated state too (which had the side-effect of making the delegate itself isolated):

public static Task<T> Run<T>(TaskFunc?<T> func);

public async delegate T TaskFunc?<T>() immutable isolated;

Now we can linearly hand off entire object graphs to a task, either permanently or temporarily:

isolated int[] data = ...; Task<int> t = Task.Run([consume data]() => { in here, we own `data`. }); "

" Because of the rules around isolation production, mutable objects produced by tasks could become isolated, and readonly object could be frozen to become immutable. "

" The final major programming model enhancement this enabled was fine-grained “actors”, a sort of mini-process inside of a process. I mentioned the vat concept earlier, but that we didn’t know how to make it safe. Finally we had found the missing clue: a vat was really just an isolated bubble of state. Now that we had this concept in the type system, we could permit “marshaling” of immutable and isolated objects as part of the message passing protocol without marshaling of any sort – they could be shared safely by-reference! "

" There was another unanticipated consequence of safe concurrency that I quite liked: sequential consistency (SC). "

" At the bottom of all of the above safe concurrency abstractions, there was indeed unsafe code...this is the only code in the system – plus the 1st party kernel code – that had to deal with concurrency at the threads, locks, events, and lock-free level of abstraction"

" no struct tearing was visible in the 3rd party programming model. Everything was “atomic”, again for free.

This allowed us to use multi-word slice and interface representations, just like Go does, but without the type-safety- threatening races. It turns out, the risk of struct tearing is one of major factors preventing us from having a great Go-like slice type to C# and .NET. In Midori, slices were safe, efficient, and everywhere. "

"The most worrisome outcome from these kind of races ((higher-level races between message-passing agents)) is time of check time of use (TOCTOU), one of the more common kinds of races that can lead to security vulnerabilities. (Midori’s type- and memory-safety of course helps to avoid this particular symptom, however reliability problems are very real also.)"

" Although people used to hate it when I compared this situation to COM STAs, for those familiar with them, an analogy is apt. If you need to block a thread inside of a COM STA, you must decide: Do I pump the message loop, or do I not pump the message loop? If you choose to pump the message loop, you can suffer reentrancy, and that reentrancy might be witness to broken invariants, or even mutate state out from underneath the blocking call, must to its dismay after it reawakens. If you choose not to pump the message loop, you can suffer deadlock, as calls pile up, possibly ones that are required to unblock the thread.

In Midori’s system, we did not give this choice to the developer. Instead, every await was an opportunity to pump the underlying message loop. Just as with a COM STA, these pumps possibly dispatched work that might interact with shared state. "

" Evolution

Before moving on, a brief word on the evolution of the system. As I mentioned earlier, I presented the system in its final form. In reality, we went through five major phases of evolution. I won’t bore you with exhaustive details on each one, although I will note the major mistakes and lessons learned in each phase.

In the first phase, I tried to build the system entirely out of annotations that were “outside” of the type system. As I’ve already said, that failed spectacularly. At this point, I hope you can appreciate how deeply integrated into the compiler and its type system these concepts need to be for them to work and for the result to be usable.

Next, I tried a variant of this with just readonly. Except I called it readable (a name that would stick until the very tail end of the project), and it was always deep. There was no immutable and there was no isolated. The concept of mutable was called writable, although I was delusional, and thought you’d never need to state it. I was very confused about the role generics played here, and ended up coding myself up into a corner trying to make it work.

After that, I recognized at least that readable and writable were related to one another, and recognized the subtyping relationship of (writable <: readable). And, largely based on conversations with colleagues in MSR, I decided to toss out everything I had done on generics and redo it. It was at that time I recognized that each generic type variable, despite looking like a naked type, actually carried both a permission and a type. That helped.

I then came up with immutable, however it wasn’t what you see today. Instead, it had the slightly confusing meaning of being a “view” over just the immutable subset of data in the target object graph. (This was at first limited to only readonly fields (in the classical C# sense) that were of a primitive type.) If you tried reading a non-immutable part from this view, you’d get a compiler error. Bizarrely, this meant you could have an immutable List<T> that wasn’t actually immutable. In hindsight, this was pretty wonky, but it got us thinking about and discussing immutability.

Somewhere in here, we recognized the need for generic parameterization over permissions, and so we added that. Unfortunately, I originally picked the % character to indicate that a generic type was a permission, which was quite odd; e.g., G<%P> versus G<T>. We renamed this to permission; e.g., G<permission P> versus G<T>.

There was one problem. Generic permissions were needed in way more places than we expected, like most property getters. We experimented with various “shortcuts” in an attempt to avoid developers needing to know about generic permissions. This hatched the readable+ annotation, which was a shortcut for “flow the this parameter’s permission.” This concept never really left the system, although (as we will see shortly), we fixed generics and eventually this concept became much easier to swallow, syntax-wise (especially with smart defaults like auto-properties).

We lived with this system for some time and this was the first version deployed at-scale into Midori.

And then a huge breakthrough happened: we discovered the concepts necessary for isolated and, as a result, an immutable annotation that truly meant that an object (graph) was immutable.

I can’t claim credit for this one. That was the beauty of getting to this stage: after developing and prototyping the initial ideas, and then deploying them at-scale, we suddenly had our best and brightest obsessing over the design of this thing, because it was right under their noses. This was getting an initial idea out in front of “customers” early-and-often at its finest, and, despite some growing pains, worked precisely as designed.

We then wallowed in the system for another year and 1/2 and, frankly, I think lost our way a little bit. It turns out deepness was a great default, but sometimes wasn’t what you wanted. List<T> is a perfect example; sometimes you want the List to be readonly but the elements to be mutable. In the above examples, we took this capability for granted, but it wasn’t always the case. The outer readonly would infect the inner Ts.

Our initial whack at this was to come up with shallow variants of all the permissions. This yielded keywords that became a never-ending source of jokes in our hallways: shreadable, shimmutable, and – our favorite – shisolated (which sounds like a German swear word when said aloud). Our original justification for such nonsense was that in C#, the signed and unsigned versions of some types used abbreviations (sbyte, uint, etc.), and shallow sure would make them quite lengthy, so we were therefore justified in our shortening into a sh prefix. How wrong we were.

From there, we ditched the special permissions and recognized that objects had “layers”, and that outer and inner layers might have differing permissions. This was the right idea, but like most ideas of this nature, we let the system get inordinately more complex, before recognizing the inner beauty and collapsing it back down to its essence.

At the tail end of our project, we were working to integrate our ideas back into C# and .NET proper. That’s when I was adamant that we unify the concept of readable with readonly, leading to several keyword renames. Ironically, despite me having left .NET to pursue this project several years earlier, I was the most optimistic out of anybody that this could be done tastefully. Sadly, it turned out I was wrong, and the project barely got off the ground before getting axed, however the introductory overview above is my best approximation of what it would have looked like. "

" I should note that I experimented with the attribute approach before Roslyn analyzers came on the scene, so I was left with post-build IL analysis. The diagnostics for this were pretty terrible: a different tool, error messages at too low a semantic level, ..., all the usual problems, most of which analyzers solve.

There are places you simply can't place attributes today. It's annoying that this is the limiting factor because it's so "easy" to fix conceptually. However, it's more subtle than it appears, due to the absence of attribute "slots" in various places in the MSIL metadata. (Local variables and generic instantiations are the ones you are left wanting most.) There's a proposal up on GitHub? to solve this: https://github.com/dotnet/rosl.... If that was done, we could get much closer. "

" "init" was ... funny. I usually leave it out of the discussion of lattices and top types and whatnot, just because it muddies up the otherwise simple conceptual model, and confuses people. `init` was indeed treated like `mutable` in the sense that you could write to it, however you could not convert one to `readonly`. There were other restrictions to prevent you from storing `this` into it, reading it out, and then aliasing *that*, for example. "

" Effect Types

Effect typing, primarily in the ML community, was also influential in the early days. An effect type propagates information at compile-time describing the dynamic effect(s) executing said code is expected to bring about. This can be useful for checking many properties.

For example, I always thought of await and throws annotations as special kinds of effects that indicate a method might block or throw an exception, respectively. Thanks to the additive and subtractive nature of effect types, they propagate naturally, and are even amenable to parametric polymorphism.

It turns out that permissions can be seen as a kind of effect, particularly when annotating an instance method. In a sense, a mutable instance method, when invoked, has the “effect” of mutating the receiving object. This realization was instrumental in pushing me towards leveraging subtyping for modeling the relationship between permissions.

Related to this, the various ownership systems over the years were also top-of-mind, particularly given Midori’s heritage with Singularity, which used the Spec# language. This language featured ownership annotations. "

" Regions, despite classically being used mostly for deterministic and efficient memory management, were incredibly interesting towards the days of figuring out isolated. They aren’t identical for several reasons, however.

The first reason is that isolated object graphs in our system weren’t as strictly partitioned as regions, due to immutable in- and out- references. Regions are traditionally used to collect memory efficiently and hence dangling references like this wouldn’t be permitted (and the reachability analysis to detect them would basically devolve into garbage collection).

The second reason is that we wanted to avoid the syntactic burden of having regions in the language. A good example of this in action is Deterministic Parallel Java, which requires explicit region annotations on objects using a very generics-like syntax (e.g., Foo<region R>). Some amount of this can be hidden from the developer through more sophisticated compiler analysis – much like Cyclone did – however, we worried that in some very common cases, regions would rear their ugly heads and then the developer would be left confused and dismayed.

All that said, given our challenges with garbage collection, in addition to our sub-process actor model, we often pondered whether some beautiful unification of isolated object graphs and regions awaited our discovery. "

" Separation Logic

Particularly in the search for formalisms to prove the soundness of the system we built, separation logic turned out to be instrumental, especially the concurrent form. This is a formal technique for proving the disjointness of different parts of the heap "

" Uniqueness types are a more recent invention, derived from some of the early linear type systems which so fascinated me early on. For a period of time, we actually had a unique keyword in the language. Eventually we folded that back into the concept of isolated (it was essentially a “shallow” isolated). But there is no denying that all of this was heavily inspired by what we saw with uniqueness types, especially in languages like Clean, the experimental work to bring uniqueness to Scala, and, now, Rust. "

" Model Checking

Finally, I would be remiss if I didn’t at least mention model checking. It’s easy to confuse this with static analysis, however, model checking is far more powerful and complete, in that it goes beyond heuristics and therefore statistics. MSR’s Zing and, although we used it to verify the correctness of certain aspects of our implementation, I don’t think we sufficiently considered how model checking might impact the way safety was attained. This was top-of-mind as we faced intra-process interleaving race conditions. Especially as we look to the future with more distributed-style concurrency than intra-process parallelism, where state machine verification is critical, many key ideas in here are relevant. "

" (Modern) C++

I already mentioned const and its influence.

It is also interesting to note the similarities between isolated and C++11’s std::unique_ptr. Although born in different times, and in very different worlds, they both clearly deliver a similar perspective on ownership. Noted difference include deepness – C++’s approach is “deep” insofar as you leverage RAII faithfully in your data structures – and motivations – C++’s motivation being primarily memory management, and neither safe concurrency nor immutability.

The concept of constexpr has obvious similarities to both isolated and immutable, particularly the compile-time evaluation and freezing of the results. The continued evolution of constexpr in C++13 and C++17 is taking the basic building blocks to new frontiers that I had always wanted to do with our system, but never had time, like arbitrary compile-time evaluation of expressions, and freezing/memoization of the results. "

" D

The system we came up with has obvious comparisons to D’s take on const and immutable; just as D’s const is a view over mutable or immutable data, so too is our readonly. And just as D added deepness to the concept of const, so did we in our permissions model generally. This is perhaps the closest analogy in any existing systems. I am frankly surprised it doesn’t get used an order of magnitude more than it does"

---

on C const:

" That said, const is best known for its unsoundness, thanks to the pervasive use of const_cast. This is commonly used at the seams of libraries with different views on const correctness, although it’s also often used to cheat; this is often for laziness, but also due to some compositional short-comings. The lack of parameterization over const, for example, forces one to duplicate code; faced with that, many developers would rather just cast it away. " -- [7]

---

"sub-process parallelism" including "task and data parallelism"

---

my summary of [8]:

he's talking about replacing 'volatile' with:

1) "Eliminate memory ordering from 99% of developers’ purviews. This is already the case with single-threaded programs, because code motion in compilers and processors is limited to what only affects concurrent observations. So the answer is pretty clear: developers must move towards single-threaded programming models connected through message passing, optionally with provably race-free fine-grained parallelism inside of those single-threaded worlds."

2) For the 1% who want to think about memory ordering, use something like C++0x atomics. He recommends that instead of attaching a 'volatile' annotation to stuff, you have a standard library of 'volatile' classes for basic types; the 'volatile' classes would have methods like ReadAcquireFence?(), ReadFullFence?(), ReadCompilerOnlyFence?(), ReadUnfenced?(), WriteReleaseFence?(), WriteFullFence?(), etc, and AtomicCompareExchange?(), AtomicExchange?(), AtomicAdd?(x), AtomicIncrement?(). Now the fences occur explicitly in code

Tangentially, he also explains some types of bugs that higher-level languages can avoid for you:

" My understanding is that C++0x has decided, at least for now, not to offer type-safety in the face of multithreading. That means you might publish an object and, if stores occur out-of-order, the reader could see an object partially initialized with an invalid vtable pointer. In C# and Java, the language designers have thankfully decided to shield programmers from this. The need for fences also extends to unsafe code like strings, where – were it possible for a thread to read the non-zero length before the char* pointer was valid – writes to random memory could occur and hence threaten type-safety. Thankfully, again, C# and Java protect developers from this, mostly due to the automatic zero’ing of memory as the GC allocates and hands out new segments. "

---

"...locks are not declarative. Not even close. They are not associated with the data protected by those locks, but rather the code that accesses said data... Sure, you can achieve atomicity and isolation, but only by intimately reasoning about your code by understanding the way they are implemented. And if you care about performance, you are also going to need to think about hardware esoterica such as CMPXCHG, spin waiting, cache contention, optimistic techniques with version counters and memory models, ABA, and so on."

---

" (Transactions unfortunately do not address one other issue, which turns out to be the most fundamental of all: sharing. Indeed, TM is insufficient – indeed, even dangerous – on its own is because it makes it very easy to share data and access it from multiple threads; first class isolation is far more important to achieve than faux isolation. "

---

" Reams of code “out there” contain brittle, untested, and, therefore, incorrect error handling code. We have seen such code lead to problems ranging in severity: reliability issues leading to data loss, security exploits, etc. Were we to replace all those try/catch/rethrow blocks of code with transactions, we could do away with this error prone spaghetti. We’d also eliminate try/filter exploits thanks to Windows/Win32 2-pass SEH. Sometimes I wish we focused on this simple step forward, forgot about concurrency-safety, and baby stepped our way forward. Likely it wouldn’t have been enough, but I still wonder to this day. " -- [9]

---

Joe Duffy's concluding notes on transactions vs message-passing and concurrency support from a side-effect-tracking type system:

" I eventually shifted focus to enforcing coarse-grained isolation through message-passing, and fine-grained isolation through type system support a la Haskell’s state monad. This would help programmers to realize where they accidentally had sharing, I thought, rather than merely masking this sharing and making it all work (albeit inefficiently).

I took this path not because I thought TM had no place in the concurrency ecosystem. But rather because I believed it did have a place, but that several steps would be needed before getting there.

I suspected that, just like with Argus, you’d want transactions around the boundaries. And that you’d probably want something like open nesting for fine-grained scalable data structures, like shared caches. These are often choke points in a coarse-grained locking system, and often cannot be fully isolated, at least in the small. Ironically I am just now arriving there. In the system I work on I see these issues actually staring us in the face. " -- [10]

He also says in a followup blog post: " I remain a big fan of transactions. Atomicity, consistency, and isolation, and coming up with strategies for achieving all three in tandem, are part and parcel to architecting software. ... manipulation of state can be done safely and even in parallel thanks to transactions.

The requirement is that messages are atomic and commute. Transactions, it turns out, are a convenient way of implementing this requirement. ... Why aren’t transactions sufficient, then, given that the I in ACID stands for Isolation? You wouldn’t model a database as one flat table in which each row is a single byte, however, would you? ... The Argus and Haskell approaches are conducive to large-scale concurrency, but it is primarily because of the natural isolation those models provide; the addition of transactions address problems that remain after taking that step. But without that first step, they would have gotten nowhere." -- [11]

and elsewhere he says, "My brief summary is that it’s more important to encourage good concurrency architecture than it is to make poor ones “just work”, although I have written more details here and here. It was this higher level architecture that we should focus on solving first and foremost and, after the dust settled, see what gaps remained. It wasn’t even clear that STM would be the correct tool for the job once we got to that point. (In hindsight, I do think it’s one of the very many reasonable tools in the toolbelt, although with more distributed application architectures on the rise, it’s a dangerous thing to give to people.)"

After reading some of his other stuff, I think what he's saying here is just that isolation is the first thing you need -- and he suggests providing it via message-passing and via the type system in a Haskell-y and Rust-y way (side effect typing; immutable types, unique types; etc). And after you have isolation, then you want transactions, as a way to share memory in addition to that. But if you have to choose, isolation is more important; isolation without transactions is still useful, but transactions without isolation is unworkable. So after deciding this, he's personally been working on isolation stuff first, and then he'll get to transactions later.

---

Joe Duffy is a big fan of Liskov's Argus, and suggests:

http://dl.acm.org/citation.cfm?id=42399

he talks about it alot when he talks about transactions, " Transactions are another such mechanism, sure, but more than that, transactions are an all-encompassing way of thinking about how programs behave at their most fundamental core. Transaction is a religion.

Not everybody believes this, and of course why would they: it is an immensely subjective and qualitative statement. Some will claim that models like message passing entirely avoid the likes of “race conditions,” and such, but this is clearly false: state transitions are made, complicated state invariants are erected amongst a sea of smaller isolated states, and care must be taken, just as in shared memory. Even Argus, a beautiful early incarnation of message-passing (via promises) demands that messages are atomic in nature. This property is not checked and, if done improperly, leads to “races in the large.” Even Argus introduced the notion of transactions and persistence in the form of guardians. " "I suspected that, just like with Argus, you’d want transactions around the boundaries. "

---