Bayle Shanks's website: proj-plbook-plPartConcurrency3

continued from [1] and [2]

https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-mcsherry.pdf notes that if you only measure perf by looking at 'scalability', you miss that often an optimized single-threaded implementation has better perf

Fearless concurrency: how Clojure, Rust, Pony, Erlang and Dart let you achieve that. discussion: https://news.ycombinator.com/item?id=19241427

---

Eiffel's SCOOP system:

http://cme.ethz.ch/scoop/ https://en.wikipedia.org/wiki/SCOOP_(software)

each object is associated with a thread
you can specify that a ref to an object refers to an object in a different thread via the 'separate' keyword
runtime ensures no data races between objects
a precondition ('require') on an object method would usually cause a runtime error if the method is called but the precondition is not met -- however if the object being called is in a different thread, then the precondition instead causes the call to block until the condition is satisfied

---

great slides:

Why Threads Are A Bad Idea (for most purposes) by John Ousterhout

recommends using a single-threaded event-loop unless CPU concurrency actually needed

---

What Memory Model Should the Rust Language Use?

---

https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/

kinda look optimistic transactions

" I really want to like rseq but I’ve never found a case where it actually helped. It’s a poor-man’s transactional memory and it requires you to structure your code such that you have a set of reads followed by a write that can fail, where you’re then notified of the failure. The problem is that this is exactly the structure that load-linked / store-conditional give you on load-store architectures and that you can fake with a 16-byte compare-and-exchange on x86 (compare-and-swap has an intrinsic ABA problem so fudge it by using a 64-bit free-running counter of updates to cause the compare to fail in the ABA case). In theory you can do multiple writes in an rseq but if you do then other code has to be able to observe and handle partial updates from you, which probably costs more. Anything I’ve been able to shoehorn into the rseq structure, I could write with CPU primitives more efficiently. " [3]

---

"colorless async"

https://news.ycombinator.com/item?id=29049881 :

https://brunosutic.com/blog/async-ruby

sizediterable 9 days ago

next [–]

> any blocking operation (a method where Ruby interpreter waits) is compatible with Async and will work asynchronously within Async code block with Ruby 3.0 and later.

That's pretty magical. Does this mean it would be possible to implement structured concurrency [0], but without the function coloring problem?

Regardless, I think I prefer how this looks from a code read/writeability perspective compared to Zig's or Swift's (and potentially future Rust's [2]) approaches.

[0] https://vorpus.org/blog/notes-on-structured-concurrency-or-g...

[1] https://kristoff.it/blog/zig-colorblind-async-await/

[2] https://blog.yoshuawuyts.com/async-overloading/

brunosutic 9 days ago

parent

next [–]

Async Ruby is colorless!

It's obvious from the examples provided in the post. For example, you can use the method `URI.open` both synchronously and asynchronously.

It's something I didn't want to mention in the article because it's a relatively advanced async concept, but yea - Async Ruby is already colorless and it's great.

gpderetta 9 days ago

root

parent

next [–]

IIRC ruby has first class continuations, so it doesn't have to deal with any async/await nonsense.

brunosutic 9 days ago

root

parent

next [–]

That's right - Async Ruby uses fibers (stackful coroutines) as a concurrency primitive. There's a lot to say about this, but the end result is that we get to write simple synchronous code and it can run asynchronously.

" bradgessler 9 days ago

root

parent

next [–]

What is “colorless” suppose to mean? This is the first time I’ve seen it used.

justsomeuser 9 days ago

root

parent

next [–]

It references the function return type:

Colored = Promise<T>

Colorless = T

In "colorless" async/await, async functions are implicit (no async keyword), and when called, the compiler auto-unwraps the promise, so everything looks like "sync" code but any function stack could "save and resume" at an "async point".

vmarquet 9 days ago

root

parent

next [–]

I think it's a reference to this blog post https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...

auno 8 days ago

root

parent

next [–]

Which in turn is making a reference to an even older post at https://ansuz.sooke.bc.ca/entry/23.

DangitBobby? 9 days ago

parent

next [–]

As someone who operates mainly in Python, I am so jealous. As far as I'm aware [1], you have to re-write your tooling to take advantage of async in Python. Does anyone have any insight into why Python async doesn't work the same way? Does it come down to fundamental language differences?

1. https://stackoverflow.com/a/63179518/4728007

brunosutic 9 days ago

root

parent

next [–]

> Does it come down to fundamental language differences?

No, I don't think there's anything fundamentally different.

Ruby 3.0 implements a "fiber scheduler" feature that enables "colorless Async". Fiber scheduler is an obscure Ruby feature, but the end result is brilliant. It was also a huge amount of work.

Side note: Fiber scheduler was implemented by the same guy who created Async Ruby - Samuel Williams. This guy is the mastermind (and master-coder) behind this project.

pansa2 9 days ago

root

parent

next [–]

> Does anyone have any insight into why Python async doesn't work the same way?

Ruby async seems to be implemented using stackful coroutines. IIRC Guido has been opposed to adding these to core Python, preferring stack-less coroutines because they require every yield-point to be explicit (i.e. marked with `await`).

There are libraries for a python that support stackful coroutines, such as gevent.

reply "

azth 9 days ago

parent

next [–]

The first reference is the same route that Java is taking with its green thread implementation by means of Project Loom[0].

[0] https://cr.openjdk.java.net/~rpressler/loom/loom/sol1_part2.... https://cr.openjdk.java.net/~rpressler/loom/loom/sol1_part2.html#structured-concurrency

reply "

---

example code for efficient concurrency in various languages, and a benchmark: https://github.com/athas/raytracers

---

http://kilby.stanford.edu/~rvg/models.html http://kilby.stanford.edu/~rvg/352.html

---

http://www.cs.tau.ac.il/~maon/teaching/2014-2015/seminar/seminar1415a-lec7-conc-perl-ppt.pdf

---

https://greenteapress.com/semaphores/LittleBookOfSemaphores.pdf

---

blackboard systems

javaspaces

"...the basics of the JavaSpaces? API: write deposits an object, called an entry, into a space; read makes a copy of an entry in a space (but leaves the object there); and take removes an entry from a space. Let's see how this simple API incorporates the basics of space-based synchronization ... Any number of processes can read the entry in the space at a given time. But suppose a process wants to update the entry: the process must first remove the entry from the space, make the changes, and then write the modified entry back to the space:

Message result = (Message)space.take(template, null, Long.MAX_VALUE); result.content = "Goodbye!"; space.write(result, null, Lease.FOREVER);

It's important to note that a process needs to obtain exclusive access to an entry that resides in the space before modifying it. If multiple processes are trying to update the same entry using this code, only one obtains the entry at a time. The others wait during the take operation until the entry has been written back to the space, and then one of the waiting processes takes the entry from the space while the others continue to wait.

The basics of space-based programming synchronization can be quickly summarized: Multiple processes can read an entry in a space at any time. But when a process wants to update an entry, it first has to remove it from the space and thereby gain sole access to it. " -- [4]

---

The Problem with Threads by Edward A. Lee (2006)

---

" Of course, there are tantalizingly simple rules for avoiding deadlock. For example, always acquire locks in the same order [32]. However, this rule is very difficult to apply in practice because no method signature in any widely used programming language indicates what locks the method 6See http://ptolemy.eecs.berkeley.edu 8 acquires. You need to examine the source code of all methods that you call, and all methods that those methods call, in order to confidently invoke a method. Even if we fix this language problem by making locks part of the method signature, this rule makes it extremely difficult to implement symmetric accesses (where interactions can originate from either end). And no such fix gets around the problem that reasoning about mutual exclusion locks is extremely difficult. " -- The Problem with Threads by Edward A. Lee (2006)

---

some concurrency techniques with nondeterminancy:

" The synchronized keyword in Java implements mutual exclusion, turning instances...into monitors, preventing any two threads from being in these methods simultaneously. When a synchronized method is called, the calling thread attempts to acquire an exclusive lock on the object. If any other thread holds that lock, then the calling thread stalls until the lock is released. ...can lead to deadlock. " -- The Problem with Threads by Edward A. Lee (2006), section 4, "How Bad is it In Practice?", page 6).

The Problem with Threads by Edward A. Lee goes on to give an example of deadlock using the Java 'synchronized' keyword technique (within section 4, "How Bad is it In Practice?", starting on page 5).

" vetted design patterns for concurrent computation, as in (D. Lea. Concurrent Programming in Java: Design Principles and Patterns. Addison-Wesley, Reading MA, 1997) and (C. Schmidt, M. Stal, H. Rohnert, and F. Buschmann. Pattern-Oriented Software Architecture - Patterns for Concurrent and Networked Objects. Wiley, 2000). Indeed, these are an enormous help when the programmer’s task identifiably matches one of the patterns. However, there are two difficulties. One is that implementation of the patterns, even with careful instructions, is still subtle and tricky. Programmers will make errors, and there are no scalable tech- niques for automatically checking compliance of implementations to patterns. More importantly, the patterns can be difficult to combine. Their properties are not typically composable, and hence nontrivial programs that require use of more than one pattern are unlikely to be understandable.

...

Transactions support speculative unsynchronized computation on a copy of the data followed by a commit or abort. A commit occurs when it can be shown that no conflicts have occurred... Transactions eliminate unintended deadlocks, but despite recent extensions for composability (Harris, S. Marlow, S. P. Jones, and M. Herlihy. Composable memory transactions. In ACM Confer- ence on Principles and Practice of Parallel Programming (PPoPP?), Chicago, IL, 2005), remain a highly nondeterministic interaction mechanism. They are well-suited to intrinsically nondeterminate situations, where for example multiple actors compete nondeterministically for resources. But they are not well-suited for building determinate concurrent interactions.

A particularly interesting use of patterns is MapReduce?, as reported by Dean and Ghemawat (Dean and S. Ghemawat. MapReduce?: Simplified data processing on large clusters. In Sixth Sympo- sium on Operating System Design and Implementation (OSDI), San Francisco, CA, 2004). This pattern has been used for large scale distributed processing of huge data sets by Google. Whereas most patterns provide fine-grain shared data structures with synchronization, MapReduce? provides a framework for the construction of large distributed programs. The pattern is inspired by the higher-order functions found in Lisp and other functional languages. The parameters to the pattern are pieces of functionality represented as code rather than pieces of data.

Patterns may be encapsulated into libraries by experts, as has been done with MapReduce? at Google, the concurrent data structures in Java 5.0, and STAPL in C++ (An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. Amato, and L. Rauchwerger. STAPL: An adaptive, generic parallel C++ library. In Wkshp. on Lang. and Comp. for Par. Comp. (LCPC), pages 193–208, Cumberland Falls, Kentucky, 2001). This greatly improves the reliability of implementations, but requires some programmer discipline to constrain all concurrent interactions to occur via these libraries. Folding the capabilities of these libraries into languages where syntax and semantics enforce these constraints may eventually lead to much more easily constructed concurrent programs.

Higher-order patterns such as MapReduce? offer some particularly interesting challenges and opportunities for language designers. These patterns function at the level of coordination languages...

...

A common compromise is to extend established programming languages with annotations or a few selected keywords to support concurrent computation. This compromise admits the re-use of signifcant portions of legacy code when concurrency is not an issue, but requires rewriting to expose concurrency. This strategy is followed for example by Split-C (D. E. Culler, A. Dusseau, S. C. Goldstein, A. Krishnamurthy, S. Lumetta, T. v. Eicken, and K. Yelick. Parallel programming in Split-C. In Supercomputing, Portland, OR, 1993) and Cilk (D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: an efficient multithreaded runtime system. In ACM SIGPLAN symposium on Principles and Practice of Parallel Programming (PPoPP?), ACM SIGPLAN Notices, 1995), which are both C-like languages supporting multithreading. In Cilk, a programmer can insert keywords “cilk” “spawn” and “sync” into what are essentially C programs. The objective of Cilk is to sup- port dynamic multithreading in a shared-memory context, where overhead is incurred only when parallelism is actually exploited (that is, on one processor, the overhead is low).

A related approach combines language extensions with constraints that limit expressiveness of established languages in order to get more consistent and predictable behavior. For example, the Guava language [5] constrains Java so that unsynchronized objects cannot be accessed from multiple threads. It further makes explicit the distinction between locks that ensure the integrity of read data (read locks) and locks that enable safe modification of the data (write locks). These language changes prune away considerable nondeterminacy without sacrificing much performance, but they still have deadlock risk.

Another approach that puts more emphasis on the avoidance of deadlock is promises, as realized for example by Mark Miller in the E programming language ( http://www.erights.org/ ). These are also called futures, and are originally attributable to Baker and Hewitt (J. Henry G. Baker and C. Hewitt. The incremental garbage collection of processes. In Proceedings of the Symposium on AI and Programming Languages, volume 12 of ACM SIGPLAN Notices, pages 55–59, 1977). Here, instead of blocking to access shared data, programs proceed with a proxy of the data that they expect to eventually get, using the proxy as if it were the data itself.

Yet another approach leaves the programming languages and the mechanisms for expressing concurrency unchanged, and instead introduces formal program analysis to identify potential con- currency bugs in multithreaded programs. This is done for example in Blast (T. A. Henzinger, R. Jhala, R. Majumdar, and S. Qadeer. Thread-modular abstraction refinement. In 15th International Conference on Computer-Aided Verification (CAV), volume 2725 of Lecture Notes in Computer Science, pages 262–274. Springer-Verlag, 2003) and the Intel thread checker ( http://developer.intel.com/software/products/threading/tcwin ). This approach can help considerably by revealing program behaviors that are difficult for a human to spot. Less formal techniques, such as performance debuggers like Valgrind ( http://valgrind.org/ ), can also help in a similar way, making it easier for programmers to sort through the vast nondeterminacy of program behaviors. Although promising, both the formal and informal techniques still require considerable expertise to apply and suffer from scalability limitations.

All of the above techniques prune away some of the nondeterminacy of threads. However, they all still result in highly nondeterministic programs. For applications with intrinsic nondeterminacy, such as servers, concurrent database accesses, or competition for resources, this is appropriate. But achieving deterministic aims through nondeterministic means remains difficult. " -- The Problem with Threads by Edward A. Lee (2006), section 5 "Fixing Threads by More Aggressive Pruning" ---

The Problem with Threads by Edward A. Lee (2006), section 6 "Alternatives to Threads", page 11, presents a coordination language with a visual syntax and an example. I can't put a picture here so here's my textual translation of the diagram in Figure 3 with pseudo-code:

concurrency_type: Rendezvous Director merge1 = Merge(inputs: [Value Producer 1, Value Producer 2], output: tee1) tee1 = Tee(input: merge1, outputs: [Value Consumer, Observer])

Here's what the paper has to say about Figure 3:

" a rendezvous-based coordination language with a visual syntax...the “Rendezvous domain” of Ptolemy II (. Eker, J. W. Janneck, E. A. Lee, J. Liu, X. Liu, J. Ludvig, S. Neuendorffer, S. Sachs, and Y. Xiong. Taming heterogeneity—the Ptolemy approach. Proceedings of the IEEE, 91(2), 2003). The box at the upper left labeled “Rendezvous Director” is an annotation spec- ifying that this diagram represents a CSP-like (C. A. R. Hoare. Communicating sequential processes. Communications of the ACM, 21(8), 1978) concurrent program, where each component (represented by an icon) is a process, and communication is by rendezvous. The processes them- selves are specified using ordinary Java code, so this framework is properly viewed as a coordination language (that happens to have a visual syntax). ... In the diagram, the Merge block specifies that either of the two Value Producers can rendezvous with both the Value Consumer and the Observer. That is, there two possible three-way rendezvous interactions that can occur. These interactions can occur repeatedly in nondeterministic order. First, note that once you understand what the icons mean, the diagram very clearly expresses the observer pattern. Second, notice that everything about the program is deterministic except the explicitly nondeterministic interaction specified by the Merge block. Were that block absent, then the program would specify deterministic interactions between deterministic processes. Third, note that deadlock is provably absent (in this case, the lack of cycles in the diagram ensures no deadlock). Fourth, note that the multiway rendezvous ensures that the Value Consumer and Observer see new values in the same order. The observer pattern becomes trivial, as it should be. " -- The Problem with Threads by Edward A. Lee (2006), section 6 "Alternatives to Threads", page 11

Alternatives to Threads that "start with deterministic, composable mechanisms, and introduce nondeterminism only where needed" according to The Problem with Threads by Edward A. Lee (2006):

...

the Kahn process networks (PN) model of concurrency (G. Kahn and D. B. MacQueen?. Coroutines and networks of parallel processes. In B. Gilchrist, editor, Information Processing. North-Holland Publishing Co., 1977). ...the processes communicate via message passing with conceptually unbounded FIFO queues and blocking reads (streams). In the original PN model due to Kahn and MacQueen?, the blocking reads ensure that every network defines a deterministic computation. In this case, the PN model has been augmented with a primitive called “NondeterministicMerge” that explicitly merges streams nondeterministically. ...has all the cited advantages of the one in figure 3 with the added property that the Observer need not keep up with the Value Consumer. Notifications can be queued for later processing.

...

A third implementation would elaborate on the nature of the nondeterminism represented by the nondeterministic merge. The principles of synchronous languages (A . Benveniste and G. Berry. The synchronous approach to reactive and real-time systems. Proceedings of the IEEE, 79(9):1270–1282, 1991) could be used to en- sure fairness. In Ptolemy II, the same model can be implemented with an “SR director” (syn- chronous/reactive), which implements a synchronous model related to Esterel (G. Berry and G. Gonthier. The esterel synchronous programming language: Design, semantics, imple- mentation. Science of Computer Programming, 19(2):87–152, 1992), SIGNAL (A. Benveniste and P. L. Guernic. Hybrid dynamical systems theory and the signal language. IEEE Tr. on Automatic Control, 35(5):525–546, 1990), and Lustre (N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous data flow programming language lustre. Proceedings of the IEEE, 79(9):1305–1319, 1991). The last of these has been used successfully to design highly concurrent, safety critical software for aircraft control applications (G. Berry. The effectiveness of synchronous languages for the development of safety-critical systems. White paper, Esterel Technologies, 2003). Using threads for such applications would not be wise.

A fourth implementation would focus on the timing of nondeterministic events. In Ptolemy II, a similar model using the “DE director” (discrete events), would provide a timed specification with rigorous semantics (E. A. Lee. Modeling concurrent real-time processes using discrete events. Annals of Software Engi- neering, 7:25–45, 1999) related to that of hardware description languages such as VHDL and Verilog and to network modeling frameworks such as Opnet Modeler.

In all four cases (rendezvous, PN, SR, and DE), we have started with a fundamentally determin- istic (and concurrent) interaction mechanism (deterministic in the sense of the computation being performed, though in the first three cases not deterministic in the sense of timing). Nondeterminism has been judiciously introduced exactly where needed. This style of design is very different from the threads style, which starts with a wildly nondeterministic interaction mechanism and attempts to prune away undesired nondeterminism.

The implementations shown in figures 3 and 4 (my note: figures 3 and 4 omitted here) both use Java threads. However, the program- mer’s model is not one of threads at all. Compared to all the techniques described in the previous section, this is closest to MapReduce?, which has a similar flavor of streaming data through computa- tions. But unlike MapReduce?, it is supported in a rigorous coordination language that is sufficiently expressive to describe a wide range of interactions. In fact, two distinct coordination languages are given here, one with rendezvous semantics, and the other with PN semantics.

This style of concurrency is, of course, not new. Component architectures where data flows through components (rather than control) have been called “actor-oriented” (E. A. Lee and S. Neuendorffer. Classes and subclasses in actor-oriented design. In Conference on Formal Methods and Models for Codesign (MEMOCODE), San Diego, CA, USA, 2004). These can take many forms. Unix pipes resemble PN, although they are more limited in that they do not support cyclic graphs. Message passing packages like MPI and OpenMP? include facilities for implementing rendezvous and PN, but in a less structured context that emphasizes expressiveness rather than determinacy. A naive user of such packages can easily be bitten by unexpected nondeterminacy

Languages such as Erlang (J. Armstrong, R. Virding, C. Wikstrom, and M. Williams. Concurrent programming in Erlang. Prentice Hall, second edition, 1996) make message passing concurrency an integral part of a general- purpose language. Languages such as Ada make rendezvous an integral part. Functional languages (P. Hudak. Conception, evolution, and application of functional programming languages. ACM Com- puting Surveys, 21(3), 1989) and single-assignment languages also emphasize deterministic computations, but they are less explicitly concurrent, so controlling and exploiting concurrency can be more challenging. Data parallel languages also emphasize determinate interactions, but they require low-level rewrites of software. " -- The Problem with Threads by Edward A. Lee (2006), section 6 "Alternatives to Threads", page 11

" The Commutator block performs a rendezvous with each of its inputs in top-to-bottom order, and thus accomplishes the same interleaving...in alternating order " -- The Problem with Threads by Edward A. Lee (2006), section 7 "Challenges and Opportunities", page 14

---

" A more challenging, long-term opportunity is to adapt the theory of computation to provide better foundations for concurrent computations. Although considerable progress has been made in this direction, I believe there is much more to be done. In section 3, sequential computation is modeled as functions mapping bit sequences into bit sequences. A corresponding concurrent model has been explored in (E. A. Lee and A. Sangiovanni-Vincentelli. A framework for comparing models of computation. IEEE Transactions on CAD, 17(12), 1998), where instead of a function

f : B → B

concurrent computation is given as a function

f : (T → B) → (T → B).

In the above, T is a partially or totally ordered set of tags, where the ordering can represent time, causality, or more abstract dependency relations. A computation viewed in this way maps an evolv- ing bit pattern into an evolving bit pattern. This basic formulation has been shown adaptable to many concurrent models of computation (A. Benveniste, L. Carloni, P. Caspi, and A. Sangiovanni-Vincentelli. Heterogeneous reactive systems modeling and correct-by-construction deployment. In EMSOFT. Springer, 2003), (J. R. Burch, R. Passerone, and A. L. Sangiovanni-Vincentelli. Notes on agent algebras. Technical Report UCB/ERL M03/38, University of California, November 2003), (X. Liu. Semantic foundation of the tagged signal model. Phd thesis, EECS Department, University of California, December 20 2005). " -- The Problem with Threads by Edward A. Lee (2006), section 7 "Challenges and Opportunities", page 14

---

" Concurrent programming models can be constructed that are much more predictable and understandable than threads. They are based on a very simple principle: deterministic ends should be accomplished with deterministic means. Nondeterminism should be judiciously and carefully introduced where needed, and should be explicit in programs. " -- The Problem with Threads by Edward A. Lee (2006), section 8 "Conclusion", page 15

---

atomics

https://sabrinajewson.org/rust-nomicon/atomics/atomics.html

---

lamport clocks and similar

https://www.exhypothesi.com/clocks-and-causality/

---

How Much Memory Do You Need to Run 1 Million Concurrent Tasks?

compares Rust, Go, Java, C#, Python, Node.js and Elixir.

" matklad 15 minutes ago

link

flag

There are three classes of programs here:

    OS threads (Rust, Java) — these allocate at least a couple of pages of ram (plus a bunch of entries in the page table for the whole 8MiB of stack)
    Stackfull coroutines — these allocate some stack, but less than a page (Go, Erlang, Java)
    Stackless coroutines — these allocate just a single stack frame (Node, C#, Python, Rust, Kotlin)

These go from “more memory”, to “less memory”. So, yeah, C# is in the most memory efficient category.

The Stackless category is further subdivided into how frequently frames are allocated onto the heap. The choices here are:

    on every await (Promise-based APIs)
    on every call to an async function, module optimizations (most async/await solutions are in this category)
    once (Rust)

This subdivision is about CPU overhead, not memory usage; it won’t be visible in the sleep benchmark anyway, as there’s no calls at all.

" -- https://lobste.rs/s/m9xe97/how_much_memory_do_you_need_run_1_million#c_upeeye

---

a kind of a higher-level async that guarantees that cancellation propagates etc:

[5] https://docs.rs/futures-concurrency/latest/futures_concurrency/

https://news.ycombinator.com/item?id=36559030 https://lobste.rs/s/u7bsm9/tree_structured_concurrency

---

this article is mostly about Rust but it also has a good general intro to processes, threads, channels near the top:

[6]

here's a longer summary of some cas, ll/sc, atomics, memory ordering concurrency stuff by the same author:

[7]

---

https://www.phoronix.com/news/Google-User-Thread-Futex-Swap

---

"There’s an important distinction between a future—which does nothing until awaited—and a task, which spawns work in the runtime’s thread pool… returning a future that marks its completion." -- https://bitbashing.io/async-rust.html

---

[8]

---

" Under a lock free algorithm, we promise that across our entire system, eventually some progress is made. I can't necessarily point at any specific thread of execution and say "This thread makes progress" but I can say that forward progress is made for the system as a whole.

Under a wait free algorithm, we promise that every individual thread eventually makes progress.

Suppose I have one really dumb thread which just goes to sleep for 10 seconds in a loop, as well as other threads which do some actual work. If we're a lock free algorithm, it is OK if the 10-second-sleep thread is always chosen to make progress, its "progress" consists of sleeping for 10 seconds, too bad, try again next time.

With a wait free algorithm, the threads which actually do useful work will also make at least some progress, eventually despite the 10 second sleeper. " -- tialaramex

---

Rust Atomics and Locks Low-Level Concurrency in Practice by Mara Bos

---

https://without.boats/blog/futures-and-segmented-stacks/

https://without.boats/blog/why-async-rust/

---

" Languages with no async/await let the caller decide how the callee should be executed/scheduled.

In Go, everything is a goroutine, so in a sense everything is already contaminated with async/await. Code that blocks is automatically an ‘await point’, unless you use the go keyword. " -- https://lobste.rs/s/cryfiu/async_rust_is_bad_language#c_srovqp

---

https://dl.acm.org/doi/pdf/10.1145/3622852 When Concurrency Ma ers: Behaviour-Oriented Concurrency """ actor model programming has a serious pain point: updating multiple actors as a single atomic operation is a challenging task. We address this pain point by introducing a new concurrency paradigm: Behaviour-Oriented Concurrency (BoC?). In BoC?, we are revisiting the fundamental concept of a behaviour to provide a more transactional concurrency model. BoC? enables asynchronously creating atomic and ordered units of work with exclusive access to a collection of independent resources. ... We propose a programming model that we call behaviour-oriented concurrency (BoC?). The BoC? programming model relies on a decomposition of state that is akin to actors—a program’s state is a set of isolated resources (that we call cowns). Behaviours are asynchronous units of work that explicitly state their required set of resources...To construct those behaviours, we introduce a new keyword, when, which enumerates the set of necessary resources for the said behaviour and spawns this an asynchronous unit of compute. So, a BoC? program is a collection of behaviours that each acquires zero or more resources, performs...computation on them, which typically involves reading and updating them, and spawning new behaviours, before releasing the resources. When running, a behaviour has no other state than the resources it acquired. A behaviour can be run when it has acquired all of its resources, which is guaranteed to be deadlock free. ... .1 BoC? in a Nutshell BoC? augments the underlying language with two fundamental concepts: the concurrent owner or cown, and the behaviour.

Cowns a cown protects a piece of separated data, meaning it provides the only entry point to that data in the program. A cown is in one of two states: available, or acquired by a behaviour. Behaviours are the unit of concurrent execution. They are spawned with a list of required cowns and a closure. We define happens before, as an ordering relation which strengthens the “spawned before” relation by also requiring that the sets of cowns required by the two behaviours have a non-empty intersection: Definition 2.1. A behaviour 𝑏 will happen before another behaviour 𝑏′ iff 𝑏 and 𝑏′ require overlapping sets of cowns, and 𝑏 is spawned before 𝑏′. Once spawned, a behaviour can be run (i.e., its closure starts executing) only when all its required cowns are available, and all other behaviours which happen before it have been run. Once available, all the cowns are acquired by the behaviour atomically, and become unavailable to other behaviours. Throughout execution of the closure the behaviour retains exclusive access to its cowns, and to their data. Moreover, the behaviour cannot acquire more cowns, nor can it prematurely release any cown it is holding. Upon termination of the closure, the behaviour terminates, and all its cowns become available. BoC? is datarace-free and deadlock-free. BoC? provides the principles to ensure data-race freedom: since the state associated with each cown is isolated, and since behaviours have exclusive access to the acquired cowns’ state, there is no way for two different behaviours to have concurrent mutable access to the same state. Furthermore, as behaviours acquire all their required cowns atomically, and because the happens before relation is acyclic, BoC? is deadlock-free by construction. We introduce BoC? through examples expressed in an imperative, expression-oriented, strongly- typed pseudo-language with isolation. Two language features are of interest: – The cown[T] type: it represents a cown with contents of type T, with a create constructor. – The when expression: it consists of a set of cown identifiers, and a closure, and is used to spawn a behaviour that requires the specified cowns. It is the remit of the underlying language to ensure that each object in the heap is owned by exactly one entry point, and that any object accessed within the closure of a when is owned by one of the acquired cowns """

this is part of https://github.com/Microsoft/Verona

discussion: https://lobste.rs/s/64eywc/when_concurrency_matters_behaviour

---