" Downey; The Little Book of Semaphores
Takes a topic that's normally one or two sections in an operating systems textbook and turns it into its own 300 page book. The book is a series of exercises, a bit like The Little Schemer, but with more exposition. It starts by explaining what semaphore is, and then has a series of exercises that builds up higher level concurrency primitives.
This book was very helpful when I first started to write threading/concurrency code. I subscribe to the Butler Lampson school of concurrency, which is to say that I prefer to have all the concurrency-related code stuffed into a black box that someone else writes. But sometimes you're stuck writing the black box, and if so, this book has a nice introduction to the style of thinking required to write maybe possibly not totally wrong concurrent code.
I wish someone would write a book in this style, but both lower level and higher level. I'd love to see exercises like this, but starting with instruction-level primitives for a couple different architectures with different memory models (say, x86 and Alpha) instead of semaphores. If I'm writing grungy low-level threading code today, I'm overwhelmingly like to be using c++11 threading primitives, so I'd like something that uses those instead of semaphores, which I might have used if I was writing threading code against the Win32 API. But since that book doesn't exist, this seems like the next best thing.
I've heard that Doug Lea's Concurrent Programming in Java is also quite good, but I've only taken a quick look at it. " -- http://danluu.com/programming-books
---
http://lucumr.pocoo.org/2016/10/30/i-dont-understand-asyncio/ says that Python asyncio is too complicated but then also says:
"What landed in 3.5 (the actual new coroutine objects) is great. In particular with the changes that will come up there is a sensible base that I wish would have been in earlier versions. The entire mess with overloading generators to be coroutines was a mistake in my mind."
https://news.ycombinator.com/item?id=12831989 concurs that Python 3.5's concurrency is better than 3.4's
---
some comments in https://news.ycombinator.com/item?id=12829759 say that greenlets/greenthreads is better than asyncio. But https://news.ycombinator.com/item?id=12831989 and children point out that (a) greenthreads eg preemtive multitasking give you no guarantees about when control will be switched out, so you have to use locks all over the place, in comparison to cooperative multitasking, where, because you know when control might switch, you only have to worry about protecting shared data at those points, and (b) it's easier to do promises and cooperative multitasking across interoperability boundaries to libraries in other languages, in comparison to greenthreads.
---
note: Python 3's asyncio event loops even have TCP facilities:
https://docs.python.org/3/library/asyncio-eventloop.html#creating-connections https://docs.python.org/3/library/asyncio-protocol.html#asyncio.Protocol
---
https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-asyncawait-world/ points out issues with the following sort of system:
Namely, the problems are that it's more difficult to reason about your code, because if the code says "f(); g();", but f() schedules things for the future, then the code reads as if f() has been completed before g() but actually the final effects of f() are not yet present at the time when g() begins executing.
Instead, e recommends systems like this:
Futhermore, curio has the convention that stuff that doesn't do all of its work synchronously is always labeled with 'async', so that you know that calls into non-async curio library functions don't put off doing their work until later. By contrast, asyncio's stream-based network I/O functions, eg StreamWriter?.write, aren't labeled 'async' yet don't complete all of their work immediately.
Furthermore, "In asyncio, there are many different representations of logically concurrent threads of execution – loop.add_reader callbacks, asyncio.Task callstacks, Future callbacks, etc. In curio, there is only one kind of object that can represent a logical thread – curio.Task – and this allows us to handle them in a uniform way".
Futhermore, in curio you can configure a timeout using a context manager, and it will be propagated downwards.
Futhermore, e notes that whenever you have futures involved, you may want to cancel a task, but then what happens if that task is blocked awaiting some future -- do you cancel that future too? But if multiple tasks are using the same future and you cancel it, then by cancelling one task you might erroneously affect an unrelated task. E says that "...we can't avoid this by checking for multiple waiters when propagating cancellations from tasks->futures" but i didn't understand why not.
Furthermore, e notes that when callbacks are involved, frameworks can't (easily) statically projected the future path of control across callbacks, so it's harder for a framework to assist with things like 'finally' cleanup actions.
Furthermore, implicit dynamic thread-local context is harder to do across callbacks, because even if a framework/language provides a way to do this ordinarily, that way might not work across callbacks.
( " In a curio-style framework, the problem is almost trivial, because all code runs in the context of a Task, so we can store our task-local data there and immediately cover all uses cases. And if we want to propagate context to sub-tasks, then as described above, sub-task spawning goes through a single bottleneck inside the curio library, so this is also easy. I actually started writing a simple example here of how to implement this on curio to show how easy it was... but then I decided that probably it made more sense as a pull request, so now I don't have to argue that curio could easily support task-local storage, because it actually does! It took ~15 lines of code for the core functionality, and the rest is tests, comments, and glue to present a convenient threading.Local-style API on top; there's a concrete example to give a sense of what it looks like in action.
I also recommend this interesting review of async context propagation mechanisms written by two developers at Google. A somewhat irreverant but (I think) fair summary would be (a) Dart baked a solution into the language, so that works great, (b) in Go, Google just forces everyone to pass around explicit context objects everywhere as part of their style guide, and they have enough leverage that everyone mostly goes along with it, (c) in C# they have the same system I implemented in curio (as I learned after implementing it!) and it works great because no-one uses callbacks, but (d) context propagation in Javascript is an ongoing disaster because Javascript uses callbacks, and no-one can get all the third-party libraries to agree on a single context-passing solution... partly because even the core packages like node.js can't decide on one. " -- [1] )
" What makes a Python app "async/await-native"? Here's a first attempt at codifying them:
An async/await-native application consists of a set of cooperative threads (a.k.a. Tasks), each of which consists of some metadata plus an async callstack. Furthermore, this set is complete: all code must run on one of these threads.
These threads are supervised: it's guaranteed that every callstack will run to completion – either organically, or after the injection of a cancellation exception.
Thread spawning is always explicit, not implicit.
Each frame in our callstacks is a regular sync- or async-colored Python function, executing regular imperative code from top to bottom. This requires that both API primitives and higher-level functions *respect causality* whenever possible.
Errors, including cancellation and timeouts, are signaled via exceptions, which propagate through Python's regular callstack unwinding.
Resource cleanup and error-handling is managed via exception handlers (with or try)." -- [2]Open questions:
" a common pattern I've run into is where I want to spawn several worker tasks that act like "part of" the parent task: if any of them raises an exception then all of them should be cancelled + the parent raise an exception; if the parent is cancelled then they should be cancelled too. We need ergonomic tools for handling these kinds of patterns robustly.
Fortunately, this is something that's easy to experiment with, and there's lots of inspiration we can draw from existing systems: Erlang certainly has some good ideas here. Or, curio makes much of the analogy between its event loop and an OS kernel; maybe there should be a way to let certain tasks sign up to act as PID 1 and catch failures in orphan tasks? " -- [3]
" Cleanup in generators and async generators ... the __del__ method. If we have a generator with some sort of cleanup code...and we iterate it, but stop before reaching the end...then eventually that finally block will be executed by the generator's __del__ method (see PEP 342 for details).
And if we think about how __del__ works, we realize: it's another sneaky, non-causal implicit-threading API! __del__ does not get executed in the context of the callstack that's using the generator – it happens at some arbitrary time and place...in a special context where exceptions are discarded ... Note: What about __del__ methods on other objects, besides generators? In theory they have the same problems, but (a) for most objects, like ints or whatever, we don't care when the object is collected, and (b) objects that do have non-trivial cleanup associated with them are mostly obvious "resources" like files or sockets or thread-pools, so it's easy to remember to stick them in a with block. Plus, when we write a class with a __del_ method we're usually very aware of what we're doing. Generators are special because they're just as easy to write as regular functions, and in some programming styles just as common. It's very very easy to throw a with or try inside some generator code and suddenly you've defined a __del__ method without even realizing it, and it feels like a function call, not the creation of a new resource type that needs managing. ... This one worries me, because it's basically the one remaining hole in the lovely interlocking set of rules described above – and here it's the Python language itself that's fighting us. For now, the only solution seems to be to make sure that you never, ever call a generator without explicitly pinning its lifetime with a with block...PEP 533 is one possible proposal for fixing this at the language level, by adding an explicit __iterclose__ method to the iterator protocol and adapting Python's iteration constructs like for accordingly."
For now, the only solution seems to be to make sure that you never, ever call a generator without explicitly pinning its lifetime with a with block. For synchronous generators, this looks like:
def some_sync_generator(path): with open(path) as ...: yield ...
And for async generators, this looks like:
async def some_async_generator(hostname, port): async with open_connection(hostname, port) as ...: yield ...
def __aenter__(self):
return self._agen def __aclose__(self, *args):
await self._agen.aclose()async with aclosing(some_async_generator(hostname, port)) as tmp: async for obj in tmp: ...
It might be possible for curio to subvert the PEP 525 __del__ hooks to at least catch cases where async generators are accidentally used without with blocks and signal some kind of error.
PEP 533 is one possible proposal for fixing this at the language level, by adding an explicit __iterclose__ method to the iterator protocol and adapting Python's iteration constructs like for accordingly.
---
timeouts with dynamic context managers in Python curio:
"
more ideas for timeout features:
https://github.com/dabeaz/curio/issues/82#issuecomment-257078638
---
" The fundamental problem here is that Futures often have a unique consumer but might have arbitrarily many, and that Futures are stuck half-way between being an abstraction representing communication and being an abstraction representing computation. The end result is that when a task is blocked on a Future, Task.cancel simply has no way to know whether that future should be considered to be "part of" the task. So it has to guess, and inevitably its guess will sometimes be wrong. (An interesting case where this could arise in real code would be two asyncio.Tasks that both call await writer.drain() on the same StreamWriter?; under the covers, they end up blocked on the same Future.) In curio, there are no Futures or callback chains, so this ambiguity never arises in the first place. "
---
" asyncio's global event loop fetching API is going to be reworked in 3.6 and backported to 3.5.3. If I understand correctly (which is not 100% certain, and I don't think the actual code has been written yet [edit2: here it is]), the new system will be: asyncio.get_event_loop(), instead of directly calling the currently-registered AbstractEventLoopPolicy?'s get_event_loop() method, will first check some thread-local global to see if a Task is currently executing, and if so it will immediately return the event loop associated with that Task (and otherwise it will continue to fall back on the AbstractEventLoopPolicy?. This means that inside async functions it should now be guaranteed (via somewhat indirect means) that asyncio.get_event_loop() gives you the same event loop that you'd get by doing an await. And, more importantly, since asyncio.get_event_loop() is what the callback-level APIs use to pick a default event loop when one isn't specified, this also means that async/await code should be able to safely use callback-layer functions without explicitly specifying an event loop, which is a neat improvement over my suggestion above. " -- [4]
---
steveklabnik 8 days ago [-]
Have you seen the tokio stuff in Rust? It's also an interesting take on the "spin up a zillion threads" problem that's not exactly green threads.
reply
---
"The best of all worlds is to write threaded code in languages that structurally eliminate the majority of issues threads have. The downside is that the "languages that structurally tame threads" is a pretty short list, even now; Rust, Haskell, Erlang/Elixir,"
---
quotemstr 9 days ago [-]
The idea espoused in this blog post, that
> if you have N logical threads concurrently executing a routine with Y yield points, then there are NY possible execution orders that you have to hold in your head
is actively harmful to software maintainability. Concurrency problems don't disappear when you make your yield points explicit.
Look: in traditional multi-threaded programs, we protect shared data using locks. If you avoid explicit locks and instead rely on complete knowledge of all yield points (i.e., all possible execution orders) to ensure that data races do not happen, then you've just created a ticking time-bomb: as soon as you add a new yield point, you invalidate your safety assumptions.
Traditional lock-based preemptive multi-threaded code isn't susceptible to this problem: it already embeds maximally pessimistic assumptions about execution order, so adding a new preemption point cannot hurt anything.
Of course, you can use mutexes with explicit yield points too, but nobody does: the perception is that cooperative multitasking (or promises or whatever) frees you from having to worry about all that hard, nasty multi-threaded stuff you hated in your CS classes. But you haven't really escaped. Those dining philosophers are still there, and now they're angry.
The article claims that yield-based programming is easier because the fewer the total number of yield points, the less mental state a programmer needs to maintain. I don't think this argument is correct: in lock-based programming, we need to keep _zero_ preemption points in mind, because we assume every instruction is a yield point. Instead of thinking about NY program interleavings, we think about how many locks we hold. I bet we have fewer locks than you have yields.
To put it another way, the composition properties of locks are much saner than the composition properties of safety-through-controlling-yield.
I believe that we got multithreaded programming basically right a long time ago, and that improvement now rests on approaches like reducing mutable shared state, automated thread-safety analysis, and software transactional memory. Encouraging developers to sprinkle "async" and "await" everywhere is a step backward in performance, readability, and robustness.
reply
...
quotemstr 9 days ago [-]
Those options aren't as distinct as you might imagine. Would calling it fiber-per-request make you happy?
(By the way: most of the time, a plain-old-boring thread-per-request is just fine, because most of the time, you're not writing high-scale software. If you have at most two dozen concurrent tasks, you're wasting your time worrying about the overhead of plain old pthread_t.)
I'm using a much more expansive definition of "thread" than you are. Sure, in the right situation, maybe M:N threading, or full green threads, or whatever is the right implementation strategy. There's no reason that green threading has to involve the use of explicit "async" and "await" keywords, and it's these keywords that I consider silly.
reply
vomjom 9 days ago [-]
(I agree that thread-per-request works just fine in the majority of cases, but it's still worthwhile to write about the cases where it doesn't work.)
Responding to your original post: you argue that async/await intends to solve the problem of data races. That's not why people use it, nor does it tackle that problem at all (you still need locks around shared data).
It only tries to solve the issue of highly-concurrent servers, where requests are bound by some resource that a request-handling threads have to wait for the result of (typically I/O).
Coroutines/fibers are not an alternative to async servers, because they need primitives that are either baked into the language or the OS itself to work well.
reply
(bayle: i don't understand the next 3 posts very well:)
gpderetta 8 days ago [-]
Coroutines/fibers are completely orthogonal to async anything. The OP is arguing against poor-man coroutines, aka stackless coroutines aka top-level yield only, which are significantly less expressive and composable than proper stackfull coroutines (i.e. first class one shot continuations).
An alleged benefit of stackless coroutines is that yield point are explicit, so you know when your state can change. The OP is arguing that this is not really a benefit because it yield to fragile code. I happen to strongly agree.
reply
barrkel 8 days ago [-]
Green threads / coroutines / fibers are isomorphic with async keyword transparently implemented as a continuation passing style transform, which is how async callbacks usually work. Actual CPU-style stacks in a green thread scenario are nested closure activation records in an explicit continuation passing style scenario, and are implicit closure activation records (but look like stacks) when using an 'async' compiler-implemented CPS.
Properly composed awaits (where each function entered is entered via an await) build a linked list of activation records in the continuations as they drill down. This linked list is the same as the stack (i.e. serves the same purpose and contains the same data in slightly different layout) in a green threads scenario.
What makes all these things different is how much they expose the underlying mechanics, and the metaphors they use in that exposition. But they're not orthogonal.
(If you meant 'async' as in async IO explicitly, rather than the async / await keyword with CPS transform as implemented in C#, Python, Javascript, etc., then apologies.)
reply
gpderetta 8 days ago [-]
I do mean async as in generic async IO.
As you said, you can of course recover stackful behaviour by using yield/await/async/wathever at every level of the call stack, but in addition to being a performance pitfall (you are in practice heap allocating each frame separately and yield is now O(N): your iterpreter/compiler/jit will need to work hard to remove the abstraction overhead), it leads to the green/red function problem.
reply
cderwin 9 days ago [-]
Please correct me if I'm wrong, but doesn't asyncio in the form of async/await (or any other ways to explicitly denote context switches) solve the problem of data races in that per-thread data structures can be operated on atomically by different coroutines? My understanding is that unless data structures are shared with another thread, you don't usually need locks for shared data.
reply
omribahumi 9 days ago [-]
I think that the biggest argument against it is code changes. Think about a code change that adds an additional yield point without proper locking.
Has any language tackled this with lazy locking? i.e. lock only on yield. Maybe this could even be done in compile time
reply
lmm 9 days ago [-]
> Look: in traditional multi-threaded programs, we protect shared data using locks. If you avoid explicit locks and instead rely on complete knowledge of all yield points (i.e., all possible execution orders) to ensure that data races do not happen, then you've just created a ticking time-bomb: as soon as you add a new yield point, you invalidate your safety assumptions. > Traditional lock-based preemptive multi-threaded code isn't susceptible to this problem: it already embeds maximally pessimistic assumptions about execution order, so adding a new preemption point cannot hurt anything.
You get an equal and opposite problem: whenever you add one more lock, you invalidate your liveness assumptions.
hueving 8 days ago [-]
> as soon as you add a new yield point, you invalidate your safety assumptions.
While true, locks aren't free from this problem. They have the inverse. If someone adds code that accesses a data structure that should be protected by a lock and they forget to add the lock, you also lose all of your safety assumptions.
reply
---
Animats 8 days ago [-]
Part of the problem is that object-oriented programming is now out of fashion. If objects only allow one active thread inside the object at a time, you have a conceptual model of how to deal with concurrency. Rust takes this route, and Java has "synchronized". It's done formally, with object invariants, in Spec#. Objects in C++ are often used this way in multi-thread programs.
If you don't have some organized way of managing concurrency, you're going to have problems. Without OOP, what? "Critical sections" lock relative to the code, not the data. "Which lock covers what data?" is a big issue, and the cause of many race conditions.
(The dislike of OOP seems to stem from the problems of getting objects into and out of databases in web services. One anti-OOP article suggests stored procedures as an alternative. Many database-oriented programs effectively use the database as their concurrency management tool. Nothing wrong with that, but it doesn't help if your problem isn't database driven.)
Python has the threading model of C - no language constructs for threads. It's all done in libraries. There's no protection against race conditions in user code. The underlying memory model is protected, by making operations that could break the memory model atomic, but that's all. CPython also has some major thread performance problems due to the Global Interpreter Lock. Having more CPUs doesn't speed things up; it makes programs slower, due to lock contention inefficiencies. So the use of real threads is discouraged in Python.
There's a suggested workaround with the "multiprocessing" module. This creates ultra-heavyweight threads, with a process for each thread, and talks to them with inefficient message passing. It's used mostly to run other programs from Python programs, and doesn't scale well.
So Python needed something to be competitive. There are armies of Javascript programmers with no experience in locking, but familiarity with a callback model. This seems to be the source of the push to put it in Python. Like many language retrofits, it's painful.
Does this imply that the major libraries will all have to be overhauled to make them async-compatible?
reply
zzzeek 7 days ago [-]
> Does this imply that the major libraries will all have to be overhauled to make them async-compatible?
well, "have to" implies that the community accepts this system as the One True Way to program. Which is why I like to point out that this is unwarranted (but yes, because the explicit async model is what I like to call "viral", in that anything that calls async IO must itself be async, so must the caller of that method be async, and turtles all the way out, it means an enormous amount of code has to be thrown out and written in the explicit async style which also adds significant function call / generator overhead to everything - it's basically a disaster).
It's very interesting that you refer to database driven programming as the reason OOP is out of fashion, since IMO one of the biggest misconceptions about async programming is that it is at all appropriate for communciation with a locally available relational database. I wrote in depth on this topic here: http://techspot.zzzeek.org/2015/02/15/asynchronous-python-an... with the goal of the post being, this is the one time I'm going to have to talk about async :)
reply
---
so someday i gotta decide:
do we want async/await by default?
and, do we want syntax for async/await so that we aren't always typing 'async/await'?
---
tschellenbach 9 days ago [-]
Are there any languages that have really nailed this? I've used gevent, eventlet, (both python), promises, callbacks (node) and none of them come close to being as productive as synchronous code.
I'd like to try out Akka and Elixer in the future.
reply
ezyang 9 days ago [-]
I like to tell people that the killer app for Haskell is writing IO bound, asynchronous code. The secret weapon is do-notation, which lets you write code as if it were sequential, but have it desugar into what is (essentially) a series of chained callbacks.
I like to point at Facebook's use of Haskell as a good example of being successful in this space http://community.haskell.org/~simonmar/papers/haxl-icfp14.pd... It would be disingenuous to suggest that Haskell is good in all situations, but if there was one place where it should be used, this it.
reply
...
ezyang 9 days ago [-]
...
The second is that do-notation is distinct from the IO monad. Even if Haskell didn't have green threads in the runtime, I could still write an async/callback library that looked just as natural as sequential code. Why? It has nothing to do with the IO monad: it has to do with the fact that "do x <- e; m" desugars to, in JavaScript? notation, bind(e, function(x) { m }); it's been "callbackified automatically".
reply
...
marcosdumay 8 days ago [-]
Haskell does not even allow one to write sequential code.
The IO monad enforces sequence on IO operations, and when you fork it, you get a new, independent sequence of IO operations to play with, not a new thread.
Haskell is really great for concurrent programming. Not only because of green threads (the mainstream concept that is nearest to the IO monad), but because of the "everything is immutable" rule, and very powerful primitives available.
reply
...
runeks 8 days ago [-]
I would argue that the Haskell language itself, through lazy evaluation, basically has built-in async/await support. Due to lazy evaluation, everything is a async/await - every time an expression is evaluated. In Python, you pass values around. In Haskell, you pass around descriptions of how to fetch a particular value, and then the runtime system makes sure it happens when/if it needs to.
It's a bit like Excel. Every cell is a variable that contains an expression, which defines what this cell evaluates to. With that description in hand, it's a simple matter of not evaluating cells that are not in view, and marking an exception in the evaluation with #######. If it were Python, each cell could contain code that modifies other cells, and it would be impossible to make sense of anything.
reply
retrogradeorbit 9 days ago [-]
Erlang (and by extension Elixir and LFE) has "nailed" it by making the actor pattern first class. Go's channels are great, but Go itself is quite low level. Also you should checkout Clojure's core.async to see what improved channel constructs on top of a high level, lock-free, multithreaded language core looks like.
Part of the problem with Python ecosystem is the insular mind set of its proponents. Python fanboys have no interest in going and seeing whats on the other side. So the platform has become a bit of an echo chamber with Pythonistas declaring their clunky approaches the industry best.
You can see this by looking at how little love a CSP solution for python gets [5] verses the enormous buy-in it's more popular frameworks receive.
reply
venantius 8 days ago [-]
core.async is using locks under the hood - it's just hiding that from you as an implementation detail.
reply
retrogradeorbit 8 days ago [-]
How is it possible then that core.async works on javascript platform, a platform that has no mutexes?
Maybe there is a lock to implement the thread macro (clojure only), but then that uses native threads. How would you propose to handle access to channels between native threads without locks?
As far as I know there is no locking performed in asynchronous code implemented using the go macro. The go macro is a macro that turns your code inside out into a state machine, is it not? Each <! and >! point becomes and entry/exit into that state machine. There are no locks here because the go macro can essentially "rewrite" your source code for you and there is only a single thread of execution through the interconnected state machines.
reply
Matthias247 9 days ago [-]
I've written quite a lot of concurrent code through the last years (network servers, protocol, ...) and overall I now like Go most.
The biggest reason for this is not that necessarily that I think it has absolutely the best concurrency model, but that it's the most consistent one. Nearly all libraries are written for the model, which means they assume multithreaded access, blocking IO (reads/writes) and no callbacks. As a result most libraries are interoperable without problems.
Erlang/Elixir should have similar properties - however I haven't used it.
Javascript has a similar property because at least everything assumes the singlethreaded environment and concurrency through callbacks (or abstraction of them like promises and async/await on promises). I also like the interoperability and predictability here. But sometimes nested callbacks (even with promises) lead to quite a big of ugly code. And calling "async methods" is not possible from "sync methods" without converting them to async first (which could mean some big refactoring). So I prefer the Go style in general.
The worst thing from my point of view are all the languages that do not have a standard concurrency model, e.g. C++, Java, C#, and according to this article also Python. Most of them have several libraries for (async) IO which can be beautiful by themselves but won't integrate into remaining parts of the application without lots of glue code. E.g. boost asio is nice, but you need a thread with an EventLoop?. If your main thread is already built around QT/gtk you now need another thread and then have 2 eventloops which need to interact. Some question for Java frameworks, e.g. integrating a Netty EventLoop? in another environment (Android, ...). In these languages we then often get libraries which are not generic for the whole language but specific to a parent IO library (works with asio, works with asyncio, ...) and thereby some fragmented ecosystems.
A standard question that also always arises in these "mixed-threaded" languages when you have an API which takes a callback is: From which thread will this callback be invoked? And if I cancel the operation from a thread, will it guarantee that the callback is not invoked. If you don't think about these you are often already in bug/race-condition land.
reply
quotemstr 9 days ago [-]
C++? Java? Python? The traditional thread model isn't bad merely because it's traditional. I much prefer it to promise hell and to async-everything. About the only thing that beats it is CSP, which you can also represent sequentially without funky new keywords and which you can implement as a library for C++, Java, or Python.
I never understood why people tout Go's goroutine feature so much. You can have it in literally any systems language.
reply
reality_czech 9 days ago [-]
The whole point of Golang is that every library and every project that uses Go will support coroutines and channels. Sure you can write a toy project in a language like C that has these concepts, but your toy library will effectively be usable with all of the other libraries that have ever been written for C. Any library that calls a blocking function will break your coroutine abstraction.
It's like saying that indoor plumbing is no big deal-- it's just liquid moving through a pipe. Well yes. Yes, it is. But if you don't have plumbing in your neighborhood, or a sewage treatment plant in your city, you can't fake it by fooling around in your garage. And frankly, it's not going to smell like a rose.
reply
btrask 9 days ago [-]
I wrote such a library in C[1] and in practice it's been no problem. Most libraries that do IO provide hooks (for example I made SQLite fully async[2], with no changes to their code). For cases where that isn't possible (or desirable), there's also an easy way to temporarily move the entire fiber to a thread pool.[3] That's actually much faster than moving back and forth for every call (which is what AIO emulation normally entails).
[1] https://github.com/btrask/libasync [2] https://github.com/btrask/stronglink/blob/master/res/async_s... [3] https://github.com/btrask/libasync/blob/master/src/async.h#L...
Disclaimer: not production ready, for most values of "production"
Edit: stacks don't grow dynamically, of course. But that's also a problem in Go if you want to efficiently call C libraries. If you really need efficiency, you can use raw callbacks for that particular section.
reply
int_19h 9 days ago [-]
> The whole point of Golang is that every library and every project that uses Go will support coroutines and channels.
Of course, this also means that Go is making it hard for its libraries to be used by other languages. So it's probably a bad candidate to write something like a cross-plat UI toolkit, if you hope for its wide use.
In contrast, threads and callbacks are both well-supported in existing languages; so if you write a library in C using either, pretty much any language will be able to consume it.
reply
LukeShu? 9 days ago [-]
(I'm not terribly familiar with Python's threading, so I'm not going to talk about it)
I never understood why people tout Go's goroutine feature so much. You can have it in literally any systems language.
There are two big reasons for it.
Firstly, goroutines are extremely lightweight. "Traditional" threading in C, C++, and Java means native OS threads, which are comparatively expensive. Sure, fiber/coroutine libraries exist for these languages, but they are far from common (and, the only fiber library for Java that I know of, Quasar, came after Go).
Secondly, Go's ecosystem encourages CSP-style message-passing, rather than "traditional" memory-sharing. This is channels, not goroutines, but they make working with goroutines very nice. This is less concrete than the first reason; you certainly can implement message-passing in any of the other languages' threading styles. But empirically, it doesn't happen as often. A factor in this is also that, unfortunately, many CS curricula don't discuss CSP, which means that Go's use of this is the first exposure many programmers have to it.
reply
quotemstr 9 days ago [-]
> But empirically, it doesn't happen as often.
It's sad that people use choice-of-language as a proxy for choice-of-execution-strategy (interpreted? JITed?), choice-of-allocation-strategy, choice-of-linking-strategy, choice-of-packaging, and so on. All of these factors should be orthogonal. By linking them, we create a lot of inefficiency by fragmenting our efforts.
AFAICT, C++ is the only language that's really been successful at being multi-paradigm.
reply
jerf 8 days ago [-]
C++ still drives you very strongly in certain directions for the things you mentioned.
Languages have to hand you very strong default choices for those things, because only the people with the hardest problems and the most time to solve them can afford to pick up a toolbox-box and build their own toolbox to solve a problem. Even the languages that arguably want to be that low of a level like Rust or D still have to offer a much more batteries-included standard library that will make more of those choices for you, and which will be for the vast majority of users the "real" version of that language.
reply
lmm 9 days ago [-]
I use Scala without Akka. Just straightforward Futures and for/yield. It's great: the distinction between "=" and "<-" is minimal overhead when writing, but enough to be visible when reading code. You have to learn the tools for gathering effects (e.g. "traverse"), but you only have to learn them once (and they're generic for any effect, rather than being specific to Future, you can use the exact same functions to do error handling, audit logs and the like).
reply
mi100hael 8 days ago [-]
After using Akka-HTTP, I never want to write a HTTP service with anything else.
reply
lmm 8 days ago [-]
akka-http is nice. akka-actor (i.e. the project that was originally called "akka") is awful. The name overlap is unfortunate.
reply
jscholes 7 days ago [-]
In your opinion, what's wrong with akka-actor?
reply
lmm 6 days ago [-]
It sacrifices type safety without offering enough value to be worth that - especially given that the model also eliminates useful stack traces. It forces non-distributed applications to pay the price of an API designed for distributed ones. Its FSM model doesn't offer the conciseness it should.
reply
jackweirdy 9 days ago [-]
How do you define "Productive"?
Aside from that, personally I've used both Akka and plain Scala with Futures, as well as node with Promises, bare callbacks and async (though I've not tried fibers). I find Promises and Futures are the perfect balance between simplicity of use and the benefits of using the Async model. There's no need to reason about threads, as they abstract away the actual async implementation, and the interface they expose is very easy to reason about.
reply
dhd415 8 days ago [-]
I'm surprised there aren't more mentions of Tasks in C# or F# on the .NET platform as examples of asynchronicity done well.
From the perspective of uniformity and availability, while C# provided asynchronicity via callbacks before the introduction of Tasks in the 4.5 release of the .NET Framework, all the core libraries that used callback-style async (as well as some that had been strictly synchronous-only) were updated with Task-based overloads, so there are no problems with Task-based async being inconsistently available. Additionally, adoption of Task-based async in third-party libraries has been high, so it's relatively uncommon to encounter code that does not support it.
From the perspective of code productivity, it's hard to get much better than simply adding the async and await keywords where necessary. As a very simple example, consider a typical server application that receives requests via HTTP, processes them via an HTTP call to another service as well as a database call, and then returns an HTTP response. The sync code (blocking with a thread-per-request model) might look something like this:
void handleRequest(HttpRequest request) {
var serviceResult = makeServiceCallForRequest(request);
var databaseResult = makeDatabaseCallForRequest(request);
sendResponse(constructResponse(request, serviceResult, databaseResult));
}In order to make that same process async (non-blocking with a dynamically-sized thread pool handling all requests), the code would look like this:
async Task handleRequestAsync(HttpRequest request) {
var serviceResult = await makeServiceCallForRequestAsync(request);
var databaseResult = await makeDatabaseCallForRequestAsync(request);
await sendResponseAsync(constructResponse(request, serviceResult, databaseResult));
}It could even be taken one step further to make the service request and database call concurrently if there were no dependencies between the two which would reduce processing latency for individual requests:
async Task handleRequestAsync(HttpRequest request) {
var serviceResultTask = makeServiceCallForRequestAsync(request);
var databaseResultTask = makeDatabaseCallForRequestAsync(request);
await sendResponseAsync(constructResponse(request, await serviceResultTask, await databaseResultTask));
}I've added asynchronicity into a C# server application as above with substantial improvements in both individual request latency and overall scalability. I'm now working on a Java8 system and bemoaning the comparatively primitive and inconsistent async capabilities in Java8.
reply
HeyImAlex?