good post: http://www.unlimitednovelty.com/2011/07/trouble-with-erlang-or-erlang-is-ghetto.html
--- the ingredients of Erlang: " Fast process creation/destruction Ability to support ยป 10 000 concurrent processes with largely unchanged characteristics.
A programming model where processes are lightweight values -- and a good scheduler -- make concurrent programming much easier, in a similar way to garbage collection. It frees you from resource micro-management so you can spend more time reasoning about other things.
Fast asynchronous message passing.
This is what ZeroMQ? gives you. But it gives you it in a form different to that of Erlang: in Erlang, processes are values and message-passing channels are anonymous; in ZeroMQ?, channels are values and processes are anonymous. ZeroMQ? is more like Go than Erlang. If you want the actor model (that Erlang is based on), you have to encode it in your language of choice, yourself.
Copying message-passing semantics (share-nothing concurrency).
Notably, Erlang enforces this. In other languages, shared memory and the trap of using it (usually unwittingly) doesn't go away.
Process monitoring.
Erlang comes with a substantial library, battle-tested over decades, for building highly concurrent, distributed, and fault-tolerant systems. Crucial to this is process monitoring -- notification of process termination. This allows sophisticated process management strategies; in particular, using supervisor hierarchies to firewall core parts of the system from more failure-prone parts of the system.
Selective message reception.
" http://www.rabbitmq.com/blog/2011/06/30/zeromq-erlang/ (from http://www.zeromq.org/whitepapers:multithreading-magic ) --
davidw 11 hours ago
| link |
I'm interested in tracking Go as a replacement for Erlang. Some things that should probably happen:
I think they'll get there eventually. Maybe not 100%, but 'good enough' sooner or later.
reply
jerf 7 hours ago
| link |
Right now, as near as I can tell, you basically can't implement an Erlang supervision tree in Go. You don't have anything like linking, which really has to be guaranteed by the runtime to work properly at scale, so bodging something together with some 'defer' really doesn't cut it. You also can't talk about "a goroutine", because you can't get a reference or a handle to one (no such thing), you can only get channels, but they aren't guaranteed to map in any particular way to goroutines.
I've done Erlang for years and Go for weeks, so I'm trying to withhold judgement, but I still feel like Erlang got it more right here; it's way easier to restrict yourself to a subset of Erlang that looks like a channel if that is desirable for some reason than to implement process features with channels in Go.
reply
JulianMorrison? 8 hours ago
| link |
That issue is quite simply a misinterpretation of goroutines.
Erlang: perform a maximum number of reductions, then switch, or switch on IO. The number of reductions is cleverly adjusted so that a process which is swamping other processes will be throttled.
Go: switch on IO.
Go's design is much simpler, and closer to Ruby Fibers than to Erlang processes, except that goroutine scheduling can use multiple threads. To cooperatively switch without doing IO, call runtime.Gosched().
reply
trailfox 10 hours ago
| link |
Akka is also a viable Erlang alternative.
reply
waffle_ss 6 hours ago
| link |
Looks like a nice library but I don't think it's a serious contender to replace Erlang because the JVM just isn't made for the level of concurrency that Erlang's VM is. Off the top of my head as an Erlang newbie:
[1]: http://doc.akka.io/docs/akka/snapshot/general/jmm.html
reply
An Erlang process has its own heap, so when it blows up, its state just goes away, leaving your remaining program's state untouched. With Go, there is no way to recover sanely; even if your goroutines are designed to copy memory, Go itself has a single heap.
Now, this is a very odd design decision for a language that claims it's designed for reliability. Perhaps Go's authors thinks it's better just for the entire program to die if a single goroutine falls over; well, that's one way, but it's a crude one. Erlang's design is simply better.
I wonder if Go can ever adopt per-goroutine heaps, or whether it's too late at this stage. I was happy to see that Rust has chosen to follow Erlang's design by having per-task heaps, even if all the pointer mechanics (three times of pointers, ownership transfer, reference lifecycles and so forth) result in some fairly intrusive and gnarly syntax.
reply
masklinn 1 day ago
| link |
> Go allows you to share memory between goroutines (i.e. concurrent code).
Go will share memory, by default, and special attention must be taken preventing or avoiding it. It's not an allowance.
> In fact, the Go team explicitly tells you not to do that
And yet they have refused to implement a correct model, even though they have no problem imposing their view when it fits them (and having the interpreter get special status in breaking them, see generics).
reply
burntsushi 1 day ago
| link |
> Go will share memory, by default, and special attention must be taken preventing or avoiding it.
Not really. If you use channels to communicate between goroutines, then the concurrency model is that of sequential processes, even if channels are implemented using shared memory under the hood.
That is, the default concurrency model militated by Go is not shared memory, but that of CSP. It's disingenuous to affix Go with the same kind of concurrency model used in C.
> And yet they have refused to implement a correct model
What is a correct model? Erlang's model isn't correct. It's just more safe.
> (and having the interpreter get special status in breaking them, see generics)
What's your point? Purity for purity's sake?
reply
masklinn 19 hours ago
| link |
> Not really. If you use channels to communicate between goroutines, then the concurrency model is that of sequential processes
Except since Go has support for neither immutable structures not unique pointers, the objects passed through the channel can be mutable and keep being used by the sender. Go will not help you avoid this.
> That is, the default concurrency model militated by Go is not shared memory, but that of CSP. It's disingenuous to affix Go with the same kind of concurrency model used in C.
It's not, go passes mutable objects over its channel and all routines share memory, you get the exact same model by using queues in C.
Erlang's core concept of concurrency seems like something that'd be better suited as a library and app server than a whole language and runtime.
I've yet to hear of any Erlang-specific magic that cannot be implemented inside another language.
reply
SoftwareMaven? 1 day ago
| link |
How would you get per-actor heaps that cannot be violated by other actors? That is critical to Erlang's ability to recover from processes dying. I spent a lot of time doing Java and can't think how you could (you could in the JVM if you had language constructs for it, but then we are back to a new language).
There's a reason Stackless Python's actors aren't just a library on top of Python.
reply
reeses 11 hours ago
| link |
TLABs are a partial step in that direction at the JVM level. You could also use object pools/factories keyed to the thread.
Those are the first two "ghetto" hack solutions I can think of that wouldn't require significant code changes on a going-forward basis.
reply
SoftwareMaven? 7 hours ago
| link |
But those hacks wouldn't provide the same guarantees that language-level changes provide. Sure, you can try not to impact other thread's heaps, but nothing is stopping me, which means a simple programming error has the potential to impact multiple threads. As a result, you can't just "reboot" that thread (a critical piece of what makes Erlang interesting), because you have no guarantees its errors didn't impact other threads. You also have no guarantees that the underlying libraries aren't mucking up all of your carefully crafted memory management.
It's like the kernel protecting memory so applications can't overwrite each other. Sure, applications could just write to their own memory, but nobody actually trusts that model[1]. Instead, they want something below that level enforcing good behavior.
reeses 11 hours ago
| link |
Obviously, you can do the same things in Java, as people have demonstrated with alternative languages that target the JVM.
It's the expressiveness at the language level that is really the "magic". For example, doing the equivalent of OO is not intuitive in Erlang, but completely possible (actually easy, but it looks...wrong) whereas it's supported by every Java tool. By the same token, pattern-matched message passing, lightweight green threads, and hot code deployment are primary concepts in Erlang.
reply
dumael 12 hours ago
| link |
You can at best implement a mimicry of Erlang's message passing in Java.
With sufficient effort, you can have the equivalent of no shared mutable data.
What you cannot have is completely separate heaps, so that if one thread crashes for whatever reason it doesn't take your application down.
Also, good luck trying to find a garbage collector that supports completely separate heaps that isn't a direct copy/near-identical implementation of the BEAM[1] VM's GC.
[1] The virtual machine that is the stock VM for Erlang. (In fact I don't know of any others but I have never looked.)
reply
reeses 11 hours ago
| link |
We've been doing it in Java since the nasty old days of RMI or "our own weird RPC over HTTP implementation". Lots of lightweight services with a supervisor to manage them and a directory to find them.
Now it's even easier with ESBs. Write a ten-line grails service to expose a bin-packing facility with Drools and then never touch it again.
reply
Evbn 1 day ago
| link |
Java's syntactic overhead stops most humans. "final" everywhere, big method names for every primitive message passing operation, etc
reply
---
seanmcdirmid 1 day ago
| link |
Erlang is not designed for parallel programming; it is designed for concurrent programming. These are two very different programming domains with different problems.
Every time someone conflates parallelism with concurrency...everyone gets very confused.
reply
unoti 23 hours ago
| link |
Isn't it really fair to say that it's designed for both? The way it uses immutable state and something-similar-to-s-expressions to express data make it very straightforward (or even transparent) to distribute work between multiple processes and separate computers, in addition to how it makes it practical and simple to break work into small chunks that can be interleaved easily within the same thread. It's really designed for doing both very well, wouldn't you say?
reply
seanmcdirmid 23 hours ago
| link |
Not at all. Erlang isn't useful for modern parallel computing as we know it, which is usually done as some kind of SIMD program; say MapReduce? or GPGPU using something like CUDA. The benefit doesn't just come from operating on data all at once, but these systems (or the programmer) also do a lot of work to optimize the I/O and cache characteristics of the computation.
Actor architectures are only useful for task parallelism which no one really knows how to get much out of; definitely not the close-to-linear performance benefits we can get from data parallelism. Task parallelism is much better for when you have to do multiple things at once (more efficient concurrency), not for when you want to make a sequential task faster.
Maybe this will help
http://jlouisramblings.blogspot.com/2011/07/erlangs-parallel...
and
https://news.ycombinator.com/item?id=2726661
reply
reeses 7 hours ago
| link |
SIMD is a specialized form of parallelism. It is not the only definition of the term.
It should also be clear that task parallelism (or concurrency from your perspective) has not had the benefit of billions of engineer-hours focused on improving its performance. It is within recent memory that if you wanted 20+ CPUs at your disposal, you'd have to build a cluster with explicit job management, topologically-optimized communications, and a fair amount of physical redundancy.
As many of the applications requiring low-end clusters tended to involve random numbers or floating point calculations, we also had the annoyance of minor discrepancies such as clock drift affecting the final output. This would present, for example, in a proportional percentage of video frames with conspicuously different coloration.
seanmcdirmid 4 hours ago
| link |
Task parallelism was something used to work on 20 years ago when we thought it was the solution to scaling. But then we found that the supercomputer people were right all along, that the only thing that really scales very well is data parallelism. So the focus in the last 5/10 years has been finding data parallel solutions to the problems we care about (say deep neural network training), and then mapping them to either a distributed pipeline (MapReduce?) or GPU solution.
> It is within recent memory that if you wanted 20+ CPUs at your disposal, you'd have to build a cluster with explicit job management, topologically-optimized communications, and a fair amount of physical redundancy.
You are still thinking about concurrency, not parallelism. Yes, the cluster people had to think this way, they were interested in performance for processing many jobs; no the HPC people who needed performance never thought like this, they were only interested in the performance of one job.
> As many of the applications requiring low-end clusters tended to involve random numbers or floating point calculations, we also had the annoyance of minor discrepancies such as clock drift affecting the final output.
Part of the problem, I think, is that we've been confused for a long time. Our PHBs saw problems (say massive video frame processing) and saw solutions that were completely inappropriate for it (cluster computing). Its only recently that we've realized there are often other/better option (like running MapReduce?