(Simon Marlow on GHC:)
simonmar on Sept 4, 2010 [-]
Oh yes, immutability is crucial. Generational GC already makes you pay for mutation with a write barrier, and in our parallel GC we omit the locking when copying immutable objects, accepting that a few might get copied twice. In the new GC mutable objects become even more expensive. I don't think local GC is viable at all in a language with ubiquitous mutation.
---
Roboprog on Sept 3, 2010 [-]
An interesting approach: giving each thread its own "young generation" sub-heap, so transient objects can be disposed of without coordination from other threads / CPUs and their cache pages.
scott_s on Sept 3, 2010 [-]
I worked on a memory allocator (as in malloc, not garbage collection) that takes a similar approach: http://people.cs.vt.edu/~scschnei/streamflow/
A group at Intel independently came up with a similar approach as well: http://portal.acm.org/citation.cfm?id=1133967
Roboprog on Sept 6, 2010 [-]
Cool, the steamflow thing sounds interesting. Is there a top level example or test-driver somewhere in the github project showing what typical use-cases are?
E.g. - I have a test-driver here: http://github.com/roboprog/buzzard/blob/master/test/src/main... (although I have barely started the library I was tinkering on)
Any example client program for your allocator? I'd like to see what use cases you are handling.
scott_s on Sept 6, 2010 [-]
The Larson and Recycle benchmarks are on github. You can read about them in the paper. Email me if you'd like to see an unpublished paper which has some more detail on the allocator's design.
larsberg on Sept 3, 2010 [-]
We've been doing this in Manticore since 2008 or so. We couldn't really get speedups past 12 cores without it (we have a "typical" parallel GC implemented as well to test against). Hopefully we'll get the paper on the GC and associated language trickery - we don't allow pointers between young generations - in somewhere soon :)
simonmar on Sept 3, 2010 [-]
Yes, the GHC design has certainly been influenced by Manticore (that was one of the "other designs" I referred to). Though in GHC we do have some different problems to solve, the worst of which is that we have to support a bunch of programming abstractions that use mutation.
---
interesting (unimplemented?) PEP on Python bytecode verification: http://legacy.python.org/dev/peps/pep-0330/
---
some things we might want to use machine registers for when building an OVM interpreter in assembly:
---
"
dyncall library The dyncall library encapsulates architecture-, OS- and compiler-specific function call semantics in a virtual bind argument parameters from left to right and then call interface allowing programmers to call C functions in a completely dynamic manner. In other words, instead of calling a function directly, the dyncall library provides a mechanism to push the function parameters manually and to issue the call afterwards. This means, that a program can determine at runtime what function to call, and what parameters to pass to it. The library is written in C and assembly and provides a very simple C interface to program against. "
---
i think i already noted this somewhere but some of these comments are a good read:
https://news.ycombinator.com/item?id=10032295
---
" up vote 13 down vote accepted
There are a number of papers on different kinds of dispatch:
M. Anton Ertl and David Gregg, Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreters, in Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (PLDI 03), pp. 278-288, San Diego, California, June 2003.
M. Anton Ertl and David Gregg, The behaviour of efficient virtual machine interpreters on modern architectures, in Proceedings of the 7th European Conference on Parallel Computing (Europar 2001), pp. 403-412, LNCS 2150, Manchester, August 2001.
An excellent summary is provided by Yunhe Shi in his PhD? thesis.
Also, someone discovered a new technique a few years ago which is valid ANSI C.
"
---
Doug's C# parser framework:
https://github.com/drubino/Linq.Parsers
---
"...can use FASM, which is self-hosting, if I can't compile NASM or YASM. " [1]
---
" Context-free grammars. What this really means is the code should be parsable without having to look things up in a symbol table. C++ is famously not a context-free grammar. A context-free grammar, besides making things a lot simpler, means that IDEs can do syntax highlighting without integrating most of a compiler front end. As a result, third-party tools become much more likely to exist. Redundancy. Yes, the grammar should be redundant. You've all heard people say that statement terminating ; are not necessary because the compiler can figure it out. That's true — but such non-redundancy makes for incomprehensible error messages. Consider a syntax with no redundancy: Any random sequence of characters would then be a valid program. No error messages are even possible. A good syntax needs redundancy in order to diagnose errors and give good error messages. "
"The first tool that beginning compiler writers often reach for is regex. Regex is just the wrong tool for lexing and parsing. Rob Pike explains why reasonably well."
" The philosophies of error message handling are:
Print the first message and quit. This is, of course, the simplest approach, and it works surprisingly well. Most compilers' follow-on messages are so bad that the practical programmer ignores all but the first one anyway. The holy grail is to find all the actual errors in one compile pass, leading to:
Guess what the programmer intended, repair the syntax trees, and continue. This is an ever-popular approach. I've tried it indefatigably for decades, and it's just been a miserable failure. The compiler seems to always guess wrong, and subsequent messages with the "fixed" syntax trees are just ludicrously wrong.
The poisoning approach. This is much like how floating-point NaNs are handled. Any operation with a NaN operand silently results in a NaN. Applying this to error recovery, and any constructs that have a leaf for which an error occurred, is itself considered erroneous (but no additional error messages are emitted for it). Hence, the compiler is able to detect multiple errors as long as the errors are in sections of code with no dependency between them. This is the approach we've been using in the D compiler, and are very pleased with the results."" Runtime Library
Rarely mentioned, but critical, is the need to write a runtime library. This is a major project. It will serve as a demonstration of how the language features work, so it had better be good. Some critical things to get right include:
I/O performance. Most programs spend a lot of time in I/O. Slow I/O will make the whole language look bad. The benchmark is C stdio. If the language has elegant, lovely I/O APIs, but runs at only half the speed of C I/O, then it just isn't going to be attractive.
Memory allocation. A high percentage of time in most programs is spent doing mundane memory allocation. Get this wrong at your peril.
Transcendental functions. OK, I lied. Nobody cares about the accuracy of transcendental functions, they only care about their speed. My proof comes from trying to port the D runtime library to different platforms, and discovering that the underlying C transcendental functions often fail the accuracy tests in the D library test suite. C library functions also often do a poor job handling the arcana of the IEEE floating-point bestiary — NaNs, infinities, subnormals, negative 0, etc. In D, we compensated by implementing the transcendental functions ourselves. Transcendental floating-point code is pretty tricky and arcane to write, so I'd recommend finding an existing library you can license and adapting that.
A common trap people fall into with standard libraries is filling them up with trivia. Trivia is sand clogging the gears and just dead weight that has to be carried around forever. My general rule is if the explanation for what the function does is more lines than the implementation code, then the function is likely trivia and should be booted out."---
" String I/O should be unicode-aware & support utf-8. Binary I/O should exist. Console I/O is nice, and you should support it if only for the sake of having a REPL with readline-like features. Basically all of this can be done by making your built-in functions wrappers around the appropriate safe I/O functions from whatever language you’re building on top of (even C, although I wouldn’t recommend it). It’s no longer acceptable to expect strings to be zero-terminated rather than length-prefixed. It’s no longer acceptable to have strings default to ascii encoding instead of unicode. In addition to supporting unicode strings, you should also probably support byte strings, something like a list or array (preferably with nesting), and dictionaries/associative arrays. It’s okay to make your list type do double-duty as your stack and queue types and to make dictionaries act as classes and objects. Good support for ranges/spans on lists and strings is very useful. If you expect your language to do string processing, built-in regex is important. If you provide support for parallelism that’s easier to manage than mutexes, your developers will thank you. While implicit parallelism can be hard to implement in imperative languages (much easier in functional or pure-OO languages), even providing support for thread pools, a parallel map/apply function, or piping data between independent threads (like in goroutines or the unix shell) would help lower the bar for parallelism support. Make sure you have good support for importing third party packages/modules, both in your language and in some other language. Compiled languages should make it easy to write extensions in C (and you’ll probably be writing most of your built-ins this way anyway). If you’re writing your interpreted language in another interpreted language (as I did with Mycroft) then make sure you expose some facility to add built-in functions in that language. For any interpreted language, a REPL with a good built-in online help system is a must. Users who can’t even try out your language without a lot of effort will be resistant to using it at all, whereas a simple built-in help system can turn exploration of a new language into an adventure. Any documentation you have written for core or built-in features (including documentation on internal behavior) should be assessible from the REPL. This is easy to implement (see Mycroft’s implementation of online help) and is at least as useful for the harried language developer as for the new user. "
---
ufo 7 days ago [-]
Deoptimization is actually really hard to implement if you have an ahead of time compiler like Cannoli. You need to get all the stuff that is living in machine registers or in the C stack and then convert them back to whatever representation your generic interpreter uses.
I think this is actually one of the things that most get in the way if you want to use traditional AOT compiler technology (like gcc or LLVM) to implement a JIT. In state of the art JIT compilers this part is always a nest of highly complex and nonportable assembly language.
reply
---
tathougies 7 days ago [-]
Compiling to machine code is not a panacea for optimization. A optimized JIT compiler is going to blow an AOT compiler out of the water. Being smart about the machine code generated is significantly more important than generating machine code. In particular, PyPy? makes several optimizations over python code that a more direct implementation of CPython at the machine level probably wouldn't. For example, PyPy? erases dictionary lookups for object member access if the object shape is statically known. Given how prevalent this kind of lookup is in Python code, it's possible that even an interpreter that made this optimization would be faster than a machine code version that used an actual hash table.
I think this compiler also makes this particular optimization, but this is just one of many many optimizations PyPy? does. I imagine that with sufficient work, this compiler could be brought up to speed with PyPy?, but as it stands right now, PyPy?