proj-oot-ootCompilerNotes1

" I use Vim plugin called vim-godef. It is sort of like ctags, but better. It lets you jump to definition within your codebase as well as Go source. So you are just 'gd' away from learning what your function takes and returns. "

--

" >Also, if you want to refactor an interface, you have to go find all of its (undeclared) implementations more or less by hand.

Or use gofmt -r: http://golang.org/cmd/gofmt/ " --

"As a stand-in for real "X implements Y" declarations, you can always add various cheap statements that cause a compile-time interface-satisfaction check if there weren't such statements already--different phrasings I was playing with last night: http://play.golang.org/p/lUZtDdP5ia "

--

" ekmett Says:

September 20, 2013 at 2:19 pm

Michael Moser,

You’d be surprised.

Due to strict control over side-effects GHC is very good at rejiggering code into a form that does few to no allocations for the kinds of things you usually turn into an inner loop.

Supercompilation (http://community.haskell.org/~ndm/downloads/paper-supero_making_haskell_faster-27_sep_2007.pdf) can even “beat C” in some cases.

Stream fusion (http://metagraph.org/papers/stream_fusion.pdf) and the worker-wrapper transformation (http://www.cs.nott.ac.uk/~gmh/wrapper.pdf) are used heavily in the compiler and core libraries to make it so idiomatic haskell doesn’t have to be slow.

The vector library uses stream fusion to generate rather impressive unrolled loops. This is getting better now with Geoffrey Mainland’s work (http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/haskell-beats-C.pdf). Now idiomatic Haskell code like:

dot v w = mfold’ (+) 0 (mzipWith (*) v w)

is starting to even beat hand-rolled C that uses SSE primitives!

The key to being able to do these rewrites on your code is the freedom from arbitrary side-effects. That lets the compiler move code around in ways that would be terrifying and unverifiable in a C/C++ compiler. "

--

hmm, seems like we really need pure fns by default then? or at least a way for there to be no side effects by default within loops?

--

" waps 1 day ago

link

But there are several things java insists on that are going to cost you in performance in java that are very, very difficult to fix.

1) UTF16 strings. Ever notice how sticking to byte[] arrays (which is a pain in the ass) can double performance in java ? C++ supports everything by default. Latin1, UTF-8, UTF-16, UTF-32, ... with sane defaults, and supports the full set of string operations on all of them. I have a program that caches a lot of string data. The java version is complete, but uses >10G of memory, where the C++ version storing the same data uses <3G.

2) Pointers everywhere. Pointers, pointers and yet more pointers, and more than that still. So datastructures in java will never match their equivalent in C++ in lookup speeds. Plus, in C++ you can do intrusive datastructures (not pretty, but works), which really wipe the floor with Java's structures. If you intend to store objects with lots of subobjects, this will bit you. As this wasn't bad enough java objects feel the need to store metadata, whereas C++ objects pretty much are what you declared them to be (the overhead comes from malloc, not from the language), unless you declared virtual member functions, in which case there's one pointer in there. In Java, it may (Sadly) be worth it to not have one object contain another, but rather copy all fields from the contained object into the parent object. You lose the benefits of typing (esp. since using an interface for this will eliminate your gains), but it does accelerate things by keeping both things together in memory.

3) Startup time. It's much improved in java 6, and again in java 7, but it's nowhere near C++ startup time.

4) Getting in and out of java is expensive. (Whereas in C++, jumping from one application into a .dll or a .so is about as expensive as a virtual method call)

5) Bounds checks. On every single non-primitive memory access at least one bounds check is done. This is insane. "int[5] a; a[3] = 2;" is 2 assembly instructions in C++, almost 20 in java. More importantly, it's one memory access in C++, it's 2 in java (and that's ignoring the fact that java writes type information into the object too, if that were counted, it'd be far worse). Java still hasn't picked up on Coq's tricks (you prove, mathematically, what the bounds of a loop variable are, then you try to prove the array is at least that big. If that succeeds -> no bounds checks).

6) Memory usage, in general. I believe this is mostly a consequence of 1) and 2), but in general java apps use a crapload more memory than their C++ equivalents (normal programs, written by normal programmers)

7) You can't do things like "mmap this file and return me an array of ComplicatedObject?[]" instances.

But yes, in raw number performance, avoiding all the above problems, java does match C++. There actually are (contrived) cases where java will beat C++. Normal C++ that is. In C++ you can write self-modifying code that can do the same optimizations a JIT can do, and can ignore safety (after proving to yourself what you're doing is actually safe, of course).

Of course java has the big advantage of having fewer surprises. But over time I tend to work on programs making this evolution : python/perl/matlab/mathematica -> java -> C++. Each transition will yield at least a factor 2 difference in performance, often more. Surprisingly the "java" phase tends to be the phase where new features are implemented, cause you can't beat Java's refactoring tools.

Pyton/Mathematica have the advantage that you can express many algorithms as an expression chain, which is really, really fast to change. "Get the results from database query X", get out fields x,y, and z, compare with other array this-and-that, sort the result, and get me the grouped counts of field b, and graph me a histogram of the result -> 1 or 2 lines of (not very readable) code. When designing a new program from scratch, you wouldn't believe how much time this saves. IPython notebook FTW !

reply

upvote

PaulHoule?