proj-oot-ootCompilerNotes2

https://wiki.alopex.li/LanguageCompilationSpeed

summary table of loc/sec (language, compiler, loc/sec):

C++ mrustc 240 rust rustc 350 rust mrustc 360 C++ clang 400 C/C++ gcc 5k C/C++ clang 5k C lcc 5k C chibicc 7-15k C tcc 10-125k pascal fpc 45k go go 44k

conclusions:

---

    5
    matklad 30 hours ago (unread) | link | flag | 
    rustc is also multithreaded. Not sure about go but C/C++ compilers aren’t, so the gap is even bigger.

That’s not super true I believe. rustc front-end is not parallel. There were “parallel compiler” efforts couple of years ago, but they are stalled. What is parallel is LLVM-side code generation — rustc can split llvm ir for a crate into several chunks (codegen units) and let LLVM compile them in parallel. On a higher level, C++ builds tend to exhibits better parallelism than Rust builds: because of header files, C++ compilation is embarrassingly parallel, while Rust compilation is shaped as DAG (although the resent pipelines build helped with shortening the critical path significantly). This particular benchmark sets -j 1.

---

viraptor on March 3, 2020 [–]

> when python code really fails, e.g. in a context with threading, you might as well burn the whole thing to the ground

This sounds weird. Why burn the whole thing to the ground? You've got frames from all threads available. Why would it take days to resolve the issue? Why do you think it's easier to resolve it in common lisp?

hyperion2010 on March 4, 2020 [–]

I think I need to unpack what I mean by 'really fails' to capture what I was trying to convey. I deal with Python programs running in a number of different environments, and there are some where literally all you have on hand are the libs you brought with you. Maybe that is an oversight on my part, but the reality is that in many cases this means that I am just going to restart the daemon and hope the problem goes away, I don't have the time to manually instrument the system to see what was going on. I shudder to imagine having to debug a failure from some insane pip freeze running on a Windows system with the runtime packaged along with it.

Worst case for CL means that at the very least I don't have to wonder if gdb is installed on the system. It provides a level of assurance and certainty that vastly simplifies the decision making around what to do when something goes wrong.

To be entirely fair, the introduction of breakpoint in 3.7 has simplified my life immensely -- unless I run into a system still on 3.6. Oops! I use pudb with that, and the number of uncovered, insane, and broken edge cases when using it on random systems running in different contexts is one of the reasons I am starting no new projects in Python. When I want to debug a problem that occurred in a subprocess (because the gil actually is a good thing) there is a certain perverse absurdity of watching your keyboard inputs go to a random stdin so that you cant even C-d out of your situation. Should I ever be in this situation? Well the analogy is trying to use a hammer to pound in nailgun nails and discovering that doing such a thing opens a portal to the realm of eternal screaming -- a + b = pick you favorite extremely nonlinear unexpected process that is most definitely not addition. You can do lots of amazing things in Python, but you do them at your own peril. (Disclosure: see some of my old posts for similar rants.)

---

zig's compiler caching system:

https://ziglang.org/download/0.4.0/release-notes.html#Build-Artifact-Caching

also "I want to point out that this caching system is not some fluffy bloated feature - rather it is an absolutely critical component to making cross-compiling work in a usable manner. As we'll see below, other compilers ship with pre-compiled, target-specific binaries, while Zig ships with source code only and cross-compiles on-the-fly, caching the result."

---

how zig does cross-compilation: https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html section 'Under the Hood'

---

" You Can’t Reproduce Someone Else’s Rust Build

A final nit I have about Rust is that builds are not reproducible between different computers (they are at least reproducible between builds on the same machine if we disable the embedded timestamp that I put into Xous for $reasons).

I think this is primarily because Rust pulls in the full path to the source code as part of the panic and debug strings that are built into the binary. This has lead to uncomfortable situations where we have had builds that worked on Windows, but failed under Linux, because our path names are very different lengths on the two and it would cause some memory objects to be shifted around in target memory. To be fair, those failures were all due to bugs we had in Xous, which have since been fixed. But, it just doesn’t feel good to know that we’re eventually going to have users who report bugs to us that we can’t reproduce because they have a different path on their build system compared to ours. It’s also a problem for users who want to audit our releases by building their own version and comparing the hashes against ours. " [2]

---

" Let's consider build systems. The classic build system is make. In a Makefile you define a set of targets with commands to build those targets and dependencies on other targets that must be built first. It is explicitly a directed, acyclic graph of targets.

In practice, we have a set of conventions for some named targets we expect to find in a Makefile, like 'all' and 'test' and 'clean.' But there's no guarantee that those exist, and even if they do, there's no guarantee that they do anything sensible. We would expect a test target in a Java project to compile the code then run JUnit. If it doesn't? Too bad.

It would make much more sense for all Java projects to share their basic behavior, to all have a 'compile' target that build bytecode, and a 'test' target that ran all unit tests, and a 'package' target that built a jar.

Inevitably, though, we will have local differences, such as "Oh, we need to add this option when we run JUnit" or "There's an extra step here where we run a linter."

If we can specify common behavior plus local differences, that makes it much easier for someone coming into the project. A quick glance at the base behavior in use tells them the main targets they care about, and the local differences they need to understand are all clearly separated from the base behavior.

So the task is to specify a directed, acyclic graph and arbitrary patches on it. Just like with GOTOs, our patches must work correctly no matter what path we took through the graph to reach them. Plus, arbitrary diffs of graphs are a hard mathematical problem.

What happens if we prune these graphs? When we carry through our pruning, we find that having each target depend on one other target leads us to start thinking of targets as more abstract stages as opposed to handling particular tasks. Our build graph turns into a set of linear sequences of targets.

Handily, it turns out that diffs on a linear sequence are really easy. They are a combination of:

Since we are only going through linear sequences, there is only one path that led to any of our local differences, so we can easily reason about whether it will behave correctly or not.

We can specify our common behavior as a set of linear sequences, so that the targets 'compile' and 'test' and 'package' are all defined and all share the same basic behavior. And we can specify our local changes to them in a clear way.

This is what Maven did. Unfortunately, they hid this clarity behind a mess of XML and weird implementations in Java so it was decidedly nontrivial to implement a set of sequences, and local differences were spread among XML and bits of implementation floating around in Java. That pretty much killed the idea in the Java community, and the next systems like Gradle, all went back to directed, acyclic graphs of targets.

I don't think any other build system has tackled this idea again, which is a shame.

" -- https://madhadron-a-programmers-miscellany.simplecast.com/episodes/progress-by-pruning-trees/transcript

---

sounds like meson, ninja are getting a lot of love these days.

there are still a lot of fans of cmake

---