proj-oot-ootNotes21

" The ubiquitous understanding that templates and variadic templates will produce much faster assembly because the C++ code will automagically vanish during compilation cannot be corroborated by my personal experience in decades of HFT and gaming, by no reputable peer-reviewed publications nor by anyone with some knowledge of compiler optimization. This is an ongoing fallacy. In fact, it is the opposite: smaller bits of code can be compiled, debugged and optimized by looking at the generated assembly much more efficiently than with templates and variadics.

Compilation times

The next stumbling block refers to compiling time. Templates and variadics notoriously increase the compilation time by orders of magnitude due to recompilation of a much higher number of files involved. While a simple, traditional C++ class will only be recompiled if the header files directly included by it are changed, in a Modern C++ setup one simple change will very often trigger a global recompilation. It is not rare to see Modern C++ applications taking 10 minutes to compile. With traditional C++, this number is counted in low seconds for a simple change. "

jupp0r 1 day ago

In my experience, the opposite of what the author claims is true: modern C++ leads to code that's easier to understand, performs better and is easier to maintain.

As an example, replacing boost::bind with lambdas allowed the compiler to inline functor calls and avoided virtual function calls in a large code base I've been working with, improving performance.

Move semantics also boosted performance. Designing APIs with lambdas in mind allowed us to get rid of tons of callback interfaces, reducing boilerplate and code duplication.

I also found compilation times to be unaffected by using modern C++ features. The main problem is the preprocessor including hundreds of thousands of lines for a single compilation unit. This has been a problem in C and C++ forever and will only be resolved with C++ modules in C++2x (hopefully).

I encourage the author to try pasting some of his code into https://gcc.godbolt.org/ and to look at the generated assembly. Following the C++ core guidelines (http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines) is also a good way to avoid shooting yourself in the foot (which is surprisingly easy with C++, unfortunately).

reply

vinkelhake 1 day ago

Fully agree with what you're saying. Just a nitpick about bind (which I'm sure you're aware of, just for the sake of others).

The return type of {boost,std}::bind is unspecified (it's basically just a "callable"). This means that bind doesn't have to do type erasure. On the other hand, {boost,std}::function has to do type erasure, which can boil down to a virtual function call. But that's orthogonal to where the callable came from.

Another thing to keep in mind is that if you're writing a function like `copy_if` which takes a callback, but doesn't have to store it for later use, it's much better to take the callback as a template type rather than going through the type-erasing {boost,std}::function. Doing the latter makes the compiler's job a lot harder.

reply

plorkyeran 1 day ago

It's not type erasure per se, but it's somewhat functionally equivalent. When you pass a function pointer (or pointer to member function) to bind(), it has to store that pointer in a member variable and perform an indirect call in operator(). In theory this is something that a compiler should be able to optimize away, but in practice they very rarely actually do, so using bind() as the argument for a STL algorithm typically does not result in the whole thing getting inlined out of existence. Lambdas, OTOH, are super easy to inline out of existence and compilers actually do so most of the time.

reply

---

catnaroek 1 day ago [-]

I'm not so sure. In C++, most certainly a mainstream language, two copies of the complex number `2 + 3i` are equal to each other. In JavaScript?, they're different. (Assuming you make two objects with fields `realPart` and `imaginaryPart`.) How can you do functional programming on a foundation where not even the most basic property of equality holds (every value is equal to itself)?

Remember that functions are mappings from values from a domain, into values from a codomain. A language whose treatment of compound values is as feeble as JavaScript?'s can't possibly constitute the right foundation for doing functional programming.

reply

---

catnaroek 13 hours ago [-]

Haskell's actual problem isn't the lack of a comprehensive standard library, but rather the presence of core language features that actively hinder large-scale modular programming. Type classes, type families, orphan instances and flexible instances all conspire to make as difficult as possible to determine whether two modules can be safely linked. Making things worse, whenever two alternatives are available for achieving roughly the same thing (say, type families and functional dependencies), the Haskell community consistently picks the worse one (in this case, type families, because, you know, why not punch a big hole on parametricity and free theorems?).

Thanks to GHC's extensions, Haskell has become a ridiculously powerful language in exactly the same way C++ has: by sacrificing elegance. The principled approach would've been to admit that, while type classes are good for a few use cases, (say, overloading numeric literals, string literals and sequences), they have unacceptable limitations as a large-scale program structuring construct. And instead use an ML-style module system for that purpose. But it's already too late to do that.

reply

---

https://www.microsoft.com/en-us/research/wp-content/uploads/2007/01/appsem-tcs.pdf

describes how a theoretical framework called 'full abstraction' or 'fully abstract compilation' relates to security holes of the form where source language invariants can fail to hold if the attacker can write linked code in the object language. 6 examples are given in C# (where C# is the source language and the CLR IL is the object language):

the first three of these were corrected:

random note on the common implementation of booleans: "most logical operations on bools interpret zero as false and non-zero as true, and hence are not affected by the possibility of values other than 0 or 1"

suggestions are given for the other problems:

some other security flaws:

" Nonetheless, (aiming for) full abstraction is just a start. Languages inevitably contain weaknesses, C ] included, and these weaknesses lead to security holes.

For example, the mutability of arrays is a common cause of security bugs in libraries for both Java and C#. (Typically, a programmer marks an array-type field or property readonly but forgets that the elements of the array can be mutated.)

The ability to apply checked downcasts can lead to holes too; one naive ‘solution’ to the mutability of arrays is to pass the array at a supertype that prevents mutation ( System.IEnumerable ); this fails because the array type can be recovered through downcasting. As the semantics community knows, the right way to think about such issues is by studying observational equivalence.

...

The complexities of industrial programming languages and platforms can make questions such as full abstraction and even type safety hard to pin down. For example, .NET, like Java, has a reflection capability that destroys any sensible notion of contextual equivalence: programs can reflect on the number of meth- ods in their implementation, or inspect the current call-stack. To be of any use at all, a definition of contextual equivalence has to ignore these capabilities (which, incidentally, are not available to untrusted components, i.e. the sort that we have been considering as potential ‘attacker’ contexts). "

---

((in a thread arguing about how cool Rust is:))

dikaiosune 8 minutes ago [-]

> But most likely - you simply do not need a language without a GC.

Absolutely. That doesn't mean I can't want predictable performance or deterministic destruction. I also think it's a shame that we waste so much electricity and rare earth minerals on keeping ourselves from screwing up (i.e. on the overhead of managed runtimes and GCs). Before, I'd have argued that it was just necessary. Having spent a bunch of time with Rust, I don't think so any more, and I'm really excited to see non-GC languages build on Rust's ideas in the future.

---

"Go says it can achieve 10ms max pause time using 20% of your CPU cores provided you give it 100% extra memory. In other words, memory utilisation must be kept below 50%."

[1]

---

" Note though that a lot of the problems with GC in crystal can be worked around by replacing classes with structs. The latter are passed by value and allocated on the stack. There is also access to pointers and manual allocation if that should be needed (though that will end up with roughly the same lack of memory safety as in C) to optimize a hotspot. " [2]

---

i guess i'm still feeling fairly eager to get-implementin' on an Oot Bytecode interpreter, even though there are many unanswered questions at the Oot Core level, and even though i haven't learned all of the other contender languages yet. What else is likely to make such a big difference to Oot Bytecode that beginning implementation now would be a waste of time? All i can think of is that i should:

---

 bluetomcat 16 hours ago | parent [-] | on: Crystal: Fast as C, Slick as Ruby

All of these "fast as C" claims about modern, high-level Python-like languages (be they statically typed and natively compiled) are missing the point. It is mostly the minimalistic and terse programming style that C encourages that makes C programs performant. You avoid allocations wherever possible, you write your own custom allocators and memory pools for frequently allocated objects, you avoid copying stuff as much as possible. You craft your own data structures suited for the problem at hand, rather than using the standard "one size fits all" ones. Compare that to the "new this, new that" style of programming that's prevalent today.

---

bluetomcat 16 days ago

parent [-]on: Memory management in C programs (2014)

> In C++, the use of RAII means that exceptions can be made to free any dynamically allocated objects whose owners went out of scope as a result of the exception. In C, we don't easily have that option available to us, and those objects are just going to stay allocated.

In GCC and Clang, there is the "cleanup" attribute which runs a user-supplied function once a variable goes out of scope, so you can have scope-based destructors:

    static void do_cleanup(char **str) {
        free(*str);
    }
    void f(void) {
        char *s __attribute__((__cleanup__(do_cleanup))) = malloc(20);
        // s gets freed once the function exits
    }

pjmlp 16 days ago [-]

Which is inherently not portable across other C compilers.

---

bluetomcat 209 days ago

parent [-]on: Why I love Rust

The whole C++ language is indeed a monstrosity designed by a committee that resulted from years of piling up "requested features", but with "modern" C++14 they try to narrow-down the idiomatic constructs to a smaller, more manageable set.

---

bluetomcat 189 days ago

parent [-]on: Undefined Behavior in LLVM [pdf]

On some architectures, an uninitialized value may contain either a trap representation or a valid value, so enforcing any consistent implementation defined behavior seems impossible.

---

to achieve deterministic builds, must not generate 'build identifiers' with the build time. But then what is the build identifier? It must depend only on the actual content of the build, therefore it must be a hash, like with the 'content-addressable' stuff.

---

i've said before that Oot needs a sigil that means 'platform native' (and mb this should be mixed with the 'inexact' sigil, eg ~, eg when doing arithmetic, for saying 'just use the platform-native arithmetic, even if it doesn't quite conform to Oot's arithmetic semantics).

In Clojure, i think '.' is used to indicate Java classes/methods?

another thing we need is 'tell me which underlying platform object instance is being used to implement Oot object x'. For example, if an Oot thread is created on an Erlang platform, then the Oot program should be able to do some 'reflection' on Oot threads in terms of asking the following questions of the implementation:

---

"You can learn TLA+ in a couple of weeks and start writing specifications in a month. There's a fairly slim tome called, Programming in the 1990s that demonstrates a practical technique for proofs that is digestible by programmers."

---

alanning 11 hours ago [-]

Thanks for sharing this. A few questions from someone interested in learning how to use BEAM-based systems in production:

I know Erlang in Anger [1] is kind of written just for this but it feels pretty intense to me as a newbie. I don't know enough to tell whether the issues there are only things I'd need to worry about at larger scale but I found it pretty intimidating, to the point where I have delayed actually designing/deploying a Erlang solution. Designing for Scalability with Erlang/OTP [2] has a chapter on monitoring that I'm looking forward to reading. Wondering if there's some resources you could recommend to get started running a prod Erlang system re: debugging, monitoring, etc.

1. https://www.erlang-in-anger.com/

2. http://shop.oreilly.com/product/0636920024149.do

reply

rdtsc 11 hours ago [-]

Good question. So first I noticed in metrics dashboard (so it is important to have metrics) the receiver never seemed to have gotten the expected number of messages.

Then noticed the count of messages was being reset. Suspected something restarted the node. Browsed through metrics, noticed both memory usage was spiking too high and node was indeed restarting (uptime kept going up and down).

Focused on memory usage. We have a small function which returns top N memory hungry processes. Notice a particular one. Noticed its mailbox was tens of gigabytes.

Then used recon_trace to trace that process to see what it was doing. recon_trace is written by author of Erlang In Anger. So did something like:

    recon_trace:calls[{my_module, '_', '_'}, 100, [{scope, local}].

dbg module is built-in and can use that, but it doesn't have rate limiting so it can kill a busy node if you trace the wrong thing (it floods you with messages).

So noticed what it was doing and noticed that a particular operation was taking way too long. It was because it was doing an O(n) operation instead of O(1) on each message. On a smaller scale it wasn't noticeable but when got to 1M plus instances it was bringing everything down.

After solving the problem to test it, I compiled a .beam file locally, scp-ed to test machine, hot-patched it (code:load_abs("/tmp/my_module")) and noticed that everything was working fine.

On whether to pick a message queue vs regular processes. It depends. They are very different. Regular processes are easier, simpler and cheap. But they are not persistent. So perhaps if your messages are like "add $1M to my account" and sender wants to just send the messages and not worry about acknowledging it. Then you'd want something with very good persistence guarantees.

reply

---

critium 12 hours ago [-]

I would add to the list;

EDIT: Side note. IMO, JMX extensions are one of the most under-appreciated things that java and jvm devs have but keep forgetting about but its so powerful.

reply

AdieuToLogic? 10 hours ago [-]

  IMO, JMX extensions are one of the most
  under-appreciated things that java and
  jvm devs have but keep forgetting about
  but its so powerful.

This is an excellent point which I cannot agree with more. When doing distributed systems using JVM's, I almost always reach for the excellent Metrics[0] library. It provides substantial functionality "out of the box" (gauges, timers, histograms, etc.) as well as exposing each metric to JMX. It also integrates with external measuring servers, such as Graphite[1], though that's not germane to this post.

0 - http://metrics.dropwizard.io/3.1.0/

1 - http://graphiteapp.org/

reply

---

honkhonkpants 11 hours ago [-]

On backpressure: "Common versions include dropping new messages on the floor (and incrementing a metric) if the system’s resources are already over-scheduled."

What I would add here is that "and incrementing a metric" must use fewer resources than would have been spent serving the request. I once worked on a system that would latch up into an overload state because in overload it would log errors at a severity high enough to force the log file to be flushed on every line. This was counterproductive to say they least. From that I learned that a system's flow control / pushback mechanism must be dirt cheap.

reply

---

marcus_holmes 13 hours ago [-]

I still miss COM. It had its problems, sure, but it worked really well.

I haven't seen anything since that allowed such decoupled development

reply

mike_hearn 8 hours ago [-]

I think you have rose tinted classes on.

COM is/was a rats nest of confusing and frequently duplicated APIs with insanely complicated rules that by the end really only Don Box understood. CoMarshalInterThreadInterfaceInStream? was one of the simpler ones, iirc. COM attempted to abstract object language, location, thread safety, types, and then the layers on top tried to add serialisation and documented embedding too, except that the separation wasn't really clean because document embedding had come first.

Even just implementing IUnknown was riddled with sharp edges and the total lack of any kind of tooling meant people frequently screwed it up:

https://blogs.msdn.microsoft.com/oldnewthing/20040326-00/?p=...

The modern equivalent of COM is the JVM and it works wildly better, even if you look at the messy neglected bits (like serialisation and RPC).

reply

asveikau 3 hours ago [-]

I think the good ideas from COM are: IUnknown, consistent error handling through HRESULT, the coding style that emerges from being clear about method inputs and outputs.

Some things done not as well as these core ideas: registration done globally in the registry, anything to do with threading, serialization, IDispatch.

I think in many situations you can take lessons from the good parts and try to avoid the bad.

I don't see how pointing out common bugs helps your argument though. You can write bugs in any paradigm.

reply

marcus_holmes 13 hours ago [-]

I still miss COM. It had its problems, sure, but it worked really well.

I haven't seen anything since that allowed such decoupled development

reply

mike_hearn 8 hours ago [-]

I think you have rose tinted classes on.

COM is/was a rats nest of confusing and frequently duplicated APIs with insanely complicated rules that by the end really only Don Box understood. CoMarshalInterThreadInterfaceInStream? was one of the simpler ones, iirc. COM attempted to abstract object language, location, thread safety, types, and then the layers on top tried to add serialisation and documented embedding too, except that the separation wasn't really clean because document embedding had come first.

Even just implementing IUnknown was riddled with sharp edges and the total lack of any kind of tooling meant people frequently screwed it up:

https://blogs.msdn.microsoft.com/oldnewthing/20040326-00/?p=...

The modern equivalent of COM is the JVM and it works wildly better, even if you look at the messy neglected bits (like serialisation and RPC).

reply

asveikau 3 hours ago [-]

I think the good ideas from COM are: IUnknown, consistent error handling through HRESULT, the coding style that emerges from being clear about method inputs and outputs.

Some things done not as well as these core ideas: registration done globally in the registry, anything to do with threading, serialization, IDispatch.

I think in many situations you can take lessons from the good parts and try to avoid the bad.

I don't see how pointing out common bugs helps your argument though. You can write bugs in any paradigm.

reply

marcus_holmes 13 hours ago [-]

I still miss COM. It had its problems, sure, but it worked really well.

I haven't seen anything since that allowed such decoupled development

reply

mike_hearn 8 hours ago [-]

I think you have rose tinted classes on.

COM is/was a rats nest of confusing and frequently duplicated APIs with insanely complicated rules that by the end really only Don Box understood. CoMarshalInterThreadInterfaceInStream? was one of the simpler ones, iirc. COM attempted to abstract object language, location, thread safety, types, and then the layers on top tried to add serialisation and documented embedding too, except that the separation wasn't really clean because document embedding had come first.

Even just implementing IUnknown was riddled with sharp edges and the total lack of any kind of tooling meant people frequently screwed it up:

https://blogs.msdn.microsoft.com/oldnewthing/20040326-00/?p=...

The modern equivalent of COM is the JVM and it works wildly better, even if you look at the messy neglected bits (like serialisation and RPC).

reply

asveikau 3 hours ago [-]

I think the good ideas from COM are: IUnknown, consistent error handling through HRESULT, the coding style that emerges from being clear about method inputs and outputs.

Some things done not as well as these core ideas: registration done globally in the registry, anything to do with threading, serialization, IDispatch.

I think in many situations you can take lessons from the good parts and try to avoid the bad.

I don't see how pointing out common bugs helps your argument though. You can write bugs in any paradigm.

reply

pjmlp 4 hours ago [-]

> The modern equivalent of COM is the JVM and it works wildly better, even if you look at the messy neglected bits (like serialisation and RPC).

Actually it is the WinRT? introduced in Windows 8.

reply

flukus 12 hours ago [-]

It came back with a vengeance in windows 10.

reply

marcus_holmes 8 hours ago [-]

really? I must have missed that... I stopped doing windows dev around Vista, because Vista.

Any details?

reply

pjmlp 4 hours ago [-]

The UWP programming model, which was introduced in Windows 8 as WinRT? is COM.

Basically it is the original idea of .NET, which as called COM+ Runtime, before they decided to create the CLR.

WinRT? is nothing more than COM+ Runtime, but with .NET metadata instead of COM type libraries.

Also since Vista, the majority of new Windows native APIs are COM based, not plain C like ones.

reply

douche 13 hours ago [-]

Do you remember DLL hell? Pepperidge Farms remembers /s

reply

asveikau 3 hours ago [-]

Library ABIs are hard. Same is true with shared libraries if you're not careful. I don't think it's necessarily the fault of the tooling.

reply

mcguire 2 hours ago [-]

The tooling didn't include version numbers.

reply

---

random somewhat interesting things, i already skimmed, no need to read them:

https://en.wikipedia.org/wiki/Restrict

https://devblogs.nvidia.com/parallelforall/six-ways-saxpy/

---

some random C and Fortran code with compiler pragmas:

" 2. OpenACC? SAXPY

If you have been following Parallel Forall, you are already familiar with OpenACC?. OpenACC? is an open standard that defines compiler directives for parallel computing on GPUs (see my previous posts on the subject). We can add a single line to the above example to produce an OpenACC? SAXPY in C.

void saxpy(int n, float a, float * restrict x, float * restrict y) {

  1. pragma acc kernels for (int i = 0; i < n; ++i) y[i] = a*x[i] + y[i]; }

... Perform SAXPY on 1M elements saxpy(1<<20, 2.0, x, y);

A Fortran OpenACC? SAXPY is very similar.

subroutine saxpy(n, a, x, y) real :: x(:), y(:), a integer :: n, i !$acc kernels do i=1,n y(i) = a*x(i)+y(i) enddo $!acc end kernels end subroutine saxpy

... ! Perform SAXPY on 1M elements call saxpy(220, 2.0, x_d, y_d) "

---

AceJohnny?2 124 days ago [-]

    Simple Memory Model
    The  RISC-V  address  space  is  byte  addressed  and  little-endian.  Even  though  most  other  ISAs  such  as  x86  and ARM  have  severalpotentially  complex  addressing  modes, RISC-V  only  uses  base+offset  addressing  with  a  12-bit immediate  to  simplify  load-store  units.

I remember reading [1] that one reason C became so popular was because it was easy to write a compiler for, not because it was easy to write a program in. Thus, compilers emerged for various platforms, making it accessible. At the opposite, stuff like Smalltalk (I may be mistaking the language) were complicated to implement and the compilers/environments were expensive, limiting the reach.

Looks like the RISC-V is taking a page from this strategy book.

[1] was it Gabriel's Worse Is Better essay? https://www.dreamsongs.com/RiseOfWorseIsBetter.html

nickpsecurity 124 days ago [-]

Other commenter's claim is inaccurate. I have a detailed history, with references, of how C became what it was:

http://pastebin.com/UAQaWuWG

That its predecessor compiled on an EDSAC, it compiled on a PDP-11, and was the language of UNIX (its killer app) was why it spread everywhere. Network/legacy effects took over from there. There were languages like Modula-2 that were safer, efficient, easy to implement, and still close to the metal. Such languages kept being ignored and mostly are to this day for system programming. They even ignore the ones that are basically C with enhanced safety (eg Cyclone, Popcorn).

It's all social and economic reasons justifying C both at its creation, during its spread, and for its continuing use. The technical case against such a language has always been there and field-proven many times.

zik 124 days ago [-]

C was quite a few years before Modula-2. C was developed in 1972 and was heavily used in UNIX by 1973. Development of Modula-2 wasn't started until 1977 and it wasn't widely available until the 1980s.

cmrdporcupine 124 days ago [-]

But C on microcomputers did not come any earlier than Pascal or Modula-2 compilers.

My first compiler was Modula-2 on my Atari ST. But it was difficult to do much in it because so much of the OS documentation and example code was geared towards C. Also compiling on a floppy based system (couldn't afford a hard disk) was terrible.

SixSigma? 124 days ago [-]

The irony too, that Ritchie, Thompson, Pike et al at the Unix labs were then enamoured by Modula-2 & Oberon and used the ideas to build plan9 but in a new version of C.

nickpsecurity 124 days ago [-]

The wikipedia article says that, when designing Google's language, all three of them had to agree on every single feature so no "extraneous garbage" crept in. The C developers dream language was basically an updated Oberon. That's them dropping the QED on C supporters for me. :)

Funny thing is, Oberon family was used to build both apps and whole operating systems. Whereas, Thompson et al's version is merely an application language lambasted for comparisons with system languages. I don't know if they built Oberon up thanks to tooling and such or if they've dropped back down a notch from Component Pascal since it's less versatile. Just can't decide.

Note: Imagine where we'd be if they figured that shit out early on like the others did. We'd be arguing why an ALGOL68 language wasn't good enough vs ML, Eiffel DbC?, LISP's macros, and so on. The eventual compromise would've been better than C, Go, or ALGOL68. Maybe.

scythe 124 days ago [-]

Have you ever actually tried to write code in Modula-2?

Has there ever been an efficient, portable implementation of Cyclone or Popcorn? Does anyone seriously consider either to be easy to implement?

And, if you just randomly picked three examples, why are they all bad? What does that mean about the other examples, statistically?

nickpsecurity 124 days ago [-]

"Have you ever actually tried to write code in Modula-2?"

Three people wrote a consistent, safer platform in it from OS to compiler to apps in 2 years. Amateurs repeated did stuff like that with it and its successors like Oberon for another decade or two. How long did the first UNIX take to get written and reliable in C?

"Has there ever been an efficient, portable implementation of Cyclone or Popcorn? "

I could've said that about C in its early years given it wasn't intended to be portable. Turns out that stripping almost every good feature out of a language until it's a few primitives makes it portable by default. ...

nickpsecurity 123 days ago [-]

As with scythe, you're ignoring the greater point to focus on tactics that it makes irrelevant. C started with stuff that was literally just what would compile on an EDSAC. It wasn't good on about any metric. They couldn't even write UNIX in it. You know, software that people wanted to use. So, seeing good features of easy compilation and raw performance, they decided to invest in that language to improve its deficiencies until it could get the job done. Now, what would UNIX look like if they subsetted and streamlined a language like ALGOL68 or Modula-2 that was actually designed to get real shit done and robustly?

It's programming, science, all of it. You identify the key goals. A good chunk of it was known as Burroughs stayed implementing it in their OS's. Their sales numbers indicated people wanted to use those enough they paid millions for them. ;) Once you have goals, you derive the language, tools, whatever to achieve those goals. Thompson failed to do this unless he only cared about easy compilation and raw speed at the expense of everything else. Meanwhile, Burroughs, Wirth, Hansen, the Ada people, Eiffel later... all sorts of people did come up with decent, balanced solutions. Some, like the Oberons or Component Pascal, were very efficient, easy to compile, easy to read, stopped many problems, and allowed low-level stuff where needed. Came straight from strengths of design they imitated. A form of that would be easy to pull off on a PDP as Hansen showed in an extreme way.

C's problems, which contributed to UNIX Hater's Handbook and many data losses, came straight from the lack of design in its predecessors which soley existed to work on shit hardware. They tweaked that to work on other shit hardware. They wrote an OS in it. Hardware got better but language key problems remained. Whether we use it or not, we don't have to pretend those effects were necessary or good design decisions. Compare ALGOL68 or Oberon to BCPL to see which looks more thought out if you're still doubting.

...

A ton of systems were more secure or robust at language level before INFOSEC was a big consideration. A number of creations like QNX and MINIX 3 achieved low fault status fast while UNIX took forever due to bad architecture. Oberon Systems were more consistent, easier understanding, faster compilation, and eventually included a GC. NextStep? & SGI taught it lessons for desktops and graphics. BeOS?, like Concurrent Pascal before it, built into OS a consistent, good way of handling concurrency to have great performance in that area. System/38 was more future proof plus object-driven. VMS beat it for cross-language design, clustering, and right functions in OS (eg distributed locking). LISP machines were more hacker-friendly with easy modifications & inspections even to running software w/ same language from apps to OS. And so on.

AnimalMuppet? 123 days ago [-] ... I'm not familiar enough with Modula or Oberon to comment intelligently on them. My reference point is Pascal, which I have actually used professionally for low-level work. I'm presuming that Modula and Oberon and that "type" of languages are similar (perhaps somewhat like you lumping BCPL and C together). But I found it miserable to use such a language. It can protect you from making mistakes, but it gets in your way even when you're not making mistakes. I would guess that I could write the same code 50% to 100% faster in C than in Pascal. (Also, the short-circuit logical operators in C were vastly superior to anything Pascal had).

...

nickpsecurity 122 days ago [-]

" Specifically, you said that C was initially so bad that they couldn't even write Unix in it. That statement is historically false "

Well, if you watched the Vimeo video, he looks at early references and compares side-by-side C with its ancestors. A lot of early C is about the same as BCPL & its squeezed version B. The first paper acted like they created C philosophy and design out of thin air based on B w/ no mention of BCPL. Already linked to it in another comment. Fortunately for you, I found my original source for the failed C attempt at UNIX which doesn't require a video & side-steps the BCPL/B issues:

https://www.bell-labs.com/usr/dmr/www/chist.html

You'll see in that description that the B -> Standard C transition took many intermediate forms. There were several versions of C before the final one. They were simultaneously writing UNIX in assembly, improving their BCPL variant, and trying to write UNIX in intermediate languages derived from it. They kept failing to do so. Ritchie specifically mentions an "embryonic" and "neonatal" C followed by this key statement:

"The language and compiler were strong enough to permit us to rewrite the Unix kernel for the PDP-11 in C during the summer of that year. (Thompson had made a brief attempt to produce a system coded in an early version of C—before structures—in 1972, but gave up the effort.)" (Ritchie)

So, it's a historical fact that there were several versions of C, Thompson failed to rewrite UNIX in at least one, and adding structs let them complete the rewrite. That's ignoring BCPL and B entirely. That they just produced a complete C magically from BCPL or B then wrote UNIX is part of C's proponents revisionist history. Reality is they iterated it with numerous failures. Which is normal for science/engineering and not one of my gripes with C. Just got to keep them honest. ;)

" I would guess that I could write the same code 50% to 100% faster in C than in Pascal. (Also, the short-circuit logical operators in C were vastly superior to anything Pascal had)."

Hmm. You may have hit sore spots in the language with your projects or maybe it was just Pascal. Ada would've been worse. ;) The languages like Modula-3, Component Pascal, and recently Go [but not Ada] are usually faster to code in than C. The reasons that keep turning up are straight forward: design to compile fast to maximize flow; default type-safety reduces hard-to-debug problems in modules; often less interface-level problems across modules or during integrations of 3rd party libraries. This is why what few empirical work I read comparing C, C++, and Ada kept showing C behind in productivity & with 2x the defects. Far as low level, the common trick was wrapping unsafe stuff in a module behind safe, simple interfaces. Then, use it as usual but be careful.

...

A number, from safety-checks to stronger typing to interface protections, had already been proven to prevent problems or provide benefits.

...

The main alternatives were ALGOL60, ALGOL68, and Pascal. They were each designed by people who knew what they were doing in balancing many goals. They, esp ALGOLS, achieved good chunk of most. A subset and/or modification of one with gradual progression for improved hardware would've led to better results than C with same amount of labor invested. On low end, Pascal ended up being ported to systems from 8-bitters to mainframes. On high end, Burroughs implemented their mainframe OS in an ALGOL with hardware that enforced its safety for arrays, stacks, and function calls.

In the 80's, things like Modula-2, Concurrent Pascal, Oberon, and Ada showed up. I'll understand if they avoided Ada given constraints at the time but the others were safer than C and quite efficient. More importantly, their authors could've used your argument back then as most people were doing but decided to build on prior work with better design. They got better results out of it, too, with many doing a lot of reliable code with very little manpower. Hansen topped it off by implementing a barebones, Wirth-like language and system called Edison on the same PDP that C was born on.

...

If you knew 1 and 2, would you still think C was a "well-designed" language for systems programming? Or a failed implementation of ALGOL60 whose designers were too limited by hardware? If No 1, we should emulate the heck out of C. If No 2, we should emulate ALGOL60-like language that balances readability, efficiency, safety, and programming in the large. Btw, all the modern languages people think are productive and safer lean more toward ALGOL more than C albeit sometimes using C-like syntax for uptake. Might be corroboration of what I'm saying.

....

Of course, don't just take my word for it: Thompson eventually tried to make a better UNIX with Plan 9, but more importantly a better language. He, Ritchie, and Pike all agreed every feature in their language. That language was Go: a clone of ALGOL68 and Oberon style with modern changes and updates. Took them a long time to learn the lessons of ALGOL that came years before them.

Gibbon1 122 days ago [-]

What I think is interesting is Intel is adding bounds checking registers to their processors. That should eliminate a lot of the issues people complain about. (Except your programs foot print will be larger due to needing to manage bounds information

https://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20...

https://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler

---

"That's not to mention it's [Pascal family] just poorly designed from a usability standpoint, with the language creators doing silly things like removing "for" loops from Oberon (a decision which both Clojure and Rust eventually admitted was bad).

---

FullyFunctional? 124 days ago [-]

That has no basis in reality. Smalltalk is to implement that one implementation fits in the Smalltalk-80 book.

C got popular because it allowed low-level code to be implemented in something higher-level than assembler and moderately portably too (machines back then where as lot more dissimilar than today). It's hard to disassociate C from Unix, the former making the latter easy to port. A symbiotic relationship.

It's important to remember C was designed for the processors of the time, whereas processors of today (RISC-V included) are arguebly primarily machines to run C. C has brought a lot of good but also a lot of bad that we are still dealing with: unchecked integer overflows, buffer under- and overflow, and more general memory corruption. No ISA since the SPARC even tries to offer support for non-C semantics.


hga 124 days ago [-]

unchecked integer overflows

This is my one big beef with the RISC-V ISA, after I went over it with a fine toothed comb, otherwise it's brilliant to my untutored eyes, and the ISA doc, a paper I read about the same time about how difficult it was to make ISAs super fast, etc. helped explain why, e.g. the VAX ISA took so long to make faster, and was probably doomed, while Intel got lucky? with x86.

Anyway, the large integer math e.g. crypto people are not happy about it, the lack of support for anything other than being able to check for divide by zero

---

kps 123 days ago [-]

The point was that those machines had the benefit of instruction sets co-designed with the language. (And Mesa had strong default type safety, but basically the same memory model as C, allowing pointer arithmetic, null pointers, and dangling pointers.)

pjmlp 123 days ago [-]

Of course Mesa had pointer arithmetic, null and dagling pointers.

Any systems programming language has them.

However there is a difference between having them as an addition to strong type features and them being the only way to program in the language.

For example, using arrays and strings didn't require pointer arithmetic.

In C all unsafe features are in thr face of the programmer. There is no way to avoid them.

In Mesa and other similar languages, those unsafe are there, but programmers only need to use them in a few cases.

C was designed with PDP-11 instruction set in mind. For many years that the C machine model no longer maps to the hardware as many think it does.

---

dbcurtis 124 days ago [-]

There are numerous barriers to optimization that mostly fall into the categories of a) too many operations break continuity of typing and data flow, and b) many small, independent compilation units.

For example: Pointers, casts, unions, casting pointers to different types, and doing all of this across procedure call boundaries all challenge/break data flow analysis. You can pass a pointer to a long into a procedure, and the compiler has no idea what other compilation unit you are linking to satisfy the external call. You could well be casting that pointer-to-long to something totally crazy. Or not even looking at it.

I was associated with a compiler group experimenting whole-program optimization -- doing optimization after a linking phase. This has huge benefits because now you can chase dataflow across procedure calls, which enables more aggressive procedure inlining, which enables constant folding across procedure calls on a per-call-site basis. You might also enable loop-jamming across procedure calls, etc. C really is an unholy mess as far as trying to implement modern optimizations. The compiler simply can't view enough of the program at once in order to propagate analysis sufficiently.

nickpsecurity 124 days ago [-]

Do you know if anybody in CompSci? or industry has put together a paper that has guidelines on this topic? Specifically, what elements of language design make it easier or more difficult to apply various optimization strategies. Just a whole collection of these for language designers to factor into their design to make compiler writers' job easy as possible.

dbcurtis 123 days ago [-]

Unfortunately, I do not, other than reading relevant conference proceedings, and sharing a cubical with the right person. None of this is in the Dragon Book, that much is certainly true.

---

...

Nonetheless, I'll look into any benefits you think C brought us that alternatives wouldn't have done.

optimiz3 124 days ago [-]

Syntax that doesn't require you to type out "begin" and "end" for scoping and ":=" when you mean "=" for assignments are the first things that come to mind.

---

bogomipz 124 days ago [-]

What non-C semantics did SPARC chips offer? I wasn't aware of that.

FullyFunctional? 124 days ago [-]

Tagged add, optionally with overflow trapping, see https://en.wikipedia.org/wiki/SPARC This was inspired by Smalltalk on a RISC and added explicitly for Lisp and Smalltalk. Also in support of this, trapping on unaligned loads.

" Tagged add and subtract instructions perform adds and subtracts on values checking that the bottom two bits of both operands are 0 and reporting overflow if they are not. This can be useful in the implementation of the run time for ML, Lisp, and similar languages that might use a tagged integer format. "

---

toread: " The program at the "Lisp in Lisp page" is the sketch of an interpreter, that I call Lisp-X0 in the following. Lisp-X0 is a "downward circular" interpreter for a Lisp dialect essentially defined by that interpreter itself. This dialect has lexical scope and follows in semantics Common Lisp use. It shall be called Lisp-X1.

The phrase "downward circular" is to be understood as follows:

    The interpreter is circular for Lisp-X1, in that it is described by means of a language it by itself provides.
    But it is "downward-circular" (and not "meta-circular"(!)) in the sense, that it uses only part of the means it provides (as it is written in Lisp-X0). But this is not the whole story: Also the semantics of Lisp-X0 is different from that of Lisp-X1, as Lisp-X0 need not have full closures and can use a stack regime in function calling. This allows functional arguments, but no functional returns.

To execute Lisp-X0 it is possible to regard it as a subset of Common Lisp. But the goal is to have a compiler (comp-0) translating Lisp-X0 into an intermediate language (called X0), which will then be mapped to a concrete, real processor (E0)

At first a E0 emulator in Common Lisp has to be written, but later the processor E0 will be implemented on an FPGA.

Consequently at this stage Lisp-X1 will be provided on that FPGA, but only in an interpreted way. " -- [3]

---

The seL4 microkernel, an open-source proven correct microkernel

implemented in C and in Haskell?

https://news.ycombinator.com/item?id=8101500

deadgrey19 777 days ago

parent favorite on: Sel4: We’re going open source

Source: I have worked at NICTA, with the seL4 team, on the seL4 project, I've seen the seL4 source code and am (was?) a primary author of the user manual.

What they mean by this is that they have specified certain properties at a high level in a logical reasoning language they call HOL. These properties are things like the kernel will never reference a null pointer, or, the kernel will always run the next runable thread, or no application can access the memory of another application, or, a capability invocation will always terminate.

They then wrote a runable version of the kernel in Haskell (a purely functional language) and they have a mechanically checked mathematical proof that the Haskell code implements these features/properties. They then wrote a (nearly entirely) C implementation of the kernel and, under a relatively small set of preconditions, proved that the the C code exactly (no more, no less) implements the Haskell code, which implements these correctness and security properties.

A nasty side effect of this effort is that the C code is very strange, since it is more or less translated Haskell code, and the kernel code must necessarily be VERY small, about 10,000 lines of code, which is practically nothing for an operating system kerenl (it is a micro-kernel in the truest sense of the word). Another side effect is that the API bears almost no resemblance with "previous" versions such as OKL4.

http://ts.data61.csiro.au/projects/TS/l4.verified/proof.pml

some limitations of C_sys [4]:

" Sequential execution is assumed. It is of course desirable to provide a concurrent semantics, as even non-preemptable systems code requires this formal support on SMP systems. However, this goes beyond the C standard and we consider this outside of the scope of the thesis ... Only the standard C control structures are supported. Code may not modify either itself or its execution stack other than through standard language features. We also disallow function pointers. In this thesis we are primarily interested in the verification of systems code responsible for memory management. As a result, we provide a simple and sound model of C that does not feature any extensions such as notifications, continuations or context switches. ... sys eschews some of the more troublesome features of C, such as non-deterministic ordering in expression evaluation. This is not a case of expressing the ordering explicitly, as this is unspecified and not even necessarily consistent in an implementation. Instead, expressions are restricted such that they remain within a syntactic subset of standard C yet have deterministic side effects. These restrictions are described in §2.5. ... where arrays appear as members of other objects or as automatic variables, they must have a constant size ... We can drop unions as first-class types, prohibit them as automatic variables and from being nested inside aggregate types ... Since no union types were present in our case study, they are currently unimplemented in our translation ... we do not describe union s, bit-fields or enum ... "

---

 _chris_ 733 days ago | parent | favorite | on: LowRISC: Open-source RISC-V SoC

RISC-V started off from a modified MIPS ISA, but we frankly ran out of opcode space. We needed 64b, IEEE floating point, and a ton of opcode space to explore new accelerator and vector ISA extensions.

Even the smallest changes to MIPS to clean up things like branch delay slots means it's a new ISA anyways, so you get zero benefit keeping it "mostly MIPS". You can read a bit more about this in the "history" section in the back of the user-level ISA manual.

---

random thing to mb (probably not) read someday:

https://hacks.mozilla.org/2013/05/compiling-to-javascript-and-debugging-with-source-maps/

---

what someone likes about old-school Smalltalk and Lisp compared to modern languages:

 eggy 361 days ago [-]

For one, operating on the running system and seeing immediate changes. I started in 1979 with a Commodore PET 2001, and although limited and not comparable to say a Lisp Machine (envy), it made me feel like I was the 'owner' and could jump into BASIC and program away at the resources in the machine - POKE, PEEK. I guess this gives new meaning to 'poking around' :)

---

eggy 361 days ago [-]

@mark_l_watson: I'm with you. I love Lisp/Scheme, but I find myself always playing with a Smalltalk environment to spite my other forays into programming languages. Right now, I am comparing µO [1] to the Lisp-based openmusic [2].

µO is more code and interfaces with Csound, whereas openmusic is node, or patching like PureData? or Max. I am always blown away and get lost in the onion layers of Squeak.

I like the newer theme vs. the older, colorful gui theme.

   [1]http://www.zogotounga.net/comp/squeak/sqgeo.htm
   [2] http://repmus.ircam.fr/openmusic/home

---

One big difference between programming paradigms is how mutations are represented and encapsulated.

In an imperative language, a mutation is accomplished by giving the computer a command. One form of command is to (re)assign to a variable (and this variable may be aliased).

In a procedural language, these commands are encapsulated in procedures/functions, and you can cause a mutation indirectly by calling a function (the mutation is a 'side-effect' of the function).

In Smalltalk, a mutation is accomplished by sending a message to an object.

In Java, a mutation is accomplished by calling a function, like in procedural languages, but these functions can be encapsulated into methods on objects.

Note: after learning Java and then taking the Smalltalk Pharo tutorial, you can clearly see what Alan Kay is talking about when he says that to him, OOP meant sending messages to objects each of which were their own little computer, in contrast to modern OOP, which is about classes and instances. In Java, the objects do hold state, but in addition to method calls, you can also mutate state with assignment or by calling functions, which feels procedural even though those functions are attached to objects. In Java, a big part of the use of objects is to do polymorphic function dispatch via a syntax 'obj.method' in which the thing to the left of the '.' is a namespace value in which the method 'method' is found (alternately, 'obj' is the first argument to 'method', but it is treated as a special argument because it alone determines dispatch).

In Python, with 'properties', even an assignment to an object's field may actually resolve to be a method call; in fact, even a read might.

In Haskell, a mutation is accomplished indirectly, by giving a function which will be applied to a conceptual symbol representing the world and yielding a conceptual symbol representing the mutated world. Multiple functions of this sort are composed into the program.

---

Out of those, i like (a) the Smalltalk idea (mutation as a message), and (b) the imperative idea, best.

In Oot, i'd like objects to hold state, and i'd like mutations to take the form of messages to those objects, and i'd like object methods to be used only for encapsulated representation-dependent operations on the state held by an object, and i'd like to use non-methods (unattached functions and procedures) otherwise, and i'd like non-object functions to do pure computations unless clearly marked as impure commands (procedures).

The Smalltalk syntax is interesting in its suitability for message sending, i should consider it further.

Note also the (Erlang?) idea that messages can be async events and end up in a 'mailbox' (buffer).

In other words:

---

the Pharo smalltalk idea of being able to have a text file where you highlight some code or place the cursor to the right of that code and type cntl-d or cntl-p or cntl-i to run, run and print, or inspect, is neat (and reminds me of emacs elisp editing). It feels kind of like Hypercard.

---

already copied to plbook:

some complicated kinds of functions calling async: the fn returns a future (and, do you start the fn right away, or lazily?) or a future's generalization, a stream or the fn can callback is the fn an introspectable AST or just a fn? partial failure; is it possible that the function never got the message? is it possible that the function completed successfully but you never got its completion message? (are exceptions possibly raised, or is there an error return argument, or are nullable types used?) could the function not terminate? is there a timeout?

polymorphism dispatch based on first argument, or on multiple arguments? dispatch chosen at compiletime or runtime?

copy of some stuff in syntaxNotes5:

smalltalk vs haskell syntax:

in Smalltalk, "Unary messages are executed first, then binary messages and finally keyword messages" and "Between messages of similar precedence, expressions are executed from left to right" so:

2 raisedTo: 3 + 2. == 2 raisedTo: (3 + 2).

and

-3 abs negated reciprocal. == ((-3 abs) negated) reciprocal.

compare to Haskell:

like Smalltalk, a b c == (a b) c, but the meaning here is reversed because in Smalltalk, the functions are b and c, not a.

in Smalltalk, the keyword messages are loosest binding, the opposite of tightest-binding function application in Haskell.

---

ams6110 1 day ago [-]

In OpenBSD? you'll see stuff like:

warning: stpcpy() is dangerous GNU crap; don't use it

warning: strcpy() is almost always misused, please use strlcpy()

warning: strcat() is almost always misused, please use strlcat()

warning: sprintf() is often misused, please use snprintf()

---

some js (from http://www.cnblogs.com/htoooth/ ):

two ways to flatten (from http://www.cnblogs.com/htoooth/p/5528425.html ):

function flatten(array){ return array.reduce((acc,cur)=> acc.concat(cur),[]); }

var flatten = function (lol){ return [].concat.apply([],lol); }

my notes:

three ways to tr (from http://www.cnblogs.com/htoooth/p/5546646.html )

var str = "TRaduttore"

str.tr( "auioe", "4-103" ) -> "TR4d-tt0r3" str.tr( "aeiouR", ["A","E","I","O","U","r"] ) -> "TrAdUttOrE?"

var hi = "Hello Happy CodeWarriors?"

hi.tr( ["ello","Happy","Wa","ior"],["ola","Felices","Gue","ero"]) -> "Hola Felices CodeGuerreros?" hi.tr( ["ll","pp","rr"], "LPR" ) -> "HeLo? HaPy? CodeWaRiors?"

String.prototype.tr1 = function(fromList, toList) { fromList = fromList

[];
    toList = toList || [];
 
    var pairs = {};
 
    var a = Array.isArray(fromList) ? fromList : fromList.split("");
    var b = Array.isArray(toList) ? toList : toList.split("");
 
    a.forEach((x, i) => pairs[x] = b[i]);
 
    return a.reduce((acc, cur) => acc.replace(new RegExp(cur, "g"), function(x) {
        return pairs[x] + "" || "";
    }), this);
 }

String.prototype.tr = function(from, to) { for (var s = this, i = 0; i < from.length; i++) { s = s.split(from[i]).join(to ? to[i] : ); }

    return s;}

String.prototype.tr = function(from, to) { return (Array.isArray(from) ? from : from.split("")).reduce((acc, cur, i) => acc.split(cur).join(to ? to[i] : ), this); }

my notes:

default_if_not_truthy' is used alot

two ways to deleteNth (should be 'deleteAfterNth' if you ask me)

http://www.cnblogs.com/htoooth/p/5536984.html

function deleteNth(arr,x) { var pairs = {}; return arr.filter(function(n) { if(pairs.hasOwnProperty(n)){ return ++pairs[n] < x? true:false; }else { pairs[n] = 1; return true; } }); }

function deleteNth(arr,x) { var cache = {}; return arr.filter(function(n) { cache[n] = (cache[n]

0) + 1; ~~cache[n] + 1
    return cache[n] <= x;
  });}

my notes:

0) idiom; the return expressions can be unified because conceptually, what they are really trying to do is cache[n] < x anyhow.

Promise.all:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/all

Promise.all waits for all fulfillments (or the first rejection).

var p1 = Promise.resolve(3); var p2 = 1337; var p3 = new Promise((resolve, reject) => { setTimeout(resolve, 100, "foo"); });

Promise.all([p1, p2, p3]).then(values => { console.log(values); [3, 1337, "foo"] });

to ways to walk a file directory with promises:

http://www.cnblogs.com/htoooth/p/5528318.html

function walk(dir, ext, callback) { ext = ext.charAt(0) === "." ? ext : `.${ext}`;

    fs.readdir(dir, (err, files) => {
        files.forEach(f => {
            fs.lstat(path.join(dir, f), (err, st) => {
                if (st.isDirectory()) {
                    walk(path.join(dir, f), ext, callback);
                } else {
                    if (path.extname(f) === ext) {
                        callback(null, f);
                    }
                }
            })
        });
    });}

function walk(dir) {

    let readDirAsync = Promise.promisify(fs.readdir);
    let lstatAsync = Promise.promisify(fs.lstat);
 
    return readDirAsync(dir).then(files => {
 
        return Promise.all(files.map(f => {
            let file = path.join(dir, f);
 
            return lstatAsync(file).then(stat => {
                if (stat.isDirectory()) {
                    return walk(file);
                } else {
                    return [file];
                }
            });
        }));
    }).then(files => {
        return files.reduce((pre, cur) => pre.concat(cur));
    }); }

walk("~/home").then(x => console.log(x));

my notes:

http://www.cnblogs.com/htoooth/p/5528304.html

seven(times(five())); must return 35 four(plus(nine())); must return 13 eight(minus(three())); must return 5 six(dividedBy(two())); must return 3

function zero() { if(arguments.length === 0){ return 0; } else { var result = arguments[0]; return result.op(0); } } ... function plus() { return { op:function(left){ return left + this.right; }, right:arguments[0] } }

var n = function(digit) { return function(op) { return op ? op(digit) : digit; } }; var zero = n(0); ... function plus(r) { return function(l) { return l + r; }; }

my notes:

---