Bayle Shanks's website: proj-plbook-plPartThoughts

Table of Contents for Programming Languages: a survey

Part : Various people's random ideas on the nature and history and future of programming languages

history

Lambda the Ultimate discussion: "What Are The Resolved Debates in General Purpose Language Design?" http://lambda-the-ultimate.org/node/3166

The next big thing

"We all thought that the next level of programming language would be much more strategic and even policy-oriented and would have much more knowledge about what it was trying to do." -- Alan Kay, http://queue.acm.org/detail.cfm?id=1039523

http://www.st.cs.uni-saarland.de/edu/seminare/2005/advanced-fp/docs/sweeny.pdf

"The future of computing depends on parallelism (for efficiency), distribution (for scale), and verification (for quality)." -- http://existentialtype.wordpress.com/2011/04/09/persistence-of-memory/

" most successful languages have had some pretty serious corporate backing ... NBL is garbage collected. ... Rule #2: Dynamic typing with optional static types. Rule #3: Performance ... 'It turns out that eval() is one of the key things that gets in the way of performance optimizations. It can easily invalidate all sorts of otherwise reasonable assumptions about variable bindings, stack addresses and so on. It's also pretty important, so you can't just get rid of it. So NBL will have to do something clever about eval. ...

Rule #4: Tools ... Here's a short list of programming-language features that have become ad-hoc standards that everyone expects:

    Object-literal syntax for arrays and hashes
    Array slicing and other intelligent collection operators
    Perl 5 compatible regular expression literals
    Destructuring bind (e.g. x, y = returnTwoValues())
    Function literals and first-class, non-broken closures
    Standard OOP with classes, instances, interfaces, polymorphism, etc.
    Visibility quantifiers (public/private/protected)
    Iterators and generators
    List comprehensions
    Namespaces and packages
    Cross-platform GUI
    Operator overloading
    Keyword and rest parameters
    First-class parser and AST support
    Static typing and duck typing
    Type expressions and statically checkable semantics
    Solid string and collection libraries
    Strings and streams act like collections

Additionally, NBL will have first-class continuations and call/cc. I hear it may even (eventually) have a hygienic macro system, although not in any near-term release.

Not sure about threads. I tend to think you need them, although of course they can be simulated with call/cc. I've also noticed that languages with poor threading support tend to use multiprocessing, which makes them more scalable across machines, since by the time you've set up IPC, distributing across machines isn't much of an architectural change. But I think threads (or equivalent) are still useful. Hopefully NBL has a story here. ....a truly great language would support Erlang-style concurrency, would have a simpler syntax and a powerful macro system, and would probably have much better support for high-level declarative constructs, e.g. path expressions, structural dispatch (e.g. OCaml's match ... with statement) and query minilanguages. "

-- http://steve-yegge.blogspot.com/2007/02/next-big-language.html

"While I'd prefer something semantically and syntactically clean and beautiful like Ioke, my real concerns center around performance, libraries, expressivity, and community. If a language is too lacking in one of those, I'll look somewhere else." -- http://lebo.io/2015/03/02/steve-yegges-next-big-language-revisited.html

"...don't rely on a VM, statically typed and safe...The language that succeeds in striking the right balance between minimalism and features with an excellent support for concurrency will be the next big thing. " -- [1]

Big vs. small languages

" AK In a history of Smalltalk I wrote for ACM, I characterized one way of looking at languages in this way: a lot of them are either the agglutination of features or they’re a crystallization of style. Languages such as APL, Lisp, and Smalltalk are what you might call style languages, where there’s a real center and imputed style to how you’re supposed to do everything. Other languages such as PL/I and, indeed, languages that try to be additive without consolidation have often been more successful. I think the style languages appeal to people who have a certain mathematical laziness to them. Laziness actually pays off later on, because if you wind up spending a little extra time seeing that “oh, yes, this language is going to allow me to do this really, really nicely, and in a more general way than I could do it over here,” usually that comes back to help you when you’ve had a new idea a year down the road. The agglutinative languages, on the other hand, tend to produce agglutinations and they are very, very difficult to untangle when you’ve had that new idea.

Also, I think the style languages tend to be late-binding languages. The agglutinative languages are usually early-binding. That makes a huge difference in the whole approach. The kinds of bugs you have to deal with, and when you have to deal with them, is completely different." -- Alan Kay, http://queue.acm.org/detail.cfm?id=1039523

Type systems

" Some people are completely religious about type systems and as a mathematician I love the idea of type systems, but nobody has ever come up with one that has enough scope. If you combine Simula and Lisp—Lisp? didn’t have data structures, it had instances of objects—you would have a dynamic type system that would give you the range of expression you need.

It would allow you to think the kinds of thoughts you need to think without worrying about what type something is, because you have a much, much wider range of things. What you’re paying for is some of the checks that can be done at runtime, and, especially in the old days, you paid for it in some efficiencies. Now we get around the efficiency stuff the same way Barton did on the B5000: by just saying, “Screw it, we’re going to execute this important stuff as directly as we possibly can.” We’re not going to worry about whether we can compile it into a von Neumann computer or not, and we will make the microcode do whatever we need to get around these inefficiencies because a lot of the inefficiencies are just putting stuff on obsolete hardware architectures. " -- Alan Kay, http://queue.acm.org/detail.cfm?id=1039523

"I'm not against types, but I don't know of any type systems that aren't a complete pain, so I still like dynamic typing." -- Alan Kay, http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en

Metaprogramming

" But the flip side of the coin was that even good programmers and language designers tended to do terrible extensions when they were in the heat of programming, because design is something that is best done slowly and carefully.

SF And late-night extensible programming is unsupportable.

AK Exactly. So Smalltalk actually went from something that was completely extensible to one where we picked a syntax that allowed for a variety of forms of what was fixed, and concentrated on the extensibility of meaning in it.

This is not completely satisfactory. One of the things that I think should be done today is to have a fence that you have to hop to forcibly remind you that you’re now in a meta-area—that you are now tinkering with the currency system itself, you are not just speculating. But it should allow you to do it without any other overhead once you’ve crossed this fence, because when you want to do it, you want to do it. "

Alexey Radul's Propagation Networks

" I suggest that we can build general-purpose computation on propagation of information through networks of stateful cells interconnected with stateless autonomous asynchronous computing elements. ... a cell should not be seen as storing a value, but as accumulating information about a value. The cells should never forget information ...except when the system can prove that particular things will never matter again. This is analogous to garbage collection in modern memory-managed systems.... -- such monotonicity prevents race conditions in the behavior of the network. Monotonicity of information need not be a severe restriction: for example, carrying reasons for believing each thing makes it possible to explore but thenpossibly reject tentative hypotheses, thus appearing to undo something, while maintaining monotonicity. ... The key idea of propagating mergeable, partial information allows propagation to be used for general-purpose computation. ... high-level languages follow the expression paradigm, assembly languages follow another paradigm. The assembly language paradigm is a loop that executes instructions in sequence (and some instructions interrupt the sequence and cause the executor to jump somewhere else)...the only difference between instructions and expressions is that instructions are all atomic...expressions generalize instructions...propagation subsumes evaluation (of expressions) the same way that evaluation subsumes execution (of instructions). " -- http://dspace.mit.edu/handle/1721.1/49525

The "stateless autonomous asynchronous computing elements" are called "propagators".

Each individual cells updates atomically, only in the sense that, if inconsistent states are physically possible, then each cell has a local lock that prevents any connected propagators from observing it in an inconsistent state.

Propagators run until steady-state

Reminiscent of the Glitch concurrency framework, (todo; i think:) each run of a propagator means running its program as many times as necessary, until it stops mutating its connected cells.

Note that this means that a system implementing Radul's system probably is explicitly notified by each cell upon each state change.

Cells Accumulate Information

"The unquestioned assumption of previous generations of propagation systems has been that cells hold values...I propose, in contrast, that we should think of a cell as a thing that accumulates information about a value. The information can perhaps be incomplete: some propagator may tell a cell something that is on the one hand useful but on the other hand does not determine that cell’s value unconditionally...Accumulating information is a generalization of the idea of holding values, because a value can always be interpreted as the information that says “I know exactly what this value is, and it is x;” and the absence of a value can be interpreted as information that says “I know absolutely nothing about what this value is.”

Radul's framework does not specify exactly what sort of partial information each cell holds; it is a generic framework; however particular instantiations of this framework would have to choose this.

"each cell must be able to accept and merge all the contributions of all the propagators that might write to it."

note that 'accumulating information' does not necessarily mean 'accumulating a list of every message sent to the cell by every propagator' (depending on what sort of partial information cells hold in a particular instantiation). For example, if the cell contains the information "my value might be 1, 2, or 5", and a propagator sends the message "your value is not 3", the receipt of that message does not alter the content in the cell (because it is redundant with what the cell already knows).

related probabalistic programming systems

As of this writing, Radul currently works on Venture:

http://probcomp.csail.mit.edu/venture/

I don't know if Venture is closely related to Radul's propagator networks, or if this is just where Radul happens to be currently employed.

On probabalistic programming as a paradigm

The Next 7000 Programming Languages by Robert Chatley, Alastair Donaldson, Alan Mycroft

---

https://lukeplant.me.uk/blog/posts/less-powerful-languages/

"Every increase in expressiveness brings an increased burden on all who care to understand the message." -- [2] via [3]

"The problem with this kind of freedom is that every bit of power you insist on having when writing in the language corresponds to power you must give up at other points of the process — when ‘consuming’ what you have written...Different players who might ‘consume’ the message are software maintainers, compilers and other development tools..."

"In my years of software development, I’ve found that clients and users often ask for “free text” fields, often for “notes”. A free text field is maximally powerfully as far as the end user is concerned — they can put whatever they like in. In this sense, this is the “most useful” field — you can use it for anything.

But precisely because of this, it is also the least useful, because it is the least structured. Even search doesn’t work reliably because of typos and alternative ways of expressing the same thing. "

" In terms of database technologies, the same point can be made. Databases that are “schema-less” give you great flexibility and power when putting data in, and are extremely unhelpful when getting it out. A key-value store is a more technical version of “free text”, with the same drawbacks — it is pretty unhelpful when you want to extract info or do anything with the data, since you cannot guarantee that any specific keys will be there. "

"HTML

The success of the web has been partly due to the fact that some of the core technologies, HTML and CSS, have been deliberately limited in power. Indeed, you probably wouldn’t call them programming languages, but markup languages. "

"the less powerful the language, the more you can do with the data stored in that language. If you write it in a simple declarative form, anyone can write a program to analyze it in many ways." [4]

" This is has become a W3C principle:

    Good Practice: Use the least powerful language suitable for expressing information, constraints or programs on the World Wide Web.

Note that this is almost exactly the opposite of Paul Graham’s advice (with the caveat that ‘power’ is often too informally defined to compare):

    if you have a choice of several languages, it is, all other things being equal, a mistake to program in anything but the most powerful one.
    "

" Python setup.py MANIFEST.in file

Moving up towards ‘proper’ programming language, I came across this example — the MANIFEST.in file format used by distutils/setuptools. If you have had to create a package for a Python library, you may well have used it.

The file format is essentially a very small language for defining what files should be included in your Python package (relative to the MANIFEST.in file, which we’ll call the working directory from now on). It might look something like this:

include README.rst recursive-include foo *.py recursive-include tests * global-exclude *~ global-exclude *.pyc prune .DS_Store

There are two types of directive: include type directives (include, recursive-include, global-include and graft), and exclude type directives (exclude, recursive-exclude, global-exclude and prune).

There comes a question — how are these directives to be interpreted (i.e. what are the semantics)?

You could interpret them in this way:

    A file from the working directory (or sub-directories) should be included in the package if it matches at least one include type directive, and does not match any exclude type directive.

This would make it a declarative language.

Unfortunately, that is not how the language is defined. The distutils docs for MANIFEST.in are specific about this — the directives are to be understood as follows (my paraphrase):

    Start with an empty list of files to include in the package (or technically, a default list of files).
    Go down the directives in the MANIFEST.in in order.
    For every include type directive, copy any matching files from the working directory to the list for the package.
    For every exclude type directive, remove any matching files from the list for the package.

As you can see, this interpretation defines a language that is imperative in nature — each line of MANIFEST.in is a command that implies an action with side effects.

The point to note is that this makes the language more powerful than my speculative declarative version above. For example, consider the following:

recursive-include foo * recursive-exclude foo/bar * recursive-include foo *.png

The end result of the above commands is that .png files that are below foo/bar are included, but all other files below foo/bar are not.

...

because the imperative language is more powerful, there is a temptation to prefer that one. However, the imperative version comes with significant drawbacks:

    It is much harder to optimise.

...

harder to understand ... "

" URL reversing

One core piece of the Django web framework is URL routing. ... In Django, this is done using regular expressions. ...

Now, as well as being able to route URLs to specific functions, web apps almost always need to generate URLs. For example, the kitten list page will need to include links to the individual kitten page i.e. show_kitten. Obviously we would like to do this in a DRY way, re-using the URL routing configuration.

However, we would be using the URL routing configuration in the opposite direction. ... In the very early days Django did not include this facility, but it was found that with most URLs, it was possible to ‘reverse’ the URL pattern. ... this is only possible at all because the language being used to define URL routes is a limited one — regular expressions. We could easily have defined URL routes using a more powerful language. ... The downside, however, is that URL reversing would be entirely impossible. For general, Turing complete languages, you cannot ask “given this output, what is the input?”. We could potentially inspect the source code of the function and look for known patterns, but it quickly becomes totally impractical. ... With regular expressions, however, the limited nature of the language gives us more options. In general, URL configuration based on regexes is not reversible — a regex as simple as . cannot be reversed uniquely. ... But as long as wild cards of any sort are only found within capture groups (and possibly some other constraints), the regex can be reversed. "

" Additionally, in Python defining mini-languages for this kind of thing is quite hard, and requires a fair amount of boilerplate and verbosity both for implementation and usage — much more than when using a string based language like regexes. In languages like Haskell, relatively simple features like easy definitions of algebraic data types and pattern matching make these things much easier. "

" Django templates vs Jinja templates

The Jinja template engine was inspired by the Django template language, but with some differences in philosophy and syntax.

One major advantage of Jinja2 over Django is that of performance. Jinja2 has an implementation strategy which is to compile to Python code, rather than run an interpreter written in Python, which is how Django works, and this results in a big performance increase — often 5 to 20 times. (YMMV etc.)

Armin Ronacher, the author of Jinja, attempted to use the same strategy to speed up Django template rendering. There were problems, however.

The first he knew about when he proposed the project — namely that the extension API in Django makes the approach taken in Jinja very difficult. Django allows custom template tags that have almost complete control over the compilation and rendering steps. This allows some powerful custom template tags like addtoblock in django-sekizai that seems impossible at first glance. However, if a slower fallback was provided for these less common situations, a fast implementation might still have been useful.

However, there is another key difference that affects a lot of templates, which is that the context object that is passed in (which holds the data needed by the template) is writable within the template rendering process in Django. Template tags are able to assign to the context, and in fact some built-in template tags like url do just that.

The result of this is a key part of the compilation to Python that happens in Jinja is impossible in Django.

Notice that in both of these, it is the power of Django’s template engine that is the problem — it allows code authors to do things that are not possible in Jinja2. However, the result is that a very large obstacle is placed in the way of attempts to compile to fast code.

This is not a theoretical consideration. At some point, performance of template rendering becomes an issue for many projects, and a number have been forced to switch to Jinja because of that. This is far from an optimal situation! "

" Python

There are many ways that we could think about the power of the Python language, and how it makes life hard for every person and program that wants to make sense of Python code.

Compilation and performance of Python is an obvious one. The unrestricted effects that are possible at any point, including writable classes and modules etc., not only allow authors to do some very useful things, they make it extremely difficult to execute Python code quickly. ... However, rather than focus on the performance problems of Python, I’m going to talk about refactoring and maintenance. ... So, for example, in Python, and with typical VCS tools (Git or Mercurial, for instance), if you re-order functions in a module e.g. move a 10 line function to a different place, you get a 20 line diff, despite the fact that nothing changed in terms of the meaning of the program. And if something did change (the function was both moved and modified), it’s going to be very difficult to spot.

This happened to me recently, and set me off thinking just how ridiculously bad our toolsets are. Why on earth are we treating our highly structured code as a bunch of lines of text? I can’t believe that we are still programming like this, it’s insane!

At first, you might think that this could be solved with a more intelligent diff tool. But the problem is that in Python, the order in which functions are defined can in fact change the meaning of a program (i.e. change what happens when you execute it).

Here are a few examples:

Using a previously defined function as a default argument:

def foo(): pass

def bar(a, callback=foo): pass

These functions can’t be re-ordered or you’ll get a NameError? for foo in the definition of bar.

Using a decorator:

@decorateit def foo(): pass

@decorateit def bar(): pass

Due to unrestricted effects that are possible in @decorateit, you can’t safely re-order these functions and be sure the program will do the same thing afterwards. Similarly, calling some code in the function argument list:

def foo(x=Something()): pass

def bar(x=Something()): pass

Similarly, class level attributes can’t be re-ordered safely:

class Foo(): a = Bar() b = Bar()

Due to unrestricted effects possible inside the Bar constructor, the definitions of a and b cannot be re-ordered safely. (This might seem theoretical, but Django, for instance, actually uses this ability inside Model and Form definitions to provide a default order for the fields, using a cunning class level counter inside the base Field constructor).

Ultimately, you have to accept that a sequence of function statements in Python is a sequence of actions in which objects (functions and default arguments) are created, possibly manipulated, etc. It is not a re-orderable set of function declarations as it might be in other languages.

This gives Python an amazing power when it comes to writing it, but imposes massive restrictions on what you can do in any automated way to manipulate Python source code.

Above I used the simple example of re-ordering two functions or class attributes. But every single type of refactoring that you might do in Python becomes virtually impossible to do safely because of the power of the language e.g. duck typing means you can’t do method renames, the possibility of reflection/dynamic attribute access (getattr and friends) means you can’t in fact do any kind of automated renames (safely).

So, if we are tempted to blame our crude VCS or refactoring tools, we actually have to blame the power of Python — despite the huge amount of structure in correct Python source code, there is very little that any software tool can do with it when it comes to manipulating it, and the line-based diffing that got me so mad is actually a reasonable approach.

Now, 99% of the time, we don’t write Python decorators which mean that the order of function definitions makes a difference, or silly things like that — we are responsible “adults”, as Guido put it, and this makes life easier for human consumers. But the fact remains that our tools are limited by what we do in the 0.01% of cases. "

---

" chriswarbo 1 day ago [–]

> I tend to think that languages aren't that important.

> switching languages is no fast path towards solving hard problems

If you're talking about switching from one mainstream, general purpose language to another, then I mostly agree. Some tradeoffs can be definite wins for particular domains, e.g. for Web development Go is almost always a better choice than C (in terms of memory management, string handling, buffer overflows, etc.). Other choices can just shift problems around, e.g. in Python it's easy to get something running, but it requires a lot of testing to avoid errors.

However, when it comes to domain-specific, non-general-purpose languages I strongly disagree! There are certain problems that are incredibly difficult to solve (often undecidable) in the context of some generic language; which become trivial when the language is designed with that problem in mind.

For example:

Incremental computation https://inc-lc.github.io
Resource handling https://en.wikipedia.org/wiki/Substructural_type_system
Client/server/DB consistency https://en.wikipedia.org/wiki/Ur_(programming_language)
Ruling out unwanted effects https://deno.land/manual@v1.0.0/getting_started/permissions https://en.wikipedia.org/wiki/Purely_functional_programming
Hard realtime timing https://en.wikipedia.org/wiki/Esterel

There are a load more examples on sites like http://lambda-the-ultimate.org

Such languages certainly introduce a bunch of problems, like lack of libraries, but they might be the difference between something being difficult, or it being impossible.

Another thing to keep in mind with domain-specific languages is that it's often useful to "embed" them inside some other, general-purpose language. This lets a DSL avoid the need for its own parsing, tooling, etc. For example https://wiki.haskell.org/Embedded_domain_specific_language#E... https://beautifulracket.com/appendix/domain-specific-languag...

" -- https://news.ycombinator.com/item?id=25093788

---

" For over a decade now, I've been pondering a "perfect" language.

Of course, such a thing is impossible, because we're always learning, and we can always do better, but it leads to some interesting avenues of thought. We can claim that certain concepts are mandatory, which can be used to reject languages that can never meet our ultimate requirements.

For example, a common complaint is the "impedance mismatch" between, say, query languages such as SQL or MDX, and object oriented languages such as Java and C#. Similarly, JSON has become popular precisely because it has zero "mismatch" with JavaScript? -- it is a nearly perfect subset.

This leads one to the obvious conclusion: An ideal language must have subsets, and probably quite a few. If it doesn't, some other special-purpose language would be forced on the developers, and then our ideal language isn't perfect any more!

The way I envision this is that the perfect language would have a pure "data" subset similar to JSON or SQL tables, with absolutely no special features of any type.

The next step up is data with references.

Then data with references and "expression" code that is guaranteed to terminate.

Then data with expressions and pure functions that may loop forever, but have no side-effects of any kind.

At some point, you'd want fully procedural code, but not unsafe code, and no user-controlled threads.

At the full procedural programming language level, I'd like to see a Rust-like ownership model that's optional, so that business glue code can have easy-to-use reference types as seen in C# and Java, but if a library function is called that uses ownership, the compiler takes care of things. You get the performance benefit where it matters, but you can be lazy when it doesn't matter.

Interestingly, C# is half-way there now with their Span and Memory types, except that this was tacked on to an existing language and virtual machine, so the abstractions are more than a bit leaky. Unlike Rust, where the compiler provides safety guarantees, there are a lot of warnings-to-be-heeded in the dotnet core doco.

TL;DR: We don't need separate simplified languages, we need powerful languages with many simpler subsets for specialist purposes. " -- jiggawatts

“The power of type theory arises from its strictures, not its affordances, in direct opposition to the ever-popular language design principle “first-class x” for all imaginable values of x.” -- Bob Harper

(replying to a comment quoting Bob Harper's comment that “The power of type theory arises from its strictures, not its affordances, in direct opposition to the ever-popular language design principle “first-class x” for all imaginable values of x.”

colanderman on Nov 15, 2015 [–]

Completely agree. People think I'm weird that I think functions being first-class is a dumb idea.

But think about it: how often do you need to perform some complicated algorithm on a function, or store a dynamically generated function in a data structure? Almost always, first-class functions are simply used as a means of genericizing or parameterizing code. (e.g. as arguments to `map` or `fold`, or currying.) (Languages like Haskell and Coq that are deeply rooted in the lambda calculus are a notable exception to this; it's common to play interesting games with functions in these languages.)

You can get the same capabilities by making functions second-class objects, with a suitable simple (non-Turing-complete) language with which to manipulate them. That language can even be a subset of the language for first-class objects: the programmer is none-the-wiser unless he/she tries to do something "odd" with a function outside the bounds of its restricted language. Generally, there is a clearer, more efficient way to express whatever it is they are trying to do.

There is some precedent for this. In Haskell, typeclass parameters live alongside normal parameters, but aren't permitted to be used in normal value expressions. In OCaml, modules live a separate second-class life but can be manipulated in module expressions alongside normal code. In Coq, type expressions can be passed as arguments, but live at a different strata and have certain restrictions placed on them.

Unfortunately designing languages like this is hard. It's easy to just say "well, functions are a kind of value in the interpreter I just wrote; let's make them a kind of value in the language". This is the thinking behind highly dynamic languages like JavaScript?, Python, and Elixir: the language is modeled after what is easy to do in the interpreter without further restriction. The end result is a language that is difficult to optimize and analyze.

It's a lot more work to plan out "well, I ought to stratify modules, types, functions, heterogenous compounds, homogeneous componds, and scalars, because it will permit optimizations someday". But these are the languages that move entire industries.

catnaroek on Nov 15, 2015 [–]

In general, language design requires a balance between first-class vs. second-class objects. First-class objects are important because they let you write the program you want. Second-class objects are also important because they let you write programs that can be reasoned about.

For instance, Haskell's type language (without compiler-specific extensions) has no beta reduction, because only type constructors can appear not fully applied in type expressions anyway. This reduces its “raw expressive power” (w.r.t. System F-ω, whose type level is the simply typed lambda calculus), but it makes inference decidable, which helps machines help humans. It also helps humans in a more direct fashion: Haskellers often infer a lot about their programs just from their type signatures, that is, without having to compute at all. It's no wonder that Haskellers love reasoning with types - it's literally less work than reasoning with values.

So I agree with you that “first-class everything” isn't a good approach to take. A language with first-class everything is a semantic minefield: Nothing is guaranteed, sometimes not even the runtime system's stability. (I'm looking at you, Lisp!)

---

But, on the specific topic of first-class functions, you'll pry them from my cold dead hands. Solving large problems by composing solutions to smaller, more tractable problems, is a style that's greatly facilitated by first-class functions, and I'm not ready to give it up.

---

replying to an argument for restricted/stratified/subturing languages:

" This topic seems less about power and more about invariants. I could imagine a powerful, fully expressive language that does a better job of explicitly stating its invariants.

In the `urlpatterns` example,

  urlpatterns = [
      url([m('kitten/'), views.list_kittens, name='kittens_list_kittens'),
      url([m('kitten/'), c(int)], views.show_kitten, name="kittens_show_kitten"),
  ]

I could see `m` and `c` being used as a more-specific-than-regex DSL specifically for routes. Reversing routes would be even easier because of the inherent limitations of using the DSL - you have to pass a string to `m`, you have to pass a type (and a variable, ideally, so it would be `m('kitten/'), c("id", int)` or just `m('kitten/'), id()`.

The lesson I gain is "figure out the implicit invariants from your design decisions, make them explicit and try to keep as many of them as possible, only letting go once you have no other choice." " -- AlexeyMK

" Rule of least expressiveness:

    When programming a component, the right computation model for the component is the least expressive model that results in a natural program." -- https://mitpress.mit.edu/books/concepts-techniques-and-models-computer-programming via [5]

---

" For 15 years or so I have been trying to think of how to write a compiler that really produces top quality code. For example, mostof the Mix programs in my books are considerably more efficient than any of today’s most visionary compiling schemes would be able to produce. I’ve tried to study the various techniques that a hand-coder like myself uses, and to fit them into some systematic and automatic system. A few years ago, several students and I looked at a typical sample of FORTRAN programs [51], and we all tried hard to see how a machine could produce code that would compete with our best hand-optimized object programs. We found ourselves always running up against the same problem: the compiler needs to be in a dialogwith the programmer; it needs to know properties of the data, and whether certain cases can arise,etc. And we couldn’t think of agood language in which to have such a dialog. For some reason we all (especially me) had a mental block about optimization, namely that wealways regarded it as a behind-the-scenes activity, to be done in the machine language, whichthe programmer isn’t supposedto know. This veil was first lifted from my eyes: : :when I ran across a remark by Hoare[42] that, ideally, a language should be designed so that anoptimizing compiler can describe its optimizations in the source language. Of course!: : : The time is clearly ripefor program-manipulation systems: : :The programmer using such a system will write his beautifully-structured, but possibly inefficient, program P; then he will interactively specify transformations that make it efficient. Such a system will be much more powerful and reliable than a completely automatic one.: : :As I say, this idea certainly isn’t my own; it is so exciting I hope that everyone soon becomes aware of its possibilities. " [6] -- The death of optimizing compilers, Daniel J. Bernstein

---

in section

" In the integrated circuit world, there is Moore’s law, which says that the number of transistors on the circuit doubles every 18 months.

In the optimizing compiler world, people like to mention Proebsting’s law to show how hard it is to improve the performance of generated code. Proebsting’s law says that optimizing compiler developers improve generated code performance by two times every 18 years.

Actually, this “law” is too optimistic, and I never saw that anyone tested this. (It is hard to check because you need to wait 18 years.) So, recently I checked it on SPEC CPU2000. SPEC CPU is one of the most credible benchmark suites used by optimizing compiler developers. It contains a set of CPU-intensive applications from the real world; for example, a specific version of GCC is one of the benchmarks. There are many versions of SPEC CPU: 2000, 2006, and 2017. I used the 2000 version because it does not require days to run it. I used the 17-year-old GCC-3.1 compiler (the first GCC version supporting AMD64) and the recent version GCC-8 in their peak performance modes on the same Intel i7-4790K processor:

	                    GCC 3.1 (-O3) 	GCC 8 (-Ofast -flto -march=native)SPECInt2000 w/o eon 	4498 5212 (+16%)

The actual performance improvement is only 16%, which is not close to the 100% Proebsting’s law states.

Looking at the progress of optimizing compilers, people believe we should not spend our time on their development at all. Dr. Bernstein, an author of the SipHash? algorithm used in CRuby, expressed this point of view in one of his talks.

There is another point of view that we should still try to improve the performance of generated code. By some estimates, in 2013 (before active cryptocurrency mining) computers consumed 10% of all produced electricity. Computer electricity consumption in the year 2040 could easily equal the total electricity production from the year 2016.

A one percent compiler performance improvement in energy-proportional computing (an IT industry goal) means saving 25 terawatt-hours (TWh) out of the 25,000 TWh annual world electricity production.

Twenty-five terawatt-hours is equal to the average annual electricity production of six Hoover dams. " Vladimir Makarov, section "The reality of optimizing compiler performance"

The linked Daniel Bernstein (djb) talk argues that:

as computers get faster, the volume of data processing that users demanded increased
which meant that the fraction of processing time not in inner loops decreased
which means that a huge amount of processing time is spent in hot spots
some say that optimizing compilers are better than hand-tuned implementations, but it's not true
you can divide code into 3 types: fast code written by human low-level experts, mediocre code written by optimizing compilers, slow code written by non-optimizing compilers (or compilers in non-optimizing mode)
the need for mediocre code is fading; code tends to either be in a hot spot, in which case a human expert can do better, or not, in which case you don't need optimizing compilers.
optimizing compilers (in their current form) will not ever be able to match human experts; human experts achieve their superior tuning by taking advantage of domain-specific invariants and assumptions.
for optimizing compilers to contain all the known domain-specific algorithms for common domains would make them much more complicated than they are now, and they would therefore become unreliable
a future path for a new kind of optimizing compiler may be to extend high-level languages so as to be able to represent these domain-specific assumptions

---

" I personally have found FP hard to debug. The "naked state" of imperative programming is actually helpful when debugging, because the context is laid out on a nice little e-desk to examine. With FP you mostly have to open and inspect one drawer at a time. " -- [7]

---

"the standard library is where good code goes to die" -- attributed to Russ Cox (but I can't find the source)

---

" Complaints about 1-based indexing for arrays

Mentioned in this comment. I often see this kind of complain from people who think that the C way of doing things is the only valid way. It would be great if people spent more time thinking about why use a 0-based index instead. C uses such indexes because the value is actually a multiplier that you can use to find the memory offset for the location of the data. If you have an array, and thus a pointer to the starting position for that data in memory, and you know the size of the array elements, you can multiply the size by the index, sum with the pointer value, and get the data that corresponds to the element referenced by that index. That is it. Indexes are zero based to help you find stuff in memory when you have a pointer which is basically all that C gives you.

Using 1 as the initial position for an array makes sense when you’re talking like a human. When you and your friends are in a queue waiting for your hotdogs, you might say to them that you are first in line. I doubt you’ll say you are zeroth in line. We usually start to count from one. When a baby completes its first roundtrip around the Sun, they’re one year old, not zero years old. Also, 1-based indexing help when you are doing iterators, you don’t need those hacks such as i-1 or counting til i<total. You can count from 1 to the total and be fine, like a normal human would do. If I remember correctly, this was a Pascal convention. Pascal is a language that deserves a ton of love IMHO. " -- [8]

"The PostgreSQL? docs even has a snarky comment on this by the way:

    The first century starts at 0001-01-01 00:00:00 AD, although they did not know it at the time. This definition applies to all Gregorian calendar countries. There is no century number 0, you go from -1 century to 1 century. If you disagree with this, please write your complaint to: Pope, Cathedral Saint-Peter of Roma, Vatican." -- [9] ---

---

"The best of intentions really don’t matter. If something can syntactically be entered incorrectly, it eventually will be. And that’s one of the reasons why I’ve gotten very big on the static analysis, I would like to be able to enable even more restrictive subsets of languages and restrict programmers even more because we make mistakes constantly." -- John Carmack, 2012

---

https://nibblestew.blogspot.com/2020/03/its-not-what-programming-languages-do.html by Jussi Pakkanen

Nibble Stew

It's not what programming languages do, it's what they shepherd you to

" Perl shepherds you into using regexps ... Make shepherds you into embedding shell pipelines in Makefiles ... C shepherds you into manipulating data via pointers rather than value objects. C++ shepherds you into providing dependencies as header-only libraries. Java does not shepherd you into using classes and objects, it pretty much mandates them. Turing complete configuration languages shepherd you into writing complex logic with them, even though they are usually not particularly good programming environments. "

---

Chapter : Neural architecture

for now see jasperBrain.txt, todo move most of that here

---

Chapter: Misc Advice

misc/todo

http://dreamsongs.com/Files/PatternsOfSoftware.pdf

"...a typical compiler IR is an SSA form, which it then maps into some fixed-number-of-registers encoding and the register rename engine then tries to re-infer an SSA representation. It feels as if there really ought to be a better intermediate between two SSA forms but so far every attempt that I’m aware of has failed." -- David Chisnall (he is talking about the register rename engine in the hardware CPU implementation)

" The rule of thumb for C/C++ code is that you get, on average, one branch per 7 instructions. This was one of the observations that motivated some of the RISC I and RISC II design and I’ve had students verify experimentally that it’s more or less still true today. This means that the most parallelism a compiler can easily extract on common workloads is among those 7 instructions. A lot of those have data dependencies, which makes it even harder. In contrast, the CPU sees predicted and real branch targets and so can try to extract parallelism from more instructions. nVidia’s Project Denver cores are descendants of TransMeta’s? designs and try to get the best of both worlds. They have a simple single-issue Arm decoder that emits one VLIW instruction per Arm instruction (it might do some trivial folding) and some software that watches the in instruction stream and generates optimised traces from hot code paths. Their VLIW design is also interesting because each instruction is offset by one cycle and so you could put a bunch of instructions with data dependencies between them into the same bundle. This makes it something more like a VLIW / EDGE hybrid but the software layer means that you can avoid all of the complicated static hyperblock formation problems that come from EDGE architectures and speculatively generate good structures for common paths that you then roll back if you hit a slow path. " -- David Chisnall

---

ways to improve safety:

" Memory safety...Garbage collection...Concurrency...Static analysis ... no false positives (eg RV-Match, Astree), ... false positives (eg PVS-Check, Coverity)...Test generators. Path-based, combinatorial, fuzzing… many methods " -- nickpsecurity

---

"As a baseline, a non optimizing compiler for a simple language should be able to do 1 million lines of code per second. Of course, most languages are not simple.

Incremental compilation is an essential operation. I probably do it more than 100 times a day. Just like editor responsiveness, it can almost not be fast enough, and if it takes too long it can bring me out of the flow. I would say that over 0.1 seconds any speed improvement is welcome. More than 3 seconds is definitely a nuisance. More than 15 seconds is extremely frustrating when dealing with certain kinds of code. " [10]

---

https://graydon2.dreamwidth.org/253769.html

---

Proebsting's Law: Compiler Advances Double Computing Power Every 18 Years

https://proebsting.cs.arizona.edu/law.html

Proebsting's Law: Compiler Advances Double Computing Power Every 18 Years I claim the following simple experiment supports this depressing claim. Run your favorite set of benchmarks with your favorite state-of-the-art optimizing compiler. Run the benchmarks both with and without optimizations enabled. The ratio of of those numbers represents the entirety of the contribution of compiler optimizations to speeding up those benchmarks. Let's assume that this ratio is about 4X for typical real-world applications, and let's further assume that compiler optimization work has been going on for about 36 years. These assumptions lead to the conclusion that compiler optimization advances double computing power every 18 years. QED.

This means that while hardware computing horsepower increases at roughly 60%/year, compiler optimizations contribute only 4%. Basically, compiler optimization work makes only marginal contributions.

Perhaps this means Programming Language Research should be concentrating on something other than optimizations. Perhaps programmer productivity is a more fruitful arena.

" -- [11]

---

https://www.haskellforall.com/2021/04/the-end-of-history-for-programming.html

---

https://buttondown.email/hillelwayne/archive/i-am-disappointed-by-dynamic-typing/

---

https://cs.wellesley.edu/~cs251/f17/assignments/antics/StefixHanenbergLanguageWars.pdf

---

discussion on the pros and cons of 'little languages':

https://lobste.rs/s/tsh7jd/little_languages_are_future_programming

---

proj-plbook-plPartThoughts