proj-oot-ootMetaprogrammingNotes3

Difference between revision 33 and current revision

No diff available.

Hi, hope all is well in North Carolina! I was wondering if you would be able to make an email-introduction to the neighbors on Travers Way who own the tall eucalyptus tree in the corner of the yard? Thanks, and happy holidays!

-- b

---

" Open implementations

Many older and more "dynamic" high-level languages (Lisps, Smalltalks, Forths) were designed around a kind of uniform programs-as-data model, and the presence / presupposition that the compiler would always be in-process with your program: programs thus more commonly invoked (and extended) the implementation significantly at runtime, did a lot of dynamic metaprogramming, reflection, and so forth. This was maybe a kinder, gentler time when "arbitrary code execution" wasn't quite so synonymous with "security nightmare"; but it also had a sort of internal logic, represents a design aesthetic that puts pressure on the language machinery itself to be programmable: pressure to keep the language, its syntax, its type system, its compilation model and so forth all simple, uniform, programmable.

We've since been through a few different eras of language sensibilities around this sort of thing, including some imaginary mobile-code stories like Telescript, Obliq, and eventually JVM/CLR. These latter were weird since they tried to be mobile (which rarely worked), and tried to have semi-open implementations (at least compilers-as-libraries and some access to the bytecode loaders) but didn't quite make it to the point where it was easy or obvious to do source-level metaprogramming (with the notable exceptions of F# quotations and F# type providers). But through all this, in the background there's been this somewhat unfortunate, competing "grown-up" model that tends to dominate mainstream languages (everything from FORTRAN to C++): a pretty complex grammar and AST, a pretty hairy compilation model, a very heavy batch-compiler that's definitely not part of the normal process runtime, and programs that seldom do any metaprogramming, even in cases where it'd be appropriate. Recent "compiled" languages have adopted this style, I suspect in part because LLVM is simply shaped that way, and I suspect also in part as a response to negative experiences with both JVM/CLR environments and overzealous use of metaprogramming in scripting languages.

I don't think, however, that the baby ought to be thrown out with the bathwater. I don't think a few bad open implementations invalidates the idea, any more than a few bad static type systems invalidates that idea. They can be done well. Julia for example has quite a nice static type system and compiler, but also a uniform syntax that's friendly to dynamic metaprogramming and JIT'ing. There are also several static metaprogramming and staged-programming systems: MetaOcaml, Template Haskell, ScalaMeta and so forth. So .. there's a spectrum, a design space.

I'm not sure exactly where to go with this topic, except to say I'm a bit dissatisfied with how hard it is to do tooling for current languages, how large the feedback cycle is between a language and its own (meta)programming tools, how distant the tools are from the users, and perhaps to point out that dynamic compilation is not entirely dead: we appear to be entering an era with a new high-integrity universal bytecode sandbox, designed for mobile code and dynamic JIT'ing, and with a lot of industrial support. It might be an interesting time to consider projects (even "static" ones) that take a slightly more nuanced view of the code/data relationship, the program/metaprogram/compiler relationship, and make the whole compilation model a little more .. pliant (yes that was a Pliant reference and if you remember what that was, congratulations you've been on the internet too long, here's your TUNES badge of merit). " "What next?" by Graydon Hoare

discussion at: http://lambda-the-ultimate.org/node/5466 https://news.ycombinator.com/item?id=15051645

---

bjoli 6 days ago [-]

One thing that makes racket shine is it's macro facilities. Syntax case is nice and all that, but Jesus Christ in a chicken basket I wish scheme would have standardised on syntax-parse.

Syntax case vs. syntax parse isn't and will never be close to a fair comparison. Not only is it more powerful, it also provides the users of your macros with proper error messages. It blows both unhygienic and other hygienic macro systems out of the water for anything more complex than very basic macros.

reply

agumonkey 6 days ago [-]

Here's the doc for the curious http://docs.racket-lang.org/syntax/stxparse-intro.html

Interesting system indeed

reply

rkallos 6 days ago [-]

100% agreed. After using syntax-parse, it pains me to use anything else. It's a gem.

reply

---

" Compile-time AST Macros

Some variant of Lisp (or Scheme?) was probably one of the first implemented FP languages; and Lisps tend to have compile-time AST macros that allow you to transform sections of the program at compile-time.

But compile-time code-transformations are not unique to Lisp; apart from other FP languages that have them, like Template Haskell or Scala Macros, many languages have some sort of compile-time code transformation. From Java Annotation Processors, to my own MacroPy? project in Python, it turns out that compile-time ast macros are just as feasible in imperative languages, doing imperative programming. "

---

" Compile-time programming

Zig has pretty strong compile-time programming support. For example, its printf formatting capability is all written in userland code [1]. It doesn't at this moment support code-generation like D's mixins but I personally have not found this too problematic.

Generic functions can be written in a duck-typing fashion. With compile-time assertions the inputs can be limited to what they need pretty clearly and the errors during usage are pretty self-explanatory.

  error Overflow;
  pub fn absInt(x: var) -> %@typeOf(x) {
      const T = @typeOf(x);
      comptime assert(@typeId(T) == builtin.TypeId.Int); // must pass an integer to absInt
      comptime assert(T.is_signed); // must pass a signed integer to absInt
      if (x == @minValue(@typeOf(x))) {
          return error.Overflow;
      } else {
          @setDebugSafety(this, false);
          return if (x < 0) -x else x;
      }
  }

Zig doesn't have any form of macros. Everything is done in the language itself.

[1]: http://ziglang.org/documentation/#case-study-printf

"

---

should mb support extensions like this:

[1]

hyperion2010 111 days ago [-]

It looks like this (e.g. `#2dcond`) implements a way to directly embed other languages in a racket file [0] and avoids the problems encountered when trying to do it using the `#reader` syntax [1] in a source file. Essentially letting you have multiple readtables in a file (probably not nestable though). I could be wrong about this (need to look more carefully when I have more time), but nonetheless could allow direct embedding of completely alternate syntax with the right setup.

[0] https://github.com/racket/2d/blob/master/2d-lib/private/read... [1] https://docs.racket-lang.org/guide/hash-reader.html

gcr 111 days ago [-]

This seems similar to the way the at-exp language is implemented.

at-exp adds support for S-expressions based on braces, and is the foundation of the Scribble markup language.

---

this shows how Forth can be metaprogrammed to support a little FSM (finite state machine) DSL:

http://galileo.phys.virginia.edu/classes/551.jvn.fall01/fsm.html

---

int_19h 130 days ago [-]

R has some really crazy metaprogramming facilities. This might sound strange coming from Python, which is already very dynamic - but R adds arbitrary infix operators, code-as-data, and environments (as in, collections of bindings, as used by variables and closures) as first class objects.

On top of that, in R, argument passing in function calls is call-by-name-and-lazy-value - meaning that for every argument, the function can either just treat it as a simple value (same semantics as normal pass-by-value, except evaluation is deferred until the first use), or it can obtain the entire expression used at the point of the call, and try to creatively interpret it.

This all makes it possible to do really impressive things with syntax that are implemented as pure libraries, with no changes to the main language.

--

" > I won’t pretend to remember Lisp inventor John McCarthy?'s exact words which is odd because there were only about ten but he simply asked if Python could gracefully manipulate Python code as data.

> ‘No, John, it can’t,’ said Peter and nothing more, graciously assenting to the professor’s critique, and McCarthy? said no more though Peter waited a moment to see if he would and in the silence a thousand words were said. "

---

(regarding the previous section)

kennytilton 8 hours ago [-]

Was McCarthy? even thinking of macros when he locked onto code as data? Or was he thinking code generation in pursuit of AI?

dangerbird2 7 hours ago [-]

He was not, macro libraries like defmacro and define-syntax were later addition to lisp. Well before formal macro systems were implemented, code-as-data was particularly useful for implementing the eval function in Lisp (the Lisp 1.5 implementation takes up less than a page), which greatly simplified the process of implementing compilers and interpreters.

reply

---

dwarfylenain 16 hours ago [-]

And what about hylang.org ? I'm surprised it's not even mentioned here : nice lispy syntax with all the python packages = best of both worlds :)

reply

cat199 13 hours ago [-]

+1 - if you ever want macros or syntactic level manipulation in the python world, or just feel like doing something with s-expressions, hy is a great thing.

reply

---

[2]

" > ... if you started from scratch, how would you design a language differently so that it doesn't run into these issues?

I don't think there would be a need to start from scratch. Python 3 is a pretty good language! Also, it's not clear that it needs to be fast, it's just fun to think about how it might be fast.

That said, I think there is only one key feature missing to enable aggressive optimizations: An annotation on functions/methods that says that the function may be optimized even at the cost of some loss in reflective capabilities. In current Python any function's code can be inspected and changed at run time, as can its local variables by callees who can walk up the call stack. This means that every optimization that compiles Python code to machine code or even just changes the interpreter's handling of local variables can, in general, change the program's semantics.

I think it would suffice to add a @static decorator to disable these reflective features for individual functions. (And disabling dynamic lookup of globals would be good too.) The interpreter could then recognize that you want those optimized and could do whatever magic it likes without caring about boring corner cases like invalidating its optimizations if someone tries to patch the function's code at runtime.

This would not be a big thing to do, and there would be no real need to restart from scratch; Python 3 is a pretty good programming language!

Everything else would pretty much fall out automatically from that using known/already implemented techniques, especially with the type hints that would allow you to gradually move a fully dynamic, interpreted application to a statically typed, compiled one function by function. Such a Python would still not win the speed crown, but it could beat the current one by miles on a lot of applications. "

---

"Various features of Python like dynamic typing and object/class mutation (via del) preclude many static analysis techniques" [3]

" Cannoli supports two major optimizations that come as a result of applying restrictions to the language. Restrictions are placed on the Python features that provide the ability to delete or inject scope elements and the ability to mutate the structure of objects and classes at run time. The corresponding feature branches are scope-opts and class-opts. The optimizations are built on top of each other, therefore the class-opts branch is a superset of the scope-opts branch. In general, the class-opts branch yields a performance increase of over 50% from the master branch. " [4]

" Chapter 5 OPTIMIZATIONS The initial implementation of the Cannoli compiler supports an intentionally re- stricted subset of Python. In particular, the ability to augment an object’s structure (attributes) post-creation has been eliminated. These features were eliminated in an attempt to quantify the cost to support such features in a language. This cost is an artifact of the performance improvements that cannot be implemented when these features are supported. We chose features of the language that we hypothesize are used infrequently, that could be trivially rewritten in a more static manner, or that are clearly detrimental to static analysis. Previous work attempts to quantify the use of dynamic features supported by Python [10]. They concluded that a substantial portion of Python programs use dynamic features during start up, with a consider- able drop in use after. Another study reports that dynamic features are used more uniformly throughout the lifetime of Python applications [1]. Although, their exper- iments showed that object structure changing at run time was used less frequently than other dynamic features. The exception was adding an attribute to an object, which may be misleading since the structure of an object is typically determined by attribute assignments in the __init__ method which may be considered uses of this feature. Our goal is to restrict features of Python in order to show a considerable increase in performance. By doing so, we provide empirical data to language designers who may be considering a variety of features for a given language.

Restricting Dynamic Code

The first optimization concerns dynamic code. This is code that is constructed at run time from other source code. The Python functions exec and eval, along with the del keyword, provide this functionality.

...

This optimization eliminates the ability to support exec, eval, and del and also restricts the use of variadic functions. Python supports both a variadic positional parameter as well as a variadic keyword parameter. It is seemingly impossible to statically determine the contents of these variadic parameters so this functionality falls back to hash tables representing the current scope. Finally, all scope elements must be inferable at compile time, precluding the functionality supported by exec and del, in order to observe performance increases from this optimization.

5.2 Restricting Dynamic Objects Dynamic objects are those whose structure can be modified at run time. This includes, but is not limited to, adding attributes and methods, deleting attributes and methods, and modifying the inheritance of a class. Since Cannoli does not currently support inheritance we focus on the addition and deletion of attributes and methods.

...

The compiler determines the attributes of an object by scanning the definition for static variables and methods. If an __init__ function is provided, the compiler will locate all attribute assignments to the parameter representing the reference to the newly instantiated object. Scanning the __init__ function is not perfect. The __init__ function could be replaced during run time or the self value could be reassigned within the __init__ function. Both of these cases are restricted but are not currently enforced. Ideally there would be a more explicit way to define non-static class attributes but doing so would require additions to the language.

---

a thought i had:

meta-programming in a programming language may not be as useful as you might hope, due to the 'curse of lisp' effect (in short, using metaprogramming makes your code harder for others to understand, because they have to learn your DSL/variant programming language first before they can read your code). However, merely learning and practicing and using it makes the programmer who did so smarter; so having strong metaprogramming in your language has the benefit of giving the language a smarter community of programmers who use it, because it enlightens them a little (this effect of making programmers who use the language smarter is a distinct effect from the related effect of attracting smarter programmers in the first place).

---

"

Examples speak louder than words, so let’s imagine we want to make a meta-html domain-specific language that looks like this:

(html (body (h1 "Hello World") (p "How's it going?")))

You’d need a bunch of functions that look like this:

(defn html [& inner] (str "<html>" (apply str (map eval inner)) "</html>"))

(defn body ...)

But, nobody likes to see code repeated like that. But we can’t use a function to define a function. So, why not automate it with a macro?

(defmacro deftag [name] (list 'defn name ['& 'inner] (list 'str "<" (str name) ">" '(apply str inner) "</" (str name) ">")))

(deftag html) (deftag body) (deftag h1) (deftag p)

See what I did there? The body of the macro actually returns lists of lists. Some of those lists are quoted ('), which causes them to be rendered literally instead of being evaluated. however, the calls to (str name) are left unquoted.

We can use macroexpand to examine the result:

(macroexpand-1 '(deftag blink))

returns
(clojure.core/defn blink [& inner] (clojure.core/str "<" "blink" ">" (clojure.core/apply clojure.core/str inner) "</" "blink" ">"))

Exactly what we expect, with some extra namespacing added. " [5]

---

---

http://willcrichton.net/notes/the-coming-age-of-the-polyglot-programmer/

---

" With Rust, I have been delighted by its support for hygienic macros. This not only solves the many safety problems with preprocessor-based macros, it allows them to be outrageously powerful: with access to the AST, macros are afforded an almost limitless expansion of the syntax — but invoked with an indicator (a trailing bang) that makes it clear to the programmer when they are using a macro. For example, one of the fully worked examples in Programming Rust is a json! macro that allows for JSON to be easy declared in Rust. This gets to the ergonomics of Rust, and there are many macros (e.g., format!, vec!, etc.) that make Rust more pleasant to use.

Another advantage of macros: they are so flexible and powerful that they allow for effective experimentation. For example, the propagation operator that I love so much actually started life as a try! macro; that this macro was being used ubiquitously (and successfully) allowed a language-based solution to be considered. Languages can be (and have been!) ruined by too much experimentation happening in the language rather than in how it’s used; through its rich macros, it seems that Rust can enable the core of the language to remain smaller — and to make sure that when it expands, it is for the right reasons and in the right way. " [7]

---

" Much of Clojure’s core functionality is written as macros that get expanded into final forms at compile-time, before evaluation. For example, another thing that delighted me was seeing that language constructs as fundamental as and/or are defined in clojure.core. For example, here’s a simplified version of the and macro:

(defmacro and [x & next] `(let [and# ~x] (if and# (and ~@next) and#)))

I’d never worked with a language in which you could peer into such fundamental constructs, and it opened my mind to what else was possible. Indeed you can read through clojure.core top-to-bottom and see much of the language bootstrap itself into existence.

F# has metaprogramming facilities but they’re comparatively clunky (and fragmented depending on what kind of syntax tree you want). Accomplishing similar metaprogramming in F# is much more involved.

I don’t often write new macros, and it’s not advised to write macros where functions work just as well, but they’re invaluable in certain contexts.

A few times I’ve been able to do trivial refactorings of code programmatically in the REPL. Of course .NET has Roslyn but it’s a totally separate facility; in Clojure this ability is inherited more or less by virtue of being a Lisp. " [8]

---

" Procedural macros on stable Rust

Macros in Rust have been around since before Rust 1.0. But with Rust 2018, we’ve made some big improvements, like introducing procedural macros.

With procedural macros, it’s kind of like your can add your own syntax to Rust.

Rust 2018 brings two kinds of procedural macros: Function-like macros

Function-like macros allow you to have things that look like regular function calls, but that are actually run during compilation. They take in some code and spit out different code, which the compiler then inserts into the binary.

They’ve been around for a while, but what you could do with them was limited. Your macro could only take the input code and run a match statement on it. It didn’t have access to look at all of the tokens in that input code.

But with procedural macros, you get the same input that a parser gets — a token stream. This means can create much more powerful function-like macros. Attribute-like macros

If you’re familiar with decorators in languages like JavaScript?, attribute macros are pretty similar. They allow you to annotate bits of code in Rust that should be preprocessed and turned into something else.

The derive macro does exactly this kind of thing. When you put derive above a struct, the compiler will take that struct in (after it has been parsed as a list of tokens) and fiddle with it. Specifically, it will add a basic implementation of functions from a trait. "

---

" As soon as you have something as apparently simple as named procedures, you're really writing a DSL, albeit very coarsely, for your business problem. Add named record types. Add textual macros. Add operator overloading. Add Lisp-style macros. At every point where the language allows a word on the screen to stand in for something larger, you're giving the programmer the power to design a language for their problem domain. "

---

" nabla9 1 day ago [-]

Guy's point was that the language should be designed to be grown grown by users, not by language designers.

> We need to put tools for language growth in the hands of the users.

Guy's previos work with Common Lisp is epitome of this. There are three kind of macros in the language: reader macros, macros and compiler macros. They give the tools for the user to extend the language. There is just 25 or so core primitives in the 'core' language and the runtime. The rest 900+ functions and symbols are basically the standard library (the fact that they are slapped into the same package and they extend the core in a ways that other languages can't hides the simplicity of the language somewhat). "

---

ramchip 1 day ago [-]

Whether it’s mainstream is debatable, but I think Elixir is a really successful example of extensible language, and indeed it managed to do this without s-expressions. Various libraries extend the language for HTTP routing, parsing, DB queries, property testing, static analysis, etc. It makes possible a lot of experimentation by the wider community, and not just the group of core language developers.

The creator has repeated this philosophy a few times:

> There is very little reason for an Elixir v2.0 with breaking changes. The language was designed to be extensible and if we need to do major changes to the language to improve the language itself, then we failed at the foundation.

https://elixirforum.com/t/what-would-you-like-to-see-in-elix...

> A big language does not only fragment the community and makes it harder to learn but it is also harder to maintain. It is also why the language was designed to be extensible: so the community could build what is necessary without a push to make everything part of the language.

https://elixirforum.com/t/questions-about-property-testing-s...

> We also understand there is a limited amount of features we can add to the language before making the journey too long or the language too big. Adding something now means not including something else later. As an exercise, let’s see a counter example of when we didn’t add something to the language: GenStage?. [...]

https://elixirforum.com/t/questions-about-property-testing-s...

reply

---

" Why is Racket best suited for the task of making languages? ... Racket is ideal for LOP because of its macro system. Macros are indispensable for making languages because they make compiler-style code transformations easy. Racket’s macro system is better than any other.

Why macros?

Because Racket DSLs compile to Racket, a language-oriented programmer using Racket needs to write some syntax transformers that convert the DSL notation into native Racket. These syntax transformers are known as macros. Indeed, macros can be characterized as extensions to the Racket compiler.

https://beautifulracket.com/appendix/glossary.html#macro

Racket’s macro system is vast, elegant, and undeniably its crown jewel. But a lot of this book is about the joy of Racket macros. That material is ready when you are. For now, the two standout features:

    Racket has a specialized data structure called a syntax object that is the medium of exchange among macros. Unlike a string, which can only contain raw code, a Racket syntax object packages the code in a way that preserves its hierachical structure, plus metadata like lexical context and source locations, and arbitrary fields called syntax properties. This metadata stays attached to the code during its various transformations. (See syntax objects for the details.)

https://beautifulracket.com/appendix/glossary.html#syntax-object https://beautifulracket.com/appendix/glossary.html#lexical-context https://beautifulracket.com/appendix/glossary.html#source-location https://beautifulracket.com/appendix/glossary.html#syntax-property https://beautifulracket.com/explainer/syntax-objects.html

    Racket macros are hygienic, which means that by default, the code produced by a macro retains the lexical context from where the macro was defined. In practice, this eliminates a huge amount of the housekeeping that would ordinarily be required to make DSLs work. (See hygiene for the details.)

https://beautifulracket.com/appendix/glossary.html#hygienic https://beautifulracket.com/explainer/hygiene.html

...

Further reading

    Beautiful Racket by me, Matthew Butterick. The online book that you’re currently visiting. But you’re in the appendix. The main part of the book is a progressive series of LOP tutorials, all relying on Racket and its fantastic macro system.

https://beautifulracket.com/

    Racket School 2019. This summer, Racket School offers two LOP classes: the three-day Beautiful Racket Workshop (taught by me) and the five-day How to Design Languages (taught by the Racket architects).

https://school.racket-lang.org/ https://school.racket-lang.org/#brw https://school.racket-lang.org/#htdl

    Language-Oriented Programming in Racket: A Cultural Anthropology by Jesse Alama. An imposing title, but inside is a friendly and readable set of interviews that Jesse has conducted with numerous Racketeers (me included), with contrasting perspectives on how LOP fits into our work.

https://gumroad.com/l/lop-in-racket-cultural-anthro

    Creating Languages in Racket by Matthew Flatt. Matthew is the lead architect of Racket (and wrote the foreword for this book). This brief article nicely demonstrates the increasing levels of sophistication in DSL design, using a clever game-creation DSL.

http://queue.acm.org/detail.cfm?id=2068896 https://beautifulracket.com/foreword.html

    More examples of Racket-implemented DSLs, and even more.
    https://beautifulracket.com/appendix/domain-specific-languages.html#racket-implemented-dsls http://docs.racket-lang.org/search/index.html?q=H%3A

"

-- [9]

---

" Personally most of the DSLs I write in Clojure are "private" (i.e. I write them for myself to help me develop a library or an app) and thus they tend to be small. This is why I favor functions over data: it allows me to reuse common function combinators (e.g (fn or-fn [f g] (fn [& args] (or (apply f args) (apply g args))))) so that I do not have to reimplement them which is something you have to do if you approach the problem with a data-first mindset. Instead, if I really want to prettify such function-based DSLs by bringing forth the use of data to express the problem at hand, I write a DSL-as-data parser and implement it as one or multiple DSL-fns.

More @ https://gist.github.com/TristeFigure/20dd01b0d3415f34075cfc02a1918106 "

---

this looks pretty awesome:

https://news.ycombinator.com/item?id=16014573

---

some types of metaprogramming:

" ($define! $f ($vau (x y z) e ($if (>=? (eval x e) 0) (eval y e) (eval z e)))) " -- [10]

---

like Lua, have 'load/compile' rather than 'eval'

---

---

some kinds of tricky dynamicity:

" Dynamic constant references

Sorbet does not support resolving constants through expressions. For example, foo.bar::A is not supported—all constants must be resolvable without knowing what type an expression has. In most cases, it’s possible to rewrite this code to use method calls (like, foo.bar.get_A).

Dynamic includes

Sorbet cannot statically analyze a codebase that dynamically includes code. For example, code like this is impossible to statically analyze.

Dynamic includes must be rewritten so that all includes are constant literals.

Missing constants

For Sorbet to be effective, it has to know about every class, module, and constant that’s defined in a codebase, including all gems. Constants are central to understanding the inheritance hierarchy and thus knowing which types can be used in place of which other types. "

---

simplify 6 days ago [-]

A fascinating part about Sorbet is it did not have to introduce any additional syntax to Ruby (unlike TypeScript?, for example). This really speaks to the expressiveness of Ruby. Very cool.

reply

SomeOldThrow? 6 days ago [-]

The type signatures are pretty noisy to read, though, some syntax can definitely help. Maybe with Ruby 3?

reply

-- [12]

---

more notes on the dangers of C-style macros:

(thanks to [13])

---

https://en.wikipedia.org/wiki/Camlp4

---

" Another purpose for which particularly devious programmers can use C++ templates is "template metaprogramming," which means writing pieces of code that run while the main program gets compiled rather than when it runs. Here's an example of a program that computes the factorials of 4, 5, 6, and 7 (which are 24, 120, 720, and 5040) at compile-time:

  1. include <stdio.h>

template <int n> class Fact { public: static const int val = Fact<n-1>::val * n; };

class Fact<0> { public: static const int val = 1; };

int main() { printf("fact 4 = %d\n", Fact<4>::val); printf("fact 5 = %d\n", Fact<5>::val); printf("fact 6 = %d\n", Fact<6>::val); printf("fact 7 = %d\n", Fact<7>::val);

  return 0;}

If you look at the assembly code g++ or any other reasonable compiler produces for this code, you'll see that the compiler has inserted 24, 120, 720, and 5040 as immediate values in the arguments to printf, so there's absolutely no runtime cost to the computation. (I really encourage you to do this if you never have before: save the code as template.cc and compile with g++ -S template.cc. Now template.s is assembly code you can look over.) As the example suggests, it turns out that you can get the compiler to solve any problem a Turing machine can solve by means of template metaprogramming.

This technique might sound like some strange abuse of C++ that's primarily useful for code obfuscation, but it turns out to have some practical applications. For one thing, you can improve the speed of your programs by doing extra work in the compile phases, as the example shows. In addition to that, it turns out that you can actually use the same technique to provide convenient syntax for complicated operations while allowing them to achieve high performance (matrix-manipulation libraries, for instance, can be written using templates). If you're clever, you can even get effects like changing the order in which C++ evaluates expressions for particular chunks of code to produce closures or lazy evaluation. " -- [14]

yuck! let's not support that. It seems like the thing that makes it complicated in this example is permitting <n-1> in a template -- i bet if only isolated variable accesses or literals were allowed in templates, you wouldn't get this complexity.

closures or lazy evaluation via template metaprogramming does sound interesting, though. I should look that up sometime.

---

i was talking about allowing all-caps to escape macro hygenicity, should we do that? or should they be keywords?

---

in PEPSI/COLA,

" The only intrinsic runtime operation (in the sense that it is inaccessible to user-level programs) is the 'memoized' dynamic binding (of selectors to method implementations) that takes place entirely within the method cache. Every other runtime operation (prototype creation, cloning objects, method dictionary creation, message lookup, etc.) is achieved by sending messages to objects, is expressed in entirely in idst, and is therefore accessible, exposed and available for arbitrary modification by any user-level program. " -- [15]

---

in Pepsi external blocks, note that a PEPSI variable is accessible from C in external blocks, see 5. pragmatics :: a few things to note in http://piumarta.com/software/cola/pepsi.html

---

" Generating source code

... In C you can use the preprocessor and define your data structure in a macro or a header that you include multiple times with different #defines. In Go there are scripts like genny that make this code generation process easy. ... D string mixins

Some languages that implement generics in some other way also include a clean way of doing code generation to address more general metaprogramming use cases not covered by their generics system, like serialization. The clearest example of this is D’s string mixins which enable generating D code as strings using the full power of D during the middle of a compile.

Rust procedural macros

A similar example but with a representation only one step into the compiler is Rust’s procedural macros, which take token streams as input and output token streams, while providing utilities to convert token streams to and from strings. The advantage of this approach is that token streams can preserve source code location information. A macro can directly paste code the user wrote from input to output as tokens, then if the user’s code causes a compiler error in the macro output the error message the compiler prints will correctly point to the file, line and columns of the broken part of the user’s code, but if the macro generates broken code the error message will point to the macro invocation. For example if you use a macro that wraps a function in logging calls and make a mistake in the implementation of the wrapped function, the compiler error will point directly to the mistake in your file, rather than saying the error occurred in code generated by the macro.

Syntax tree macros

Some languages do take the step further and offer facilities for consuming and producing Abstract Syntax Tree (AST) types in macros written in the language. Examples of this include Template Haskell, Nim macros, OCaml PPX and nearly all Lisps.

One problem with AST macros is that you don’t want to require users to learn a bunch of functions for constructing AST types as well as the base languages. The Lisp family of languages address this by making the syntax and the AST structure very simple with a very direct correspondence, but constructing the structures can still be tedious. Thus, all the languages I mention have some form of “quote” primitive where you provide a fragment of code in the language and it returns the syntax tree. These quote primitives also have a way to splice syntax tree values in like string interpolation. Here’s an example in Template Haskell:

-- using AST construction functions genFn :: Name -> Q Exp genFn f = do x <- newName "x" lamE [varP x] (appE (varE f) (varE x))

-- using quotation with $() for splicing genFn' :: Name -> Q Exp genFn' f = [

\x -> $(varE f) x ]

One disadvantage of doing procedural macros at the syntax tree level instead of token level is that syntax tree types often change with the addition of new language features, while token types can remain compatible.

...

Templates

The next type of generics is just pushing the code generation a little further in the compiler. " -- [16]

---

things to annotate:

variables/values at a point points in code statically points in code dynamically? regions of code lexically regions of code dynamically

---

"Zig does this using the same language at both compile time and runtime, with functions split up based on parameters marked comptime. There’s another language that uses a separate but similar language at the meta level called Terra. Terra is a dialect of Lua that allows you to construct lower level C-like functions inline and then manipulate them at the meta level using Lua APIs as well as quoting and splicing primitives:

... Terra’s crazy level of metaprogramming power allows it to do things like implement optimizing compilers for domain specific languages as simple functions, or implement the interface and object systems of Java and Go in a library with a small amount of code. Then it can save out generated runtime-level code as dependency-free object files. "

http://terralang.org/

http://terralang.org/#compiling-a-language

https://github.com/zdevito/terra/blob/master/tests/lib/javalike.t

https://github.com/zdevito/terra/blob/master/tests/lib/golike.t

---

dom96 on June 7, 2018 [-]

> Nim macros are not quite at the Lisp level, but they are extremely powerful.

I'm not fully familiar with Lisp macros so I'm curious, what is Nim missing that Lisp has in terms of metaprogramming?

beagle3 on June 8, 2018 [-]

Two things in common use:

Lisp has reader macros that can alter lexical analysis and parsing; correct me if I am wrong, but I think that’s not possible in Nim. E.g. things like JSX are trivial to implement in Lisp.

Also, lisp macros let you e.g. write new control structures with multiple “body” parts - iirc, in nim only the last untyped macro arg can be a code body (you can put a block in parantheses, but that’s not as elegant)

I’m sure there’s other stuff that fexprs and other [a-z]exprs can do that nim can’t, but i’Ve never seen them in use (or used them myself)

Also, personally I think Nim’s model is more practical; lisp’s power essentially requires lispy or REBOLy syntax to be usable. Nim is pascalythonesque, and though complex is not complicated; much like Python, and unlike C++, you can make use of the ecosystem without being afraid of obscure details biting you - but it has all the capabilities when you need them.

---

https://odin-lang.org/docs/overview/#using-statement

"using can used to bring entities declared in a scope/namespace into the current scope. This can be applied to import declarations, import names, struct fields, procedure fields, and struct values."

---

fmap 3 months ago [-]

C++ has bad error messages because of language design. Contemporary C++ compilers are very good at reporting clear error messages about common mistakes, but template heavy code still yields arcane error messages. Templates are untyped, so there is no way to give sensible error messages when defining or instantiating a template. Instead you have to typecheck after template expansion, at which point you are left with an error message about compiler generated code.

There are some proposals which address this (e.g., concepts), but none of them are part of the language standard yet. Concepts in particular made it into the C++20 draft, but they also made it into a draft of the C++17 standard and were ultimately rejected. Somewhat vexingly C++ concepts actually come with the same problems only at the level of concepts instead of at the level of templates.

---

crazypython 2 hours ago [–]

I don't see why people won't just take the step D and Lisp do-- allowing full use of the programming language at compile time.

You can execute an ordinary functions at compile-time to read a DSL from a string or read attributes (reflective metaprogramming) on your program's classes. Take the string it outputs, use mixin(), and you have code. For example:

    // Sort a constant declaration at Compile-Time
    enum a = [ 3, 1, 2, 4, 0 ];
    static immutable b = sort(a);

"a" only appears in the compiler's memory. "sort" is a normal function that runs at compile-time. "allowing accessto the full language at compile-time" is similar to what dynamic languages such as Python and JavaScript? give you, except D is a static language with GCC and LLVM backends.

reply

vertex-four 1 hour ago [–]

Rust is getting that, gradually, with the work of replacing the ad-hoc "const evaluator" with miri (an interpreter for a Rust intermediate representation). Right now you have procedural macros, which are Rust code that operates on the syntax tree, including custom attributes.

Proper reflective metaprogramming would be a fairly big step though - right now, the macro systems happen well before the type system even gets a chance to look at the code, so the data to play with types in an interesting way isn't there at the right step.

reply

octo_t 5 minutes ago [–]

Another issue not mentioned is cross-compiling.

If the result of my `sort ` function is dependent on `sizeof(void )` being 8, when I compile from my x86 hardware for a 32-bit only architecture, assumptions go awry.

Note that this isn't likely with something like sort, but definitely is* likely with precomputing values/structs etc.

reply

codr7 18 minutes ago [–]

I suspect a big reason is that you need an interpreter in addition to the compiler. From what I understand, the D gods spent quite some time and effort developing theirs.

For interpreted languages, there are no excuses; hooking into the interpreter at compile time is trivial. Full macros [0] are nice, but depends a lot on the syntax. A way to evaluate expressions at compile time [1] would go a long way.

[0] https://github.com/codr7/gfoo#macros [1] https://github.com/codr7/gfoo#bindings

reply

dependenttypes 12 minutes ago [–]

If you do not want to implement a separate interpreter you can do multiple compilation passes instead.

reply

dimtion 32 minutes ago [–]

One issue not yet mentioned with Turing complete language at compile is that it makes tooling and IDE integration much more difficult.

When you need to run an unbounded program each time you want to provide real time feedback, like type inference or in Rust case lifetime inference, you make the language tooling much less simple and accessible.

reply

dependenttypes 27 minutes ago [–]

> One issue not yet mentioned with Turing complete language at compile is that it makes tooling and IDE integration much more difficult

How so?

> When you need to run an unbounded program

How is a program that provably terminates but takes 2 years to finish any better for compile-time computation? You want timeouts in either case.

reply

zokier 1 hour ago [–]

Rust have procedural macros https://doc.rust-lang.org/reference/procedural-macros.html

reply

dilap 2 hours ago [–]

Zig takes this approach. I haven't done more than kick the tires curiously, but it seems very cool.

reply

---

some simple but interesting metaprogramming features here:

http://nineties.github.io/amber/feature.html

the language isn't much documented but it's in here: https://github.com/nineties/amber

also of great interest: https://speakerdeck.com/nineties/creating-a-language-using-only-assembly-language?slide=73 ---

IRIS keyword, infix operator syntax defns:

" The if…then…else… handler’s glue definition, including custom operator syntax:

to ‘if’ {test: condition as boolean, then: action as expression, else: alternative_action as expression} returning anything requires { can_error: true use_scopes: #command swift_function: ifTest {condition, action, alternativeAction} operator: {[keyword “if”, expr “condition”, keyword “then”, expr “action”, optional sequence [keyword “else”, expr “alternative_action”]], precedence: 101} }

...

e.g. Consider the expression:

“HELLO, ” & uppercase my_name

Handler definitions:

to uppercase {text as string} returning string requires { }

to ‘&’ {left as string, right as string} returning string requires { can_error: true swift_function: joinValues operator: {form: #infix, precedence: 340} }

" -- https://github.com/hhas/iris-script

---

mb Define-syntax exists but can only operate on entire files. Or maybe also on regions within special reserve delimiters like syntaxname[[image: ?]] or something like that. Maybe the preprocessor stuff can't be altered within a block like that

---

(replying to a comment in Rust about proc macros introducting a Lisp Curse sort of situation:

Ar-Curunir 14 hours ago [–]

Overly powerful proc-macros aren't in common use; most common proc-macros are either ones that automatically Derive a trait, or ones that serves as attributes on methods or functions and perform some transformation of the source code (without introducing a DSL) )

---

section "routes to emacs for newbies" and "doom is not the answer" in https://org-roam.discourse.group/t/the-state-of-org-roam-for-actual-beginners/494 make an important point that overly configurable systems have the problem that it's hard for ppl to help each other b/c each pair of ppl probably has a very different configuration

---

https://adv-r.hadley.nz/metaprogramming.html

https://tidyeval.tidyverse.org/

https://rlang.r-lib.org/

---

"Once you have captured an expression, you can inspect and modify it. Complex expressions behave much like lists. That means you can modify them using [[ and $: f <- expr(f(x = 1, y = 2))

  1. Add a new argument f$z <- 3 f
  2. > f(x = 1, y = 2, z = 3)
  3. Or remove an argument: f[[2?]] <- NULL f
  4. > f(y = 2, z = 3) The first element of the call is the function to be called, which means the first argument is in the second position. You’ll learn the full details in Section 18.3." -- https://adv-r.hadley.nz/metaprogramming.html

---

not sure how important this is, but just saving it here for future reference:

"8. paste! This is a small detail, but one that took me a little while to find. As I described in my blog entry two years ago, I have historically made heavy use of the C preprocessor. One (arcane) example of this is the ## token concatenation operator, which I have needed only rarely — but found essential in those moments. (Here’s a concrete example.) As part of a macro that I was developing, I found that I needed the equivalent for Rust, and was delighted to find David Tolnay’s paste crate. paste! was exactly what I needed — and more testament to both the singular power of Rust’s macro system and David’s knack for build singularly useful things with it!" -- [17]

---

"One important difference from languages such as JavaScript? is that eval() does not have access to the current scope. This is crucial for optimizations as it means that local variables are protected from interference" -- Julia: dynamism and performance reconciled by design

---

"A good question to ask is how many language features you have to throw away to gain a useful feature. In Smalltalk, you send messages to objects. This is the equivalent of calling methods in a language such as C++ or Java. If the object doesn’t have an explicit handler for that message type, then the runtime system delivers this message and its arguments to the object’s doesNotUnderstand: method. This setup allows for a lot of flexibility in programming.

Consider Java’s RMI mechanism. Each class to be exposed through RMI must be run through a preprocessor that generates a proxy class, which passes the name and arguments of each method through to the RMI mechanism. In Smalltalk (or Objective-C, for that matter), you don’t need to do all this. You can just create a single proxy class that implements the doesNotUnderstand: method and passes the message to the remote class. This one class can be used as a proxy for any other class.

If you wanted to implement something comparable in C++, however, you would need to throw away the C++ method-call mechanism and replace it with your own custom message-passing system. Each C++ class would implement a single handleMessage() method, which would then call the "real" methods. By the time you’ve done this, you’ve thrown away a lot of the convenience of using C++ in the first place. "

---

andyc edited 21 days ago

link

Related wish: I kinda want an application language with Zig-like metaprogramming, not a systems language. In other words, it has GC so it’s a safe language, and no pointers (or pointers are heavily de-emphasized).

Basically something with the abstraction level of Kotlin or OCaml, except OCaml’s metaprogramming is kinda messy and unstable.

(I’m sort of working on this, but it’s not likely to be finished any time soon.)

jamii 21 days ago

link

Julia has similar ideas. There is a bit more built in to the type-system eg multimethods have a fixed notion of type specificity, but experience with julia is what makes me think that zig’s model will work out well. Eg: https://scattered-thoughts.net/writing/zero-copy-deserialization-in-julia/ , https://scattered-thoughts.net/writing/julia-as-a-platform-for-language-development/

5 travv0 22 days ago

link

I find D’s approach to metaprogramming really interesting, might be worth checking out if you’re not familiar with it.

5 Moonchild 21 days ago

link

D’s compile-time function execution is quite similar. Most of the zig examples would work as-is if translated to d. The main difference being that in d, a function cannot return a type; but you can make a function be a type constructor for a voldemort type and produce very similar constructions.

    3
    andyc 21 days ago | link | 

Yeah I have come to appreciate D’s combination of features while writing Oil… and mentioned it here on the blog:

http://www.oilshell.org/blog/2020/07/blog-roadmap.html#how-to-rewrite-oil-in-nim-c-d-or-rust-or-c

Though algebraic data types are a crucial thing for Oil, which was the “application” I’m thinking about for this application language … So I’m not sure D would have been good, but I really like its builtin maps / arrays, with GC. That’s ilke 60% of what Oil is.

    2
    Moonchild 21 days ago | link | 
        D does have basic support for ADTs (though there’s another better package outside the standard library). Support is not great, compared with a proper ml; but its certainly no worse than the python/c++ that oil currently uses.

3 cmcaine 22 days ago

link

Julia sort of fits, depends on your applications. Metaprogramming is great and used moderately often throughout the language and ecosystem. And the language is fantastically expressive.

2 Sophistifunk 22 days ago

link

I want this too, got anything public like blog posts on your thoughts / direction?

    4
    andyc edited 21 days ago | link | 

Actually yes, against my better judgement I did bring it up a few days ago:

https://old.reddit.com/r/ProgrammingLanguages/comments/jb5i5m/help_i_keep_stealing_features_from_elixir_because/g8urxou/

tl;dr Someone asked for statically typed Python with sum types, and that’s what https://oilshell.org is written in :) The comment contains the short story of how I got there.

The reason I used Python was because extensive metaprogramming made the code 5-7x shorter than bash, and importantly (and surprisingly) it retains enough semantic information to be faster than bash.

So basically I used an application language for a systems level task (writing an interpreter), and it’s turned out well so far. (I still have yet to integrate the GC, but I wrote it and it seems doable.)

So basically the hypothetical “Tea language” is like statically typed Python with sum types and curly braces (which I’ve heard Kotlin described as!), and also with metaprogramming. Metaprogramming requires a compiler and interpreter for the same language, and if you squint we sorta have that already. (e.g. the Zig compiler has a Zig interpreter too, to support metaprogramming)

---

Kit's 'term rewriting' metaprogramming that can match on syntactic structure but also on types:

https://www.kitlang.org/examples.html#term-rewriting

---

" "There are rare occasions I see DSLs used to fulfill these roles, and it's to great benefit."

I agree. iMatix, REBOL/RED, and LISP community (esp Racket) have shown it consistently in terms of productivity and reliability with performance being better sometimes. True even for general-purpose, systems programming: see Ivory langauge from Galois.

"SMC rather than hand-coding stuff as C++ boilerplate. Same with using a process coordinator like Erlang's OTP (which isn't a DSL but is close enough)."

OTP as a coordinator DSL I haven't thought about. Might look into it to see if there's design wisdom to learn for a future DSL outside of Erlang. " [18]

---

regarding the desirability of restricted/stratified/sub-turing languages:

" norswap on Nov 14, 2015 [–]

Sometimes, yes. But sometimes, you're just going to chafe at the restrictions. And then you start adding to your restricted language and you've created a monster.

Case in point: every build system targeting the JVM. Most build systems in general. "

---

in reply to a comment (interpreted as) asking about good languages for DSLs:

 nickpsecurity on Nov 15, 2015 [–]

LISP, Forth, SNOBOL, REBOL/RED, PROLOG, Stratego Programming System, BNF, ASF+SDF...

---

sparkie on Nov 15, 2015 [–]

Kernel is ideal for creating small embedded languages, where you can selectively expose parts of the parent language to perform general purpose computation, without giving any access to sensitive code.

Kernel is like Scheme, except environments are first class objects which you can mutate and pass around. A typical use for such environment is for the second argument of $remote-eval, to limit the bindings available to the evaluator. If you treat an eDSL as a set of symbols representing it's vocabulary and grammar, then you bind them to a new environment with $bindings-environment, then passing a piece of code X to eval with this resulting argument as the second environment, will ensure X can only access those bindings (and built-in symbols which result from parsing Kernel, such as numbers), and nothing else.

There's a function make-kernel-standard-environment for easy self-hosting.

Trivial examples:

    ($define! x 1)
    ($remote-eval (+ x 2) (make-kernel-standard-environment))
    > error: unbound symbol: x
    ($define! y 2)
    ($remote-eval (+ x y) (make-environment ($bindings->environment (x 1))
                                            (make-kernel-standard-environment)))
    > error: unbound symbol: y
    ($remote-eval (+ x y) ($bindings->environment (x 1) (y 2) (+ +)))
    > 3
    ($define! x 1)
    ($define! y 2)
    ($remote-eval (+ x y) (get-current-environment))
    > 3

$remote-eval is a helper function in the standard library which evaluates the second argument in the dynamic environment to get the target environment, then evaluates the first argument in that. The purpose of this is to provide a blank static-environment for o to be evaulated, so no bindings from the current static environment where $remote-eval is called from are made accessible to the first argument.

   ($define! $remote-eval ($vau (o e) d (eval o (eval e d))))

If you're familiar with Scheme, you may notice the absensce of quote.

And contrary to the opinions in the article, Kernel is the most expressively powerful language I know, and I'd recommend everyone learn it. You have a powerful but small and simple core language, from which you can derive your DSLs with as little or much power as you want them to have.

cpeterso on Nov 16, 2015 [–]

Thanks! I will check out Kernel.

The ability to restrict the environment is key. Creating a DSL in standard Lisp would create a more powerful environment: the DSL would include all Lisp features plus the domain features.

---

"...Tcl's flexible, to the degree of being able to define new language constructs or redefine built-ins (e.g. you can redefine the basic if-then if you want), which is great in some ways..." [19]

---

" Strict modules are a new Python module type marked with __strict__ = True at the top of the module, and implemented by leveraging many of the low-level extensibility mechanisms already provided by Python. A custom module loader parses the code using the ast module, performs abstract interpretation on the loaded code to analyze it, applies various transformations to the AST, and then compiles the modified AST back into Python byte code using the built-in compile function. " -- [20]

---

---

"

Using with-redefs for testing

Clojure provides the macro "with-redefs" that can redefine any function executed within the scope of that form, including on other threads. We have found this to be an invaluable feature for writing tests.

Sometimes we use with-redefs to mock specific behavior in the dependencies of what we’re testing so we can test that functionality in isolation. Other times we use it to inject failures to test fault-tolerance.

The most interesting usage of with-redefs in our codebase, and one of our most common, is using it alongside no-op functions we insert into our source code. These functions effectively provide a structured event log that can be dynamically tapped in an à la carte way depending on what a test is interested in.

Here’s one example (out of hundreds in our codebase) of how we use this pattern. One part of our system executes user-specified work in a distributed way and needs to: 1) retry the work if it fails, and 2) checkpoint its progress to a durable, replicated store after a threshold amount of work has succeeded. One of the tests for this injects a failure the first time work is attempted and then verifies the system retries the work.

The source function that executes the work is called "process-data!", and here is an excerpt from that function: 1 2 3

(when (and success? retry?) (retry-succeeded) (inform-of-progress! manager))

"retry-succeeded" is a no-op function defined as (defn retry-succeeded [] ).

In a totally separate function called "checkpoint-state!", the no-op function "durable-state-checkpointed" is called after it finishes replicating and writing to disk the progress information. In our test code, we have: 1 2 3 4 5 6 7 8 9 10

(deftest retry-user-work-simulated-integration-test (let [checkpoints (volatile! 0) retry-successes (volatile! 0)] (with-redefs [manager/durable-state-checkpointed (fn [] (vswap! checkpoints inc))

                  manager/retry-succeeded
                  (fn [] (vswap! retry-successes inc))]
      ...
      )))

Then in the body of the test, we check the correct internal events happen at the correct moments.

Best of all, since this à la carte event log approach is based on no-op functions, it adds basically no overhead when the code runs in production. We have found this approach to be an incredibly powerful testing technique that utilizes Clojure’s design in a unique way.

" [22]

--

" Macro usage

We have about 400 macros defined through our codebase, 70% of which are part of source code and 30% of which are for test code only. We have found the common advice for macros, like don’t use a macro when you can use a function, to be wise guidance. That we have 400 macros doing things you can’t do with regular functions demonstrates the extent to which we make abstractions that go far beyond what you can do with a typical language that doesn’t have a powerful macro system.

About 100 of our macros are simple "with-" style macros which open a resource at the start and ensure the resource is cleaned up when the form exits. We use these macros for things like managing file lifecycles, managing log levels, scoping configurations, and managing complex system lifecycles.

About 60 of our macros define abstractions of our custom language. In all of these the interpretation of the forms within is different than vanilla Clojure.

Many of our macros are utility macros, like "letlocals" which lets us more easily mix variable binding with side effects. We use it heavily in test code like so: 1 2 3 4 5

(letlocals (bind a (mk-a-thing)) (do-something! a) (bind b (mk-another-thing)) (is (= (foo b) (bar a))))

This code expands to: 1 2 3 4

(let [a (mk-a-thing) _ (do-something! a) b (mk-another-thing)] (is (= (foo b) (bar a))))

The rest of the macros are a mix of internal abstractions, like a state machine DSL we built, and various idiosyncratic implementation details where the macro removes code duplication that can’t be removed otherwise.

Macros are a language feature that can be abused to produce terribly confusing code, or they can be leveraged to produce fantastically elegant code. Like anything else in software development, the result you end up with is determined by the skill of those using it. At Red Planet Labs we can’t imagine building software systems without macros in our toolbox.

" [23]