proj-plbook-plChMetaprogramming

Table of Contents for Programming Languages: a survey

Chapter: metaprogramming: hacking the call stack

first-class call stacks

Chapter: metaprogramming: hacking the ENV

hacking the ENV

ruby's method_missing, respond_to, instance_eval

Chapter: metaprogramming: operations

monkeypatching

object protocols

Chapter: metaprogramming: hacking classes

Chapter : metaprogramming: syntax

custom operator precedence

syntax rules

OMeta:

Related:

Chapter : metaprogramming: eval (todo generalize)

eval

interpreter tower

up and down

Chapter : metaprogramming: misc

by model of computation

the models of computation also suggest extension mechanisms:

turing/imperative: gotos, selfmodifying code, mutable state, call stack manipulation (including continuations) lambda calc/functional: higher order functions combinatorial: reductions grammar? concatenative? logic? mu-recursive? relational?

where do macros come from? grammar? combinatorial? none of these?

by evasion of constraints

see Martin's [1]. Martin defines programming paradigms by constraints, and it seem to me that by evading these constraints (e.g. by using GOTO) we get metaprogrammy stuff.

annotations/attributes

" There is a reasonable fear that attributes will be used to create language dialects. The recommendation is to use attributes to only control things that do not affect the meaning of a program but might help detect errors (e.g. [[noreturn?]]) or help optimizers (e.g. [[carries_dependency?]]) " -- http://www.stroustrup.com/C++11FAQ.html#attributes

reflection

todo: put this section elsewhere

Links:

DSL

https://en.wikipedia.org/wiki/Metalinguistic_abstraction

SELL

The SELL paper also criticizes metaprogramming through compiler options and pragmas (not flexible enough), preprocessed languages (distinguished from compilers by the criterion that in a true compiler, you can never get an error from the target language's compiler; the problems are: (i) the preprocessor toolchain must be maintained/released in lockstep with the host language (ii) difficult to remove undesirable features from the host language, (iii) impedence mismatch with the host language, especially the type system, a symptom of which is often seen in trouble with error detection and reporting), and dialects (the problems are: (i) difficult to remove undesirable features from the host language; (ii) similar but lesser problems as making a new language w/r/t expense of tooling implementation and maintanence, and networks effects of users).

links

todo

" Wyvern: at CMU, Jonathan Aldrich's group has been working on the Wyvern language, which incorporates a novel language feature called Type-Specific Languages (TSLs) developed by Cyrus Omar. TSLs are a means of extending the syntax of Wyvern by defining parsers as a first-class member of the language and allowing users to create their own mini-languages inside of Wyvern: This is similar to procedural macros in Rust which are essentially functions from TokenStream? → AST that define an alternate syntax for the language. TSLs, however, are notable in that they are unambiguous and composable. The general idea is that a TSL returns a value of a known type, and the compiler can do type inference to determine the expected type of a value, so it can resolve ambiguity between languages based on their types. Additionally, the TSLs can be used within each other. When you define a new TSL, you get interoperability with every other TSL defined in Wyvern for free! Imagine if you could freely mix C, OCaml, Javascript and HTML with type-safe interop. This style of composition is the future of front-end, or syntax level, interop.

Figure 3: HTML templating with SQL and CSS mixed in using Wyvern.

http://i.imgur.com/7c2oyxx.jpg

" -- http://notes.willcrichton.net/the-coming-age-of-the-polyglot-programmer/

3-Lisp

3-Lisp augements Lisp with 'reflective procedures', which can be thought of as 'the hook to end all hooks' -- they are called from the context of the interpreter, and are passed in their own arguments, and the environment, and finally a continuation. They are supposed to compute their result, and then return it by calling the continuation that they were passed (with the result as argument). They have access to interpreter functions 'normalize' and 'reduce' as well as 'set' (or 'rebind'), and also 'up' and 'down'; 'down' is similar to unquote and can be used to refer to the 'program level' interpretation of a term, rather than the 'interpreter level' interpretation, and 'up' is similar to quoting (todo: not sure if i got that right, especially with 'up' and 'down').

Note that 3-Lisp executes the program as if it is at the highest level of an infinite tower of interpreters; eg you can have a 2nd-order reflective procedure, etc.

Note that in 3-Lisp fexprs can be implemented because a reflective procedure has control over when, if at all, each of its arguments is evaluated.

"(Apparently, 3-Lisp is quite similar to Kernel: every operative receives not only its operand tree and the lexical environment in which it is called, but also its continuation. I still don't understand it. Why would you pass the continuation to an operative, when it can easily obtain it using e.g. call/cc? Apparently because 3-Lisp considers the continuation to exist on the next meta-level, not the current one.)" -- http://axisofeval.blogspot.com/2013/06/a-week-of-lisp-in-madrid.html

Links:


deep vs. shallow DSL embedding: http://www.cs.ox.ac.uk/publications/publication7584-abstract.html

"As far as I'm aware, while you can use the strict definition (is there, in the metalanguage, the creation of an AST?) they're often discussed as more of a continuum. HOAS is a great example. There is the creation of an AST, but perhaps the most important and tricky part of any AST, the binding system, is left to the metalanguage. For that reason exactly I'm happy to say that HOAS is "shallower" than, say, a de Bruijn indexed binding system. "

https://alessandrovermeulen.me/2013/07/13/the-difference-between-shallow-and-deep-embedding/

---

Fexprs / operative combiners

https://web.cs.wpi.edu/~jshutt/kernel.html

" Unfortunately, as traditionally realized, fexprsare even more badly behaved than traditional macros: by making it impossible todetermine the meanings of subexpressions at translation time, they destroy localityof information in the source-code — thus undermining not only encapsulation (as dotraditional macros), but most everything else in the language semantics as well. Soaround 1980, the Lisp community abandoned fexprs, turning its collective energiesinstead to mitigating the problems of macros. ... The fundamental distinction between fexprs and macros is one of explicit ver-sus implicit evaluation. (See§1.2.3.) If the fexpr wishes any of its operands to beevaluated, it must explicitly evaluate them

...

The fundamental issues surrounding fexprs were carefully and clearly laid out byKent M. Pitman in a 1980 paper, [Pi80]. After enumerating reasons for supportingconstructed operatives, he discussed strengths and weaknesses of macros and fexprsand, ultimately, recommended omitting fexprs from future dialects of Lisp:It has become clear that such programming constructs asNLAMBDA’s andFEXPR’s are undesirable for reasons which extend beyond mere questionsof aesthetics, for which they are forever under attack. ... Fexprs have an inherent clarity advantage over macros " -- https://web.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unrestricted/jshutt.pdf

---

Metaprogramming in R with rlang

https://adv-r.hadley.nz/metaprogramming.html

---

macros like __FILE__ and __LINE__

---

"

Const-me 1 day ago [–]

> one is writing code that is abstracted over types, and another is a clever trick called template metaprogramming

I don't use either of these tricks much. Metaprogramming is especially bad, pretty much unreadable once written.

There's a third useful thing people do with templates - pass integer numbers to compiler. Unlike meta-programming, doesn't hurt readability.

There're instructions like _mm_shuffle_ps or _mm_srli_epi16 or _mm_extract_epi32 which encode an integer into machine codes. Instructions can encode memory references like `ptr [rax+32]`. And even without low-level trickery, it's sometimes useful to compile the same code multiple times, with changes somewhere. Templates is an easy way to express that in C++, here's an example: https://stackoverflow.com/a/59495197/126995. A nice related feature is `if constexpr` from C++/17, such branches are resolved at compile-time and therefore free.

reply " -- [2]

---

"C preprocessor macros present a number of hazards, including possible multiple evaluation of expressions with side effects and no type safety. If you are tempted to define a macro, consider creating an inline function instead. The code which results will be the same, but inline functions are easier to read, do not evaluate their arguments multiple times, and allow the compiler to perform type checking on the arguments and return value." -- [3]

---

http://www.erights.org/elang/kernel/auditors/

"Auditors" are pieces of code which examines various parts of the AST to see if they contain various properties. They (can? or must?) run at compile-time.

---

macros

Implementations in various languages:

Usage:

reader macros

---

continuations

Links:

---

non-standard evaluation

See also fexprs.

---

Metaprogramming links

Metaprogramming todos

http://venge.net/graydon/talks/CompilerTalk-2019.pdf section "Variation #3 Meta-languages"

---

Introduction to Term Rewriting with Meander https://jimmyhmiller.github.io/meander-rewriting

---

https://rust-analyzer.github.io//blog/2021/11/21/ides-and-macros.html

main.rs:

mod foo; foo::declare_mod!(bar, "foo.rs");

foo.rs:

pub struct S; use super::bar::S as S2;

macro_rules! _declare_mod { ($name:ident, $path:literal) => { #[path = $path] pub mod $name; } } pub(crate) use _declare_mod as declare_mod;

"Semantics like this are what prevents rust-analyzer to just process every file in isolation"

"

There is an alternative — design meta programming such that it can work “file at a time”, and can be plugged into an embarrassingly parallel indexing phase. This is the design that Sorbet, a (very) fast type checker for Ruby chooses: https://youtu.be/Gdx6by6tcvw?t=804. I really like the motivation there. It is a given that people would love to extend the language in some way. It is also given that extensions wouldn’t be as carefully optimized as the core compiler. So let’s make sure that the overall thing is still crazy fast, even if a particular extension is slow, by just removing extensions from the hot path. (Compare this with VS Code architecture with out-of-process extensions, which just can’t block the editor’s UI).

To flesh out this design bit:

    All macros used in a compilation unit must be known up-front. In particular, it’s not possible to define a macro in one file of a CU and use it in another.
    Macros follow simplified name resolution rules, which are intentionally different from the usual ones to allow recognizing and expanding macros before name resolution. For example, macro invocations could have a unique syntax, like name!, where name identifies a macro definition in the flat namespace of known-up-front macros.
    Macros don’t get to access anything outside of the file with the macro invocation. They can simulate name resolution for identifiers within the file, but can’t reach across files.

Here, limiting macros to local-only information is a conscious design choice. By limiting the power available to macros, we gain the properties we can use to make the tooling better. For example, a macro can’t know a type of the variable, but because it can’t do that, we know we can re-use macro expansion results when unrelated files change.

"

---

---

procedural macros vs declarative macros:

Procedural macros are functions which take an AST and return an AST.

Declarative macros are things like search-and-replace rules.

---

rust macros

https://doc.rust-lang.org/book/ch19-06-macros.html

racket macros https://users.cs.northwestern.edu/~robby/pubs/papers/jfp2012-fcdf.pdf

---

racket (scheme) macros vs clojure macros (vs rhombus (racket variant) macros):

" ~ NoahTheDuke? 2 hours ago

link flag
    A rather interesting observation to make here is that while I could have implemented a similar threading operator in Racket, I opted not to, because I felt that the extra syntactic overhead required to write the macro would outweigh its benefits, while in Rhombus, the lighter syntax actually made me more inclined to use macros3.

Funny he mentions that cuz over in Clojure, we have a much lighter macro system that trades raw power for ease and everyday usage. (Turns out, folks do write some macros but still mainly stay in functions as they compose better.)

To give an example, in Clojure, we have the thread-first (->) and thread-last (->>) macros (examples from ClojureDocs?):

(-> person :employer :address :city) ;; is equivalent to (:city (:address (:employer person)))

(->> (range) (map #(* % %)) (filter even?) (take 10) (reduce +)) ;; is equivalent to (reduce + (take 10 (filter even? (map #(* % %) (range)))))

The implementations are pretty simple as seen in clojure.core. They take the forms, apply transformations, and return the quasi-quoted result.

Compare this with Racket’s threading library implementation. Maybe it’s my relative inexperience with Racket’s syntax-case, but I feel like the difference is vast.

    ~
    technomancy 1 hour ago | link | flag | 
    Compare this with Racket’s threading library implementation. Maybe it’s my relative inexperience with Racket’s syntax-case, but I feel like the difference is vast.

To understand what’s going on here you have to see that hygenic macros were specifically designed to avoid a specific problem with CL’s macro system: that of accidental symbol capture. Scheme’s macros go to great lengths to make symbol capture completely impossible, at the cost of giving up the conceptual simplicity of defmacro.

Clojure came along and solved the same problem with a much easier solution: ban backticked symbols from being used as identifiers unless they used auto-gensym. It’s still basically impossible to accidentally do symbol capture. You have to go out of your way to do it, and it’s really obvious when you do. (Almost always still a mistake, but at least you won’t get bitten by it unaware!) But the same bug was solved with a tiny fraction of the conceptual cost.

(This has nothing to do with Rhombus FWIW; just a general comparison between Schemes vs Clojure and other newer lisps which have learned from Clojure.) " -- [5]

---

racket (scheme) macros vs rhombus (racket variant) macros:

comments on https://gopiandcode.uk/logs/log-racket-and-rhombus-sexp.html :

" 5 algernon 6 hours ago

link flag

To me, this has been a pretty convincing argument to sexp.

The Rhombus codes were often shorter, yes, but I found the Racket versions of the code easier to follow, and the macros noticeably cleaner. The Rhombus macros have way too much stuff encoded in strings - no, thanks.

    ~
    snej 1 hour ago | link | flag | 

Those aren’t strings, they’re quoted forms. Double-quote characters delimit strings.

    ~
    algernon 55 minutes ago | link | flag | 
        ‘$exp |> $f $rest …’
    That sure does look like a string to me: starts with a ’, ends with an ‘, with code between that has different syntax than the rest of Rhombus. It may not be technically a string, but it’s encoding a different syntax within a block of text than the rest of the language.
    In contrast, in Racket, quoted or unquoted, it’s the same syntax.

~ NoahTheDuke? 3 hours ago

link flag
    The Rhombus macros have way too much stuff encoded in strings - no, thanks.

That stood out to me as well. Seems like a big whiff to move all of the metaprogramming into strings. " -- [6]

---

Smalltalk context objects

"...contexts, the Smalltalk terminology for stack frames" -- [7]

"Smalltalk-80 provides a reification of execution state in the form of context objects which represent procedure activation records [Ingalls76]. This feature provides a portable abstraction of execution state which has several advantages, including

"-- [8]

---

some examples of metaprogramming in python: [9]

---

"^ is "the meta character" it tells the reader to add the symbol starting with ^ as metadata to the next symbol (provided it is something that implements IMetas)" -- https://stackoverflow.com/questions/8920137/clojure-caret-as-a-symbol

---

metaprogramming in tcl

examples:

" Concept 8: Procedures

Naturally, nothing stops a Tcl programmer from writing a procedure (that's a user defined command) in order to use math operators as commands. Like this:

proc + {a b} { expr {$a+$b} }

The proc command is used to create a procedure: its first argument is the procedure name, the second is the list of arguments the procedure takes as input, and finally the last argument is the body of the procedure. Note that the second argument, the arguments list, is a Tcl list. As you can see the return value of the last command in a procedure is used as return value of the procedure (unless the return command is used explicitly). But wait... Everything is a command in Tcl right? So we can create the procedures for +, -, *, ... in a simpler way instead of writing four different procedures:

set operators [list + - * /] foreach o $operators { proc $o {a b} [list expr "\$a $o \$b"] }

After this we can use [+ 1 2], [/ 10 2] and so on. Of course it's smarter to create these procedures as varargs like Scheme's procedures. In Tcl procedures can have the same names as built in commands, so you can redefine Tcl itself. For example, in order to write a macro system for Tcl I redefined proc. Redefining proc is also useful for writing profilers (Tcl profilers are developed in Tcl itself usually). After a built in command is redefined you can still call it if you renamed it to some other name prior to overwriting it with proc.

Concept 9: Eval and Uplevel

If you are reading this article you already know what Eval is. The command eval {puts hello} will of course evaluate the code passed as argument, as happens in many other programming languages. In Tcl there is another beast, a command called uplevel that can evaluate code in the context of the calling procedure, or for what it's worth, in the context of the caller of the caller (or directly at the top level). What this means is that what in Lisp are macros, in Tcl are just simple procedures. Example: in Tcl there is no "built-in" for a command repeat to be used like this:

repeat 5 { puts "Hello five times" }

But to write it is trivial.

proc repeat {n body} { set res "" while {$n} { incr n -1 set res [uplevel $body] } return $res }

Note that we take care to save the result of the last evaluation, so our repeat will (like most Tcl commands) return the last evaluated result. An example of usage:

set a 10 repeat 5 {incr a} ;# Repeat will return 15

As you can guess, the incr command is used to increment an integer var by one (if you omit its second argument). "incr a" is executed in the context of the calling procedure, (i.e. the previous stack frame).

...

Radical language modifications = DSL

If you define a procedure called unknown it is called with a Tcl list representing arguments of every command Tcl tried to execute, but failed because the command name was not defined. You can do what you like with it, and return a value, or raise an error. If you just return a value, the command will appear to work even if unknown to Tcl, and the return value returned by unknown will be used as return value of the not defined command. Add this to uplevel and upvar, and the language itself that's almost syntax free, and what you get is an impressive environment for Domain Specific Languages development. Tcl has almost no syntax, like Lisp and FORTH, but there are different ways to have no syntax. Tcl looks like a configuration file by default:

disable ssl validUsers jim barbara carmelo hostname foobar { allow from 2:00 to 8:00 }

The above is a valid Tcl program, once you define the commands used, disable, validUsers and hostname.

" -- http://antirez.com/articoli/tclmisunderstood.html

toread