Concatenative Languages
A concatenative programming language is a point-free language in which all expressions are functions, and the concatenation of two programs denotes the composition of the functions denoted by the two programs. -- paraphrasing https://en.wikipedia.org/wiki/Concatenative_programming_language and http://web.archive.org/web/20111007025556/http://www.latrobe.edu.au/phimvt/joy/j02maf.html
In this chapter, we may also include languages that are similar to concatenative languages but are not actually concatenative.
Forth
features
- "Forth builds on some fundamental properties that make it quite idiosyncratic and may appear quirky, due to its roots in simplicity and compactness, considering that it emerged in the microcomputer era, on machines that provided (for today's standards) severely limited capacity. Yet these attributes result in certain rather peculiar advantages. One such property is the use of reverse polish notation and the lack or eschewal of local variables. As argument-passing to subroutines and primitives takes place on the stack, factoring the code becomes nearly trivial - expressions can be taken out of context and moved into new definitions, just as in the "pointfree" style of functional programming. The emphasis on short definitions and (usually) a low subroutine call overhead additionally suggests isolating common code sequences. Local variables make this much more difficult which is why local variable facilities should be avoided." -- [1]
- "Another useful feature is the hyperstatic environment, where later redefinition of values does not change the meaning of previous uses. This greatly enhances the stability of already compiled code and reduces the pressure to find alternative names for existing definitions, if shadowing of already defined words does not present a problem for code defined later. Even though Forth provides a quite powerful namespacing system using vocabularies, it is not strictly necessary to isolate namespaces just to avoid polluting the environment, unless shadowing is an issue." -- [2]
local variables
Some Forth implementations have local variables.
" In some environments there may be more or fewer functions assigned to the call stack. In the Forth programming language, for example, ordinarily only the return address, counted loop parameters and indexes, and possibly local variables are stored on the call stack (which in that environment is named the return stack), although any data can be temporarily placed there using special return-stack handling code so long as the needs of calls and returns are respected; parameters are ordinarily stored on a separate data stack or parameter stack, typically called the stack in Forth terminology even though there is a call stack since it is usually accessed more explicitly. Some Forths also have a third stack for floating-point parameters. " -- [3]
"A Forth system is allowed to keep local variables on the return stack. This is reasonable, as local variables usually eliminate the need to use the return stack explicitly. So, if you want to produce a standard compliant program and you are using local variables in a word, forget about return stack manipulations in that word (refer to the standard document for the exact rules). " -- [4]
Adding a third stack to a Forth engine by VanNorman and Koopman says that the best use of a third stack in Forth would be for local variable (activation records, like C), but doesnt explain why this is better than just storing local variablens on the parameter stack or on the return stack
http://turboforth.net/resources/locals.html is an example of a third stack local variable implementation in Forth
https://forth-standard.org/standard/locals says "The storage resource ((for locals)) may be the return stack or may be implemented in other ways, such as in registers. The storage resource shall not be the data stack. Use of locals shall not restrict use of the data stack before or after the point of declaration. "
Stack Frames and Local Variables by George B Lyons (1985) or [5] is an example of a parameter stack implementation of local variables in Forth. I only skimmed it but it appears to be similar to (now)-standard activation records on a single stack.
Forth Dimension, volume XI, number 1 talks, amongst other thanks, about some techniques for adding local varibles to Forth. For example, on page 18, there is a description of an implementation that uses the return stack.
use of stacks
(I could be misunderstanding the following, but:) In Forth, there are typically two stacks, one called either just 'stack' or 'parameter stack', and the other called 'return stack'. The parameter stack tends to be much larger than the return stack. The return stack is the one organized with traditional 'stack frames', with return addresses, loop counters, and maybe local variables, so it looks the most like a C stack. Function calling parameters go on Forth's larger 'parameter stack'. The 'parameter stack' is manipulated much more frequently than the return stack.
retrospectives
opinions
- "per line, Forth is still a bit more error-prone than C, in my experience. In C you get enough static type checking to immediately detect something like 90% of your type errors; you get a warning if you forget to pass an argument to a function, or pass too many; the data flow of any subroutine is easily approximable without having to know the stack effect of every subroutine you call; and so on. " -- [6]
- "extensible assembly language" -- [7]
- "The good news is you can do anything with Forth, the bad news is you will probably have to do it yourself." -- David Sands
- [8]
- response: "I think the heart of the problem is that the rudimentary ideas of forth- a stack, some stack-shuffling words, some arithmetic, control flow, and that's about it- are appealingly simple, but not enough of the language...Forth can be small, and simple, but the real gems of the language are a bit subtle. The many implications of CREATE>/DOES. Abuses of manipulating the rstack. Striking a balance between parsing words, named locals, global variables, vocabularies, and other tools to minimize twisty stack shuffling. Bending the rules at the right times. " [9]
- "The main characteristic of Forth programs that separates Forth from most other languages is the high frequency of subroutine calls." -- [10]
- "Given a data stack, subroutine calling sequences vanish...The cost of a subroutine is merely the call and return. This allows many small subroutines and more intense factoring of a problem." [11]
- "Forth is a great language, simple syntax, (relatively) simple to implement, has a repl, easily extendable and fast. Chuck Moore was/is really onto something." [12]
- https://blog.information-superhighway.net/what-the-hell-is-forth
primitives
" Appendix B A Glossary of Forth Primitives
The Forth language is based on an extensible, interactive compiler that creates code for a virtual stack machine. The virtual machine has two stacks. The Data Stack is used for expression evaluation and subroutine parameter passing. The Return Stack is used for subroutine return address saving and for loop control variable storage. The source code for Forth directly reflects the underlying stack machine, and so uses Reverse Polish Notation (RPN) to perform all operations using.
Forth programs are built as a hierarchy of subroutines. Each subroutine is called a "word" in Forth terminology. A program consists of a single Forth word which calls several other Forth words, and so on, forming a tree-structured program. At the lowest level, the leaves of the tree are invocations of Forth primitive words that manipulate the stacks and perform arithmetic.
Below is a glossary of the Forth primitive words found on stack machines discussed in this book. Most of these primitives are actually applicable to any program written in any language on a stack machine (for example, addition of the top two stack elements or swapping the order of the top two stack elements). Forth nomenclature is used in discussions to maintain consistency with an existing standard vocabulary for stack machine operation.
Each Forth word is followed by a "stack picture" on the same line. This stack picture shows the input parameters and output parameters on the Forth Data Stack for the word being described. The values on the left of the "_" indicate the input parameters while those to the right of the "_" indicate output parameters. Each parameter list is ordered with the topmost stack element to the right. Notation in the stack lists is as follows: N1, N2, N3, etc. indicate single-precision integers. D1, D2, etc. indicate double-precision integers, which take up two elements on the data stack. ADDR indicates an address, which may be thought of as a pointer value. FLAG is an integer which is false if zero, true if non-zero. A more detailed glossary of Forth is Haydon's All About Forth (1983).
0 - 0 Push the integer 0 onto the stack.
0< N1 - FLAG Return a true FLAG if N1 is negative.
0= N1 - FLAG Return a true FLAG if N1 is zero.
0 N1 - FLAG Return a true FLAG if N1 is greater than zero.
0BRANCH N1 - If N1 is false (value is 0) perform a branch to the address in the next program cell, otherwise continue.
1+ N1 - N2 Add one to N1, returning N2.
1- N1 - N2 Subtract one from N1, returning N2.
2+ N1 - N2 Add two to N1, returning N2.
2* N1 - N2 Multiply N1 by two, returning N2.
2/ N1 - N2 Divide N1 by two, returning N2.
4+ N1 - N2 Add four to N1, returning N2.
< N1 N2 - FLAG Return a true FLAG if N1 is less than N2.
<> N1 N2 - FLAG Return a true FLAG if N1 is not equal to N2.
N1 N2 - FLAG
Return a true FLAG if N1 equals N2.
R N1 - Push N1 onto the return stack.
> N1 N2 - FLAG Return a true FLAG if N1 is greater than N2.
! N1 ADDR - Store N1 at location ADDR in program memory.
+ N1 N2 - N3 Add N1 and N2, giving sum N3.
+! N1 ADDR - Add N1 to the value pointed to by ADDR.
- N1 N2 - N3 Subtract N2 from N1, giving difference N3.
- - Define the start of a subroutine. The primitive [CALL] is compiled every time this subroutine is reference by other definitions.
; - Perform a subroutine return and end the definition of a subroutine. The primitive [EXIT] is compiled.
?DUP N1 - N1 N1 ( if N1 non-zero ) N1 - N1 ( if N1 is zero ) Conditionally duplicate the input N1 if it is non-zero.
@ ADDR - N1< Fetch the value at location ADDR in program memory, returning N1.
ABS N1 - N2 Take the absolute value of N1 and return the result N2.
AND N1 N2 - N3 Perform a bitwise AND on N1 and N2, giving result N3.
BRANCH - Perform an unconditional branch to the compiled in-line address.
D! D1 ADDR - Store the double-precision value D1 at the two memory words starting at ADDR.
D+ D1 D2 - D3 Return the double precision sum of D1 and D2 as D3.
D@ ADDR - D1 Fetch the double precision value D1 from memory starting at address ADDR.
DDROP D1 - Drop the double-precision integer D1.
DDUP D1 - D1 D1 Duplicate D1 on the stack.
DNEGATE D1 - D2 Return D2, which is the two's complement of D1.
DROP N1 - Drop N1 from the stack.
DSWAP D1 D2 - D2 D1 Swap the top two double-precision numbers on the stack.
DUP N1 - N1 N1 Duplicate N1, returning a second copy of it on the stack.
I - N1 Return the index of the currently active loop.
I' - N1 Return the limit of the currently active loop.
J - N1 Return the index of the outer loop in a nested loop structure.
LEAVE - Set the loop counter on the return stack equal to the loop limit to force an exit from the loop.
LIT - N1 Treat the compiled in-line value as an integer constant, and push it onto the stack as N1.
NEGATE N1 - N2 Return N2, which is the two's complement of N1 NOP - Do nothing.
NOT FLAG1 - FLAG2 Synonym for 0=. Takes the inverse of a flag value.
OR N1 N2 - N3 Perform a bitwise OR on N1 and N2, giving result N3.
OVER N1 N2 - N1 N2 N1 Push a copy of the second element on the stack, N1, onto the top of the stack.
PICK ... N1 - ... N2 Copy the N1'th element deep in the data stack to the top. In Forth-83, 0 PICK is equivalent to DUP , and 1 PICK is equivalent to OVER .
R> - N1 Pop the top element of the return stack, and push it onto the data stack as N1.
R@ - N1 Copy the top Return Stack word N1 onto the Data Stack.
ROLL ... N1 - ... N2 Pull the N1'th element deep in the data stack to the top, closing the hole left in the stack. In Forth-83, 1 ROLL is equivalent to SWAP , and 2 ROLL is equivalent to ROT.
ROT N1 N2 N3 - N2 N3 N1 Pull the third element down in the stack onto the top of the stack.
S-D N1 - D2 Sign extend N1 to occupy two words, making it a double precision integer D2.
SWAP N1 N2 - N2 N1 Swap the order of the top two stack elements.
U< U1 U2 - FLAG Return a true FLAG if U1 is less than U2 when compared as unsigned integers.
U U1 U2 - FLAG Return a true FLAG if U1 is greater than U2 when compared as unsigned integers.
U* N1 N2 - D3 Perform unsigned integer multiplication on N1 and N2, yielding the unsigned double precision result D3.
U/MOD D1 N2 - N3 N4 Perform unsigned integer division on D1 and N2, yielding the quotient N4 and the remainder N3.
XOR N1 N2 - N3 Perform a bitwise exclusive OR on N1 and N2, giving result N3.
" -- http://users.ece.cmu.edu/~koopman/stack_computers/appb.html
- A Forth standard: http://lars.nocrew.org/dpans/dpans.htm
- the 'core words' in that standard: http://lars.nocrew.org/dpans/dpans6.htm#6.1
- according to http://excamera.com/sphinx/article-standard.html , the DPANS standard words are: ] \ [THEN] [IF] [ELSE] [COMPILE] [CHAR] ['] [ XOR WRITE-LINE WRITE-FILE WORDS WORDLIST WORD WITHIN WHILE W/O VARIABLE VALUE UNUSED UNTIL UNLOOP UM/MOD UM* U> U< U.R U. TYPE TUCK TRUE TO TIME&DATE THROW THEN SWAP STATE SPACES SPACE SOURCE-ID SOURCE SM/REM SLITERAL SIGN SFLOATS SFLOAT+ SFALIGNED SFALIGN SF@ SF! SET-PRECISION SET-ORDER SET-CURRENT SEE SEARCH-WORDLIST SEARCH SAVE-INPUT S>D S" RSHIFT ROT ROLL RESTORE-INPUT RESIZE-FILE RESIZE REPRESENT REPOSITION-FILE REPEAT RENAME-FILE REFILL RECURSE READ-LINE READ-FILE R@ R> R/W R/O QUIT PREVIOUS PRECISION POSTPONE PICK PARSE PAGE PAD OVER ORDER OR OPEN-FILE ONLY OF NIP NEGATE MS MOVE MOD MIN MAX MARKER M+ M*/ M* LSHIFT LOOP LOCALS
| LITERAL LEAVE KEY? KEY J INVERT INCLUDED INCLUDE-FILE IMMEDIATE IF I HOLD HEX HERE GET-ORDER GET-CURRENT F~ FVARIABLE FTANH FTAN FSWAP FSQRT FSINH FSINCOS FSIN FS. FROUND FROT FREE FOVER FORTH-WORDLIST FORTH FNEGATE FMIN FMAX FM/MOD FLUSH-FILE FLOOR FLOG FLOATS FLOAT+ FLNP1 FLN FLITERAL FIND FILL FILE-STATUS FILE-SIZE FILE-POSITION FEXPM1 FEXP FE. FDUP FDROP FDEPTH FCOSH FCOS FCONSTANT FATANH FATAN2 FATAN FASINH FASIN FALSE FALOG FALIGNED FALIGN FACOSH FACOS FABS F@ F>D F< F0= F0< F/ F. F- F+ F F* F! EXIT EXECUTE EVALUATE ERASE ENVIRONMENT? ENDOF ENDCASE EMIT ELSE EKEY? EKEY>CHAR EKEY DUP DUMP DU< DROP DOES> DO DNEGATE DMIN DMAX DFLOATS DFLOAT+ DFALIGNED DFALIGN DF@ DF! DEPTH DELETE-FILE DEFINITIONS DECIMAL DABS D>S D>F D= D< D2/ D2* D0= D0< D.R D. D- D+ CS-ROLL CS-PICK CREATE-FILE CREATE CR COUNT CONSTANT COMPILE, COMPARE CODE CMOVE> CMOVE CLOSE-FILE CHARS CHAR+ CHAR CELLS CELL+ CATCH CASE C@ C, C" C! BYE BLOCK BLK BLANK BL BIN BEGIN BASE AT-XY ASSEMBLER AND ALSO ALLOT ALLOCATE ALIGNED ALIGN AHEAD AGAIN ACCEPT ABS ABORT" ABORT @ ?DUP ?DO ? >R >NUMBER >IN >FLOAT >BODY > = <> <# < ;CODE ; :NONAME : 2VARIABLE 2SWAP 2ROT 2R@ 2R> 2OVER 2LITERAL 2DUP 2DROP 2CONSTANT 2@ 2>R 2/ 2* 2! 1- 1+ 0> 0= 0<> 0< /STRING /MOD / .S .R .( ." . -TRAILING - , +LOOP +! + */MOD */ * (LOCAL) ( ' #S #> # ! ok |
Opinions:
- "Forth is best used on medium-sized programming projects involving no more than two or three programmers who have compatible programming styles." -- http://users.ece.cmu.edu/~koopman/stack_computers/sec7_2.html
- http://yosefk.com/blog/my-history-with-forth-stack-machines.html
- note however https://news.ycombinator.com/item?id=21154418 , quoted here: "The author learned a little about Forth and was fascinated. He then wrote his own implementation and struggled mightily with it. Ultimately he was disenchanted. I think the heart of the problem is that the rudimentary ideas of forth- a stack, some stack-shuffling words, some arithmetic, control flow, and that's about it- are appealingly simple, but not enough of the language: >"I figured I didn't need the more interesting metaprogramming stuff for my first prototype programs, and I could add it later if it turned out that I was wrong. It was wierd to throw away everything I originally liked the most, but I was all set to start writing real programs. Solving real problems in the real world. It was among the most painful programming experiences in my life." Imagine reading the first chapter of SICP without prior exposure to Lisp, excitedly writing an interpreter for the parts of the language you'd seen so far, and then feeling your enthusiasm slowly slip away into dismay as you attempted to use that stunted fragment of a language to get work done. Forth can be small, and simple, but the real gems of the language are a bit subtle. The many implications of CREATE>/DOES. Abuses of manipulating the rstack. Striking a balance between parsing words, named locals, global variables, vocabularies, and other tools to minimize twisty stack shuffling. Bending the rules at the right times. "
- "Forth, factor and lisp (or scheme) are amongst the languages that are easiest to bootstrap, that alone gives them a right to exist." -- [13]
- "In my personal [imaginary] categorized language list, Forth is in the section named "cute things that doesn't scale", somewhere near Scheme (which they call Lisp when pushing it to unsuspecting students) and _hand-written_ PDP-11 assembly (or hand-written PDP-11 machine code: MACRO-11 is way too powerful for this category)...I'd probably start with Forth on a system with ~16-32K RAM, no C compiler and no GCC port [yet]]." -- [14]
- "You might look at Factor http://factorcode.org/ . It's sort of "scalable Forth"." -- [15]
- "Forth is a fine and capable language, but I tend to agree with those who say that it tends to produce write-only programs. Most of the books on Forth are very old and compare Forth's elegance to Fortran's or COBOL's. I'm of the opinion that C is more expressive and generally produce more elegant and maintainable code...Forth is, to my knowledge, the most compact language allowing high level constructs. It is so compact that Collapse OS' Forth implementation achieves self-hosting with about the same amount of resources than its assembler counterpart!" -- https://collapseos.org/forth.html
- "Language with extremely terse syntax. Just two things - words and spaces between them. No brackets, no operators, no function calls and parameter passing. Only words, words, words. Forth runtime is very simple. No dynamic memory allocation, no garbage collection, no type system, no support for arrays, lists, vectors or hashmaps. Just a flat bytes of memory and stack. You could write your forth system from scratch easily." -- http://brokestream.com/bestlang.html
- http://www.findinglisp.com/blog/2008/06/forth-timelessness-redux.html
- "Regarding Factor – to the extent of my understanding, it's a modern dynamic language, its single similarity to Forth being the stack (for example, its http-get call is followed by nip in their link scraping example.) Perhaps the single isolated feature of Forth most easy to take away to an altogether different environment, but the least favorite of mine (sounds moronic to like Forth and dislike stacks; well, I like Lisp but dislike its mutable, tail-sharing, cons-exposing lists – here, Clojure seems to have done a nice job of salvaging the right things, like macros, and not the wrong but iconic things, like naked cons). Forth's approach to metaprogramming (parsing words) is absent from Factor – not that it's necessarily a great thing if you want to scale, but AFAIK nobody checked (the extent to which you can salvage Forth's metaprogramming without dragging in the rest of Forth is unclear to me.)" -- [16]
- "Are you familiar with how to implement a bytecode virtual machine using a jump table? Forth is pretty much that, but the jump table is extensible. I meant it very literally." [17]
- "Forth is one of the two quintessential extensible languages, the other one being Lisp" [18]
Implementations:
- Jones Forth; an annotated-code tutorial on implementing Forth
- https://www.gnu.org/software/gforth/
- http://www.excamera.com/sphinx/fpga-j1.html
- SwapForth (runs on the j1A) "SwapForth? takes about 5K of the available RAM, and includes a full native compiler, the ANS standard CORE words, and several more modern extensions." [21]
- gforth (eg the J1 CPU's compiler and assembler run under gforth)
- eforth (the J1 paper also cites [22] a newer version of which is at [23]) (eg "much of" the J1 CPU's implementation of Forth core is based on eforth)
- ColorForth
- https://github.com/samueltardieu/picforth/blob/master/doc/picforth.texi
- https://flashforth.com/
- MicroCore? embedded CPU
- b16-small embedded CPU
- eP32 embedded CPU
- http://ficl.sourceforge.net/index.html
- https://github.com/kragen/stoneknifeforth "a tiny self-hosted Forth implementation". See section in [[plChMinLangs.txt?]]
- "Bernard Hodson.. has a Forth interpreter and a library of subroutines that occupies less than 32K"
- http://wayback.archive.org/web/20090209211708/http://www.annexia.org/forth
- https://reddit.com/r/Forth/comments/ay627/ask_rforth_whats_the_smallest_implementation_of/
- Forth for Atmel AVR8 Atmega micro controller family and some variants of the TI MSP430. "AmForth? for the AVR8 needs 8 to 12 KB Flash memory, 80 bytes EEPROM, and 200 bytes RAM for the core system. A similar code for the MSP430 fits into 8KB flash"
- Forth for MSP430 microcontrollers. "fits into 11 kb of flash or fram and runs with at least 512 bytes of ram"
- Mecrisp-Stellaris runs on various ARM Cortex M chips. "fits into 16 kb of flash and runs with at least 1 kb of ram"
- 32-bit/64-bit "Pygmy Forth is a 32-bit/64-bit Forth for Linux/MacOS/Windows. Riscy Pygness is 32-bit Pygmy Forth for ARM processors (and now the ARM Cortex). See forth.html for 16-bit Pygmy Forth for DOS (which runs on all sorts of “DOS”, including FreeDOS? and under various Microsoft Windows operating systems, and under the DOS simulator on Linux) and for links to various Forth information, including ARM my “3 Instruction Forth” paper.
- http://www.camelforth.com/page.php?5 camelforth for Z80
- "About 6K of PROM and 640 bytes of RAM are used by CamelForth?, plus whatever additional PROM and RAM is needed by your program"
- "I've been told that a set of 13 primitives is sufficient to define all of Forth -- a very slow Forth. eForth [2], designed for easy porting, had a more generous set of 31 primitives" -- http://www.bradrodriguez.com/papers/moving5.htm
- https://github.com/jkotlinski/durexforth C64 Forth
- https://forthworks.com/retro "RETRO is a clean, elegant, and pragmatic dialect of Forth. It provides a simple alternative for those willing to make a break from legacy systems. The language draws influences from many sources including traditional Forth systems, cmForth, colorForth, Factor, and Parable. It was designed to be easy to grasp and adapt to specific uses. The basic language is very portable. It runs on a tiny virtual machine (Nga), which is written in C. There are multiple interface options, the main one (rre) is buildable with just the standard C compiler and libraries on most systems." http://forthworks.com/retro/book.html
- https://gitlab.com/akkartik/reva-forth/blob/master/src/core.asm
- " I am particularly attracted to Reva Forth which bootstraps from sub-2k LoC? of (32-bit) x86 Assembly without any dependence on the C stack. But it's been challenging to understand the implementation; just figuring out what form a function expects its input data to be in has been non-trivial. " -- [24]
Retrospectives:
Forth Tutorials and books: