Forth dialects
eForth (a (popular?) portable Forth with 31 primitives)
" The following registers are required for a virtual Forth computer:
Forth Register 8086 Register Function IP SI Interpreter Pointer SP SP Data Stack Pointer RP RP Return Stack Pointer WP AX Word or Work Pointer UP (in memory) User Area Pointer
...
eForth Kernel
System interface:BYE, ?rx, tx!, !io Inner interpreters: doLIT, doLIST, next, ?branch, branch, EXECUTE, EXIT Memory access: ! , @, C!, C@ Return stack: RP@, RP!, R>, R@, R> Data stack: SP@, SP!, DROP, DUP, SWAP, OVER Logic: 0<, AND, OR, XOR Arithmetic: UM+ " -- http://www.exemark.com/FORTH/eForthOverviewv5.pdf
---
CamelForth
here are CamelForth?'s ~70 Primitives:
http://www.bradrodriguez.com/papers/glosslo.txt
design rationale for this choice:
" 1. Fundamental arithmetic, logic, and memory operators are CODE. 2. If a word can't be easily or efficiently written (or written at all) in terms of other Forth words, it should be CODE (e.g., U<, RSHIFT). 3. If a simple word is used frequently, CODE may be worthwhile (e.g., NIP, TUCK). 4. If a word requires fewer bytes when written in CODE, do so (a rule I learned from Charles Curley). 5. If the processor includes instruction support for a word's function, put it in CODE (e.g. CMOVE or SCAN on a Z80 or 8086). 6. If a word juggles many parameters on the stack, but has relatively simple logic, it may be better in CODE, where the parameters can be kept in registers. 7. If the logic or control flow of a word is complex, it's probably better in high-level Forth. " -- http://www.bradrodriguez.com/papers/moving5.htm
ANS Forth Core wordsThese are required words whose definitions are specified by the ANS Forth document.
! x a-addr -- store cell in memory + n1/u1 n2/u2 -- n3/u3 add n1+n2 +! n/u a-addr -- add cell to memory
- n1/u1 n2/u2 -- n3/u3 subtract n1-n2 < n1 n2 -- flag test n1<n2, signed
x1 x2 -- flag test x1=x2
> n1 n2 -- flag test n1>n2, signed >R x -- R: -- x push to return stack ?DUP x -- 0
| x x DUP if nonzero |
@ a-addr -- x fetch cell from memory 0< n -- flag true if TOS negative 0= n/u -- flag return true if TOS=0 1+ n1/u1 -- n2/u2 add 1 to TOS 1- n1/u1 -- n2/u2 subtract 1 from TOS 2* x1 -- x2 arithmetic left shift 2/ x1 -- x2 arithmetic right shift AND x1 x2 -- x3 logical AND CONSTANT n -- define a Forth constant C! c c-addr -- store char in memory C@ c-addr -- c fetch char from memory DROP x -- drop top of stack DUP x -- x x duplicate top of stack EMIT c -- output character to console EXECUTE i*x xt -- j*x execute Forth word 'xt' EXIT -- exit a colon definition FILL c-addr u c -- fill memory with char I -- n R: sys1 sys2 -- sys1 sys2 get the innermost loop index INVERT x1 -- x2 bitwise inversion J -- n R: 4*sys -- 4*sys get the second loop index KEY -- c get character from keyboard LSHIFT x1 u -- x2 logical L shift u places NEGATE x1 -- x2 two's complement OR x1 x2 -- x3 logical OR OVER x1 x2 -- x1 x2 x1 per stack diagram ROT x1 x2 x3 -- x2 x3 x1 per stack diagram RSHIFT x1 u -- x2 logical R shift u places R> -- x R: x -- pop from return stack R@ -- x R: x -- x fetch from rtn stk SWAP x1 x2 -- x2 x1 swap top two items UM* u1 u2 -- ud unsigned 16x16->32 mult. UM/MOD ud u1 -- u2 u3 unsigned 32/16->16 div. UNLOOP -- R: sys1 sys2 -- drop loop parms U< u1 u2 -- flag test u1<n2, unsigned VARIABLE -- define a Forth variable XOR x1 x2 -- x3 logical XOR
ANS Forth ExtensionsThese are optional words whose definitions are specified by the ANS Forth document.
<> x1 x2 -- flag test not equal BYE i*x -- return to CP/M CMOVE c-addr1 c-addr2 u -- move from bottom CMOVE> c-addr1 c-addr2 u -- move from top KEY? -- flag return true if char waiting M+ d1 n -- d2 add single to double NIP x1 x2 -- x2 per stack diagram TUCK x1 x2 -- x2 x1 x2 per stack diagram U> u1 u2 -- flag test u1>u2, unsigned
Private ExtensionsThese are words which are unique to CamelForth?. Many of these are necessary to implement ANS Forth words, but are not specified by the ANS document. Others are functions I find useful.
(do) n1
| u1 n2 | u2 -- R: -- sys1 sys2 |
run-time code for DO(loop) R: sys1 sys2 --
run-time code for LOOP(+loop) n -- R: sys1 sys2 --
run-time code for +LOOP>< x1 -- x2 swap bytes ?branch x -- branch if TOS zero BDOS DE C -- A call CP/M BDOS branch -- branch always lit -- x fetch inline literal to stack PC! c p-addr -- output char to port PC@ p-addr -- c input char from port RP! a-addr -- set return stack pointer RP@ -- a-addr get return stack pointer SCAN c-addr1 u1 c -- c-addr2 u2 find matching char SKIP c-addr1 u1 c -- c-addr2 u2 skip matching chars SP! a-addr -- set data stack pointer SP@ -- a-addr get data stack pointer S= c-addr1 c-addr2 u -- n string compare n<0: s1<s2, n=0: s1=s2, n>0: s1>s2 USER n -- define user variable 'n' "
Retro (forth)
http://www.forthworks.com/retro https://forthworks.com/retro http://forthworks.com/retro/book.html http://forth.works/doc.html http://retroforth.org/Handbook-Latest.txt
The following notes are about Retro version 12.
Retro can be built in a reduced memory configuration that requires only 96KiB? memory (~70KiB? for the image, 0.5KiB? data stack, 1KiB? address stack, ~24KiB? remaining RAM available for use by the VM).
By contrast, "The standard system is configured with a very deep data stack (around 2,000 items) and an address stack that is 3x deeper." [1]
Retro syntax
Much of this section is copied from http://forthworks.com/retro/book.html
Input is divided into a series of whitespace delimited tokens. Each of these is then processed individually. There are no parsing words in RETRO.
Tokens may have a single character prefix, which RETRO will use to decide how to process the token.
The major prefixes are: Prefix Used For @ Fetch from variable ! Store into variable & Pointer to named item
- Numbers $ ASCII characters ’ Strings ( Comments
- Define a word [ Quotes (notes: i added this, this wasn't in the 'major prefixes' table in the documentation) { Arrays (notes: i added this, this wasn't in the 'major prefixes' table in the documentation)
Comments are delimited by ().
Constant literals:
- Numeric constants are prefixed by #
- String constants are prefixed by '
- Character constants are prefixed by $
- Floating point constants are prefixed by .
- Array literal constants are delimited by {}
- Quotation ("quote") constant literals are delimited by []
Data types used in Retro's stack notation (provided via comments in many word definitions):
- numeric values
- string
- variable
- pointer
- quotation
- dictionary header
- TRUE or FALSE flag
To add words (which are Forth's equivalent to functions) to the "dictionary", use the ':' prefix with the name of the word, then the word definition, then ';' to terminate, e.g.:
:palindrome dup s:reverse s:eq? ;
Most Retro code is written using a literate file format called Unu, in which only code within ~~~-delimited blocks is executed. Unu has #-style headings: "# This is a heading".
Some notes on basic data types
" Strings in RETRO are NULL terminated sequences of values representing characters. Being NULL terminated, they can’t contain a NULL (ASCII 0).
The character words in RETRO are built around ASCII, but strings can contain UTF8 encoded data if the host platform allows. Words like s:length will return the number of bytes, not the number of logical characters in this case.
...
Some RETRO systems include support for floating point numbers. When present, this is built over the system libm using the C double type. ... Floating point values exist on a separate stack, and are bigger than the standard memory cells, so can not be directly stored and fetched from memory.
The floating point system also provides an alternate stack that can be used to temporarily store values. ... Floating point words are in the f: namespace. There is also a related e: namespace for encoded values, which allows storing of floats in standard memory. ... A combinator is a function that consumes functions as input. They are used heavily by the RETRO system. " -- http://forthworks.com/retro/book.html
Retro standard library and namespaces
" Words are grouped into broad namespaces by attaching a short prefix string to the start of a name.
The common namespaces are:
Prefix Contains
- a: Words operating on simple arrays
- ASCII: ASCII character constants for control characters
- buffer: Words for operating on a simple linear LIFO buffer
- c: Words for operating on ASCII character data
- class: Contains class handlers for words
- d: Words operating on the Dictionary
- err: Words for handling errors
- io: General I/O words
- n: Words operating on numeric data
- prefix: Contains prefix handlers
- s: Words operating on string data
- v: Words operating on variables
- file: File I/O words
- f: Floating Point words
- gopher: Gopher protocol words
- unix: Unix system call words "
The "Glossary" is the (documentation of the?) standard library. It can be browsed online at http://forthworks.com:9999/
The words that i noticed mentioned in http://forthworks.com/retro/book.html are (much of the following is copied from there):
Combinators:
- Compositional Combinators:
- curry (takes a value and a quote and returns a new quote applying the specified quote to the specified value)
- example of curry: :acc (n-) here swap , [ dup v:inc fetch ] curry ;
- Execution Flow Combinators:
- call: takes a quote and executes it
- Conditional execution flow:
- choose: takes a flag and two quotes from the stack. If the flag is true, the first quote is executed. If false, the second quote is executed
- if: takes a flag and one quote from the stack. If the flag is true, the quote is executed. If false, the quote is discarded
- -if: takes a flag and one quote from the stack. If the flag is false, the quote is executed. If true, the quote is discarded
- case: takes two numbers and a quote. The initial value is compared to the second one. If they match, the quote is executed. If false, the quote is discarded and the initial value is left on the stack. Additionally, if the first value was matched, case will exit the calling function, but if false, it returns to the calling function.
- example of case: :test (n-)
- 1 [ 'Yes s:put ] case
- 2 [ 'No s:put ] case drop 'No idea s:put ;
- s:case: like case, but for strings instead of simple values.
- Looping execution flow:
- while: takes a quote from the stack and executes it repeatedly as long as the quote returns a true flag on the stack. This flag must be well formed and equal -1 or 0.
- example of while: #10 [ dup n:put sp n:dec dup 0 -eq? ] while
- times: takes a count and quote from the stack. The quote will be executed the number of times specified.
- times<with-index>: like times, but also provides access to the loop index (via I) and parent loop indexes (via J and K).
- example of times<with-index>: #10 [ I n:put sp ] times<with-index>
- Data Flow:
- Preserving data flow (execute code while preserving portions of the data stack):
- dip: takes a value and a quote, moves the value off the main stack temporarily, executes the quote, and then restores the value.
- example of dip: "#10 #20 [ n:inc ] dip" will yield the following on the stack: 11 20
- sip: like dip, but leaves a copy of the original value on the stack during execution of the quote
- example of sip: "#10 [ n:inc ] sip" will yield the following on the stack: 11 10
- Cleave data flow (apply multiple quotations to a single value or set of values):
- bi: takes a value and two quotes. It applies each quote to a copy of the value
- tri: takes a value and three quotes. It applies each quote to a copy of the value.
- Spread data flow (apply multiple quotations to multiple values):
- bi*: takes two values and two quotes. It applies the first quote to the first value and the second quote to the second value.
- tri*: same as bi* but with 3 values and 3 quotes
- Apply data flow (apply a single quotation to multiple values):
- bi@: takes two values and a quote. It applies the quote to each value.
- tri@: same as bi@ but with 3 values and a quote.
- various for-each combinators for various data structures
Stack ops:
- Data stack shufflers (to reorder the data stack): dup drop swap over tuck nip rot
- reorder: for example, let’s say we have four values on the data stack: #1 #2 #3 #4. And we want them to become: #4 #3 #2 #1. We can do "'abcd 'dcba reorder".
- reset: empty the data stack
- push, pop: move values from the data stack to the address stack, and from the address stack to the data stack
- depth: find out how many items are on the data stack
- dump-stack: display the data stack
Arrays:
- array creation from quotations:
- a:counted-results: takes a quotation and returns number of values to store, as an array (? i don't understand this)
- a:make: takes a quotation and returns values, as an array (? i don't understand this)
- a:nth: array access. Returns a variable that 'fetch' can be applied to
- example of a:nth: { #1 #2 #3 #4 } #3 a:nth fetch
- a:length: size
- a:dup: copy
- a:filter: takes an array and a quote, and returns an array. The quote will be passed each value in the array and should return TRUE or FALSE. Values that lead to TRUE will be collected into a new array.
- example of a:filter: { #1 #2 #3 #4 #5 #6 #7 #8 } [ n:even? ] a:filter
- a:map: applies a quotation to each item in an array and constructs a new array from the returned values.
- example of a:map: { #1 #2 #3 } [ #10 * ] a:map
- a:reduce: takes an array, a starting value, and a quote. It executes the quote once for each item in the array, passing the item and the value to the quote. The quote should consume both and return a new value.
- example of a:reduce: { #1 #2 #3 } #0 [ + ] a:reduce
- a:contains?: takes an array and a numeric value. Search the array for the numeric value and return either TRUE or FALSE.
- a:contains-string?: search an array for a string value
Buffers:
Simple linear LIFO buffer. RETRO provides words for operating on a linear memory area. This can be useful in building strings or custom data structures.
A buffer is a linear sequence of memory. The buffer words provide a means of incrementally storing and retrieving values from it. The buffer words keep track of the start and end of the buffer. They also ensure that an ASCII:NULL is written after the last value, which make using them for string data easy.
Only one buffer can be active at a time. RETRO provides a buffer:preserve combinator to allow using a second one before returning to the prior one.
- buffer:set: set the active buffer. takes an address. The buffer will be assumed to be empty. The inital value will be set to ASCII:NULL.
- buffer:add: append a value to the buffer. This takes a single value and will also add an ASCII:NULL after the end of the buffer.
- buffer:get: return the last value. Removes the value and sets an ASCII:NULL in the memory location the returned value occupied.
- buffer:start: return the initial address in the buffer
- buffer:end: return the last address (ignoring the ASCII:NULL)
- buffer:size: return the number of values in the buffer
- buffer:empty: reset a buffer to the empty state
Example: 'Test d:create #1025 allot &Test buffer:set #100 buffer:add buffer:get n:put nl
(note: i think n:put prints numeric data, and nl prints newline)
Characters:
- character classification: c:consonant? c:digit? c:letter? c:lowercase? c:uppercase? c:visible? c:vowel? c:whitespace?
- and the corresponding negated forms: c:-consonant? c:-digit? c:-lowercase? c:-uppercase? c:-visible? c:-vowel? c:-whitespace?
- character conversion (takes a character and returns a character): c:to-lower c:to-number c:to-upper c:toggle-case
- c:to-string: takes a character and creates a new temporary string with the character
- c:put: display/print a character
- c:get: read input (some platforms only) (may be buffered, depending on platform)
The Dictionary:
The Dictionary is a linked list containing the dictionary headers.
'Dictionary' is a variable holding a pointer to the most recent header.
Dictionary Header Structure (may change in future) Offset Contains
--------------------------- 0000 Link to Prior Header 0001 Link to XT 0002 Link to Class Handler 0003+ Word name (null terminated)
Don't use that structure directly; use the accessor words (see below).
- Dictionary Header structure access:
- d:xt, d:class, d:name to access the address of each specific field
- There is no d:link, as the link will always be the first
- Shortcuts For The Latest Header:
- d:last returns a pointer to the latest header
- d:last<xt>: the d:xt field for the latest header
- d:last<class>: similar
- d:last<name>: similar
- Adding headers:
- d:create: takes a string for the name and makes a new header with the class set to class:data and the XT field pointing to here.
- example of d:create: 'Base d:create
- d:add-header. This takes a string, a pointer to the class handler, and a pointer for the XT field and builds a new header using these.
- example of d:add-header: 'Base &class:data #10000 d:add-header
- Searching the dictionary
- d:lookup: takes a string and tries to find it in the dictionary. It will return a pointer to the dictionary header or a value of zero if the word was not found.
- d:lookup-xt: takes a pointer and will return the dictionary header that has this as the d:xt field, or zero if no match is found.
- d:foreach: iterate over all entries in the dictionary. For each entry, this combinator will push a pointer to the entry to the stack and call the quotation.
- example of d:foreach: [ d:name s:put sp ] d:for-each
- d:words: list the names of all words in the dictionary
- d:words-with: example: 'class: d:words-with
Floating point:
Numbers in RETRO are signed, 32 bit integers with a range of -2,147,483,648 to 2,147,483,647.
- primary floating point stack ops: f:nip f:over f:depth f:drop f:drop-pair f:dup f:dup-pair f:dump-stack f:tuck f:swap f:rot
- secondary floating point stack ops: f:push f:pop f:adepth f:dump-astack
- constants: f:E f:-INF f:INF f:NAN f:PI
- comparisons: f:-eq? f:between? f:eq? f:gt? f:lt? f:negative? f:positive? f:case f:-inf? f:inf? f:nan?
- arithmetic ops: f:* f:+ f:- f:/ f:abs f:floor f:inc f:limit f:max f:min f:negate f:power f:ceiling f:dec f:log f:sqrt f:square f:round f:sign f:signed-sqrt f:signed-square
- trig: f:acos f:asin f:atan f:cos f:sin f:tan
- conversions: f:to-number f:to-string
- f:put: display/print
- Floating point Encoded Values: RETRO provides a means of encoding and decoding floating point values into standard integer cells. This is based on the paper “Encoding floating point values to shorter integers” by Kiyoshi Yoneda and Charles Childers.
- f:E1 f:to-e e:-INF e:-inf? e:INF e:MAX e:MIN e:NAN e:clip e:inf? e:max? e:min? e:n? e:nan? e:put e:to-f e:zero?
Pointers:
- fetch: takes a pointer and returns the value at that address
- store: takes a number and a pointer and sets that address to that value
- instead of passing a single-word quotation to a combinator, you can substitute a word with a & prefix, for example &n:negate may be substituted for [ n:negate ]. This saves a level of call/return by avoiding the quotation, and is more readable.
Variables:
- var: takes a string and creates a variable with that name (the variable is initialized to 0)
Strings:
At the interpreter, strings get allocated in a rotating buffer. If you need to keep them around, use s:keep or s:copy to move them to more permanent storage. In a definition, the string is compiled inline and so is in permanent memory.
- s:copy: copy a string (the copy is in permanent memory)
- s:keep: place a string into permanent memory
- s:temp: copy string to the rotating buffer.
Strings are mutable.
- searching: s:contains-char? s:contains-string? s:index-of s:index-of-string
- comparisons: s:eq? s:case
- substrings:
- s:left: a new string containing the first n characters from a source string. Example: 'Hello_World #5 s:left
- s:right: a new string containing the last n characters from a source string
- s:substr: takes a string, the offset of the first character, and the number of characters to extract
- joining:
- s:append: example: 'First 'Second s:append
- s:prepend: example: 'Second 'First s:prepend
- tokenization: s:tokenize s:tokenize-on-string s:split s:split-on-string
- case conversion: s:to-lower s:to-upper
- s:to-number
- s:chop: remove the last character from a string. This is done by replacing it with an ASCII:NULL.
- s:trim-left, s:trim-right, s:trim: remove leading, trailing, and (both leading and trailing) whitespace from a string
- string combinators: s:for-each s:filter s:map
- misc string ops: s:evaluate s:copy s:reverse s:hash s:length s:replace s:format s:empty
- variables controlling the temporary string buffer: TempStrings? (default 32), TempStringMax?