proj-plbook-plChForthDialects

Difference between revision 17 and current revision

No diff available.

Forth dialects

eForth (a (popular?) portable Forth with 31 primitives)

" The following registers are required for a virtual Forth computer:

Forth Register 8086 Register Function IP SI Interpreter Pointer SP SP Data Stack Pointer RP RP Return Stack Pointer WP AX Word or Work Pointer UP (in memory) User Area Pointer

...

eForth Kernel

System interface:BYE, ?rx, tx!, !io Inner interpreters: doLIT, doLIST, next, ?branch, branch, EXECUTE, EXIT Memory access: ! , @, C!, C@ Return stack: RP@, RP!, R>, R@, R> Data stack: SP@, SP!, DROP, DUP, SWAP, OVER Logic: 0<, AND, OR, XOR Arithmetic: UM+ " -- http://www.exemark.com/FORTH/eForthOverviewv5.pdf

---

CamelForth

here are CamelForth?'s ~70 Primitives:

http://www.bradrodriguez.com/papers/glosslo.txt

design rationale for this choice:

" 1. Fundamental arithmetic, logic, and memory operators are CODE. 2. If a word can't be easily or efficiently written (or written at all) in terms of other Forth words, it should be CODE (e.g., U<, RSHIFT). 3. If a simple word is used frequently, CODE may be worthwhile (e.g., NIP, TUCK). 4. If a word requires fewer bytes when written in CODE, do so (a rule I learned from Charles Curley). 5. If the processor includes instruction support for a word's function, put it in CODE (e.g. CMOVE or SCAN on a Z80 or 8086). 6. If a word juggles many parameters on the stack, but has relatively simple logic, it may be better in CODE, where the parameters can be kept in registers. 7. If the logic or control flow of a word is complex, it's probably better in high-level Forth. " -- http://www.bradrodriguez.com/papers/moving5.htm

               ANS Forth Core wordsThese are required words whose definitions are  specified by the ANS Forth document.

! x a-addr -- store cell in memory + n1/u1 n2/u2 -- n3/u3 add n1+n2 +! n/u a-addr -- add cell to memory

x1 x2 -- flag test x1=x2

> n1 n2 -- flag test n1>n2, signed >R x -- R: -- x push to return stack ?DUP x -- 0

@ a-addr -- x fetch cell from memory 0< n -- flag true if TOS negative 0= n/u -- flag return true if TOS=0 1+ n1/u1 -- n2/u2 add 1 to TOS 1- n1/u1 -- n2/u2 subtract 1 from TOS 2* x1 -- x2 arithmetic left shift 2/ x1 -- x2 arithmetic right shift AND x1 x2 -- x3 logical AND CONSTANT n -- define a Forth constant C! c c-addr -- store char in memory C@ c-addr -- c fetch char from memory DROP x -- drop top of stack DUP x -- x x duplicate top of stack EMIT c -- output character to console EXECUTE i*x xt -- j*x execute Forth word 'xt' EXIT -- exit a colon definition FILL c-addr u c -- fill memory with char I -- n R: sys1 sys2 -- sys1 sys2 get the innermost loop index INVERT x1 -- x2 bitwise inversion J -- n R: 4*sys -- 4*sys get the second loop index KEY -- c get character from keyboard LSHIFT x1 u -- x2 logical L shift u places NEGATE x1 -- x2 two's complement OR x1 x2 -- x3 logical OR OVER x1 x2 -- x1 x2 x1 per stack diagram ROT x1 x2 x3 -- x2 x3 x1 per stack diagram RSHIFT x1 u -- x2 logical R shift u places R> -- x R: x -- pop from return stack R@ -- x R: x -- x fetch from rtn stk SWAP x1 x2 -- x2 x1 swap top two items UM* u1 u2 -- ud unsigned 16x16->32 mult. UM/MOD ud u1 -- u2 u3 unsigned 32/16->16 div. UNLOOP -- R: sys1 sys2 -- drop loop parms U< u1 u2 -- flag test u1<n2, unsigned VARIABLE -- define a Forth variable XOR x1 x2 -- x3 logical XOR
x x DUP if nonzero
               ANS Forth ExtensionsThese are optional words whose definitions are specified by the ANS Forth document.

<> x1 x2 -- flag test not equal BYE i*x -- return to CP/M CMOVE c-addr1 c-addr2 u -- move from bottom CMOVE> c-addr1 c-addr2 u -- move from top KEY? -- flag return true if char waiting M+ d1 n -- d2 add single to double NIP x1 x2 -- x2 per stack diagram TUCK x1 x2 -- x2 x1 x2 per stack diagram U> u1 u2 -- flag test u1>u2, unsigned

               Private ExtensionsThese are words which are unique to CamelForth?. Many of these are necessary to implement ANS Forth words, but are not specified by the ANS document. Others are functions I find useful.

(do) n1

u1 n2u2 -- R: -- sys1 sys2
                             run-time code for DO(loop) R: sys1 sys2 --
sys1 sys2
                           run-time code for LOOP(+loop) n -- R: sys1 sys2 --
sys1 sys2
                          run-time code for +LOOP>< x1 -- x2 swap bytes  ?branch  x -- branch if TOS zero BDOS   DE C -- A                   call CP/M BDOS branch -- branch always lit    -- x         fetch inline literal to stack PC! c p-addr -- output char to port PC@ p-addr -- c           input char from port RP! a-addr -- set return stack pointer RP@ -- a-addr         get return stack pointer SCAN   c-addr1 u1 c -- c-addr2 u2 find matching char SKIP   c-addr1 u1 c -- c-addr2 u2 skip matching chars SP! a-addr -- set data stack pointer SP@ -- a-addr           get data stack pointer S= c-addr1 c-addr2 u -- n      string compare n<0: s1<s2, n=0: s1=s2, n>0: s1>s2 USER   n -- define user variable 'n' "

Retro (forth)

http://www.forthworks.com/retro https://forthworks.com/retro http://forthworks.com/retro/book.html http://forth.works/doc.html http://retroforth.org/Handbook-Latest.txt

The following notes are about Retro version 12.

Retro can be built in a reduced memory configuration that requires only 96KiB? memory (~70KiB? for the image, 0.5KiB? data stack, 1KiB? address stack, ~24KiB? remaining RAM available for use by the VM).

By contrast, "The standard system is configured with a very deep data stack (around 2,000 items) and an address stack that is 3x deeper." [1]

Retro syntax

Much of this section is copied from http://forthworks.com/retro/book.html

Input is divided into a series of whitespace delimited tokens. Each of these is then processed individually. There are no parsing words in RETRO.

Tokens may have a single character prefix, which RETRO will use to decide how to process the token.

The major prefixes are: Prefix Used For @ Fetch from variable ! Store into variable & Pointer to named item

  1. Numbers $ ASCII characters ’ Strings ( Comments
Define a word [ Quotes (notes: i added this, this wasn't in the 'major prefixes' table in the documentation) { Arrays (notes: i added this, this wasn't in the 'major prefixes' table in the documentation)

Comments are delimited by ().

Constant literals:

Data types used in Retro's stack notation (provided via comments in many word definitions):

To add words (which are Forth's equivalent to functions) to the "dictionary", use the ':' prefix with the name of the word, then the word definition, then ';' to terminate, e.g.:

  :palindrome dup s:reverse s:eq? ;

Most Retro code is written using a literate file format called Unu, in which only code within ~~~-delimited blocks is executed. Unu has #-style headings: "# This is a heading".

Some notes on basic data types

" Strings in RETRO are NULL terminated sequences of values representing characters. Being NULL terminated, they can’t contain a NULL (ASCII 0).

The character words in RETRO are built around ASCII, but strings can contain UTF8 encoded data if the host platform allows. Words like s:length will return the number of bytes, not the number of logical characters in this case.

...

Some RETRO systems include support for floating point numbers. When present, this is built over the system libm using the C double type. ... Floating point values exist on a separate stack, and are bigger than the standard memory cells, so can not be directly stored and fetched from memory.

The floating point system also provides an alternate stack that can be used to temporarily store values. ... Floating point words are in the f: namespace. There is also a related e: namespace for encoded values, which allows storing of floats in standard memory. ... A combinator is a function that consumes functions as input. They are used heavily by the RETRO system. " -- http://forthworks.com/retro/book.html

Retro standard library and namespaces

" Words are grouped into broad namespaces by attaching a short prefix string to the start of a name.

The common namespaces are:

Prefix Contains

The "Glossary" is the (documentation of the?) standard library. It can be browsed online at http://forthworks.com:9999/

The words that i noticed mentioned in http://forthworks.com/retro/book.html are (much of the following is copied from there):

Combinators:

  1. 1 [ 'Yes s:put ] case
  2. 2 [ 'No s:put ] case drop 'No idea s:put ;

Stack ops:

Arrays:

Buffers:

Simple linear LIFO buffer. RETRO provides words for operating on a linear memory area. This can be useful in building strings or custom data structures.

A buffer is a linear sequence of memory. The buffer words provide a means of incrementally storing and retrieving values from it. The buffer words keep track of the start and end of the buffer. They also ensure that an ASCII:NULL is written after the last value, which make using them for string data easy.

Only one buffer can be active at a time. RETRO provides a buffer:preserve combinator to allow using a second one before returning to the prior one.

Example: 'Test d:create #1025 allot &Test buffer:set #100 buffer:add buffer:get n:put nl

(note: i think n:put prints numeric data, and nl prints newline)

Characters:

The Dictionary:

The Dictionary is a linked list containing the dictionary headers.

'Dictionary' is a variable holding a pointer to the most recent header.

Dictionary Header Structure (may change in future) Offset Contains


--------------------------- 0000 Link to Prior Header 0001 Link to XT 0002 Link to Class Handler 0003+ Word name (null terminated)

Don't use that structure directly; use the accessor words (see below).

Floating point:

Numbers in RETRO are signed, 32 bit integers with a range of -2,147,483,648 to 2,147,483,647.

Pointers:

Variables:

Strings:

At the interpreter, strings get allocated in a rotating buffer. If you need to keep them around, use s:keep or s:copy to move them to more permanent storage. In a definition, the string is compiled inline and so is in permanent memory.

Strings are mutable.