Chapter 14. eForth System
The
file EF24.F contains all the high level words in P24 eForth.
This
implementation follows closely the eForth model. The following set of words are removed
because they are not absolutely necessary for embedded applications. In this implementation, the size
constrain is severe, and the existence of every word must be justified
rigorously.
Words
removed from the eForth model:
CATCH, THROW, PRESET, XIO, FILE, HAND, I/O
CONSOLE, RECURSE, USER,
Many
of the user variables are eliminated:
SP0,
RP0, '?KEY, 'EMIT, 'EXPECT, 'TAP, 'ECHO
'PROMPT, CSP, 'NUMBER, HANDLER, CURRENT, NP
Only
these user variables remain and are macros:
HLD,
SPAN, >IN, #TIB, 'TIB, 'EVAL, BASE, tmp
CP, CONTEXT, LAST, 'ABORT,
TEXT
14.1 Overview
of the P24 eForth system
Figure
14.1 is a very interesting graphic representation of the P24 eForth system
operating inside a P24 chip. Upon
power up, the eForth system is initialized and enters into the
Figure
14.1 The
eForth system in a P24 chip
The
P24 eForth system can be more rigorously specified in the following list
together with their pseudo code:
COLD
boots Forth, print sign-on message and jump to QUIT
QUIT
repeats the sequence: accepts a line of text and executes
the commands in
sequence. The pseudo code is:
: QUIT BEGIN QUERY
EVAL AGAIN ;
QUERY
accepts one line of text of 80 characters or terminated
by a carriage-return.
EVAL
parses out tokens in the text and evaluates them:
: EVAL BEGIN TOKEN
WHILE 'EVAL @EXECUTE REPEAT .OK ;
TOKEN
parses out one word from the input text.
'EVAL
contains $INTERPRET in the interpret mode or $COMPILE
in the compiling mode.
@EXECUTE
executes either $INTERPRET or $COMPILE.
.OK
prints out the "OK" message.
$INTERPRET
( a ) searches the dictionary for a word of the
text string at a. If the word exists, execute it.
Else, convert the
string into a number on the stack.
Failing to convert the
string to a number, prints an
error message and
abort to QUIT.
: $INTERPRET NAME? IF
EXECUTE ELSE NUMBER?
IF ELSE ERROR THEN THEN ;
$COMPILE
( a ) searches the dictionary for a word of the
text string at a. If the word exists, compile it.
Else, convert the
string to a number and compile the
number as a
literal. Failing the conversion,
prints
a message and abort to
QUIT.
: $COMPILE NAME? IF ,
ELSE NUMBER?
IF
LITERAL ELSE ERROR THEN THEN ;
NAME?
calls 'find' to locate a word of the name parsed out
out the input text
string.
NUMBER?
( a ) converts the text string at a to a number.
ERROR
prints the offending text string and aborts to QUIT.
LITERAL
( n ) compiles n as a literal into the current word
being compiled.
The
above words serve as a top-down map of the eForth operating system. The eForth system source code builds up
to QUIT and COLD. Most words in
EF24.F are necessary in the building process. The eForth system can be viewed as a
very sophisticated application of P24.
Most applications are much simpler than eForth system. You can model your application code to
eForth, and use all the tools contained therein.
14.2
The
No-Cost UART uses very little hardware resource and give us a powerful tool to
access and to examine the P24 CPU.
On
executing SHR instruction, the least significant bit in T, T(0), is shifted to
a flip-flop, whose output is connected to the serial output port. At the same time the state of the serial
input port is latched into the carry bit, which is bit T(24). Repeating SHR 8 times, a character is
sent out. One character is captured
by waiting for the start bit on the serial input port, and then test the port
at the intervals of 100 us. One must be very careful in using the SHR
instruction. In order not to disturb the output port, you should always set T(0)
to a 1 before executing SHR. This
way, the serial output port stays at the mark level.
50us
delays 52 us, half of a bit at 9600 baud.
100us
delays 104 us, one bit frame at 9600 baud.
EMIT
( c ) sends character c to the serial output port.
KEY
( -- c ) waits for a character from the serial input port. The serial ports are
actually connected to the T register.
CRR .( Chararter IO ) CRR
CODE 50us
2 ldi skip
CODE 100us
1 ldi
then
sta -138 ldi
begin lda add
-until
drop
ret
CODE EMIT ( c -- )
$7F ldi and
shl $FFFF01 ldi xor
$0A ldi
FOR shr 100us NEXT
drop ret
CODE KEY ( -- c )
$FFFFFF ldi
begin shr
-until
repeat ( wait for start bit )
50us
7 ldi
FOR
100us shr
-if $80 ldi xor then
NEXT
$FF ldi and
100us ret
14.3 Simple
Utility Words
These
common functions are too complicated to code in machine instructions, and are
left in the high level form.
CRR .( Common functions ) CRR
:: U< ( u u -- t ) 2DUP XOR 0< IF SWAP DROP 0< EXIT THEN - 0< -;'
:: < ( n n -- t ) 2DUP XOR 0< IF DROP 0< EXIT THEN - 0< -;'
:: MAX ( n n -- n ) 2DUP < IF SWAP THEN DROP ;;
:: MIN ( n n -- n ) 2DUP SWAP < IF SWAP THEN DROP ;;
:: WITHIN ( u ul uh -- t ) \ ul <= u < uh
OVER - >R - R> U< -;'
14.4 Division
UM/MOD
and /MOD share the same body to do division of a 48-bit divident by a 24 bit
divisor, using the DIV machine instruction. The higher half of the divident is
placed in T and the lower half is placed in A. The divisor is negated and placed on the
data stack below T. The negated
divisor is added to T in the adder.
If a carry is generated, indicating that T is big enough to subtract the
divisor, The sum is accepted into T, and then T-A combination is shifted left
by one bit. The most significant
bit in A is shifted into T(0), and Carry is shifted into A(0). If the adder
does not generate a carry, the subtraction will not be done. The T-A combination is shifted left by
one bit, and a 0 is shifted into A(0).
The
above divide step DIV instructions is repeated 25 times to generate the proper
quotient in A. The remainder is in
T, if it is shifted right by one bit.
The
only restriction in this division procedure is that the divisor and the
divident must be positive. It
cannot handle negative divisor or negative divident. This is not a serious limitation because
the special word M/MOD does signed division by first convert both divisor and
divident to positive numbers for division operations, and then place
appropriate signs in front of quotient and remainder.
UM/MOD,
/MOD, /, and MOD all assume that divisors and dividents are positive. In the eForth system, this is not a
problem. Nevertheless, users must be aware of this limitation when writing code
which must handle negative numbers.
CRR .( Divide ) CRR
CODE UM/MOD ( ud u -- ur uq )
com 1 ldi add sta
push lda push sta
pop pop
skip
CODE /MOD ( n n -- r q )
com 1 ldi add push
sta pop 0 ldi
then
div div div div
div div div div
div div div div
div div div div
div div div div
div div div div
div 1 ldi xor shr
push drop pop lda
ret
CODE MOD ( n n -- r )
/MOD
drop ret
CODE / ( n n -- q )
/MOD
push drop pop ret
:: M/MOD ( d n -- r q ) \ floored
DUP 0< DUP >R
IF NEGATE >R DNEGATE R>
THEN >R DUP 0< IF R@ + THEN R> UM/MOD R>
IF SWAP NEGATE SWAP THEN ;;
14.5 Multiplication
UM*
multiplies two unsigned 24-bit integers and produces a 48-bit product. The multiplier is placed in A register,
and the multiplicant is placed on the data stack below T. T is cleared to zero. The MUL machine instruction looks at
A(0) bit. If it is a one, the
multiplicant is added to T, and the T-A combination is shifted to the right by
one bit. Carry us shifted into T(23).
It A(0) is a zero, the multiplicant is not added. The T-A combination is shifted to the
right, and a zero is shifted into T(23). After the MUL instruction is repeated
24 times, a 48-bit product is produced in the T-A combination. T has the more significant half and A
has the less significant half of the product.
Both
UM* and * do the unsigned multiplication.
M* does signed multiplication.
For correctness, * should call M* to do the multiplicant. However, here * calls UM* for
speed. You should be aware of this
property in your applications. As
the eForth system only does unsigned multiplications, it is not a problem.
CRR .( Multiply ) CRR
CODE UM* ( u u -- ud )
sta 0 ldi
mul mul mul mul
mul mul mul mul
mul mul mul mul
mul mul mul mul
mul mul mul mul
mul mul mul mul
push drop lda pop
ret
:: * ( n n -- n ) UM* DROP ;;
:: M* ( n n -- d )
2DUP XOR 0< >R ABS SWAP ABS UM* R> IF DNEGATE THEN ;;
:: */MOD ( n n n -- r q ) >R M* R> M/MOD -;'
:: */ ( n n n -- q ) */MOD SWAP DROP ;;
14.6 Memory
Access Words
There are three buffer areas used often in the eForth system. HERE returns the address of the first free location above the code dictionary, where new words are compiled. PAD returns the address of the text buffer where numbers are constructed and text strings are stored temporarily. TIB is the terminal input buffer where input text string is held.
@EXECUTE is a special word supporting the vectored execution words in eForth. It takes the word address stored in a memory location and executes the word. It is used extensively to execute the vectored words in the user area.
A memory array is generally specified by a starting address and its length in words. In a string array, the first word is a count, specifying the number of words in the following string. This is called a counted string.
COUNT converts a string array address to the address-length representation of a counted string.
CMOVE copies a memory array from one location to another. FILL fills a memory array with the same byte.
>CHAR
filters out non-printable characters for TYPE. It thus ensures that TYPEing a
non-printable character will not choke the printer.
CRR .( Bits & Bytes ) CRR
:: >CHAR ( c -- c )
$7F LIT AND DUP $7F LIT BL WITHIN
IF DROP ( CHAR _ ) $5F LIT THEN ;;
CRR .( Memory access ) CRR
:: HERE ( -- a ) CP @ ;;
:: PAD ( -- a ) CP @ 50 LIT + ;;
:: TIB ( -- a ) 'TIB @ ;;
CRR
:: @EXECUTE ( a -- ) @ ?DUP IF EXECUTE THEN ;;
:: CMOVE ( b b u -- )
FOR AFT >R DUP @ R@ ! 1+ R> 1+ THEN NEXT 2DROP ;;
:: FILL ( b u c -- )
SWAP FOR SWAP AFT 2DUP ! 1+ THEN NEXT 2DROP ;;
14.6
String Packing and Unpacking Words
PACK$
packs the string at b with length u into memory located at a, three bytes to a
24-bit program word. It calls B>
to do the packing. This packing
function greatly reduces the total size of the P24 code image. The packing also speeds up the
dictionary searches because three bytes are compared at once. The system scratch variable TMP is used
to store the byte count which directs the bytes to their proper location. After the byte string is fully packed,
the last packed program word is left justified and empty slots are filled with
NUL bytes.
:: PACK$ ( b u a -- a ) \ null fill
dup push
1 ldi tmp sta st
sta dup push st
lda pop
FOR AFT ( b a )
B>
tmp sta ld
IF ld 1 ldi xor
IF dup dup xor st
1 ldi add
ELSE 2 ldi st
THEN
ELSE 1 ldi st
THEN
THEN NEXT
tmp sta ld
IF ld 2 ldi xor
IF sta ld
shl shl shl shl
shl shl shl shl
st lda
THEN
sta ld
shl shl shl shl
shl shl shl shl
st lda
THEN
drop drop pop
;;
UNPACK$
unpacks a packed string at address a into a counted byte string at b. It calls >B to unpack a 24-bit word
into three bytes. It allows names of words to be printed, and in-line packed
strings to be accessed as byte strings.
:: UNPACK$ ( a b -- b )
DUP >R ( save b )
>B $1F LIT AND 3 LIT /
FOR AFT
>B DROP
THEN NEXT
2DROP R>
;;
14.7 Number
Output Words
All
numbers in P24 are stored internally as 24-bit binary patterns. To make the
numbers visual to the user, they are converted to strings of digits to be
printed. A number is converted one
digit at a time. It is divided by
the value stored in BASE, and the remainder is converted to a digit by DIGIT. The quotient is divided further by BASE
to build a complete numeric string suitable for printing. The output numeric string is built
backward below the memory buffer at PAD, using HLD as the pointer moving
backward. Additional formatting
characters can be inserted into the output string by HOLD.
This
numeric output mechanism is extremely flexible and can produce numbers in a
wide variety of formats for tables and arrays. It also allows the user to display
numbers in any reasonable base, like decimal, hexadecimal, octal, and binary,
among other non-conventional bases.
DIGIT converts an integer to a digit.
EXTRACT extracts the least significan digit from a number n. n is divided by the radix in BASE and returned on the stack.
The output number string is built below the PAD buffer. The least significant digit is extracted from the integer on the top of the data stack by dividing it by the current radix in BASE. The digit thus extracted are added to the output string backwards from PAD to the low memory. The conversion is terminated when the integer is divided to zero. The address and length of the number string are made available by #> for outputting.
An output number conversion is initiated by <# and terminated by #>. Between them, # converts one digit at a time, #S converts all the digits, while HOLD and SIGN inserts special characters into the string under construction. This set of words is very versatile and can handle all different output formats.
CRR .( Numeric Output ) CRR \ single precision
:: DIGIT ( u -- c )
9 LIT OVER < 7 LIT AND +
( CHAR 0 ) 30 LIT + ;;
:: EXTRACT ( n base -- n c )
0 LIT SWAP UM/MOD SWAP DIGIT -;'
:: <# ( -- ) PAD HLD ! ;;
:: HOLD ( c -- ) HLD @ 1- DUP HLD ! ! ;;
:: # ( u -- u ) BASE @ EXTRACT HOLD -;'
:: #S ( u -- 0 ) BEGIN # DUP WHILE REPEAT ;;
CRR
:: SIGN ( n -- ) 0< IF ( CHAR - ) 2D LIT HOLD THEN ;;
:: #> ( w -- b u ) DROP HLD @ PAD OVER - ;;
:: str ( n -- b u ) DUP >R ABS <# #S R> SIGN #> -;'
:: HEX ( -- ) 10 LIT BASE ! ;;
:: DECIMAL ( -- ) 0A LIT BASE ! ;;
14.8 Number
Input Words
Numbers
are entered into P24 as strings of digits, delimited by spaces and other white
characters like CR, TAB, NUL, etc. Numeric strings are converted to internal
binary form by multiply the digits, most significant digit first, by the value
in BASE and accumulate the product until the digits are exhausted.
DIGIT? converts a digit to its numeric value according to the current base.
NUMBER? converts a string of digits to a single integer. If the first character is a $ sign, the number is assumed to be in hexadecimal. Otherwise, the number will be converted using the radix value stored in BASE. For negative numbers, the first character should be a - sign. No other characters are allowed in the string. If a non-digit character is encountered, the address of the string and a false flag are returned.
CRR .( Numeric Input ) CRR \ single precision
:: DIGIT? ( c base -- u t )
>R ( CHAR 0 ) 30 LIT - 9 LIT OVER <
IF 7 LIT - DUP 0A LIT < OR THEN DUP R> U< -;'
:: NUMBER? ( a -- n T | a F )
BASE @ >R 0 LIT OVER COUNT ( a 0 b n)
OVER @ ( CHAR $ ) 24 LIT =
IF HEX SWAP 1+ SWAP 1- THEN ( a 0 b' n')
OVER @ ( CHAR - ) 2D LIT = >R ( a 0 b n)
SWAP R@ - SWAP R@ + ( a 0 b" n") ?DUP
IF 1- ( a 0 b n)
FOR DUP >R @ BASE @ DIGIT?
WHILE SWAP BASE @ * + R> 1+
NEXT DROP R@ ( b ?sign) IF NEGATE THEN SWAP
ELSE R> R> ( b index) 2DROP ( digit number) 2DROP 0 LIT
THEN DUP
THEN R> ( n ?sign) 2DROP R> BASE ! ;;
Following
is the set of words displaying characters to the output device.
DO$
is an internal system word which unpacks a packed string compiled in-line with
program words. It digs up the
starting address of the packed string on the return stack, unpacks the string
to location a, and then move the return address passing the packed string. Then, the execution can continue,
skipping the packed string in-line.
$"|
is compiled before a packed string.
It unpacks the string and returns the address of the TEXT buffer where
the unpacked string is stored.
."|
is also compiled before a packed string.
It unpacks the string and displays it on the output device.
CRR .( Basic I/O ) CRR
:: SPACE ( -- ) BL EMIT -;'
:: CHARS ( +n c -- )
SWAP 0 LIT MAX
FOR AFT DUP EMIT THEN NEXT DROP ;;
:: SPACES ( +n -- ) BL CHARS -;'
:: TYPE ( b u -- )
FOR AFT DUP @ >CHAR EMIT 1+
THEN NEXT DROP ;;
:: CR ( -- ) ( =Cr )
0A LIT 0D LIT EMIT EMIT -;'
:: do$ ( -- a )
R> R@ TEXT UNPACK$
R@ R> @ $3FFFFF LIT AND $30000 LIT / 1+ +
>R SWAP >R ;;
CRR
:: $"| ( -- a ) do$ -;'
:: ."| ( -- ) do$ COUNT TYPE -;'
:: .R ( n +n -- )
>R str R> OVER - SPACES TYPE -;'
:: U.R ( u +n -- )
>R <# #S #> R> OVER - SPACES TYPE -;'
:: U. ( u -- ) <# #S #> SPACE TYPE -;'
:: . ( n -- )
BASE @ 0A LIT XOR
IF U. EXIT THEN str SPACE TYPE -;'
:: ? ( a -- ) @ . -;'
With the number formatting word set as shown above, one can format numbers for output in any form desired. The free output format is a number string preceded by a single space. The fix column format displays a number right-justified in a column of pre-determined width. The words ., U., and ? use the free format. The words .R and U.R use the fix format.
14.9 String
Parser
TOKEN
parses out the next string in the input stream, delimited by spaces. The string is packed and placed on the
top of the dictionary, so that it can be used to do dictionary searches, and
becomes the name field if the string just happened to be the name of a new
definition.
PARSE
allows the user to specify the delimiting character to parse out the next
string in the input stream. It
calls 'parse' to do the dirty work.
'parse'
scans the input stream and skips the leading blanks if SPACE is the delimiting
character. The parsed string starts
with the next non-delimiting character and is terminated by the next delimiting
character. It returns b the
beginning address of the parsed word, u the length of the remaining characters
in the input stream, and delta the length of the parsed word. It is a very long word with many nested
and interlaced structures. It is a
challenge even to the very experienced Forth programmers.
PARSE
parses out the next string in the Terminal Input Buffer (TIB), started where
>IN is pointing at. The c
specifies the delimiting character of the string. It returns the address of the string in
TIB and its length b;
TOKEN
is the crucial word in the Forth text interpreter which scans the terminal input
buffer for the next string delimited by spaces. It packs the string into the word buffer
at HERE, ready for dictionary search.
WORD
is similar to TOKEN, except that it takes the delimiting character from the
stack. TOKEN is used by the
system. WORD is intended for the
users who has to do special parsing on his input strings.
CRR .( Parsing ) CRR
:: (parse) ( b u c -- b u delta ; <string> )
tmp ! OVER >R DUP \ b u u
IF 1- tmp @ BL =
IF \ b u' \ 'skip'
FOR BL OVER @ - 0< NOT
WHILE 1+
NEXT ( b) R> DROP 0 LIT DUP EXIT \ all delim
THEN R>
THEN OVER SWAP \ b' b' u' \ 'scan'
FOR tmp @ OVER @ - tmp @ BL =
IF 0< THEN WHILE 1+
NEXT DUP >R
ELSE R> DROP DUP 1+ >R
THEN OVER - R> R> - EXIT
THEN ( b u) OVER R> - ;;
:: PARSE ( c -- b u ; <string> )
>R TIB >IN @ +
#TIB @ >IN @ -
R> (parse) >IN +! ;;
:: TOKEN ( -- a ;; <string> )
BL PARSE 1F LIT MIN 2DUP
DUP TEXT ! TEXT 1+ SWAP CMOVE
HERE 1+ PACK$ -;'
:: WORD ( c -- a ; <string> )
PARSE HERE 1+ PACK$ -;'
14.10 Dictionary
Search
'find'
follows the linked list in the dictionary, and compares the names of each
compiled word with the packed string stored at address a. va points to the starting name field of
the dictionary. If a match is found, it returns the execution address (code
field address) and the name field address of the matching word in the
dictionary. If it failed to find a
match, it returns the address of the packed string and a 0 for a false flag.
'find'
runs through the dictionary very quickly, because it compares the length and
the first two characters in the names. Most Forth words are unique in these
three characters. For words with
the same lengths and identical first two characters, 'find' calls SAME? to
determine whether the remaining characters
of the packed strings match.
NAME>
converts a name field address na to a code field address xt.
NAME?
Searches the dictionary for the string at address a, starting from the top of
the dictionary. The name field
address of the last word stored in the dictionary is maintained in the variable
CONTEXT. This is where the
dictionary search begins.
CRR .( Dictionary Search ) CRR
:: NAME> ( na -- xt )
DUP @ $3FFFFF LIT AND
$30000 LIT / + 1+ ;;
:: SAME? ( a a u -- a a f \ -0+ )
$30000 LIT /
FOR AFT OVER R@ + @
OVER R@ + @ - ?DUP
IF R> DROP EXIT THEN
THEN NEXT
0 LIT ;;
:: find ( a va -- xt na | a F )
SWAP \ va a
DUP @ tmp ! \ va a \ get cell count
DUP @ >R \ va a \ count
1+ SWAP \ a' va
BEGIN @ DUP \ a' na na
IF DUP @ $3FFFFF LIT AND
R@ XOR \ ignore lexicon bits
IF 1+ -1 LIT
ELSE 1+ tmp @ SAME?
THEN
ELSE R> DROP SWAP 1- SWAP EXIT \ a F
THEN
WHILE 1- 1- \ a' la
REPEAT R> DROP SWAP DROP
1- DUP NAME> SWAP ;;
:: NAME? ( a -- xt na | a F )
CONTEXT find -;'
14.11 Terminal
Input
^H
processes the Back Space encountered in the input stream. It backs up the character pointer and
erased the character preceding the Back Space.
TAP
echoes an input character and deposits it into the terminal input buffer.
kTAP
detects a Carriage Return to terminate the input stream. It also calls ^H to process a Back
Space, and TAP to process ordinary characters. These words allows the
interpreter to handle a human user on the terminal smoothly, and friendly.
CRR .( Terminal ) CRR
:: ^H ( b b b -- b b b ) \ backspace
>R OVER R> SWAP OVER XOR
IF ( =BkSp ) 8 LIT EMIT
1- BL EMIT \ distructive
( =BkSp ) 8 LIT EMIT \ backspace
THEN ;;
:: TAP ( bot eot cur c -- bot eot cur )
DUP EMIT OVER ! 1+ ;;
:: kTAP ( bot eot cur c -- bot eot cur )
DUP ( =Cr ) 0D LIT XOR
IF ( =BkSp ) 8 LIT XOR
IF BL TAP ELSE ^H THEN
EXIT
THEN DROP SWAP DROP DUP ;;
QUERY
accepts a line of characters typed in by the user and put them in the terminal
input buffer for interpreting or compiling. The line is terminated at the 80th input
character or by a Carriage Return.
'accept'
waits for input characters and place them in the terminal input buffer at b
with length u. It returns the same
buffer address b with the length of the character string actually received.
EXPECT
receives the input stream and stores the length in the variable SPAN.
CRR
:: accept ( b u -- b u )
OVER + OVER
BEGIN 2DUP XOR
WHILE KEY DUP BL - 5F LIT U<
IF TAP ELSE kTAP THEN
REPEAT DROP OVER - ;;
:: EXPECT ( b u -- ) accept SPAN ! DROP ;;
:: QUERY ( -- )
TIB 50 LIT accept #TIB !
DROP 0 LIT >IN ! ;;
14.12 Error
Handling Words
ABORT
actually executes QUIT, which is defined much later. Here it is defined as a
vectored execution word which gets the execution address in the system variable
'ABORT. This mechanism also gives
the user some flexibility in how the application should handle an error
condition.
abort"
aborts after a warning message is displayed, if the flag on stack is true. Otherwise, ignore the message and
continue on.
ERROR
prints the character string store in the TEXT buffer before aborting. The TEXT buffer contains the word just
parsed out of the input stream.
This is the word which the interpreter/compiler fail to recognize. The natural error message is the name of
this word followed by a ? mark.
CRR .( Error handling ) CRR
:: ABORT ( -- ) 'ABORT @EXECUTE ;;
:: abort" ( f -- )
IF do$ COUNT TYPE ABORT THEN do$ DROP ;;
:: ERROR ( a -- )
SPACE TEXT COUNT TYPE
$3F LIT EMIT CR ABORT
14.13
Text
Interpreter
$INTERPRET
interprets the word just parsed out of the input stream. It searches the dictionary for this
word. If a match is found, executes
it, unless the word is marked as a compile-only word. It a match is now found in the
dictionary, convert the word into a number. If successful, the number is left on the
data stack. If not successful, exit
with ERROR.
[ activates the text interpreter by storing the execution address of $INTERPRET into the variable 'EVAL, which is executed in EVAL while the text interpreter is in the interpretive mode.
.OK prints the familiar 'ok' prompt after executing to the end of a line. 'ok' is printed only when the text interpreter is in the interpretive mode. While compiling, the prompt is suppressed.
EVAL is the interpreter loop which parses words from the input stream and invokes whatever is in 'EVAL to handle that word, either execute it with $INTERPRET or compile it with $COMPILE.
QUIT is the operating system, or a shell, of the eForth system. It is an infinite loop eForth will never get out. It uses QUERY to accept a line of commands from the terminal and then let EVAL parse out the words and execute them. After a line is processed, it displays 'ok' and wait for the next line of commands. When an error occurred during execution, it displays the command which caused the error with an error message.
Because the behavior of EVAL can be changed by storing either $INTERPRET or $COMPILE into 'EVAL, QUIT exhibits the dual nature of a text interpreter and a compiler.
CRR .( Interpret ) CRR
:: $INTERPRET ( a -- )
NAME? ?DUP
IF @ 400000 LIT AND
ABORT" $LIT compile only" EXECUTE EXIT
THEN DROP TEXT NUMBER?
IF EXIT THEN ERROR
:: [ ( -- )
forth' $INTERPRET >body forth@ LIT 'EVAL !
;; IMMEDIATE
:: .OK ( -- )
forth' $INTERPRET >body forth@ LIT 'EVAL @ =
IF ."| $LIT OK" CR
THEN ;;
:: EVAL ( -- )
BEGIN TOKEN DUP @
WHILE 'EVAL @EXECUTE \ ?STACK
REPEAT DROP .OK -;'
CRR .( Shell ) CRR
:: QUIT ( -- )
( =TIB) $730 LIT 'TIB !
[ BEGIN QUERY EVAL AGAIN
14.14 Compiler
After wading through the text interpreter, the Forth compiler will be an easy piece of cake, because the compiler uses almost all the modules used by the text interpreter. What the compile does, over and above the text interpreter, is to build various structures required by the new words we want to add to the existing system. Here is a list of these structures:
Name headers
Colon definitions
Constants
Variables
Integer literals
String literals
Address literals
Control structures
A special concept of immediate words is difficult to grasp at first. It is required in the compiler because of the needs in building different data and control structures in a colon definition. To understand the Forth compiler fully, you have to be able to differentiate and relate the actions during compile time and actions taken during executing time. Once these concepts are clear, the whole Forth system will become fairly transparent.
Here is a group of words which support the compiler to build new words in the code dictionary.
' (tick) searches the next word in the input stream for a word in the dictionary. It returns the execution address of the word if successful. Otherwise, it displays an error message.
ALLOT allocates n bytes of memory on the top of the code dictionary. Once allocated, the compiler will not touch the memory locations.
, (comma) adds the execution address of a word on the top of the data stack to the code dictionary, and thus compiles a word to the growing word list of the word currently under construction.
COMPILE is used in a colon definition. It causes the next word after COMPILE to be added to the top of the code dictionary. It therefore forces the compilation of a word at the run time.
[COMPILE] acts similarly, except that it compiles the next word immediately. It causes the following word to be compiled, even if the following word is an immediate word which would otherwise be executed.
LITERAL compiles an integer literal to the current colon definition under construction. The integer literal is taken from the data stack, and is preceded by the word doLIT. When this colon definition is executed, doLIT will extract the integer from the word list and push it back on the data stack. LITERAL compiles an address literal if the compiled integer happens to be an execution address of a word. The address will be pushed on the data stack at the run time by doLIT.
$," compiles a string literal. The string is taken from the input stream and is terminated by the double quote character. $," only copies the counted string to the code dictionary. A word which makes use of the counted string at the run time must be compiled before the string. It is used by ." and $".
CRR .( Compiler Primitives ) CRR
:: ' ( -- xt )
TOKEN NAME? IF EXIT THEN
ERROR
:: ALLOT ( n -- ) CP +! ;;
:: , ( w -- ) HERE DUP 1+ CP ! ! ;;
:: [COMPILE] ( -- ; <string> )
' $100000 LIT OR , -;' IMMEDIATE
CRR
:: COMPILE ( -- ) R> DUP @ , 1+ >R ;;
:: LITERAL $29E79E LIT , ,
-;' IMMEDIATE
:: $," ( -- ) ( CHAR " )
22 LIT WORD @ 1+ ALLOT -;'
?UNIQUE is used to display a warning message to show that the name of a new word is a duplicate to a word already existing in the dictionary. eForth does not mind your reusing the same name for different words. However, giving many words the same name is a potential cause of problems in maintaining software projects. It is to be avoided if possible and ?UNIQUE reminds you of it.
$,n builds a new entry in the name dictionary using the name already moved to the bottom of the name dictionary by PACK$. It pads the word field with the address of the top of code dictionary where the new code is to be built, and link the link field to the current vocabulary. A new word can now be built in the code dictionary.
CRR .( Name Compiler ) CRR
:: ?UNIQUE ( a -- a )
DUP NAME?
IF TEXT COUNT TYPE ."| $LIT reDef "
THEN DROP ;;
:: $,n ( a -- )
DUP @
IF ?UNIQUE
( na) DUP DUP NAME> CP !
( na) DUP LAST ! \ for OVERT
( na) 1-
( la) CONTEXT @ SWAP ! EXIT
THEN ERROR
$COMPILE
compiles the word just parsed out of the input stream. It searches the dictionary for this
word. If a match is found, compiles
it as a subroutine call, unless the word is marked as an immediate word. An immediate word is executed by the
compiler. If a match is not found in the dictionary, convert the word into a
number. If successful, the number
is compile as a literal. If not successful, exit with ERROR.
OVERT links a new definition to the current vocabulary and thus makes it available for dictionary searches.
; terminates a colon definition. It compiles an RET to the end of the word list, links this new word to the current vocabulary, and then reactivates the interpreter.
] turns the interpreter to a compiler.
: creates a new header and start a new colon word. It takes the following string in the input stream to be the name of the new colon definition, by building a new header with this name in the name dictionary. Now, the code dictionary is ready to accept a word list. ] is now invoked to turn the text interpreter into a compiler, which will compile the following words in the input stream to a list of subroutine calls in the dictionary. The new colon definition is terminated by ;, which compiles an RET to terminate the word list, and executes [ to turn the compiler back to text interpreter.
CRR .( FORTH Compiler ) CRR
:: $COMPILE ( a -- )
NAME? ?DUP
IF @ $800000 LIT AND
IF EXECUTE
ELSE $3FFFF LIT AND $100000 LIT OR ,
THEN EXIT
THEN DROP TEXT NUMBER?
IF LITERAL EXIT
THEN ERROR
:: OVERT ( -- ) LAST @ CONTEXT ! ;;
:: ; ( -- )
$5E79E LIT , [ OVERT -;' IMMEDIATE
:: ] ( -- )
forth' $COMPILE >body forth@ LIT 'EVAL ! ;;
:: : ( -- ; <string> )
TOKEN $,n ] -;'
With
“:” thus defined, the eForth system is essentially complete. It runs generally as a text
interpreter. When “:” is
encountered, it compiles a new word and adds it to the existing system. This is Forth.
14.15 Debugging
Tools
eForth
provides a set of very powerful tools to help users debugging their
programs. Since most Forth words
can be executed interactively under the interpreter, there is no need to set up
break points for tracing a complicated program. One simply execute the component words
sequentially and examine the stack and memory to determine if the words behave
properly.
What
it does provide are:
DUMP
to dump the contents of a range of memory.
WORDS
to dump the names of words in the dictionary.
.S
to dump the contents of the data stack.
SEE
to decompile a colon word.
DUMP dumps u words starting at address b to the terminal. It dumps 8 words to a line. A line begins with the address of the first word, followed by 8 words shown in hex, 7 columns per word.
dm+ displays u words from b1 in one line. It leave the address b1+u on the stack for the next dm+ command to use.
_TYPE is similar to TYPE. It displays u characters starting from b. Non-printable characters are replaced by underscores.
CRR .( Tools ) CRR
:: dm+ ( b u -- b )
OVER 7 LIT U.R SPACE
FOR AFT DUP @ 7 LIT U.R 1+
THEN NEXT ;;
:: DUMP ( b u -- )
BASE @ >R HEX 8 LIT /
FOR AFT CR 8 LIT 2DUP dm+
THEN NEXT DROP R> BASE ! ;;
WORDS allows you to examine the dictionary and to look for the correct names of words in case you are not sure of their spellings. WORDS follows the vocabulary thread in the user variable CONTEXT and displays the names of each entry in the name dictionary. The vocabulary thread can be traced easily because the link field in the header of a word points to the name field of the previous word. The link field of the next word is one cell below its name field.
WORDS displays all the names in the context vocabulary. The order of words is reversed from the compiled order. The last defined words is shown first.
.ID displays the name of a word, given the word's name field address. It also replaces non-printable characters in a name by under-scores.
Since the name fields are linked into a list in the name dictionary, it is fairly easy to locate a word by searching its name in the name dictionary. However, finding the name of a word from the execution address of the word is more difficult, because the execution addresses of words are not organized in any systematic way.
It is necessary to find the name of a word from its execution address, if we wanted to decompile the contents of a word list in the code dictionary. This reversed search is accomplished by the word >NAME.
>NAME finds the name field address of a word from the execution address of the word. If the word does not exist in the CURRENT vocabulary, it returns a false flag. It is the mirror image of the word NAME>, which returns the execution address of a word from its name address. Since the execution address of a word is stored in the word field, two cells below the name, NAME> is trivial. >NAME is more complicated because the entire name dictionary must be searched to locate the word. >NAME only searches the CURRENT vocabulary.
SEE searches the dictionary for the next word in the input stream and returns its code field address. Then it scans the list of subroutine calls (words) in the colon definition. If the address of the subroutine matches the execution address of a word in the name dictionary, the name will be displayed by the command '.ID'. If the word does not match any subroutine in the dictionary, it must be part of a structure and it is displayed by 'U.'. This way, the decompiler ignores all the data structures and control structures in the colon definition, and only displays valid subroutine calls in the word list.
CRR
:: >NAME ( xt -- na | F )
CONTEXT
BEGIN @ DUP
WHILE 2DUP NAME> XOR
IF 1-
ELSE SWAP DROP EXIT
THEN
REPEAT SWAP DROP ;;
:: .ID ( a -- )
TEXT UNPACK$
COUNT $01F LIT AND TYPE SPACE -;'
CRR
:: SEE ( -- ; <string> )
' CR
BEGIN
20 LIT FOR
DUP @ DUP FC0000 LIT AND
DUP
IF 100000 LIT XOR THEN
IF U. SPACE
ELSE 3FFFF LIT AND >NAME
?DUP IF .ID THEN
THEN 1+
NEXT KEY 0D LIT = \ can't use ESC on terminal
UNTIL DROP ;;
:: WORDS ( -- )
CR CONTEXT
BEGIN @ ?DUP
WHILE DUP SPACE .ID 1-
REPEAT ;;
Data stack is the working place of the Forth computer. It is where words receive their parameters and also where they left their results. In debugging a newly defined word which uses stack items and which leaves items on the stack, the best was to check its function is to inspect the data stack. The number output words may be used for this purpose, but they are destructive. You print out the number from the stack and it is gone. To inspect the data stack non-destructively, a special utility word .S is provided in most Forth systems.
.S dumps the contents of the data stack on the screen in the free format. The top of the stack is aligned to the left. .S does not change the data stack so it can be used to inspect the data stack non-destructively at any time. As the P24 has a 16 level hardware data stack, and the stack pointer is not available to software, we really do not know how deep the stack is. We only know the top of the stack, and can push it or pop it. However, if we dump all 16 items out, the stack will come to rest at the same point before the dump. Hence, .S dumps of T and 16 levels of data stack, and the stack is preserved.
CODE .S ( dump all 17 stack items )
PAD sta stp
stp stp stp stp
stp stp stp stp
stp stp stp stp
stp stp stp stp
DROP PAD $10 LIT
FOR DUP ? 1+ NEXT
DROP PAD @ CR -;'
14.16 Start
Up
After
powering up, the P24 CPU starts executing the instruction at location 0. A small set of instructions following
location 0 sets up the user variables necessary for the proper operation of the
interpreter and the compiler. Then,
it jumps to COLD and starts eForth.
COLD
first executes DIAGNOSE, to help the hardware designer making sure that the CPU
is executing the most commonly used machine instruction correctly. This routine is very useful in verifying
the CPU design in VHDL. Simulator
in VHDL can be invoked to trace through these instructions in DIAGNOSE.
DIAGNOSE
executes a sequence of words to leave the ASCII code of ‘ForthMl’ on the data
stack for the hardware designer to see.
COLD
starts eForth by first displaying the sign-on message ‘P24 v1.02’, and then
jump to QUIT to start the Forth interpreter.
CRR .( Hardware reset ) CRR
:: DIAGNOSE ( - )
$65 LIT
\ 'F' prove UM+ 0< \ carry, TRUE, FALSE
0 LIT 0< -2 LIT 0< \ 0 FFFF
UM+ DROP \ FFFF ( -1)
3 LIT UM+ UM+ DROP \ 3
$43 LIT UM+ DROP \ 'F'
\ 'o' logic: XOR AND OR
$4F LIT $6F LIT XOR \ 20h
$F0 LIT AND
$4F LIT OR
\ 'r' stack: DUP OVER SWAP DROP
8 LIT 6 LIT SWAP
OVER XOR 3 LIT AND AND
$70 LIT UM+ DROP \ 'r'
\ 't'-- prove BRANCH ?BRANCH
0 LIT IF $3F LIT THEN
-1 LIT IF $74 LIT ELSE $21 LIT THEN
\ 'h' -- @ ! test memeory address
$68 LIT $700 LIT !
$700 LIT @
\ 'M' -- prove >R R> R@
$4D LIT >R R@ R> AND
\ 'l' -- prove 'next' can run
1 LIT $6A LIT FOR 1 LIT UM+ DROP NEXT
;;
CRR
:: COLD ( -- )
diagnose
CR ."| $LIT P24 v"
66 LIT <# # # ( CHAR . ) 2E LIT HOLD # #> TYPE
CR QUIT
14.17 Control
Structure Words
This
is the set of compiler words which allows the user to build control structures
in a colon word. The structures
include:
Conditional:
IF
… THEN
IF
… ELSE … THEN
Finite
loop:
FOR
… NEXT
FOR
… AFT … THEN … NEXT
Infinite
loop:
BEGIN
… AGAIN
Indefinite
loop:
BEGIN
… UNTIL
BEGIN
… WHILE … REPEAT
These
compiler directives are not compiled like other regular Forth words into a
colon word. Instead, they compile
machine instructions like JZ, JMP, doNEXT, >R, into colon word with the
proper address information so that the control structures behave properly when
the colon word is executed. All
these words are ‘IMMEDIATE’ words which are executed, not compiled, in colon
words.
CRR .( Structures ) CRR
:: IF ( -- A ) HERE $80000 LIT , -;' IMMEDIATE
:: FOR ( -- a ) $71E79E LIT , HERE -;' IMMEDIATE
:: BEGIN ( -- a ) HERE -;' IMMEDIATE
:: AHEAD ( -- A ) HERE 0 LIT , -;' IMMEDIATE
CRR
:: AGAIN ( a -- ) , -;' IMMEDIATE
:: THEN ( A -- ) HERE SWAP +! ;; IMMEDIATE
:: NEXT ( a -- ) COMPILE doNEXT , -;' IMMEDIATE
:: UNTIL ( a -- ) $80000 LIT + , -;' IMMEDIATE
CRR
:: REPEAT ( A a -- ) AGAIN THEN -;' IMMEDIATE
:: AFT ( a -- a A ) DROP AHEAD BEGIN SWAP ;; IMMEDIATE
:: ELSE ( A -- A ) AHEAD SWAP THEN -;' IMMEDIATE
:: WHILE ( a -- A a ) IF SWAP ;; IMMEDIATE
14.18 Redefine
Macro Words
As
many Forth words are actually P24 machine instructions, the P24 Metacompiler
tries its best to assemble machine instructions instead of compiling subroutine
calls. Macros were defined, as
shown in Section 6.2, to produce optimized code in the eForth system.
However,
the macros are only tools in the metacompiler, and are not available in the
target system. The end users still
need all these Forth words for interpreting and compiling. These words must be included in the
final system as ordinary Forth words.
They are defined here.
CRR .( macro words ) CRR
CODE EXIT pop drop ret
CODE EXECUTE push ret
CODE ! sta st ret
CODE @ sta ld ret
CRR
CODE R> pop sta pop lda push ret
CODE R@ pop sta pop dup push lda push ret
CODE >R sta pop push lda ret
CRR
CODE SWAP
push sta pop lda ret
CODE OVER
push dup sta pop
lda ret
CODE 2DROP
drop drop ret
CRR
CODE + add ret
CODE NOT com ret
CODE NEGATE
com 1 ldi add ret
CODE 1-
-1 ldi add ret
CODE 1+
1 ldi add ret
CRR
CODE BL
20 ldi ret
CODE +!
sta ld add st
ret
CODE -
com add 1 ldi add
ret
CRR
CODE DUP dup ret
CODE DROP drop ret
CODE AND and ret
CODE XOR xor ret
CODE COM com ret
14.19 Final
System Words
ABORT" compiles an error message. This error message is display when the top item on the stack is non-zero. The rest of the words in the definition is skipped and eForth re-enters the interpreter loop. This is the universal response to an error condition.
." packs and compiles a character string literal which will be printed which the word containing it is executed in the runtime.
$" packs and compiles a character string literal. When it is executed, only the address of the unpacked string is left on the data stack. The programmer will use this address to access the string and individual characters in the string as a string array.
CODE
starts a new word containing machine code mnemonic.
CREATE
defines an array in memory. Its
size must be specified by ALLOT.
VARIABLE
defines a variable in memory. Its
initial value is 0.
(
starts a comment like ( this is a
comment. ) The string until and
including ) is ignored.
\
starts a comment line until the next end-of-line.
.(
starts a comment which is printed on the terminal. The string up to but not including ) is
printed.
IMMEDIATE
marks the word last defined as ‘immediate’. Immediate words are not compiled in a
colon word. They are executed to
build control structures.
CRR
:: ABORT" ( -- ; <string> ) COMPILE abort" $," ;; IMMEDIATE
:: $" ( -- ; <string> ) COMPILE $"| $," ;; IMMEDIATE
:: ." ( -- ; <string> ) COMPILE ."| $," ;; IMMEDIATE
:: CODE ( -- ; <string> ) TOKEN $,n OVERT -;'
:: CREATE ( -- ; <string> ) CODE doVAR ;;
:: VARIABLE ( -- ; <string> ) CREATE 0 LIT , -;'
CRR
:: .( ( -- ) 29 LIT PARSE TYPE -;' IMMEDIATE
:: \ ( -- ) #TIB @ >IN ! ;; IMMEDIATE
:: ( 29 LIT PARSE 2DROP ;; IMMEDIATE
:: IMMEDIATE $800000 LIT LAST @ @ OR LAST @ ! ;;
CRR