SPECIAL FEATURES IN LaFORTH

 

 

C. H. Ting

LaForth was developed by LaFarr Stuart and Robert L. Smith, concurrently with figForth in 1978-79.  Both of them were actively participating the Forth Implementation Team which released the figForth Model on 6 different microprocessors in 1979.  However, LaFarr was not satisfied with the figForth Model.  He put his many ideas into LaForth and used it to demonstrate the results of his experimentations.  In one of the FIG meetings, he jokingly introduced himself by announcing that: "I mutilating Forth".

Originally LaForth was implemented on a 6809 microprocessor.  Recently, it was moved to an 8088 under MS-DOS.  As the Silicon Valley FIG Chapter is looking at various approaches to produce a Forth Model which can be readily adapted to the more powerful microprocessors of the 1990's, it becomes apparent that many of the features in LaForth are very useful.  It is thus appropriate to publish LaForth in a form useful to people who might want to help producing this new Forth Model and actually move it to different processors.

Because of the many unique features in LaForth, LaFarr often claimed that LaForth is more Forth-like than Forth.  You can only appreciate this claim after experiment with LaForth for a while.  Some of the most important features are highlighted here.

 

1.  IMMEDIATE EXECUTION

Most Forth systems process keyboard input on a line-by-line basis; i.e., they read in a whole line of text and only process the text after a carriage return is entered to signify the end of a line.  This is a hangover from the very old days when the computer knew only how to deal with punched cards, which holds up o 80 characters of text.  With personal computers, which are designed to give their full attention to their owners, forcing the computer to sleep while the user is typing in a line of text is a pure waste of its power.  The computer can and should provide much more service to the user, and help him with his typing as much as it can. 

LaForth intensely looks at the characters the user types in.  Whenever a word delimiting character, a space or a carriage return, is detected, it immediately goes to work.  If the word can be found in the dictionary, it is executed or compiled immediately.  If the word does not exist in the dictionary and it can be converted to a number, it is converted and the resulting number is pushed on the data stack or compiled as a literal into the dictionary.  If the number conversion failed, LaForth cannot do anything about the word, and it beeps and moves the cursor back to the beginning of the offending word to let the user typing in the correct word.

LaForth thus processes the text one word at a time.  The spaces have the same effect as the carriage returns.  They all cause LaForth to process the word just finished.  The only difference is that a carriage return causes an additional line-feed character echoed back to the terminal.

Not waiting for a whole line of text allows LaForth to detect typing errors early, and thus helps the user to use the computer more efficiently.  The interaction between the user and its computer is much more intimate and intense.  Most often the user gains more confidence on the computer and the productivity is improved.

 

2.  TEXT FILE AS MASS STORAGE

Since LaForth treats carriage returns similarly to spaces, it can process regular line-based text files without much additional efforts.  Basically, the carriage returns in a text file can be considered as equivalent to spaces.  Since LaForth stops processing a text string which it reaches a NULL character, text files can be terminated by NUL characters, as well as EOF characters.

The same text interpreter is used to process text entered from a keyboard and the text obtained from a text file, the design of the text interpreter is simple and fast.  Most text files generated by popular word processors and editors can be used by LaForth for execution and compilation.  Therefore, it is not necessary to burden LaForth to provide a text editor, which is an utility conventional Forth system must provide to deal with text stored in the block format.  Since the block format is not commonly used by most commercial editors and word processors. Forth systems must provide the functionality, which is not a trivial application.  Most Forth block editors lack the sophistication and user friendliness which are common expectation from the user's point of view.  A block editor with limited functionality is the most common source of complains and irritation to the unsuspecting users.

¡@

3.  LOOPS AND CONDITIONALS IN LAFORTH

For many years, LaForth has been a private and highly experimental version of Forth, used and modified only by us.  There have been only a very few published papers and talks about LaForth, in part because we did not wish to conflict with figForth or the standardization effort.  Some of the results of LaForth have, nevertheless, had some influence on other Forth systems.  We have recently agreed to release a version of LaForth for others to use.   We anticipate that its main use will be for experimentation.  The strength of Forth is its simplicity, and in that sense we believe that LaForth is more Forth-like than Forth!  It is very easy to modify.   It is an excellent test bed for new ideas.   It is a "lean and mean" type of a Forth system, in contrast to some of the recent "fat" Forths which have come into vogue.

One of the obvious differences of LaForth from other versions of Forth is a new set of words for conditionals and loops.  The usual Forth words for conditionals like IF , ELSE and THEN are, at the least, quite confusing for beginners in Forth, whether or not they have learned another computer language previously. Words like BEGIN and UNTIL show nesting so poorly that many Forth programmers use indentation and lots of "white space" to clarify the nesting.   In LaForth, the nesting tends to be quite obvious through the use of the left and right brackets, "[" and "]".   The bracket symbols are quite well known typographical symbols clearly suggesting closure.  The set of words in this paper appear to us to be far more consistent and readable than those currently in vogue. If you disagree, you can obviously replace our suggested words with any other of your choice.

The word [[ can be used to mark the beginning of structures previously started by the word BEGIN or Eaker's CASE.  The word ]] is our suggested replacement for the words AGAIN and REPEAT .   When used with no related conditionals it causes an indefinite repeat back to the [[ .

Three of the LaForth words have the "?" symbol immediately preceding a bracket symbol. This indicates a test of the value on the top of the stack.   Two other words use the ? symbol after a bracket to indicate an obvious "matching." We have tried to be consistent with the use of the "?" when used in conjunction with the bracket symbols.   When used for matching at the end of a conditional, it merely resolves the implied forward references and does not imply any branching back.

Thus the usual IF ... ELSE ... THEN structure turns into the sequence

 

        ?[  ...  ][  ...  ]?

 

where the ... indicates the true and false parts.  The symbol ?[ can be used not only to replace IF but also the word WHILE in either its single or multiple forms.  The "multiple while loop" has the form:

 

        [[  ...  ?[  ...  ?[  ...  ?[  ...  ]]

 

where the final word ]] causes a branch back to the [[ mark and also resolves the forward conditional branches.   An alternative form (the "andif conditional") just resolves the nested conditionals without branching back:

 

   [[  ...  ?[  ...  ?[  ...  ?[  ...  ]]?

 

The purpose of the word ]]? is to resolve the appropriate number of unresolved forward references since the occurrence of the opening mark [[ .  There is no exact equivalence for the "andif conditional" in standard Forth, but the closest functionality is the word ENDCASE from Eaker.  In the form above, one might invent a word THENS or ENDALL .  Note that the ? symbol used in either the ]? or the ]]? form implies that there is no looping back, merely a termination of a set of conditionals. It should be noted that the word ][ can be inserted after any use of ?[ in the above examples.  For simple looping, we use the form:

 

        [[  ...  ?]

 

to replace the words BEGIN and UNTIL .

 

Although case statements are not strictly required, we found them useful enough to be included in the basic set.  By adding only one additional word we found that we could implement the entire Eaker case statement.  The suggested new word is =?[ and it has the effect of the sequence

 

        OVER  =  ?[  DROP

 

A case statement typically has the form:

 

        [[      n1  =?[  ...  ][

                    n2  =?[  ...  ][

                    n3  =?[  ...  ][

                    default      ]]?

 

We see that we have managed to use previous words for three out of the four words generally used in a case statement.   Notice that in the default part it is likely that you will have to DROP the initial argument on the stack.

Finally we come to the counting type of loop.  Other than a simple name change, we suggest that Charles Moore's FOR - NEXT loop should replace the older DO - LOOP structure.   The simplicity it offers more than pays for the apparent slight loss of generality.  The form it takes in LaForth is simply:

 

        n  #[  ...  ]#

 

The appearance of the # sign implies a counting type of a loop, and the appearance is quite similar to ?[ ..  ]? in terms of a nice typograghical clustering. 

The above information is summarized in the following table for a quick summary:

 

   Construct           Usage

 

   Conditional     ?[   ...   ][   ...   ]?

                   ?[   ...   ]?

   Infinite Loop   [[   ...   ]]     

   Until Loop      [[   ...   ?]

   While Loop      [[   ...   ?[   ...   ?[   ...   ]]

   Andif           [[   ...   ?[   ...   ?[   ...   ]]?

   Case            [[   ...   =?[   ...   ][

                        ...   =?[   ...   ][

                        ...               ]]?

   Down-count      #[   ...   ]#

 

 

The final table gives suggested pronunciations and the equivalent names of LaForth conditionals as represented in alternative Forth systems:

 

LaForth   Pronunciation Equivalent  Alternative

[[        mark          CASE            BEGIN

=?[       case         OF              OVER = IF DROP

?[        ifso         IF              WHILE

][        otherwise    ENDOF           ELSE

]]?       endall       ENDCASE         THEN THEN ... THEN

]]        repeat       AGAIN           REPEAT

]?        endif        THEN

?]        until        UNTIL

#[        count         FOR

]#        down         NEXT

 

We have presented a more consistent nomenclature for looping and conditional structures than is seen in usual Forth systems. We have ignored some of the obvious additional forms to keep the presentation to a minimum.  In the While and Andif forms, additional structures using ][ and =?[ are possible, and in the Case structures one may readily add the simpler conditional structures.

 

4.  32 THREADS HASED DICTIONARY

LaForth needs a very fast mechanism to find words in the dictionary so that words can be found and executed instantaneously when a space or carriage return is detected.  The dictionary is broken down into 32 different threads, and generally only one thread is traversed to locate a particular word in the dictionary.  The link addresses of the last words in the threads are kept in a table, whose address is returned by the word VOCTAB.  Words are linked through the link fields in the words.  The link field of the first word in a thread contains a zero, indicating the end of the thread.

LaForth also allows for 32 vocabularies to be declared.  Each vocabulary is assigned a vocabulary index.  The ROOT vocabulary has an index of 0, and the ASSEMBLER vocabulary has an index of 1.  Other vocabularies as assigned indices sequentially.  The hashing algorithm is very simple.  The ASCII value of the first character in a name is added to the index of the vocabulary in which the word belongs.  The resulting sum, modulo 32, is used to select the thread in VOCTAB, and the search begins with the link address thus selected from the VOCTAB table.

Each thread links together about 20 words.  Searching a word in a thread is thus very fast.

 

5.  UNIQUE DATA STRUCTURE IN A WORD

Words in LaForth is represented by a single address, the Code Field Address, which is returned by the dictionary search word ' , and is also compiled into other colon definitions.  It is also the address which EXECUTE expects.

LaForth uses the Direct Threaded Code technique to organized the information in the Code Field.  What it means is that the Code Field contains executable object code, rather than a pointer to a memory location where executable code is stored.  The latter method is used in most Forth systems and is known as Indirect Threaded Code.  Direct Threaded Code Forth is generally faster, because of the elimination of an extra level of indirection.

In a colon definition, the first three byte in the Code field contain a CALL DOLIST instruction, which invokes the colon definition interpreter.  The address list of the colon definition starts from the fourth byte and is terminated by a UNNEST word.

The Link Field in a word is placed two bytes in front of the Code Field, and its contains the address of the link field of the word defined before this word, in the same thread.  The link fields thus link together the words in a thread as a linear chain.  The address of the link field of the last word in a thread is stored in the vocabulary table VOCTAB.  The end of a thread is recognized as the first word has a zero in its link field.

The most outstanding feature of LaForth is the name fields in the words.  The Name Field in a word is a variable length field, starting from the 3rd byte below the Code Field and extends to the lower memory.  The characters in the name is thus arranged backwards, from high memory address to low memory address.  The name field is terminated by a zero byte or NUL character.  The null terminated name field allows LaForth to use names of indefinite length, not limited to 31 characters, as in the figForth Model.

The highest bit, bit 7, in the first character in the name field is reserved as the immediate bit.  If this bit is set, the word is an immediate word which is executed inside a colon definition.  Non-immediate words are compiled in a colon definition.  Immediate words are used to construct control and data structures inside a colon definition.

 

6.  LaForth Words by Categories

 

 

Memory Accessing Words

 

@   C@  2@  X@  XC@     P@ PC@

!   C! 2    !   X!  XC! P!  PC! +!

CMOVE   FILL

 

Stack Words

 

+R  RDROP   RP! SP@ SP!

DUP ?DUP    OVER    2DUP    2OVER   ROT -ROT

DROP    2DROP   NIP SWAP    2SWAP   SWAB

>R  R>  I   2I  J

 

Math Words

 

+   -   *   /   D+  D-

1+  2+  1-  2-  2*  2/  D2* D2/

M*  UD* /MOD    UM/MOD  UDMOD/  NEG DNEG

RAND    0   1   2   3   4   -1  -2

 

Logic and Comparison Words

 

AND OR  XOR COMP    -$< $<

0=  D0= 0<> 0<  <   D<  U<  UD<

 

Interpreter Words

 

DSADDR  TEXT!   (TEXT   .NAME   .TEXT   GETCHAR -WORD

SKIP    SCAN    EXECUTE (DOS    DOS BYE MS

DECIMAL INTERPRET   RUN '  

?STACK  QUIT    WARM    COLD

 

Compiler Words

 

?DEF    ('  'PRE    'LAST   (NUM    (B

SCOMP   LIT CLIT    DLIT    ALLOT   ,   C,     

:   :CON    :BUILD  :VAR    (DEFER  (;C ;:  ;

]:  ?CSP    HERE    COMPILE ?COMP   (CALL   (JMP

HEAD,   DEFS    INSTALL :VOC    IMM ROOT

(." ("  (=?[    (?] (]  (E] (]#

 

System Variables

 

MEM INCNT   DP  LATEST  STATE   BASE    ECHO    USER

SEARCHING   GROWING VOCNUM  VOCTABLE    #VOCS

 

 

¡@