\       Lesson 9 - Compiler Words

\       The Forth Course

\       by Richard E. Haskell

\          Dept. of Computer Science and Engineering

\          Oakland University, Rochester, MI 48309

 

comment:

 

 

 

                                Lesson 9

 

                             COMPILER WORDS

 

 

                9.1  COMPILING VS. INTERPRETING                 9-2

 

                9.2  COMPILE AND [COMPILE]                      9-5

 

                9.3  LITERALS                                   9-6

 

                9.4  CONDITIONAL COMPILER WORDS                 9-8

 

                     BEGIN...WHILE...REPEAT                     9-9

 

                     IF...ELSE...THEN                           9-10

 

                     BEGIN...AGAIN                              9-11

 

                     BEGIN...UNTIL                              9-12

 

                     DO...LOOP                                  9-13

 

                9.5  EXERCISES                                  9-14

 

 

 

 

 

 

 

 


9.1  COMPILING VS. INTERPRETING

Compiler words are IMMEDIATE words.  This means that they are executed immediately when they are encountered in a colon definition rather than being compiled into the list segment.  Immediate words have the precedence bit in the name field set (see Lesson 3, Section 3.12).

F-PC can be in one of two possible states: compiling or interpreting.  It is in the compiling state during the compilation of a colon definition.  That is, between the time the word "colon" (:) is executed and the word "semi-colon" (;) is executed.   The system variable STATE contains the following possible two values:

                TRUE  --  if compiling

                FALSE --  if interpreting

To test what state exists at different times, consider the following two definitions:

comment;

 

        : 1state?       ( -- )

                        STATE @

                        IF

                           ." Compiling"

                        ELSE

                           ." Interpreting"

                        THEN

                        CR ;

 

        : 1test         ( -- )

                        1state? ;

 

comment:

        After FLOADing lesson9 type

 

                1state?

        and

                1test

 

        Note that "interpreting" is printed in each case.  Why?

        It is because you are in the interpreting mode when you type both 1state? and 1test.

 

 

 

 

 

How can you get "Compiling" to be printed? It is necessary to have 1state? execute during the time that 1test is being compiled.  That is, we must make 1state? an immediate word.  We do this by adding the word IMMEDIATE to the efinition following the semi-colon.  Let's change the name to

        2state? and define

comment;

 

        : 2state?       ( -- )

                        STATE @

                        IF

                           ." Compiling"

                        ELSE

                           ." Interpreting"

                        THEN

                        CR ; IMMEDIATE

 

comment:

        Now type in the following colon definition:

¡@

        : 2test   2state? ;

Note that when you type this definition, the word "Compiling" is printed as soon as you press <Enter>.  That is, the word 2state? is executed immediately and does not wait for you to later type 2test.  Now type

                2test

 

Note that nothing is printed on the screen.  This is because 2state? was not compiled in the dictionary.  It was just executed immediately.  Immediate words do not get compiled in the dictionary unless forced to do so.  You can force an immediate word to be compiled, rather than be executed immediately, by preceding the word by [COMPILE].  Thus, in the following definition of 3test the word 2state? is compiled and not executed immediately.

comment;

 

        : 3test         ( -- )

                        [COMPILE] 2state? ;

 

comment:

    What do you think will be printed when you type 3test?  Try it.

 

 

 

 

 

 

It is also possible to turn the compiler off and on within a colon definition by using the two words [ and ].  The word [ is an immediate word that turns the compiling state off; that is, it returns to the interpreting mode.  The definition of [ is

        : [     ( -- )

                STATE OFF ; IMMEDIATE

 

The word ] turns the compile mode on and then enters the compiling loop.  The compiling loop consists of the following:

        DO      Get next word in input stream;

                If it is an immediate word, execute it

                Else compile it;

                If the word is not in the dictionary,

                  convert it to a number and compile it;

        UNTIL end of input stream.

¡@

        An a final example, type in the following example:

        : 4test  [ 1state? ] ;

 

        Note that when you press <Enter> the word "interpreting" is printed.

        Why?

 

 

 

 

 

 

 

 

 


9.2  COMPILE AND [COMPILE]

We have seen that [COMPILE] will compile the following immediate word into the list segment.  Its definition is

        : [COMPILE]     ( -- )

                        ' X, ; IMMEDIATE

 

The word "tick" (') will put the CFA of the next (immediate) word on the stack and the word X, compiles the integer on the stack into the next available location in the list dictionary.  Note that [COMPILE] itself is an immediate word so that it is executed at compile time of the word that contains it.

Sometimes you want to compile a word at run time.  The word COMPILE will do this.  For example, the definition of "semi-colon" is basically the following.

        : ;     ( -- )

                COMPILE UNNEST          \ compile the UNNEST routine

                REVEAL                  \ make the colon word available

                [COMPILE] [             \ go to interpreting mode

                ; IMMEDIATE             \ do ; immediately

 

Note that ; is an immediate word so that it will be executed when it is encountered in a colon definition.  It will COMPILE the CFA of the UNNEST routine in the list dictionary of the colon word.  After making the colon word available to dictionary searches by the word REVEAL it switches to the interpreting mode by executing [ which had been compiled in the definition of ; (even though [ is an immediate word) by [COMPILE].

The definition of COMPILE, which will compile the CFA of the following non-immediate word when the word containing COMPILE executes, is the following

        : COMPILE       ( -- )

                2R@                     \ get ES:SI of next CFA in list seg

                R> 2+ >R                \ inc SI past next word in list seg

                @L                      \ get CFA on next word in list seg

                ,X ;                    \ & compile it at run time

 

 

 

 

 

 

 

 


9.3  LITERALS

 

         Consider the following colon definition:

comment;

¡@

        : four+         ( n -- n+4 )

                        4 + ;

comment:

This will be compiled into the dictionary as follows:

 

                                          four+

                         ________          |

                  CFA | CODE | <------|

                          |----------|                                      _________

                  PFA |   LSO   | -------- +XSEG ------->  |     (LIT)   |   ES:0

                          |----------|                                      |-------------|

                  Code Segment ?CS:                             |        4      |   IP = ES:SI

                                                                              |-------------|

                                                                              |        +      |

                                                                              |-------------|

                                                                              | UNNEST |

                                                                              |-------------|

                                                                         List Segment XSEG

 

The word (LIT) is a code word defined as follows:

        CODE (LIT)      ( -- n )

                        LODSW ES:       \ get next word at ES:SI, SI=SI+2

                        1PUSH           \ push it on stack

                        END-CODE

 

        Thus the word (LIT) will push the number 4 on the stack and the instruction pointer ES:SI will then be pointing to the CFA of +.

 

 

 

 

If you have a number on the stack and you want to compile it as a literal in the list dictionary you can use the word LITERAL. 
This word is defined as follows:

        : LITERAL       ( n -- )

                        COMPILE (LIT)   \ compile (LIT)

                        X,              \ plus the value n

                        ; IMMEDIATE     \ immediately

 

A useful application of the word LITERAL is when you compute some constant value in a definition.  For example, suppose that the value 5 is computed as 2 + 3 (perhaps for clarity in some definition).  You might then define the word five+ as follows:

comment;

 

        : five+         ( n -- n+5 )

                        [ 3 2 + ] LITERAL + ;

¡@

comment:

        Although this produces the same answer as

¡@

        : five+         3 2 + + ;

 

the advantange is that [ 3 2 + ] LITERAL will compile the literal 5 in the list dictionary at compile time so that at run time only 5 + is executed.  On the other hand, 3 2 + + will compile both a literal 3 and a literal 2 in the list dictionary and will execute two plus operations at run time.  Therefore, the use of

        [ 3 2 + ] LITERAL produces more efficient code that executes faster.

 

 

 

 

 

 


9.4  CONDITIONAL COMPILER WORDS

The two conditional compiler words BRANCH and ?BRANCH are used to define the various conditional branching instructions in F-PC.  The word BRANCH is a code word that is defined as follows:

 

                                                                         |-------------|

                                                                         |BRANCH |

        CODE BRANCH     ( -- )                          |-------------|

                        MOV ES: SI, 0 [SI]            |-----|     addr    | IP = ES:SI

                        NEXT                                |      |-------------|

                        END-CODE                       |      |                |

                                                                  |      |-------------|

                                                                  |      |                |

                                                                  |      |-------------|

                                                                  |--->|                 | ES:addr

                                                                         |-------------|

                                                                         |                |

                                                                         |-------------|

                                                                  List Segment XSEG

 

BRANCH is compiled into the list dictionary followed by the offset address that is to be unconditionally branched to.

The word ?BRANCH will cause a branch to the address following ?BRANCH if the flag on top of the stack is FALSE.  It is defined
as follows:

 

                                                                        |--------------|

        CODE ?BRANCH     ( f -- )                     |?BRANCH|

                        POP CX                                   |--------------|

                        CX<>0                               |----|     addr     | IP = ES:SI

                        IF                                       |     |--------------|

                           ADD SI, # 2                    |     |                 |

                           NEXT                             |      |-------------|

                        THEN                                |      |                |

                        MOV ES: SI, 0 [SI]            |      |-------------|

                        NEXT                                |---> |                | ES:addr

                        END-CODE                              |-------------|

                                                                         |                |

                                                                         |-------------|

                                                                    List Segment XSEG

 

 

 

 

 

 

BEGIN...WHILE...REPEAT

As an example of a BEGIN...WHILE...REPEAT loop recall the definition of the word "factorial" given in Lesson 4:

        : factorial     ( n -- n! )

                           1 2 ROT                 \ x i n

                           BEGIN                   \ x i n

                           2DUP <=                \ x i n f

                           WHILE                   \ x i n

                           -ROT TUCK           \ n i x i

                           * SWAP                  \ n x i

                           1+ ROT                  \ x i n

                           REPEAT                 \ x i n

                           2DROP ;                 \ x

This definition will be stored in the list dictionary as follows:

                           _________

                          |      (LIT)   |

                          |--------------|

                          |        1       |         XHERE  ( -- seg offset )

                          |--------------|

                          |      (LIT)   |

                          |--------------|

                          |         2      |

        TACK        |--------------|

                          |      ROT    |

                          |--------------|

     xhere1 --->   |     2DUP   | <--- : BEGIN  XHERE NIP ; IMMEDIATE

                          |--------------|

                          |      <=       |

                          |--------------|

                          |?BRANCH| <--- : WHILE  COMPILE ?BRANCH

       xhere1        |--------------|               XHERE NIP 0 X,

     xhere2 --->   |         0       | <---|         SWAP ; IMMEDIATE

                          |--------------|       |

                          |    -ROT     |       |

                          |--------------|       |

                          |    TUCK   |       |   xhere3

                          |--------------|       |

                          |       *        |        |

                          |--------------|       |

                          |    SWAP   |       |

                          |--------------|       |

                          |       1+      |       |

                          |--------------|       |

                          |      ROT    |       |

                          |--------------|       |

                          | BRANCH |       | <--- : REPEAT  COMPILE BRANCH X,

                          |--------------|       |                XHERE -ROT

                          |    xhere1   |       |--------------  SWAP !L ;

                          |--------------|                      IMMEDIATE

     xhere3 --->   |   2DROP   |

         seg           |---------------|

      xhere2        |  UNNEST  |

                         |---------------|

                    List Segment XSEG

 

The word BEGIN leaves the offset address xhere1 on the stack.  The word WHILE compiles ?BRANCH followed by a 0 at xhere2.  This value of 0 will later be replaced with the address xhere3 of the word 2DROP.  WHILE also leaves the value of xhere2 on the stack under xhere1.  The word REPEAT compiles BRANCH and then commas in the address xhere1.  It then puts the address xhere3 on the stack and then stores it at address seg:xhere2.

IF...ELSE...THEN

 

        Consider the following colon definition:

 

        : test          ( f -- f )

                        IF

                           TRUE

                        ELSE

                           FALSE

                        THEN ;

 

This will be stored in the list dictionary as follows:

                          _________

                          |?BRANCH| <--- : IF   COMPILE ?BRANCH

                          |--------------|             XHERE NIP 0 X,

     xhere1 --->   |        0        | <-----|       ; IMMEDIATE

                          |--------------|         |

                          |    TRUE    |         | xhere3

                          |--------------|         |

                          | BRANCH |         | <--- : ELSE  COMPILE BRANCH

                          |--------------|         |              XHERE NIP 0 X,

     xhere2 --->   |         0       |<--|    |              SWAP XHERE

                          |--------------|     |    |------------  -ROT SWAP !L

     xhere3 --->   |    FALSE   |     | xhere4          ; IMMEDIATE

                          |--------------|     |

     xhere4 --->   |  UNNEST |     | <--- : THEN  XHERE

                          |--------------|     |------------  -ROT SWAP !L

                                                     ; IMMEDIATE

                       List Segment XSEG

 

The word IF compiles ?BRANCH followed by a 0 at xhere1.  This value of 0 will later be replaced with the address xhere3 of the word FALSE.  IF also leaves the value of xhere1 on the stack.

The word ELSE compiles BRANCH followed by a 0 at xhere2.  This value of 0 will later be replaced with the address xhere4 of the word UNNEST.  ELSE also leaves the value of xhere2 on the stack after putting the address xhere3 on the stack and then storing it at address seg:xhere1.

The word THEN puts the address xhere4 on the stack and then stores it at address seg:xhere2.

BEGIN...AGAIN

An example of a using BEGIN...AGAIN was given in the pop-up menu examples in Lesson 8.  The typical form was

        : main          ( -- )

                        minit

                        BEGIN

                           KEY do.key

                        AGAIN ;

¡@

This will be stored in the list dictionary as follows:

                          _________

                          |    minit    |

                          |-------------|

     xhere1 --->   |      KEY   | <--- : BEGIN  XHERE NIP ; IMMEDIATE

                          |-------------|

                          |   do.key  |

                          |-------------|

                          | BRANCH| <--- : AGAIN   COMPILE BRANCH X,

                          |-------------|                ; IMMEDIATE

                          |    xhere1  |

                          |-------------|

                          | UNNEST |

                          |-------------|

                       List Segment XSEG

 

The word BEGIN leaves the offset address xhere1 on the stack.  The word AGAIN compiles BRANCH and then commas in the address xhere1.

 

 

 

 

 

 

BEGIN...UNTIL

The following example of a using BEGIN...UNTIL was given in Lesson 4.

        : dowrite       ( -- )

                        BEGIN

                           KEY

                           DUP EMIT

                           13 =

                        UNTIL ;

 

This will be stored in the list dictionary as follows:

                          _________

      xhere1 --->   |     KEY    | <--- : BEGIN  XHERE NIP ; IMMEDIATE

                          |--------------|

                          |     DUP     |

                          |--------------|

                          |    EMIT    |

                          |--------------|

                          |     (LIT)    |

                          |--------------|

                          |       13      |

                          |--------------|

                          |              |

                          |--------------|

                          |?BRANCH| <--- : UNTIL   COMPILE ?BRANCH X,

                          |--------------|                ; IMMEDIATE

                          |    xhere1   |

                          |--------------|

                          |  UNNEST |

                          |--------------|

                       List Segment XSEG

 

The word BEGIN leaves the offset address xhere1 on the stack.  The word UNTIL compiles ?BRANCH and then commas in the address xhere1.  Note that the only difference between BEGIN...AGAIN and BEGIN...UNTIL is that in UNTIL ?BRANCH replaces the BRANCH in AGAIN.

 

 

 

 

 

 

 

 

 

 

DO...LOOP

 A DO loop will produce the following structure in the list dictionary:

                          __________

                          |     (DO)     | <--- : DO   COMPILE (DO)

                          |--------------|             XHERE NIP 0 X,

        xhere1 ---> |         0      | <-----|       ; IMMEDIATE

                          |--------------|         |

 xhere1 + 2 ---> |                  | <--|   |xhere2

                          |--------------|     |   |

                          |                 |     |   |

                          |--------------|     |   |

                          |                 |     |   |

                          |--------------|     |   |

                          |  (LOOP)   |     |   | <--- : LOOP  COMPILE (LOOP)

                          |--------------|     |   |              DUP 2+ X,

                          | xhere1+2  |----|   |              XHERE

                          |--------------|         |-----------   -ROT SWAP !L

       xhere2 ---> |                  |                    ; IMMEDIATE

                          |--------------|

 

                       List Segment XSEG

 

The word DO compiles (DO) followed by a 0 at xhere1.  This value of 0 will later be replaced with the address xhere2  of the first word following the DO loop.  DO also leaves the value of xhere1 on the stack.

The word LOOP compiles (LOOP) and then commas in the address xhere1+2.  LOOP then puts the address xhere2 on the stack and then stores it at address seg:xhere1.

        The run-time word

                (DO)  ( limit index -- )

        sets up the return stack as follows:

 

                               Return Stack

                   ___________________________