3.  ADAPTING cmFORTH TO RTX2000

¡@

 

3.1.    Indelko Implementation of cmForth

The Indelko RTX Forth Kit consists of a 100mmx100mm printed circuit board and two EPROM's containing the cmForth modified for RTX2000.  The user has to provide the RTX2000 chip, two static RAM chips, a 74HC04, a 74HC32, a 74HC74, a MAX232, and some resistors and capacitors.  A 12 MHz crystal is recommended to drive RTX2000 at 6 MHz.  5 Volts power, ground, RS232 transmit and receive, ASIC address and data lines, and interrupts are brought to an edge where a 64 contact DIN C-form connector can be used to carry them to a mother board.

When the board first powers up, the 12 Mhz oscillator is divided to 6 MHz by a flipflop.  RTX chip divide the input clock by 2 internally and thus runs at 3 MHz.  The slower speed allows cmForth to start executing from the code stored in EPROM's.  Among other things, the boot-up procedure copies the cmForth kernel from the EPROM's to the faster SRAM's.  It then disables the EPROM's while enables the SRAM's, and disables the flipflop divider.  Now the RTX is running at 6 MHz from the SRAM's.  It then falls into a waiting loop for the user to type a 'B' character on the keyboard.  It uses the B character to determine the baud rate of the serial RS-232 line and uses this baud rate to communicate with the terminal or the host computer.

The EPROM sockets have 28 pin DIP footprints.  Up to 64 Kbytes of EPROM's can be accommodated.  The SRAM sockets have 32 pin footprints to host even larger SRAM chips, up to 1 Mbytes.  150 ns SRAM's and 250 ns EPROM's are adequate for 6 MHz operations on RTX2000.

 


3.2.    Motorola Style Byte Orientation

The most significant difference between cmForth for RTX and that for NC4000 is the memory addressing method.  Addresses used on NC4000 are word addresses.  16 bits in an address makes it possible to address directly 64 Kwords (128 Kbytes).  However, because in a subroutine call instruction the MS bit must be zero, Forth code can only be placed in the lower 32 Kword region.  The upper 32 Kword region can only be used to store data.

In the RTX version of cmForth, addresses are byte addresses.  Consequently, only 64 Kbytes of memory can be accessed.  All 64 Kbytes can be used to store code and data.  Although memory is addressed externally with byte addresses, internally the addresses of executable code are still word addresses.  Executable code must be aligned to word boundaries.  When we examine the object code in memory, we have to remember that addresses in the address fields of subroutine call instructions are word addresses, which are byte addresses divided by 2.

The byte addressibility in RTX makes more efficient use of the memory and it makes the cmForth better aligned to Forth-83 Standard.  It also eliminates the needs to pact bytes into words and unpack words to bytes in NC4000.

The hardware byte swapping mechanism in RTX2000 allows bytes to be placed in memory using either the Intel style or the Motorola style.  To store a 16 bit word into a byte oriented memory, the Intel style puts the LS byte in a lower memory location and the MS byte in the higher memory location.  The Motorola style places the MS byte in the lower memory location.  RTX cmForth chooses the Motorola style to store 16 bit words.  This style has the advantage that when a range of memory is dumped in bytes, 16 bit numbers are displayed in a more natural and readable fashion.


3.3.    Accessing ASIC Bus

 

NC4000 has two bidirectional I/O ports: a 16 bit B-port and a 5 bit X-Port.  B-port is a general purpose I/O port and X-port is supposed to be an extension to the memory address, allowing NC4000 to address 2 Mwords (4 Mbytes) of memory.  However, the memory extension mechanism does not work, and the X-port can only be used as a general purpose I/O port.

Both ports have 4 registers: a data register, a direction register, a mask register, and a tristate register.  These registers give user the versatility in connecting to outside peripheral devices.  NC4000 has 16 internal registers.  The B-port register are registers 8 to 11, and the X-port registers are 12 to 15.

RTX2000 used a different approach to implement the input-output devices.  Instead of dedicated registers, it uses an I/O bus, to which many peripheral devices and be attached.  Using an I/O bus makes RTX2000 an ideal CPU core in an ASIC design, because customers can add application specific peripheral devices around the RTX core and built their own chips to solve their own problems.

Harris uses RTX2000 as the bait to catch ASIC customers.  To demonstrate the speed, the power, and the versatility of RTX CPU core, Harris added several very useful devices around the core: a single cycle 16 bit integer multiplier, three 16 bit counter-timers, and an interrupt controller.  After placing these devices on the ASIC bus, there are 8 registers left unused.  These 8 unused registers are brought to the outside so that users can connect up to 8 external devices to the ASIC bus.  These external devices behave like the internal registers, and they can be accessed by RTX2000 in a single machine cycle.

The architecture uses a five bit field in the memory instructions to specify an internal/external register.  It can thus accommodate 32 registers.  Among them, the first 8 registers are used by the CPU, next 16 registers are used to control on-chip devices, and the last 8 registers are opened to external devices.  Following is a list of the registers defined in RTX2000.

 

No. Name            Function

 0  I                   Index Register.  Read/write does not pop/push the return stack.

 1  I                    Index Register.  Read/write popes/pushes the return stack.

 2  I                     Count Register for LOOP and REPEATS instructions.

 3  CR                 Configuration Register containing setup/status information.

 4  MD                Multiply/Divide Register for step math instructions.

 5  SQ                  Pseudo Register used with MD in step math instructions.

 6  SR                   Square Root Register for square root math instructions.

 7  PC                   Program Counter.

 8  IMR                 Interrupt Mask Register to enable/ disable individual interrupt request.

 9  J/K                   Stack Pointer Register.  Bits 0-7 are the data stack pointer.  Bits 8-15 are the return stack pointer.

10                         Reserved.

11  IVR                 Interrupt Vector/Stack Limit Register. Read the current interrupt vector.  Write the stack limits.

12  IPR                  Index Page Register.  Contains the upper 5 bits of a subroutine return address.

13  DPR                 Data Page Register.  Contain memory pagenumber for memory instruction if DPRSEL bit is 1.

14  UPR                User Page Register.  Bits 8-11 contains the memory page number for user memory instructions.

15  CPR                Code Page Register.  Contains the memory page number for instruction fetch cycles and memory instructions if DPRSEL is 0.

16  IBCR               Interrupt Base/Control Register. Contains interrupt base address and other processor setup values.

17  UBR               User Base Register.  Points to a 32 word memory block for user memory  instructions.

18                          Reserved.

19  TCR0/TPR0     Counter/Timer 0 Register.  Reading returns current counts.  Writing loads the pre-load count.

20  TCR1/TPR1     Counter/Timer 1 Register.

21  TCR2/TPR2     Counter/Timer 2 Register.

22  MLR                Multiplier Low Register.  Writing initiates an unsigned multiply cycle.  Reading returns lower 16 bits of product.

23  MHR                Multiplier High Register.  Writing initiates a signed multiply cycle.
 Reading returns higher 16 bits of product.

24-31   ASIC          Bus Registers 0-7.

 

In RTX cmForth, these registers are accessed using the following commands:

 

n G@           Fetch contents in register n and push the value on top of data stack.

n G!             Pop top of data stack and store its value into register n.

 

Since many of the registers contain important setup and configuration information which control the functioning of RTX2000, the user should be very careful in storing new data into them.  Mistakes are generally fatal.

 


3.4.    Single Cycle Multiply

 

RTX2000 has a hardware 16 bit single cycle multiplier, which can greatly accelerate computation intensive algorithms like filters and signal processing.  This special feature makes RTX2000 look more like a DSP chip than a controller. 

 

Multiplication is initiated by writing to register 22 or 23.  The multiplicant and multiplier are in TOP and NEXT registers on the top of the data stack.  The resulting 32 bit product is in registers 22 and 23, which have to be read immediately following the initial writing.  During these three cycles, the data stack is not pushed or popped, and the product will replace the multiplicant and multiplier.

The following words are defined in cmForth to effect multiplication:

22 G!       MULU         Initiate unsigned multiplication.

23 G!       MULS          Initiate signed multiplication.

22 G@     MLR@         Fetch low 16 bits of product.

23 G@     MHR@        Fetch high 16 bits of product.

 

A few examples of these instructions are:

: M* ( n n -- d )       MULS MLR@ MHR@ ;

: *  ( n n -- n )         MULS MLR@ MHR@ DROP ;

: UM* ( u u -- ud )     MULU MLR@ MHR@ ;

 

Bit 6 in the IBCR (Interrupt Base/Control Register, register 16) is the ROUND bit.  If this bit is set to 1, the lower 16 bits of the product is rounded into the higher 16 bits of the product.  Rounding is generally preferred than truncation when a sequence of multiplications are performed.

 


3.5.    Timers

RTX2000 has three 16 bit counter/timers.  They can use either the processor clock as input for timing applications or get their inputs from EI3-5 pins to perform counting operations.  As show in the last section, each counter/timer uses a register on th ASIC bus for setup and counting.

Writing to a counter/timer register loads an initial count from TOP to the counter register.  The counter is decremented with every input or clock pulse.  When the counter is decremented to zero, an interrupt is sent to the interrupt controller.  The counter is then re-initialize with the initial count previously loaded into it.

Reading a counter/timer register copies its current count to the TOP register on the data stack and counting is not affected.

Bits 8 and 9 in the Interrupt Base/Control Register (IBCR, register 16) determine the sources of clock to the counter/timers. 

 


3.6.    Interrupt Controller

The interrrupt controller is enabled by bit 4 in the CR (Configuration Resgister.)  If this bit is 1, interrupts are disabled.  If it is cleared, interrupts will be serviced. The prioritized interrupt controller responds to 14 sources of interrupts.  The follwing table shows the interrupts and their priorites, among other information.

Source      Priority    Sensitvty   IMR bit Vector address

Nonmaskable Inter.  0(high) edge        N/A     1E0

Ext. Interrupt 1    1       level       01      1C0

Stack Underflow 2       level       02      1A0

R Stack Underflow   3       level       03      180

Stack Overflow  4       level       04      160

R Stack Underflow   5       level       05      140

EI2         6       level       06      120

Timer/Counter 0 7       edge        07      100

Timer/Counter 1 8       edge        08      E0

Timer/Counter 2 9       edge        09      C0

Ext. Interrupt 3    A       level       0A      A0

Ext. Interrupt 4    B       level       0B      80

Ext. Interrupt 5    C       level       0C      60

Software Interrupt  D(low)  level       0D      40

 

The Non-Maskable Interrupt NMI cannot be masked.  Other interrupts can be individually masked by setting the corresponding bit in the Interrupt Mask Register (IMR).  When an interrupt is recognized, the interrupt controller generates an interrupt vector, forcing the CPU to make a subroutine call to the routine pointed to by the interrupt vector.  The interrupt vector is formed by bits 10-15 in the Interrupt/Base Control Register (IBCR) and the vector address in the above table.  Address bits 16-19 are always 0.  Consequently, the interrupt service routines have to reside in memory page 0.

The Software Interrupt SWI is generated by writing to the IBCR register.  It is cleared by reading IBCR.  The following instructions are used to control SWI:

 

16 G@ DROP      Clear software interrupt.

16 G@ 16 G!         Request software interrupt but preserve contents in IBCR.

 


3.7.    Internal Stacks

NC4000 uses external stacks which reside in external memories separatied from the main memory.  The stack pointers are contained in register 0.  Bits 0-7 are the data stack pointer and bits 8-15 are the return stack pointer.  It was necessary to leave the stacks off the chip, because NC4000 did not have enough cells to implement internal stack.  However, having external stacks sometimes is advantageous because stack pages can be swapped conveniently and stack data can be accessed easily.

RTX2000 holds both stacks on chip, with an integral stack controller.  The stacks are controlled with two registers: a Stack Pointer Register (SPR) and a Stack Limit Register (SLR).  Bits 0-7 in SPR are the data stack pointer, and bits 8-15 are the return stack pointer.  Corresponding bits in the SLR are the upper limits to the data stack and the return stack.  The lower limits to the stacks are 0.

When a stack pointer is decremented to 0, a stack underflow interrupt is generated to the interrupt controller.  When a stack is incremented passing the stack limit set in SLR, a stack overflow interrupt is generated.  Proper interrupt service routines must be installed to respond to these interrupts, if these interrupts are enabled.

During power-up, cmForth initilaizes both stack pointers to 1, and the stack limits are 255.

 


3.8.    Meta Compilation

The meta compiler allows cmForth to produce a copy of a ROMmable object image into which a user can add custom application code.  To regenerate cmForth itself for testing purposes, type:

    1 LOAD      Load optimized compiler.

    3 LOAD      Load meta compiler in screen 2 and the entire cmForth.

The target object image is stored between 2000H and 2FFFH, which can be used to burn a set of EPROM's.  This image is identical to that contained in the EPROM's supplied in the Indelko Forth Kit.

Screens 8 to 23 contain the Forth kernel and the text interpreter.  Screens 24 to 30 contain the Forth compiler.  User application should be loaded after Screen 23 before the compiler is loaded to avoid using the wrong version of compiler in compiling the application.

 

¡@