Interfacing to Assembly Language

The programmer is free to program in assembly, preferably as low level forth 'code' words, with a minimum of restrictions to avoid conflict with the forth machine model. WREG, XHIGH and XLOW may be used as desired, but their contents are not preserved between calls. There are four general purpose scratch registers (tempA to tempD) which may be used as desired, but their contents must be saved (it has been my experience that these registers are generally not needed, indeed they are only used in the high precision division and multiplication routines.) There are convenient predefined macros and functions for accessing the two Forth stacks, should the assembly function need to interface to Forth words, or need to push registers. The Forth words assume that the bank select register is always pointing to the stacks bank.

Invoking Assembly Language

Words can be defined in assembly by using the code word instead of : (colon.) Words defined in this way should contain a return instruction (or defer a return) and be ended with endcode.

Inline assembly can also be embedded into a regular Forth word by using [code]. Note that endcode is also used to delimit this inline assembly block, but in this case it does not end compilation of the word. Also, no return instruction should go into an inline assembly block.

PicForth does not have an internal assembler. All assembly words and blocks are echoed into the generated .asm file for assembly by MPASM or compatible assembler.

Calling Forth Words from Assembly

Calling Forth words from assembly language is generally easy, except for a couple of caveats. Due to restrictions on symbol naming in MPASM, some Forth symbols must be mangled to make them legal MPASM symbols (an example is + which would be otherwise be interpreted as the addition operator by the assembler) otherwise an underscore is simply prepended to the Forth symbol. A way to avoid having to call mangled symbols from assembly, is to use inline assembly language. You can do something like this:
: someword
[code]
        ; do some assembly stuff here
endcode
        \ Back to forth.
        dup swap rot
[code]
        ; back to assembly
endcode
;
You also must avoid calling words defined with ::, unless you use the same calling sequence as the compiler generates. These words do not use the hardware return stack, but rather put the return address on the software return stack, so are non-standard in their call/return mechanism.

Direct Access to Memory

PicForth could allocate variables in any bank, so a bank switch is necessary before a variable is accessed. PicForth variables are pointers to memory locations, with the high byte containing the bank location. Also, the big-endian byte ordering convention must be maintained for multi-byte integer types. An example of accessing a PicForth variable in a code word would be:
variable count   \ Declare a variable.

code getcount
   call _pushtop                ; make room on the stack
   gprbank  high _count_PTR     ; set the bank
   movfp _count_PTR, THIGH      ; get the variable high byte
   movfp _count_PTR+1, TLOW     ; get the variable low byte
   gprbank BANKSTACKS           ; reset the bank
   return
endcode
Code words should follow all forth register assignment rules. There are numerous examples of code words in the file picf16.f.

The Forth Virtual Machine Implementation

What follows is a description of the Forth stacks and associated resources. THIGH and TLOW function as the top of stack registers and are located in unbanked memory. The rest of the stack is implemented using INDF1 (renamed STACK) and FSR1 (renamed SP). The parameter stack is located at the very end of physical memory, in bank BANKSTACKS, and grows toward low memory. During normal operation, BSR points to BANKSTACKS. BSR may temporarily be changed to directly address variables in other banks, but must be immediately reset to point to BANKSTACKS.

The low memory in BANKSTACKS is used for a secondary stack for holding loop indexes and for data stack manipulations. This is equivalent to the Forth return stack, but it will not hold return values, as the PIC already has a hardware return stack (not addressable). We will call it (the software one) the return stack anyway, to be consistent with Forth. The rule for addressing the return stack is that, since the return stack holds the limit and index of counted loops, manipulations must retain the stack integrity within loops. That is, anything pushed onto the return stack must be popped off before the end of loop condition is tested for, or the count will be corrupted. It would also be a bad thing to blindly drop count values off the return stack. The return stack is addressed by the FSR0 register (renamed RSP for return stack pointer) and dereferenced through INDF0 register (RSTACK).

Variables are automatically allocated to the memory banks.

Registers and Indirect Memory Access

If indirect memory access is needed, you will need to push the RSP register. Disable interrupts, and then the FSR and INDF registers can then be used for indirect access as usual. It is suggested that you not use FSR1 and INDF1 for indirect operations, and their names have been removed from the include file.

Due to the implementation of the stack, the two highest places in memory (FFh and FEh) never receives a valid stack value. A push operation with the stack empty pushes whatever transient value happens to be in the top of stack registers into FFh and FEh. Therefore those memory cells become two valuable general purpose registers XHIGH and XLOW. No assumption about the value of the X registers should be made between calls, and it is important to realize that stack pushes when the stack is empty will corrupt the X registers. The supplied set of forth words make extensive use of the X registers as a scratch register.

There are also 4 temporary registers in BANKSTACKS: tempA through tempD. These temporary registers should be saved and restored after use.

Special Considerations due to Interrupts

Since interrupt handlers are free to use the stacks, if interrupts are to be used there are some special considerations for words defined in assembly language. Pushes and pop to the stacks must be atomic operations, that is, the stack pointers must not be used to index up and down through valid stack items. If an interrupt does occur, stack values above the stack pointer will change (I inadvertently did this once, and it created the most subtle and insidious bug of this project. Don't make the same mistake.) As an example of proper indexing through the stack, see the code for PICK.

Operations that use one of the stack pointers for indirect addressing must follow the following steps.

  1. disable the interrupts (with disable_ints)
  2. temporarily save the stack pointer (SP or RSP)
  3. perform the operation
  4. restore the stack pointer
  5. call enable_ints
For an example see the code for '!' and '@'.

Tail Call Optimisations

If you go through PICF16.F or through assembly code generated by the compiler, you will find that procedures are often missing a return instruction. This only happens when the procedure calls another as its last instruction. It is sufficient to simply make a jump instead of a call, and the return statement becomes unnecessary. This is called a "tail call optimisation," and is used whenever possible by the compiler. You may want to use the same technique in your code definitions.