Skip to main content

Using the e4thcom MSP430 assembler with Mecrisp Forth

·1022 words·5 mins
Jon McAuliffe
Author
Jon McAuliffe
e4thcom MSP430 Assembler - This article is part of a series.
Part 1: This Article

Introduction
#

In this short series of posts, we’ll look at using an assembler with Mecrisp forth on the MSP430 family of devices. You’ll need an msp430 board with Mecrisp Forth (mecrisp-2.0.7 download) installed with a serial port connection. In addition to this you’ll also need a working install of the e4thcom forth terminal communication program.

Why you might want to use the assembler
#

The main reasons you’d want to use the assembler are code execution speed and code size. The MSP430 family of devices typically have lower clock speeds and flash memory sizes, so any opportunity to reduce resource use is valuable, especially on hot code paths. We’ll get to an example of this in a bit (along with some benchmarks). Whilst the code that forth itself compiles is generally fast, and Mecrisp does attempt to optimise the code it generates somewhat, having control over the code used for some functions can really improve performance.

A bit about forth
#

Without getting too much into the details, forth functions are all called ‘words’ and live in the ‘dictionary’, which is implemented as a linked list of words. New words can be defined as combinations of existing words, and these new words are added to the dictionary as they are defined. The forth system itself compiles the new word as calls to these existing words (as well as in-lining some things). Function parameters are placed on the data stack before the word and are consumed by the word, return values are placed back on the same stack.

What we are aiming to do here is instead of using the forth compiler to build the body of new words, we instead want to use an assembler to insert the machine instructions in the dictionary entry of the new word and for this we will first need to define two new words, code and end-code.

Run e4thcom with the basic Mecrisp plugin (your baud rate and device may be different):

e4thcom -t mecrisp -d ttyUSB0 -b B115200

The following definitions can be typed directly into the forth interpreter.

: code      ( "name" -- )   :  postpone [ ;  ok.
: end-code  ( -- )          ] postpone ;  ;  ok.

The purpose of code is to start the process of creating a new named dictionary entry, but then instead of compiling the function definition from supplied forth definitions (as in a regular ‘colon’ definition), instead we leave the dictionary entry half complete, ready for us to manually insert the instructions as hex values. Once this is complete, end-code cleans up and finishes the dictionary entry (including adding a RET call to the end for the definition).

Don’t type the ok., this is the response from the forth interpreter after you hit newline at the end of the word definition ;.

The contents inside the brackets of both of these definitions is a comment showing what is put on the stack before the call and what is expected to be left on the stack once the word has completed. This is a forth convention but isn’t turned into any code as such.

We can now use the newly defined code/end-code words to create a new word plus_one. This word consists of two 16 bit hex values $5394 and $0000 (this is the machine code for adding 1 to the value at the data stack pointer address). The , writes the hex value to the current dictionary pointer ‘here’ address then moves the dictionary pointer along by 2 bytes, ready for the next entry. We can then test the word (. prints the current top of stack, the return value of the word):

code plus_one $5394 , 0 , end-code  ok.
1 plus_one . 2  ok.

If we create a helper function dump-cells, we can print out values from memory. The word ' gives us the execution token (code address, much like a pointer) of a word, so we can see that we have in fact created a dictionary entry with the correct machine code ($4130 is the machine code for the RET instruction). We can also compare this to the in-built Mecrisp forth word that has the exact same functionality, 1+.

: dump-cells ( addr u -- ) 0 do dup @ hex. 2+ loop drop ;
' plus_one 3 dump-cells
5394 0000 4130  ok.
' 1+ 3 dump-cells
5394 0000 4130  ok.

This is all well and good, but it’s not very convenient to have to work out the machine code instructions for everything, we still need the actual assembler. This is where e4thcom comes in. If we run e4thcom with the assembler/dis-assembler enabled for Mecrisp, a second thing happens when we use code and end-code.

e4thcom -t mecrisp-msp430xas -d ttyUSB0 -b B115200

code and end-code still perform the same on the target device, but everything in between is instead interpreted by the assembler running in e4thcom and then sent to the target device as a list of hex values and commas, much like our hand-crafted example. One minor inconvenience is that this hijacking of the text between the code/end-code only works when including the source from a file, not when you are interactively typing in the forth interpreter (the code <name> and end-code also have to be on their own lines).

Create a file simple.fs:

code 1++
 #1 0 sp x) add
end-code

Import into the forth environment:

#i simple.fs

We can then again see the expected behavior and the same machine instructions, only this time the code is a little more like a traditional assembly source.

1 1++ . 2  ok.
' 1++ 3 dump-cells
5394 0000 4130  ok.

Since this assembler is also running in forth, the order of the operands is a bit different, ending with the instruction. There isn’t really a good documentation source for this other than the actual source of the assembler itself (msp430xas.efx in the e4thcom distribution). LLMs can be helpful in this case.

In the next part we’ll flesh out the syntax for the forth assember a bit more then look at some sample programs that demonstrate some features that you might like to use.

e4thcom MSP430 Assembler - This article is part of a series.
Part 1: This Article