K4 Blog
K4 now
In the last post we covered some basics on how K4 accepts input, looks up words in a dictionary executes them and even how new words are defined. In this post I’ll talk about what K4 has been developed for, and how you could run it for yourself.
I want to develop this Blog as a story of K4. This project has been going now for about six months. The language was prototyped in C++ (on OSX) and has been ported to Linux, our ESP32 BOB boards and ATSAMD51 (ARM Cortex-M4+) CPU’s in our EDDIE project. Work is underway to put a miniaturized version on ATSAMD21 (Cortex-M0) CPU’s we use.
At this time, K4 is being expanded to allow for:
Improved string handling (eg comparison, appending and copying)
Boot strapping behaviour to allow loading of dictionaries and applications on start-up
Interrupts for reactive and real-time applications
Local variables and arguments for words: simplifies stack handling in frequently used words.
Multiple vocabularies (aka dictionary namespaces)
Rudimentary “objects” which is very experimental
Word linking to ease building complex data structures
Optionally embedding a web server.
Variable and object persistence: saving and loading of the current values of some words to NV memory
“System” interfaces - dictionaries unique to particular ports:
K4-Kairos - adds seven segment display drivers, WiFi
K4-51 - Control of Eddie busses and communications interfaces
etc..
K4 is a very extensive and exciting project. I will bring the blog up to date with its complete features. These blogs read backwards - the latest blog entry is at the top of the page, so it is hard to make it read as a story - so please bear with me and we will work it through.
Let’s build a real K4 program
The saying is that the “proof is in the pudding”. So we present som “pudding” in the form of a two player tic-tac-toe game. Two player means the two people are playing and the computer is not your opponent. A version with a computer opponent would be an interesting project later.
I will break the code down and explain each part. The final code will be available on our github repo at (????).
Initializing: Our board consists of nine cells, so we need to declare an array of nine elements (bytes). We also need byte sized constants for an EMPTY cell, a cell with an X in it and obviously an O as well:
." stdext.k4" include
#
variable board 9 allotc
#
# Cell values for empty, x and o
#
0 constant EMPTY
1 constant O
2 constant X
This code loads the K4 standard extensions: K4 comes with a lot of built-in words. These are compiled from C++ code into the K4 binary. There is good reason to standardise other useful words (for portability, readability etc). These are being standardised into a file ‘stdext.k4’ that comes with K4:
#
# K4 Standard Extensions (non-built-in)
# (use: 'include stdext.k4' to read into your dictionary
# ----------------------------------------
#
# <n> historic : print the nth last item in the history
#
: historic $cmd $> . ;
#
# .r, .xr, ..r, ..xr - prints with returns
#
: nl 10 emit ;
: .r . nl ;
: .xr .x nl ;
: ..r .. nl ;
: ..xr ..x nl ;
# tab : emits a tab character
: tab 9 emit ;
If you care to study this code, you will see words for emitting a new line and nicer versions of the ‘dot’ words that print the stack.
After including this code, we initialise a variable ‘board’ with 9 bytes of storage. The reason should be obvious that this is an array containing all the cells of our tic-tac-toe board.
Lastly, we declare constant values that are stored in cells of the board for a nought, a cross and an empty cell.
You will notice that any line that starts with ‘#’ in the leftmost column is a comment line. At this time this is the only way that a comment can be put into a K4 program - and I think its enough of a shortcoming to warrant fixing at some point in the near future!
2. Lets have some way of displaying a cell! We write another word ‘xo’ that uses the cell number and prints ‘X’, ‘O’ or ‘ ‘. We have two ways of addressing cells in the program: the raw way numbers the cells 0 to 8 left to right, top to bottom. In a moment we will introduce a row/column method of addressing the cells instead.
#
# given a cell number (0..8) load the cell
# and print X, O or ' '. If the cell value
# is not expected, print '?'
#
: xo 1 0 {
1 () X = if
\X emit
else
1 () O = if
\O emit
else
1 () EMPTY = if
20 emit
else
\? emit
then
then
then
} ;
This section of code introduces a couple of important K4 concepts. The first is our invention called “frames”, and the second this the ‘if-else-end’ construct.
The ‘if’ statement takes a true/false value off the stack. If the value is true, execution continues at the word after the ‘if’. If the value is ‘false’, execution will continue after the ‘else’ statement. (If there is no ‘else’ then it continues after the ‘then’ at the end of the block. Note that the statement just before the ‘else’ continues after the ‘then’, it doesn’t proceed into the ‘else’ bit.
Frames are a more difficult concept. Frames begin at ‘{‘ words and end at ‘}’ words. Preceding the ‘{‘ you will see two numbers:
<args> <locals> { … }
The <args> number is the number of stack entries expected before the frame to treat as ‘arguments’ inside the frame code. The <locals> number allocates this many entries on the stack that can be directly addressed inside the frame. When the frame ends at ‘{‘ the stack is returned to the position it was at before the frame started. We can address arguments with:
<n> ()
Which pushes a copy of the nth argument onto the stack. (Arguments are numbered 1 to n, with 1 being the top of stack at the start of the frame, 2 being the next down and so on. We address local variables with:
# get the nth local and push a copy onto the stack
<n> #ld
# pop the stack and store it in the nth local.
<n> #st
The code here doesn’t yet use locals, we meet them a bit later int the program.
So this code takes the argument it was supplied and compares it to the constant X. If it is an X, the program emits the ‘X’ character. If it isn’t, it has another look at the first argument and compares it to the O constant and likewise emits an ‘O’ if it is. If it isn’t that, it checks if the cell is empty using the same expressions and emits a space if it is. If it’s still none of those it prints ‘?’ to signal its confusion at the odd value in the cell.
3. Some little worker words. These words do what’s on the tin:
#
# <col> <row> cell - push cell number
# (col, row range is 0..2)
#
: cell 3 * + ;
#
# <col> <row> cell@ - push the value at cell
#
: cell@ cell board _ @c ;
: xcell@ cell board _ @c dup 0 = if 4 then ;
#
# <mark> <col> <row> cell - mark the board
# at row, col with X, O or EMPTY
#
: cell! cell board _ !c ;
#
# <col> <row> cell! - print the mark on the
# board at <col> and <row>
#
: .cell cell@ xo ;
The word ‘cell’ multiplies a row number by 3 and its it to the column number magically providing a number from 0 to 8 to address the cells on the board. This is our second way of addressing the board.
The word ‘cell@’ takes row/col on the stack, calculates cell number then the expression ‘board _’ adds the cell number to the board address and gets the cell address. Finally ‘@c’ loads the value at that address and pushes it onto the stack. The word ‘xcell@’ is similar, however if the cell value is EMPTY (0), it returns 4 on the stack - the reason for this will be explained shortly.
The word ‘cell!’ does the reverse work of ‘cell@’ - it calulates the cell address from the row, column and board. Then it stores the value (X, O or EMPTY) which is lowest on the stack (after the row/col values) into the cell.
Finally ‘.cell’ uses the row and column to get the cell value. Then it uses the previous ‘xo’ word to display its symbol.
4. Displaying the board. We can put all the previous words together and display the board:
#
# display the board with the # markers
#
: bar ." -+-+-\n" ;
: board?
X 0 0
3 0 do
3 0 do
I J .cell
I 2 < if
\| emit
then
loop
10 emit
I 2 = not if
bar .
then
loop
nl ;
Output (for example)… (this uses commands to set cells on the board and display it)
The ‘bar’ word defines a string literal that we can load just by invoking the ‘bar’ word. Typing ‘bar .’ would print ‘-+-+-’.
The ‘board?’ word uses two ‘do’ loops that we haven’t discussed before. A do loop is of the form:
<end> <start> do <words> loop
Given an starting value, the do loop counts up until its loop value is equal to or greater than the end value. Usefully the word ‘I’ pushes a copy of the loop value for the nearest loop. The word ‘J’ pushes a copy of the loop value in the next loop out. (There are no words for even deeper nested loops, but there are ways around this issue). Note also that ‘do’ loops (like if statements and other flow control constructs can only be compiled, not iterpreted on the command line)
So we see two loops, then the line ‘I J .cell’ which uses the loop values as the row and column to get the cell value.
Crafty tests with ‘if’ statements place the vertical bars and the horizonal lines between cells. See that prior to each if statement there is a construct like:
<expr1> <expr2> <comparison> [ <not> ] if …
The comparison could be ‘=’, ‘<‘ or ‘>’ (from the available builtin words). This reads as <expr1> compares to <expression2> and the result is used by the if. We can invert the comparison logic with the word ‘not’. If you think about it, ‘<=’ is the same as ‘not >’ so only the three simple comparators are needed.
5. Clearing the board for a new game:
#
# Empty the board for a new game
#
: newgame 9 0 do EMPTY board I _ !c loop ;
This is very simple code to fill every cell in the board with the EMPTY value by using a loop, ‘_’ for indexing and ‘!c’ to store the value.
6. Testing rows, columns and diagonals for a winning situation. This big chunk of code creates words for the three possible winning permutations on the board:
#
# <row> checkrow
# push X, O or EMPTY if X wins row, O wins
# or nobody's winning
#
: checkrow 1 1 {
0 1 #st
3 0 do
I 1 () xcell@ 1 #ld or 1 #st
loop
1 #ld X = 1 #ld O = or not
if
EMPTY 1 #st
then
1 #ld
1 ->} ;
#
: checkrows
0
3 0 do
I checkrow
ifnz
swap drop leave
else
drop
then
loop ;
#
This first chunk introduces more interesting aspects of K4. First of all we actually make use of a single local variable inside the frame. We start by setting the local variable to zero with ‘0 1 #st’ - this is put 0 on the stack and store in local 1. Then inside the do loop, we get each cell value by combining the input argument ‘1 ()’ as the row number with I from the do loop as the column number. Notice we are using ‘xcell@’ here - which returns the value 4 for an empty cell. Consider a row of all ‘X’ values (1) ‘ored’ together will give 1 which is ‘X’. Likewise for ‘O’ which is 2. If the row contains ‘X’ and ‘O’ we or together 1 and 2 and get 3. If the row has any empty cell we could get 5,6 or 7 using binary arithmetic. Only the values ‘2’ and ‘4’ represent rows that are all X’s or all O’s. Outside the do loop, the statement ‘1 #ld X = 1 #ld O = or not’ is read as ‘if the result is not X or O’ push ‘true’. Then the if statement converts that ‘true’ value into EMPTY which is what we want to return if nobody’s winning. Finally we ensure the result is put on the stack.
Then there is a more curious ending to the frame. Previously I said that frames were ended with ‘}’. But what if we want to return a value from a frame’s calculation? In this case we can use:
<n> ->}
This construct tells K4 to put aside the <n> values at the top of the stack, reset the stack to the state before the frame began and push those <n> values for the code outside the frame to see. Effectively ‘checkrow’ returns the result on the stack this way.
Lastly we define the word ‘checkrows’ that uses a ‘do’ loop to run through rows 0, 1, 2 looking for a win. In that code there are more new things!
ifz
ifnz
We use ‘ifnz’ here, but the logic for ifz is the same. These built-in words look at the top of the stack - without popping that value from the stack - and compare it with zero. ‘ifz’ executes if the value is zero and ‘ifnz’ executes if it is not. In this case, if it the result is non zero, we swap the result and the 0 that’s already on the stack and drop the 0, then we ‘leave’ the loop. If it is the result is just dropped and the loop continues with a balanced stack. Consequently if ‘checkrows’ exits with a non zero value, someone has won.
To leave the loop we use the word:
leave
This sets the loop variable to the limit and the loop will exit the next time the ‘loop’ word is executed.
So the code to check the columns follows the same pattern only we swap the loop variable and the column argument. The rest of the logic is the same.
# <col> checkcol
# push X, O or EMPTY if X wins col, O wins
# or nobody's winning
#
: checkcol 1 1 {
0 1 #st
3 0 do
1 () I xcell@ 1 #ld or 1 #st
loop
1 #ld X = 1 #ld O = or not
if
EMPTY 1 #st
then
1 #ld
1 ->} ;
#
: checkcols
0
3 0 do
I checkcol
ifnz
swap drop leave
else
drop
then
loop ;
#
Finally we have a word that scans the columns left to right and right to left, using two local variables this time. At the end out logic tests both variables and returns non zero if someone wins.
# Check the left-right and right to left diagonals and
# push X, O or EMPTY if X wins , O w ‘checkdiains
# or nobody's winning
: checkdia 0 2 {
0 1 #st
0 2 #st
3 0 do
I I xcell@ 1 #ld or 1 #st
2 I - I xcell@ 2 #ld or 2 #st
loop
1 #ld X = 1 #ld O = or not
if
EMPTY 1 #st
then
2 #ld X = 2 #ld O = or not
if
EMPTY 2 #st
then
1 #ld 2 #ld or 1 #st
1 #ld X = 1 #ld O = or not
if
EMPTY 1 #st
then
1 #ld
1 ->} ;
We have three words: ‘checkrows’, ‘checkcols’ and ‘checkdia’ we can use to test if there’s a winner.
If there is a winner we use ‘win?’ in this block to print who won and return true or false (using the frame and 1 ->} mechanism:
#
: Xwin ." X Wins!" ;
: Owin ." O Wins!" ;
#
# win? - print who wins, return true if
# game is over
#
: win? 1 0 {
1 () X = if
Xwin .
true
else 1 () O = if
Owin .
true
else
false
then
then
1 ->} ;
Putting all these words in one place, we can write a single word that checks for and prints a win and subsequently returns true for a win or false:
#
# full - return true if the board is full
#
: full 9 0 do
I board _ @c
ifz
drop false leave
else
drop true
then
loop ;
#
# gameends
#
: gameends
checkrows
checkcols
checkdia
or
or
win?
if
true
else
full if
." Its a draw!" .r
true
else
false
then
then
;
We have to take care of the situation where the board becomes full but there is no winner. We have a word: ‘full’ that tests that there are no empty cells and returns true if there aren’t any. Now if win? returns false, we also check for a full board and indicate a draw if so.
This is a good example of how K4 programs build up from basic to more and more sophisticated functions in layers of words in the dictionary.
Now we can start working out the logic to play a game. Firstly, we have a variable called player that takes on the value X or O. We use this to control who is playing now and who will play next. The simple word ‘toggleplayer’ switches to the opposite player every time it is invoked:
variable player
: toggleplayer
player @ X =
if
O player !
else
X player !
then ;
To tell the users who is playing we use ‘player?’ that tests the player variable and displays a string on the console:
: player?
player @ X =
if
." 'X' to play:\n" .
else
." 'O' to play:\n" .
then ;
Then we have to ask for the row and column the player wants to play in. Since the row number is a single digit from 1 to 3 we can read this in. We also know that the same is true for the column number. So how about one word that prints a prompt and asks for input. It verifies that input is a single digit from 1 to 3. If it isn’t it loops until it is re-prompting the user for the correct input.
#
: rowcolstr ." Enter 1,2 or 3 please! " ;
: readrc 1 2 {
false 2 #st
begin
1 () .
input
$> strlen 2 = if
$> @c \0 - 1 #st
1 #ld 1 < 1 #ld 3 > or if
rowcolstr .
else
true 2 #st
then
else
rowcolstr .
then
2 #ld
until
1 #ld 1 ->} ;
See that the ‘readrc’ word takes on argument (the prompt string) and uses two local variables. Variable 1 is the output value and variable 2 is used as a flag to control the loop. Notice the two new words used here:
input
$>
strlen
begin .. <words> .. until
The word input waits for a line of input from the user. The text that is typed is put into a global variable known as the ‘StringBuffer’. There are various words that access and manipulate the StringBuffer. Used here is the ‘>$’ word that puts the address of the StringBuffer on the stack. The ‘strlen’ word uses the top of the stack as the address of a string and returns the number of characters in the string (Strings in K4 are null terminated). We use strlen to make sure the typed input is just one character long.
Lastly we’ve introduced the begin..until control construct. The words between begin and until are executed at least once. When the program gets to the until word, a boolean (true/false) should be on the stack. A false value causes the program to jump back to the begin and try again. A true value lets the program continue.
Now we have a the general purpose word ‘readrc’ we can wrap it in two words to set the prompt for Row? or Col? The result from readrc is simply passed out to the calling program:
#
: readrow 0 0 {
." Row? " readrc
1-
1 ->}
;
#
: readcol 0 0 {
." Col? " readrc
1-
1 ->}
;
Now we can make a move for a specific player:
#
: empty cell@ 0 = not
if
." Cell is not empty! Try Again\n" .
false
else
true
then ;
#
#
: move
player?
begin
clear
player @
readcol
readrow
over over
empty
until
cell!
;
Here we start with the word ‘empty’. This returns true if the cell at the given row and column is empty. Otherwise it prints a message and returns false;
The ‘move’ word prints out the ‘player?’ and reads a row then a column. Notice the use of the word ‘clear’ at the beginning of the loop. This stops the stack from growing out of hand from unused values. After reading the row and the column we use the peculiar word ‘over’. This word copies the word one below the top of the stack to the top. Using ‘over over’ duplicates the top two values on the stack:
Example: with 3, 2, 1 on the stack, execute ‘over over’ to duplicate 1 and 2
The two two values on the stack are now copies of the row and column. We use ‘empty’ to see if that cell is empty. If it is ‘empty’ returns true on the stack and the until allows the program to store the player symbol at the row and column on the stack. If it isn’t empty the program loops around.
Finally we use the built in ‘random’ word (returns a random integer on the stack). Take this value modulo two to get a value of 0 or 1. Since X and O are numbers 1 and 2 we add one and have a way to choose a random player to start the game:
#
# randomplayer - chose a random player to start
#
: randomplayer random 2 % 1 + ;
And now we can play(2)! The program initiailizes the board, chooses a random player to start and begins looping:
#
# play2 - two player (non computer) game
#
: play2
newgame
randomplayer player store
board?
begin
toggleplayer
move
board?
gameends
until ;
The loop switches players, asks for and makes a move and shows the new board. To see if it should keep looping, it uses the ‘gameends’ word to test for some form of win or a draw. That word is kind enough to return true if the game ends and false otherwise. We can use the ‘until’ to keep playing until a conclusion is reached.
Finally!
If youve read all that you are almost a K4 pro!. This program covers many practical and real aspects of our language. It should serve to demonstrate the K4 is a fully fledged programming language - albeit limited to the console in this demonstration.
In coming posts we’ll talk about K4’s implementation and some more advanced aspects of the language that make it useful in embedded environments.
Experiencing K4 First Hand
It all begins with an idea.
K4 is an esoteric language. That does not mean it is difficult. In our case - it matches the uses we put it too very well.
Let’s go off on a short tangent…
I hear many critiques of various computer languages from across the spectrum of experience, needs and quality control. “Language A is superior to Language B because” etc etc. Let’s be careful here: separate the language from its eco-system and then talk about each - you will get a better evaluation. Firstly the merits of the actual language relate to:
How it expresses the kinds of problems you need to solve. Someone doing numeric vector processing, or building user interfaces or working on communications etc, etc have very different “domains” of logic and thinking they work in.
How “correct” and/or “maintainable” your source code needs to be. Web UI developers have very different needs in this respect to say embedded avionics developers.
The perfomance of the generated code in time, energy and memory.
And lastly - the need to speak the language of your local tribe!
The eco-system provides you with something different. Anyone working in data-science or science in general will greatly appreciate powerful frameworks and libraries that do complex numeric work without them resorting to programming the basics. If they move to a different language, that power might not be with them. The same applies to embedded software, or database processing or web interfaces. There are frameworks everywhere, but often they are part of a particular language.
The conclusion here is simple: horses-for-courses. So we come back to K4: its our pony used to inexpensively run modest embedded hardware. We believe it provides us with:
A low-cost (to develop and to deploy) language.
It requires little memory and is fast for what it does.
It is very extensible by its dictionary based defining words (functions)
And we get the freedom to shape the language as we develop it.
So lets get into it.
K4 is like FORTH (https://en.wikipedia.org/wiki/Forth_(programming_language). To be clear though: its not an implementation of FORTH, nor of its cousin STOIC or for that matter any other particular stack based language. K4 is Bored Owl’s own language, borrowing heavily from this heritage, but deviating to suit the needs of our systems (and to some extent my own whims and interests in computer languages).
So what is a “stack-based language”?
A stack is a type of data structure in programming that implements a last-in, first-out operation. Imagine a stack of plates in a busy restaurant. The dishwasher puts plates onto the stack, and the servers remove them from the top. The last plate onto the stack is the first plate off the stack. In stack based languages we substitute pieces of data (numbers) for plates and do all our operations in this order.
A conventional programming language (or even your desktop calculator) understands: ‘1 + 2 =’ and returns ‘3’. This is ok until you say ‘1 + 2 * 3 =’ and you argue endlessly on social media if the answer is 9 or 7. (It’s 7 no arguments!) The rules of precedence apply ao that * and / happen before + and -. We can override those rules with parentheses: ‘(1 + 2) * 3 =’ and get 9 if we wish. Precedence complicates the life of the computer so it has to write out the whole expression and gather it up in the order of precedence. So what though - computers are fast and have tons of memory, its not a big deal. We need to elaborate some more.
K4 is neither an interpreting language nor a compiling one. To straighten out the jargon:
An Interpreter keeps its source code at hand (generally in memory) and when the program is run, the source code is interpreted (understood) and its instructions are executed.
A Compiler reads its source code once (often from a file on disk) and produces instructions for a hardware machine or a virtual machine to execute in memory.
This is a very great simplification as there are combinations of both in the myriad of available languages. The distinction has become less clear and less important with time. The main things to note are that interpreters provide the convenience of being available to run all the time, but slower and more memory bound while compilers require machines with editors and storage and the ability to run the binary after it has been compiled. Compilers and operating systems are good friends.
K4 (like all Forth type systems) does things another way.
It presents the user with a command line - like an old school command line operating system would. What you type at the command line is interpreted and executed. (If you are a modern language user this is like Python or Scala’s ‘repl’). However with a simple change to what we type, the input is “compiled” into a “word” and stored in memory.
What gets typed on this command line is a mixture of constants and words. There is no other concept for K4. The very simplest example of a K4 command line might be:
>1 1 + .
The output from this line would be:
2>
(I’ve included the prompt character ‘>’ to illustrate the K4 tells you its ready for input).
What happened here? The line was scanned and executed as each ‘token’ was recognised. All parts of a K4 line are separated by a space and each part between spaces is called a token. Tokens are evaluated - firstly to see if they represent a constant (decimal or hexadecimal number, or character constant) and if not to see if they are a “word”. Step-by-step the line was broken into tokens:
1 | One is a decimal number, a constant and it is “pushed” onto the stack |
1 | The same it also goes on the stack. (Now the stack contains two numbers 1 and 1) |
+ | This is not a number, so the dictionary is searched. It happens there is a built in word for ‘+’, so K4 proceeds by executing the word. The built-in code for ‘+’ removes two words from the stack, adds them together and pushes the result ‘2’ back onto the stack. |
. | This also is not a number, so the same process of finding the word in the dictionary is followed. The word ‘.’ has a built in executable that is called. The function of the ‘dot’ executable is to remove a number from the stack and print it on the console. See that the print function does not automatically add a new-line so the ‘>’ prompt appears immediately after the ‘2’ output. |
Another possibility is that we can emit a character using a character constant and the ‘emit’ word:
>\A emit
Output:
A>
In this case, the ‘\A’ token is interpreted as a character constant for the letter ‘A’ and this is pushed onto the stack. The next token ‘emit’ is found as a built-in word. The function of ‘emit’ is to print the single character from the top of the stack to the console. Hence the output ‘A’.
Note that the character constant in this interpretation is not a C-style escaped character: ‘\n’ means the letter ’n’. To emit a new-line we need to know the character code for a new-line which is 0x0A or 10. So the expression ’10 emit’ outputs a new-line command. We’ll use this a bit later.
The concept of a “word” is central to K4. A word is an executable unit and can be “built-in” or “defined”. Built-in words are compiled with K4 and they perform the fundamental operations like stack manipulation, arithmetic and I/O. Defined words are built upon the fundamentals using “colon”, “variable” and “constant” definitions.
Either built-in or defined, words exist in the “dictionary”. A word has a structure in memory like this:
| WORD: | Purpose: |
| Identifier | The name of the word |
| Flags | Special information about the word. There are eight available flags but presently only two are used. (eg: can its output be treated as a string?) |
| Previous Word | Points to the address of the previous word in the dictionary. (See the explanation of the dictionary below) |
| Executor | Points to a function that can execute this word. |
| Parameter | A single number for any useful purpose to the word |
| Extensions... | A word can continue on for as long as required and how this extended data is treated is up to the function that executes the word.
(Only the first 14 characters of a word are significant) |
The dictionary pointer in K4 points to the last word that was defined. A search of the dictionary simply starts at this pointer and if the identifier we are looking for does not match, we use the previous word pointer and move backwards through the dictionary until we find what we need. When the previous pointer refers to nothing our search ends unsuccessfully and we have to deal with an identifier we can’t treat as a word.
The current implementation of K4 has about 150 built in words. These provide:
Arithmetic and logical operations on the stack
Console I/O (printing etc.)
Stack manipulations
Dictionary and compilation management
Constant and variable definition
Control structures (if/then, while/until etc.)
String creation, alteration and control
File operations (when supported)
Interrupts (when supported)
System functions - specifically supported by the environment
All the built-ins are registered in the dictionary at start-up.
The words that are built-in cannot do everything though. Defining new words based on the basic functions is the at the heart of K4.
The Colon Definition
The console operations that we showed earlier are executed immediately. This isn’t much help towards writing actual programs. We need a mechanism for defining new functions the we can use over and over. For this we have a “special” word known as “the colon”. It is literally the colon character, which is executed as a word (from the built-in dictionary).
When ‘:’ is executed, the K4 interpreter starts compiling a new word. This means space is found at the end of the dictionary and each token is ‘compiled’ into this space. Before we discuss these technicalities, lets look a typical colon definition:
>: add10 10 + dup . ;
>
>4 add10
14>
Colon definitions (that don’t have errors) produce no console output. They just compile their definition.
Looking at this expression, we see that it ends with the token ‘;’. This is another special word that ends compilation. Immediately after the colon, we see the token ‘add10’ which is the identifier for our new word. This is the identifier used in the dictionary definition. Following that are the words that are executed later when the word is invoked.
The first token is a ‘literal’, meaning the value is loaded onto the stack - so 10 is pushed onto the stack. The next is ‘+’ which pulls the top two values from the stack, adds them and pushes the result back onto the stack. The word ‘dup’ is a built-in that takes the value on the top of the stack and pushes another copy. So if the stack was [ 7 ], then we execute ‘dup’ , the stack becomes: [ 7 7 ]. Finally we use the dot word again to print the value on the stack. Dot removes one value from the stack so the previous ‘dup’ leaves the result of our addition on the stack for us to keep using.
The word is used by typing ‘4 add10’. The definition is consulted, 10 is pushed, added to the 4 already on the stack and 14 is pushed back, duplicated and printed. If we typed:
>.
14>
The dot prints the top of stack which is still 14 because we duplicated the previous result. See that the stack retains what it contained from one word to the next regardless of whether we are simply interpreting or compiling. The stack will be cleared if an error occurs though (and we have the word ‘clear’ if we want to clear the stack manually). If we type (continuing on from the last):
14>.
0
Stack Underflow: - ?
? (. )
>
This execution of dot is a problem! There is nothing else on the stack, so the attempt to pop a value off the stack fails: the pop operation returns 0 and sets a stack exception (for an underflow). The consequential output is ‘0 \ Stack Underflow: - ?’, followed by ‘? (. )’. The first bit is the result of the failed pop and the second prints (between the parentheses) the point in the original input where the error occurred (at the ‘.’).
Lets define something different but useful:
>: .r . 10 emit ;
>
This defines a new word ‘.r’ that prints the top of stack and then emits a new line. This helps us see our output a bit more easily. We can redefine ‘add10’ thus:
>: add10 + dup .r ;
>4 add10
14
>
The subtle difference being the use of our newly defined ‘.r’ word that means the answer 14 is nicely separated from the next ‘>’ prompt. Now try:
>4 add10 2 * .r
14
28
>
And viola! We used the answer from add10 left on the stack, multiplied it by two and prints the extra answer ’28’. Note also that without the use of ‘.r’ the output would have been the confusing string ‘1428>’!
Concluding (for now)
This has been a very basic introduction to K4. Enough to start understanding expressions and words and the dictionary. In future posts we will talk about how K4 is implemented and we’ll start to talk about more of the built in functions and how we can handle strings - look forward to it.
K4: An old language for a new machine
It all begins with an idea.
I’ve been messing about with computers for a long time. I probably should be thinking about other things nowadays, but I can’t help my fascination with the things.
Now I’m coming back to my roots building embedded systems in my time at Bored Owl. As much as I love software, having a connection to actual hardware the switching of things and measuring stuff makes computers relevant and important. My projects have involved making new hardware and ultimately writing software to get that hardware going.
Embedded hardware has come a long way in 45 years. I started writing and manually loading assembly language programs - compilers were an expensive luxury in more ways than one. Not only did you have to afford the compiler and development system but also generally larger and more expensive target hardware to run it on.
Now for a few tens of dollars and an evening on the internet, a 32bit 200MHz processor will run Micropython and empower you in so many ways.
But what about K4?
I started a project to build digital clocks using LED filaments called “Kairos”. This project uses the aforementioned 32 bit processor (an ESP32 Arduino Nano) to drive LED digit cards that we make. You can buy the digit cards in our shop, and these are discussed in the Kairos blog.
Initially, the clock software reached a point where I wanted to enable the user to use simple scripts to define some action that could happen at a specific time or times. The action was simply to play a chime from a sound file on an SD card. The more I thought about a suitable user interface the less appealing a GUI was: it was simply to limited. A proper script language would be better but I didn’t really want to design and document one.
About the same time, this ancient book about “FORTH” fell out of my shelves while I was clearing up:
A book bought in 1983 - out of curiosity.
Somewhat beaten up and well read over time.
This book also reminded me of running a similar language called “STOIC” (https://en.wikipedia.org/wiki/STOIC) on a Motorola 68000 CPU that I built between 1981 and 1982. Forth is somewhat better known but the concepts between the languages are very similar.
A 1980’s 68000 computer:
Made in my early university days, similar to and contemporary with and original Apple Macintosh 128K. Only this wasn’t graphic it was designed for music sampling and synthesis. It ran a port of “STOIC”.
Ok Ok Ok enough of the reminiscing! Back to a reality where I needed a lightweight language that could script some digital clocks. It didn’t take me long to open a new C++ project in Eclipse (the IDE I’m most familiar with, please don’t judge!).
K4 was born as a pure C++ implementation of a stack based language. This blog will gradually take you through the process and get you to a working implementation you can put on your desktop OS, or an ESP32 or an ARM or lots of other possibilities.
Sure there are real FORTH implementations out there. Far more “standard” maybe more what you want. What I wanted was a piece of code that I had the freedom to mold and shape and play with. And I wanted the pure satisfaction of the exercise.
What is a “Stack Based Language”?
The languages that get this description all use the concept of a “stack”. That’s a simple data structure that stores data in a first-in, last-out manner. Which tells you next to nothing! So I’ll give you some more details:
K4 is and “interactive compiler”. So is Python and Scala etc. You can enter commands at a command line and they will be executed immediately. If input tells K4 to compile the input, it will be stored as a new “word”. Using that word in later input causes the expression compiled into it to be executed.
All input is broken into words - everything that is delimited by white-space (except special casess like strings) and each word is executed in sequence. Words are contained in a “dictionary” that is searched for each word encountered in the input. If a word is not found, it is treated as a number