Let’s be clear about this, I have no idea how you write an operating system. I just know you need to have one.
When I first started writing ROM code for the Zolatron 64 (Z64) 6502-based homebrew computer all that really mattered was getting things to work. And I was amazed when they did.
I was following in the footsteps of Ben Eater and his 6502 project (although our paths started to diverge a while back). Getting to the point of being able to print messages to an LCD screen and communicate via serial were achievements in themselves. I wasn’t thinking about how the hardware and software would support future projects.
Routine access
But then I reached the point where not all of the code was running from the ROM. I can now load user code from a Raspberry Pi and run it in RAM. And that brought up an issue.
It’s all to do with labels and addresses.
The ROM code is one homogenous lump. Within the code are useful routines that you might call from various points in the program. The most common way of doing this is to write these code sections as subroutines. And you invoke these subroutines using the command:
jsr <address_of_subroutine>
Let’s say you have a subroutine that sends a character to the serial port. And that this subroutine starts at address $C2FE. This is in the ROM section of the computer’s address map. To invoke that subroutine, you could use:
jsr $C2FE
But there’s a couple of problems with that. For a start, it’s hard to remember all the addresses for all the various subroutines. And more important, those addresses are likely to change as the code is developed and debugged.
Instead, we use labels. At the start of the subroutine code, we place a label – let’s call it lcd_wr_chr. It might look something like this (in the syntax used by Beebasm):
.lcd_wr_chr ldx #0 ldy #0 ; ... rest of code for subroutine rts
Now, when we want to invoke the subroutine, we simply use: jsr lcd_wr_chr and not worry about the address. This works because, during assembly, the assembler calculates what the address of the lcd_wr_chr label will be and replaces the label in the calling code with the appropriate value.
But now we have another problem. We can use that jsr_wr_chr label anywhere in our ROM code. But when writing user programs, designed to be loaded into RAM and which will be assembled separately, that label has no meaning. So if we want to call that handy subroutine from user code, how do we do that?
The answer is, indirectly.
What’s your vector, Victor?
We could go back to putting such subroutines at fixed points within the ROM code. But that’s clunky and prone to error. A better solution is to have something that is at a predictable location that we can use to find the code we’re after.
And we’re already doing this.
When the 6502 powers up, it goes straight to address $FFFC and reads the two bytes starting there. These two bytes constitute another address – to which the processor will redirect. This is a vector. The processor jumps to that address and executes whatever code is there.
In my case, I start the ROM code at address $C000, so I ensure that that bytes at addresses $FFFC and $FFFD contain the values $00 and $C0 respectively (the low byte first because the 6502 is little-endian). How do I do this? With a label.
At the very beginning of the ROM code I have:
ORG $C000
.startcode
The startcode label is then always given the address $C000. At the end of my ROM code I have:
ORG $FFFC
equw startcode
Well, actually, that’s a lie. Because the 6502 also likes you to have a couple of other vectors – one to the interrupt service routine and one to the NMI handler. In my code, I use the labels .ISR_handler and .NMI_handler. And the end of the ROM code looks like this:
ORG $FFFC
equw NMI_handler ; vector for NMI
equw startcode ; reset vector to start of ROM code
equw ISR_handler ; vector for ISR
While the start address of the ROM ($C000) is fixed, the two interrupt handling routines don’t live at predictable places. So you can see how the use of labels prevents problems when they move. We can always find them by first looking at the vector addresses which are in a predictable place.
Vectors of our own
How can we use this to our own advantage? With the same technique of having things at fixed addresses and using vectors.
There are two ways we could do this. Let’s start with the simple one and then progress to the one I’m actually using.
Let’s say we want use programs to be able to call that lcd_wr_chr routine. To do that, they need to be able to jump (using JSR) to the subroutine’s address. But the user program doesn’t know where that is.
The simple solution is a jump table. Towards the end of my ROM code, I can insert something like:
ORG $FFF0
jmp lcd_wr_char
In the user program, we can use jsr $FFF0. This jumps to the $FFF0 address which has the instruction to jump to the subroutine. Better yet, we can define some constants to be included in both the ROM code and the user software so that we don’t have to remember the addresses. Something like:
OS_LCD_WR_CHR = $FFF0
Now, to call the subroutine from the user code, we just jsr OS_LCD_WR_CHR.
More flexible
However, there’s a better and more flexible approach, although it does require a tad more effort in setting up and is ever so slightly slower. Instead of using direct labels in the jump table, we use vector addresses.
First, we define a list of constants containing vector addresses, where those addresses are in RAM. For our purposes, let’s use locations in page 2 of memory. So we might have an entry in our code something like this:
OS_LCD_WR_CHR_VEC = $0200
We’re just defining a constant here – this is not executable code in itself. Also note how I’ve added _VEC to the end to make clear the purpose of this address.
Now, during the first initialisation stages of the ROM code, we define what is going to live at the address we just defined. Each vector will actually take up two bytes.
lda #<lcd_wr_chr sta OS_LCD_WR_CHR_VEC #>lcd_wr_chr sta OS_LCD_WR_CHR_VEC + 1
Let’s walk through this. First, we get the low byte of the address defined by the label lcd_wr_chr and we store this at the address indicated by OS_LCD_WR_CHR_VEC. Then we get the high byte of the label address and store that at the byte right above OS_LCD_WR_CHR_VEC. In other words, the two bytes starting at OS_LCD_WR_CHR_VEC contain the address of the subroutine in the ROM code.
Now we go back to our jump table but write it slightly differently.
ORG $FFF0
jmp (OS_LCD_WR_CHR_VEC)
Note how the constant is in parentheses. This is an indirect jump (and you’ll need to have a CMOS 65C02, not an NMOS 6502 for this to work). What it’s saying is, don’t jump to the address OS_LCD_WR_CHR_VEC, but rather jump to the two-byte address which you’ll find stored at OS_LCD_WR_CHR_VEC.
On the fly
Why is this better? After all, it does seem to require an awful lot more work.
The main reason is that the address to which you’re jumping is not hard-coded into the ROM but is sitting in RAM, where you can change it programmatically.
In this example, we’ve been using the example of writing a character to an LCD. In other words, the LCD is the destination for the output stream. But you user code could change the two bytes at $0200 to point instead to a routine to write to the printer or a screen.
To give you an idea of the work involved, here are some snippets from my current code.
First, here’s where we define the locations for the vectors. All of this is only in the ROM code.
OSRDHBYTE_VEC = $0200 OSRDHADDR_VEC = $0202 OSRDCH_VEC = $0204 OSWRBUF_VEC = $0206 OSWRCH_VEC = $0208 OSWRERR_VEC = $020A OSWRMSG_VEC = $020C OSWRSBUF_VEC = $020E OSLCDCH_VEC = $0210 OSLCDCLS_VEC = $0212 OSLCDERR_VEC = $0214 OSLCDMSG_VEC = $0216 OSLCDSC_VEC = $0218
Note how each location is two bytes after the previous one. Now, when the ROM code first runs, we set up the contents of the vector locations.
; SETUP OS CALL VECTORS lda #<read_hex_byte ; OSRDHBYTE sta OSRDHBYTE_VEC lda #>read_hex_byte sta OSRDHBYTE_VEC + 1 lda #<read_hex_addr ; OSRDHADDR sta OSRDHADDR_VEC lda #>read_hex_addr sta OSRDHADDR_VEC + 1 lda #<acia_sendbuf ; OSWRBUF sta OSWRBUF_VEC lda #>acia_sendbuf sta OSWRBUF_VEC + 1 lda #<acia_writechar ; OSWRCH sta OSWRCH_VEC lda #>acia_writechar sta OSWRCH_VEC + 1 lda #<os_print_error ; OSWRERR sta OSWRERR_VEC lda #>os_print_error sta OSWRERR_VEC + 1 lda #<acia_println ; OSWRMSG sta OSWRMSG_VEC lda #>acia_println sta OSWRMSG_VEC + 1 lda #<acia_prt_strbuf ; OSWRSBUF sta OSWRSBUF_VEC lda #>acia_prt_strbuf sta OSWRSBUF_VEC + 1 lda #<lcd_prt_chr ; OSLCDCH sta OSLCDCH_VEC lda #>lcd_prt_chr sta OSLCDCH_VEC + 1 lda #<lcd_cls ; OSLCDCLS sta OSLCDCLS_VEC lda #>lcd_cls sta OSLCDCLS_VEC + 1 lda #<lcd_prt_err ; OSLCDERR sta OSLCDERR_VEC lda #>lcd_prt_err sta OSLCDERR_VEC + 1 lda #<lcd_println ; OSLCDMSG sta OSLCDMSG_VEC lda #>lcd_println sta OSLCDMSG_VEC + 1 lda #<lcd_set_cursor ; OSLCDSC sta OSLCDSC_VEC lda #>lcd_set_cursor sta OSLCDSC_VEC + 1
Yeah, there’s a lot of it – four lines for each vector. Maybe there’s a smarter way of doing this. If there is, let me know.
Now we set up the jump table at the end of the ROM code:
ORG $FF00 .os_calls jmp (OSRDHBYTE_VEC) jmp (OSRDHADDR_VEC) jmp (OSRDCH_VEC) jmp (OSWRBUF_VEC) jmp (OSWRCH_VEC) jmp (OSWRERR_VEC) jmp (OSWRMSG_VEC) jmp (OSWRSBUF_VEC) jmp (OSLCDCH_VEC) jmp (OSLCDCLS_VEC) jmp (OSLCDERR_VEC) jmp (OSLCDMSG_VEC) jmp (OSLCDSC_VEC)
Now you have access to all the ROM subroutines defined in this way. For example, to use the OSRDHBYTE subroutine (the first one listed) in your use code, you’d use: jsr $FF00.
But what about the next one, OSRDHADDR? As each of these jump commands takes up three bytes, OSRDHADDR will be at $FF03, OSRDCH at $FF06 and so on. And that’s pretty much how the BBC Micro does it. To use one of its OS commands in your assembler code, you look up the function in the user manual to find what its address is.
But there is a better method. I have a configuration file that defines certain constants and which gets included into every program, including the ROM. In part, this contains:
OSRDHBYTE = $FF00 OSRDHADDR = $FF03 OSRDCH = $FF06 OSWRBUF = $FF09 OSWRCH = $FF0C OSWRERR = $FF0F OSWRMSG = $FF12 OSWRSBUF = $FF15 OSLCDCH = $FF18 OSLCDCLS = $FF1B OSLCDERR = $FF1E OSLCDMSG = $FF21 OSLCDSC = $FF24
Note how the addresses are three bytes apart. I calculate these addresses by hand, but as a sanity check, I look at the logged output of the Beebasm assembler. Here’s the relevant section:
.os_calls FF00 6C 00 02 JMP (&0200) FF03 6C 02 02 JMP (&0202) FF06 6C 04 02 JMP (&0204) FF09 6C 06 02 JMP (&0206) FF0C 6C 08 02 JMP (&0208) FF0F 6C 0A 02 JMP (&020A) FF12 6C 0C 02 JMP (&020C) FF15 6C 0E 02 JMP (&020E) FF18 6C 10 02 JMP (&0210) FF1B 6C 12 02 JMP (&0212) FF1E 6C 14 02 JMP (&0214) FF21 6C 16 02 JMP (&0216) FF24 6C 18 02 JMP (&0218)
Now, calling a ROM-based OS function from user code is as simple as jsr OSWRMSG.
the zero page in the 6502’s RAM is a great resource for storing things like rapid lookup tables.
I would consider deciding on a bunch of API calls that your OS would provide, and using some of the zero page as a vector and lookup table for them, which allow you to write a sort of BIOS. Add in basic semaphore handling, message passing, timers and interrupts, and you’ve got a kernel.
For example, a serial device handler would receive messages from other processes and gradually write the payload in the message to the device and then free up the message.
To be honest, I used page 2 for the vectors because that’s where the BBC Micro puts them. And I don’t need zero-page addressing for them.