Interfacing assembly with BASIC via DEFUSR, part 5

See also: Part 1, Part 2Part 3, and Part 4.

Now that I’ve gotten my digressions with BASIC variable access speeds and input speeds and INKEY/INSTR and GOTO/GOSUB speeds and HEX versus DECimal speeds out of the way, I can finally get back to digressing on using assembly language to speed up BASIC.

Where were we?

Oh, right…

In part 4 of this article I presented an example of using BASIC to scroll a PAC-MAN style maze that was too tall to fit on 16 the line screen.

I also presented some assembly code that would scroll the screen much faster than BASIC could ever hope to.

Today, let’s combine these two items and try to create a fast-scrolling maze playfield.

Let’s get started!

START SHOUTING AT ME! (revisited)

But before we get started, let’s revisit the uppercase routine I presented in Part 3.

Simon Jonassen is well-known in the CoCo Community for doing some amazing things on the original CoCo 1 and 2 hardware (and, lately, the CoCo 3 as well). He is quite the master of optimization, and has created some stunning sound players that allow the original CoCo to have cool background music while doing other things (if only the game programmers of 1980 knew about this!). He also has a cool web-based CoCo semigraphics editor. He provided a few enhancements:

* UCASE.ASM v1.01
* by Allen C. Huffman of Sub-Etha Software
* www.subethasoftware.com / alsplace@pobox.com
*
* 1.01 a bit smaller per Simon Jonassen
*
* DEFUSRx() uppercase output function
*
* INPUT:   VARPTR of a string
* RETURNS: # chars processed
*
* EXAMPLE:
*   CLEAR 200,&H3F00
*   DEFUSR0=&H3F00
*   A$="Print this in uppercase."
*   PRINT A$
*   A=USR0(VARPTR(A$))
*
ORGADDR     EQU     $3f00

GIVABF      EQU     $B4F4   * 46324
INTCNV      EQU     $B3ED   * 46061
CHROUT      EQU     $A002

            opt	    6809    * 6809 instructions only
            opt	    cd      * cycle counting

            org     ORGADDR

start       jsr     INTCNV  * get passed in value in D
            tfr     d,x     * move value (varptr) to X
            ldy     2,x     * load string addr to Y
;           ldb     ,x      * load string len to B
            beq     null    * exit if strlen is 0
            ldb     ,x      * load string len to B
            ldx     #0      * clear X (count of chars conv)

loop        lda     ,y+	    * get next char, inc Y
;           lda     ,y      * load char in A
            cmpa    #'a     * compare to lowercase A
            blt     nextch  * if less, no conv needed
            cmpa    #'z     * compare to lowercase Z
            bgt     nextch  * if greater, no conv needed
lcase       suba    #32     * subtract 32 to make uppercase
            leax    1,x     * inc count of chars converted
nextch      jsr     [CHROUT] * call ROM output character routine
;           leay    1,y     * increment Y pointer
cont        decb            * decrement counter
            bne	    loop    * not done yet
;           beq     exit    * if 0, go to exit
;           bra     loop    * go to loop

exit        tfr     x,d     * move chars conv count to D
;           bra     return
            jmp     GIVABF  * return to caller

null        ldd     #-1     * load -2 as error
return      jmp     GIVABF  * return to caller

* lwasm --decb -o ucase2.bin ucase2.asm -l
* lwasm --decb -f basic -o ucase2.bas ucase2.asm -l
* lwasm --decb -f ihex -o ucase2.hex ucase2.asm -l
* decb copy -2 -r ucase2.bin ../Xroar/dsk/DRIVE0.DSK,UCASE2.BIN

This code is 46 bytes long, compared to my original which was 49 bytes. The changes are:

  1. Move the initial LDB with string length to after the string length check, since it’s only needed if we get past that check and have a string.
  2. Change my LDA ,Y to LDA ,Y+ to increment Y there and not need the LEAY 1,Y later.
  3. Changed my “characters left” check from BEQ EXIT and BRA LOOP to BNE LOOP since it can just fall through and continue otherwise.
  4. Change a BRA RETURN to JMP GIVABF, since the branch would just end up at a JMP, and doing a JMP is faster than branching to a JMP.

Minor changes, but every little bit helps.

Simon also pointed out an embarrassing oversight in my very first example shown in part 1:

ORGADDR EQU $3f00

GIVABF EQU $B4F4   * 46324
INTCNV EQU $B3ED   * 46061

       org  ORGADDR
start  jsr  INTCNV * get passed in value in D
       tfr  d,x    * transfer D to X so we can manipulate it
       leax 1,x    * add 1 to X
       tfr  x,d    * transfer X back to D
return jmp  GIVABF * return to caller

He reminded me about the “addd” instruction which can add to D. For some reason, I was thinking I needed to use LEA to add to a 16-bit register, and since “LEAD 1,D” wasn’t a thing, I did the whole transfer to X, add one to X, transfer back to D thing.

He said I should just do this:

* ADDONE.ASM v1.01
* by Allen C. Huffman of Sub-Etha Software
* www.subethasoftware.com / alsplace@pobox.com
*
* 1.01 made less stupid per Simon Jonassen
*
* DEFUSRx() add one routine
*
* INPUT:   integer to add one to
* RETURNS: value +1
*
* EXAMPLE:
*   CLEAR 200,&H3F00
*   DEFUSR0=&H3F00
*   A=USR0(42)
*   PRINT A
*
ORGADDR EQU     $3f00

INTCNV  EQU     $B3ED   * 46061
GIVABF  EQU     $B4F4   * 46324

        org     ORGADDR

start   jsr     INTCNV  * get passed in value in D
;       tfr     d,x     * transfer D to X so we can manipulate it
;       leax    1,x     * add 1 to X
;       tfr     x,d     * transfer X back to D
        addd    #1      * add 1 to D
return  jmp     GIVABF  * return to caller

* lwasm --decb -o -9 addone2.bin addone2.asm
* lwasm --decb -f basic -o addone2.bas
* decb copy -2 -r addone2.bin ../Xroar/dsk/DRIVE0.DSK,ADDONE2.BIN

See what happens when people who actually know 6809 assembly language look at my code? Thanks, Simon!

Moving Day

My simple examples have been building up to slight less-simple ones that do something more useful, like moving data that would take days to move in BASIC. Previously, I presented a PAC-MAN maze that could “scroll” up and down the screen by PRINTing the whole screen each time with just the lines of the maze that should be visible. I also presented some assembly code that could be used to move the screen up, down, left or right.

Today, the first thing I want to do is integrate that assembly routine in to the PAC-MAN maze code. Instead of redrawing the entire screen each time, BASIC will only need to redraw the top or bottom line depending on which was the screen just scrolled. If my math is correct, printing one line instead of sixteen lines should be at least twice faster.

First, let’s revisit the screen moving assembly code, which, thanks to comments from L. Curtis Boyle, now has a smarter routine for checking which direction the user passed in to scroll (though it could still be thrown off by values larger than 255):

* SCRNMOVE.ASM v1.01
* by Allen C. Huffman of Sub-Etha Software
* www.subethasoftware.com / alsplace@pobox.com
*
* DEFUSRx() screen moving function
*
* INPUT:   direction (1=up, 2=down, 3=left, 4=right)
* RETURNS: 0 on success
*         -1 if invalid direction
*
* 1.01 better param parsing per L. Curtis Boyle
*
* EXAMPLE:
*   CLEAR 200,&H3F00
*   DEFUSR0=&H3F00
*   A=USR0(1)
*
ORGADDR EQU     $3f00

INTCNV  EQU     $B3ED   * 46061
GIVABF  EQU     $B4F4   * 46324

UP      EQU     1
DOWN    EQU     2
LEFT    EQU     3
RIGHT   EQU     4
SCREEN  EQU     1024    * top left of screen
END     EQU     1535    * bottom right of screen

        org     ORGADDR

start   jsr     INTCNV  * get incoming param in D
;       cmpb    #UP
        decb            * decrement B
        beq     up      * if one DEC got us to zero
;       cmpb    #DOWN
        decb            * decrement B
        beq     down    * if two DECs...
;       cmpb    #LEFT
        decb            * decrement B
        beq     left    * if three DECs...
;       cmpb    #RIGHT
        decb            * decrement B
        beq     right   * if four DECs...
error   ldd     #-1     * load D with -1 for error code
        bra     exit

up      ldx     #SCREEN+32
loopup  lda     ,x
        sta     -32,x
        leax    1,x
        cmpx    #END
        ble     loopup
        bra     return

down    ldx     #END-32
loopdown lda    ,x
        sta     32,x
        leax    -1,x
        cmpx    #SCREEN
        bge     loopdown
        bra     return

left    ldx     #SCREEN+1
loopleft lda    ,x
        sta     -1,x
        leax    1,x
        cmpx    #END
        ble     loopleft
        bra     return

right   ldx     #END-1
loopright lda   ,x
        sta     1,x
        leax    -1,x
        cmpx    #SCREEN
        bge     loopright
    
return  ldd     #0      * return code (0=success)
exit    jmp     GIVABF  * return to BASIC

* lwasm --decb -9 -o scrnmove2.bin scrnmove2.asm
* lwasm --decb -f basic -o scrnmove2.bas scrnmove2.asm
* decb copy -2 -r scrnmove2.bin ../Xroar/dsk/DRIVE0.DSK,SCRNMOVE2.BIN

The generated BASIC program looks like:

10 READ A,B
20 IF A=-1 THEN 70
30 FOR C = A TO B
40 READ D:POKE C,D
50 NEXT C
60 GOTO 10
70 END
80 DATA 16128,16217,189,179,237,90,39,14,90,39,28,90,39,42,90,39,55,204,255,255,32,67,142,4,32,166,132,167,136,224,48,1,140,5,255,47,244,32,47,142,5,223,166,132,167,136,32,48,31,140,4,0,44,244,32,30,142,4,1,166,132,167,31,48,1,140,5,255,47,245,32,14
90 DATA 142,5,254,166,132,167,1,48,31,140,4,0,44,245,204,0,0,126,180,244,-1,-1

Let’s take the original maze program and modify it to use the assembly routines instead:

0 REM MAZETEST.BAS
10 DIM MZ$(31)
20 FOR A=0 TO 30:READ MZ$(A):NEXT
30 CLS
40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 FOR LN=0 TO 15
70 PRINT @LN*32,MZ$(LN+ST);
80 NEXT:NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 FOR LN=0 TO 15
120 PRINT @LN*32,MZ$(LN+ST);
130 NEXT:NEXT
140 GOTO 40
999 GOTO 999
1000 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"    
1010 DATA "X            XX            X"    
1020 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1030 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1040 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1050 DATA "X                          X"
1060 DATA "X XXXX XX XXXXXXXX XX XXXX X"   
1070 DATA "X XXXX XX XXXXXXXX XX XXXX X"    
1080 DATA "X      XX    XX    XX      X"   
1090 DATA "XXXXXX XXXXX XX XXXXX XXXXXX"    
2100 DATA "     X XXXXX XX XXXXX X     "    
2110 DATA "     X XX          XX X     "    
2120 DATA "     X XX XXXXXXXX XX X     "   
2130 DATA "XXXXXX XX X      X XX XXXXXX"   
2140 DATA "          X      X          "   
2150 DATA "XXXXXX XX X      X XX XXXXXX"   
2160 DATA "     X XX XXXXXXXX XX X     "   
2170 DATA "     X XX          XX X     "   
2180 DATA "     X XX XXXXXXXX XX X     "   
2190 DATA "XXXXXX XX XXXXXXXX XX XXXXXX"   
3200 DATA "X            XX            X"   
3210 DATA "X XXXX XXXXX XX XXXXX XXXX X"   
3220 DATA "X XXXX XXXXX XX XXXXX XXXX X"   
3230 DATA "X   XX                XX   X"   
3240 DATA "XXX XX XX XXXXXXXX XX XX XXX"   
3250 DATA "XXX XX XX XXXXXXXX XX XX XXX"   
3260 DATA "X      XX    XX    XX      X"   
3270 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"   
3280 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"   
3290 DATA "X                          X"   
4200 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"

The maze is 31 lines tall. The fake scrolling is done by redrawing the entire screen line-by-line. The screen is 16 lines tall, so initially we draw maze lines 0-15. Then we redraw maze lines 1-16, giving the appearance that the screen is scrolling up and a line has scrolled off the top of the screen. This repeats for lines 2-17, 3-18 and so on until we’ve drawn the last 16 lines of 15-30.

After “scrolling” all the way to the bottom of the maze, a second block of FOR/NEXT loops reverses the process, starting with maze lines 15-30, then 15-30 and so on until it is back to displaying the top lines 0-15.

The scrolling is done by the FOR/NEXT loops using the LN variables in lines 60-80 and 110-130.

Rather than redrawing all sixteen lines each time, we could use the assembly routine to move the screen, and then we’d just draw one line – top or bottom, depending on which was the screen scrolled.

In effect, we’d replace this:

40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 FOR LN=0 TO 15
70 PRINT @LN*32,MZ$(LN+ST);
80 NEXT:NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 FOR LN=0 TO 15
120 PRINT @LN*32,MZ$(LN+ST);
130 NEXT:NEXT

…with this:

35 FOR LN=0 TO 15:PRINT @LN*32,MZ$(LN+ST);:NEXT
40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 Z=USR0(1)
70 PRINT @480,MZ$(ST+15);
80 NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 Z=USR0(2)
120 PRINT @0,MZ$(ST);
130 NEXT

Line 35 was added to initially draw the screen. After that, the assembly routine can move it up or down, and let BASIC redraw just the one line that needs to be drawn.

This, of course, requires the assembly routine to be loaded. We can take the BASIC loader of that and renumber it so we can call it from our test program. Here is the scrnmove2.asm updated code from the top of this article, renumbered and changed in to a subroutine:

5000 REM ASSEMBLY ROUTINE
5010 READ A,B
5020 IF A=-1 THEN 5070
5030 FOR C = A TO B
5040 READ D:POKE C,D
5050 NEXT C
5060 GOTO 5010
5070 RETURN
5080 DATA 16128,16217,189,179,237,90,39,14,90,39,28,90,39,42,90,39,55,204,255,255,32,67,142,4,32,166,132,167,136,224,48,1,140,5,255,47,244,32,47,142,5,223,166,132,167,136,32,48,31,140,4,0,44,244,32,30,142,4,1,166,132,167,31,48,1,140,5,255,47,245,32,14
5090 DATA 142,5,254,166,132,167,1,48,31,140,4,0,44,245,204,0,0,126,180,244,-1,-1

Now, I can add this to the end of the mazetest.bas program and set it up so the USR0() calls will work:

10 CLEAR 200,&H3F00:DIM MZ$(31)
25 GOSUB5000:DEFUSR0=&H3F00

Now the program will use the CLEAR command to protect memory starting at &H3F00 (where the assembly will load), then after it reads all the maze strings in to memory (those DATA statements appear first), it will GOSUB 5000 and that READs the assembly code statements and POKEs them in to memory starting at &H3F00. The DEFUSR call is then done to make USR0(x) work.

With just a few lines changed, and getting our assembly routine in memory, now the maze scrolling is very fast! And, if we optimized the BASIC code around it, it could be even faster since most of the time is spent processing the BASIC program.

Here is the full listing:

0 REM MAZETST2.BAS - W/ASM!
10 CLEAR 200,&H3F00:DIM MZ$(31)
20 FOR A=0 TO 30:READ MZ$(A):NEXT
25 GOSUB5000:DEFUSR0=&H3F00
30 CLS

40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 FOR LN=0 TO 15
70 PRINT @LN*32,MZ$(LN+ST);
80 NEXT:NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 FOR LN=0 TO 15
120 PRINT @LN*32,MZ$(LN+ST);
130 NEXT:NEXT

35 FOR LN=0 TO 15:PRINT @LN*32,MZ$(LN+ST);:NEXT
40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 Z=USR0(1)
70 PRINT @480,MZ$(ST+15);
80 NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 Z=USR0(2)
120 PRINT @0,MZ$(ST);
130 NEXT
140 GOTO 40
999 GOTO 999
1000 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"    
1010 DATA "X            XX            X"    
1020 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1030 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1040 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1050 DATA "X                          X"
1060 DATA "X XXXX XX XXXXXXXX XX XXXX X"   
1070 DATA "X XXXX XX XXXXXXXX XX XXXX X"    
1080 DATA "X      XX    XX    XX      X"   
1090 DATA "XXXXXX XXXXX XX XXXXX XXXXXX"    
2100 DATA "     X XXXXX XX XXXXX X     "    
2110 DATA "     X XX          XX X     "    
2120 DATA "     X XX XXXXXXXX XX X     "   
2130 DATA "XXXXXX XX X      X XX XXXXXX"   
2140 DATA "          X      X          "   
2150 DATA "XXXXXX XX X      X XX XXXXXX"   
2160 DATA "     X XX XXXXXXXX XX X     "   
2170 DATA "     X XX          XX X     "   
2180 DATA "     X XX XXXXXXXX XX X     "   
2190 DATA "XXXXXX XX XXXXXXXX XX XXXXXX"   
3200 DATA "X            XX            X"   
3210 DATA "X XXXX XXXXX XX XXXXX XXXX X"   
3220 DATA "X XXXX XXXXX XX XXXXX XXXX X"   
3230 DATA "X   XX                XX   X"   
3240 DATA "XXX XX XX XXXXXXXX XX XX XXX"   
3250 DATA "XXX XX XX XXXXXXXX XX XX XXX"   
3260 DATA "X      XX    XX    XX      X"   
3270 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"   
3280 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"   
3290 DATA "X                          X"   
4200 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"

5000 REM ASSEMBLY ROUTINE
5010 READ A,B
5020 IF A=-1 THEN 5070
5030 FOR C = A TO B
5040 READ D:POKE C,D
5050 NEXT C
5060 GOTO 5010
5070 RETURN
5080 DATA 16128,16217,189,179,237,90,39,14,90,39,28,90,39,42,90,39,55,204,255,255,32,67,142,4,32,166,132,167,136,224,48,1,140,5,255,47,244,32,47,142,5,223,166,132,167,136,32,48,31,140,4,0,44,244,32,30,142,4,1,166,132,167,31,48,1,140,5,255,47,245,32,14
5090 DATA 142,5,254,166,132,167,1,48,31,140,4,0,44,245,204,0,0,126,180,244,-1,-1

Try the original BASIC-only version and then this new assembly-enhanced version and see what you think.

Next time, I will share a version of this scrolling maze that has a character you can control and move through the maze.

Until then…

Leave a Reply