Category Archives: BASIC Programming

Optimizing Color BASIC, part 9

See also: Part 1Part 2Part 3Part 4Part 5, Part 6Part 7 and Part 8.

Have no fear! Today’s installment is a short one. It will address a few miscellaneous things I have been told about.

“.” versus “0”

In a comment posted to Part 5, Darren Atkinson (designer of the fabulous CoCoSDC interface) pointed out a place where a decimal point makes things faster!

One place where a decimal point will give an improvement is when using the value 0 in an expression. Basic will accept a stand-alone decimal point as the number 0, but it will process it faster than the ‘0’ character.

Try comparing the speed of:
IF N < 0 THEN …

with that of:
IF N < . THEN …

– Darren Atkinson

I, of course, had to test this. Using the benchmark program:

0 DIM Z
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 Z=0
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

Setting Z=0 1000 times produced the value of 178.

30 Z=&H0

Using a hexadecimal zero produced 164.

30 Z=.

And using just a period to represent zero produces … 141!

Okay, no more zeros.

Fake FOR

George Phillips chimed in about GOTO and GOSUB with a sneaky way to jump around faster:

However, because BASIC stores lines as a singly linked list the fastest places to GOTO and GOSUB are either the top of the program or anywhere after you do the GOTO/GOSUB. BASIC is clever enough to look forward if the line # is after the current one, otherwise it must search from the top.
Using a “fake” FOR/NEXT loop instead of GOTO could well be faster in such cases since it doesn’t have to search for the start of the FOR. If you have:

… whole bunch of code

1000 PRINT”Here we go.”

… do some stuff

2000 GOTO1000

It is likely faster to do this:

1000 FORX=0TO1STEP0:PRINT”Here we go.”

2000 NEXT

Possibly tricky to use in practice, but you can have a lot of FOR/NEXT loops on the stack and skip over interior ones. Fairly unpleasant, but optimizing BASIC is not a pretty undertaking.

– George Phillips

Wow. James Gerrie and Johann Klasek also mentioned this in response to Part 4. Johann gave an example:

I think something like that was in mind:

20 FOR A=. TO 1: A=. : REM FOREVER
30 ON INSTR(” UDLR”,INKEY$) GOTO 100,200,300,400,500
35 REM FALL THROUGH ACTION
40 NEXT
50 END
100 REM IDLE LOOP
110 NEXT
200 REM MOVE UP
210 NEXT
300 REM MOVE DOWN
310 NEXT
400 REM MOVE LEFT
410 NEXT
500 REM MOVE RIGHT
510 NEXT

Compared to the ON GOSUB this is some kind of “redo” or “loop retry” not reaching the fall through action in 35.

– Johann Klasek

Anytime the overhead of scanning through lines (from first line to destination) is more than the overhead of a RETURN, this would be faster (and only use a bit of extra memory for remembering where to RETURN to).

I don’t have any benchmarks for this one, but there is probably a threshold where the number of lines before the GOTO has to be more than X before this is always faster.

Thanks, Darren, James and Johann (and any others I might have missed).

Until next time…

Optimizing Color BASIC, part 8

See also: Part 1Part 2Part 3Part 4Part 5, Part 6 and Part 7.

Arrays and Variable Length

In part 3, I demonstrated a simple “game” where you could move a character around the screen and try to avoid running in to enemies. The original version hard coded four enemies, each with their own variable. It looked like this:

0 REM GAME.BAS
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=256+16
15 E1=32:E2=63:E3=448:E4=479
20 PRINT@P,"*";:PRINT@E1,"X";:PRINT@E2,"X";:PRINT@E3,"X";:PRINT@E4,"X";
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$):IF LN=0 THEN 30
45 PRINT@P," ";
50 ONLN GOSUB100,200,300,400
60 IF P=E1 OR P=E2 OR P=E3 OR P=E4 THEN 90
80 GOTO 20
90 PRINT@267,"GAME OVER!":END
100 IF P>31 THEN P=P-32
110 RETURN
200 IF P<479 THEN P=P+32:RETURN
210 RETURN
300 IF P>0 THEN P=P-1
310 RETURN
400 IF P<510 THEN P=P+1
410 RETURN

I then modified it to use an array for the enemy variables, so less code was needed to cycle through them, while also allowing easy changing of the amount of enemies. It looked like this:

0 REM GAME2.BAS
1 EN=10-1 'ENEMIES
2 DIM E(EN)
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=256+16
15 FOR A=0 TO EN:E(A)=RND(510):NEXT
20 PRINT@P,"*";
25 FOR A=0 TO EN:PRINT@E(A),"X";:NEXT
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$):IF LN=0 THEN 30
45 PRINT@P," ";
50 ONLN GOSUB100,200,300,400
60 FOR A=0 TO EN:IF P=E(A) THEN 90 ELSE NEXT
80 GOTO 20
90 PRINT@267,"GAME OVER!":END
100 IF P>31 THEN P=P-32
110 RETURN
200 IF P<479 THEN P=P+32:RETURN
210 RETURN
300 IF P>0 THEN P=P-1
310 RETURN
400 IF P<510 THEN P=P+1
410 RETURN

Arrays are an easy way to reduce code. What I did not realize is how slow they are! Consider this example:

4 DIM E1,E2,E3,E4
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 E1=1:E2=1:E3=1:E4=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

In the test loop we simply set four different variables. Notice that the ones we use most often (inside the test loop) I have declared first in line 4. This makes them faster, since they are found earlier when BASIC looks up the variables.

This results in 576.

Instead of using four separate variables, we could use an array of four elements:

4 DIM E(3)
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 FORC=0TO3:E(C)=1:NEXT
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

Although this simplifies the code, and allows us to easily change it from 3 to more variables, it adds another FOR/NEXT loop.

The speed drops to 1408! And that’s using a one-character variable name. It might be a tad slower if I had chosen “EN” for the array or something two-characters to match the original.

It seems that, if you can get away with it, manually handling separate variables is much faster.

Arrays will win for code size, but lose for speed. I probably won’t want to use arrays to track the enemies in my BASIC arcade action games.

Can we make arrays faster? Let’s remove the FOR/NEXT loop and access them manually, so it’s as direct a compare to non-arrays as we can get:

4 DIM E(3)
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 E(0)=1:E(1)=1:E(2)=1:E(3)=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

Now we are doing “E(0)=1” versus “E1=1”. Arrays should still be slower because there are more characters to parse, and an array lookup as to be done.

This produces 1021. It appears that setting the variable takes about a third of the time, looking up the array another third, and the FOR/NEXT loop a third third. Or something.

As brute-force as it looks, it appears that is the faster way for handling variables even though it produces more code.

And speaking of more code…

Variable Length

One character variable names can get confusing real quick, but the two-character limit in Color BASIC isn’t much better. Well, you can specify longer variable names, but only the first two characters are honored:

USERNUM=1

…turns in to…

US=1

If you could make sure the first two characters were unique, you could make your program more readable, but you would be wasting speed and code size.

Consider this benchmark example, which uses a 10-character variable:

0 REM VARLEN.BAS '211
4 DIM USERCOUNT
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 USERCOUNT=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

This produces 211. It’s wasteful, since BASIC is only honoring the first two characters. Let’s try this:

0 REM VARLEN.BAS '182
4 DIM US
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 US=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

This produces 182. All those extra useless characters did nothing but slow things down a tad.

But using a one-character variable is the fastest we can get:

0 REM VARLEN.BAS '177
4 DIM U
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 U=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

This produces 177.

For the most-used variables, use a one-character variable name for the best speed.

And remember, every time a variable is references, BASIC has to start with the first variable it knows about and walk through all of them until it finds a match. That is the purpose of the DIM statements in lines 4 and 4. They are declaring variables in the priority of most-used to least, sorta. The variable used in the inner FOR/NEXT loop is U, so I declare it first.

If I had done U last:

0 REM VARLEN.BAS '177
5 DIM TE,TM,B,A,TT
6 DIM U
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 U=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

This slows it down to 188.

Putting it all together: Avoid arrays, use one-character variable names, and variables you want to be the fastest should be declared earlier.

Just an FYI.

Interfacing assembly with BASIC via DEFUSR, part 6

See also: Part 1, Part 2Part 3, Part 4 and Part 5.

Previously, we finally got to do something semi-useful with assembly: we replaced a slow full-screen scrolling routine in BASIC with a turbo-charged assembly routine, all called via the DEFUSR command.

Today, let’s apply this concept a bit further with the shell of a Pac-Man style video game written in BASIC, but enhanced with assembly.

In my Optimizing Color BASIC, part 3 article, I set the groundwork for writing a game in BASIC that involved moving a character around the screen and detecting collisions with enemy characters. Today I will combine that with the previous maze demo and create the world’s easiest Pac-Man game (no enemies, and no bothersome dots to eat).

The Maze

A few years ago, I started toying with a video output project for the Arduino computers. I began by simply bouncing a circle around the screen and then, for some reason, turned that in to an animated Pac-Man. This led me to digging in to some wonderful websites that had reverse engineered the original Pac-Man source code to explain how everything worked. You can find the series here:

http://subethasoftware.com/2014/02/09/arduino-pac-man-project/

Although I have yet to finish the game, I learned quite a bit about how Pac-Man works, including how the ghosts behave. I don’t know if BASIC would be fast enough to handle the logic of four ghosts and all the other stuff, but it sure would be fun to try — it would be much easier to write it in BASIC than C, I think.

But I digress.

The reason I mention this series is so I can show this picture:

Pac-Man!

For the Arduino project, I started with a screen shot of the original game and downsized it to fit the low resolution, black and white Arduino TVOut graphics library. It ended up looking like this:

Arduino Pac-Man!

Pac-Man was designed on a tile system. The original game resolution was 224×288. The screen was made up of 8×8 tiles, 28 across and 36 down. Without the score lines at the top and the players left lines at the bottom, the playfield itself was 28×31. The maze tiles looked like this:

Pac-Man maze tiles.

…and since the CoCo’s screen is 32×16, if we used one character per tile, we could replicate the same horizontal dimensions, but we’d need to scroll up and down to get to all 31 lines of the maze.

I was initially working on this for a 4K programming challenge I started (and have yet to complete). Using ASCII, the make looks like this:

XXXXXXXXXXXXXXXXXXXXXXXXXXXX
X            XX            X
X XXXX XXXXX XX XXXXX XXXX X
X X  X X   X XX X   X X  X X
X XXXX XXXXX XX XXXXX XXXX X
X                          X
X XXXX XX XXXXXXXX XX XXXX X
X XXXX XX XXXXXXXX XX XXXX X
X      XX    XX    XX      X
XXXXXX XXXXX XX XXXXX XXXXXX
     X XXXXX XX XXXXX X     
     X XX          XX X     
     X XX XXX--XXX XX X     
XXXXXX XX X      X XX XXXXXX
          X      X          
XXXXXX XX X      X XX XXXXXX
     X XX XXXXXXXX XX X     
     X XX          XX X     
     X XX XXXXXXXX XX X     
XXXXXX XX XXXXXXXX XX XXXXXX
X            XX            X
X XXXX XXXXX XX XXXXX XXXX X
X XXXX XXXXX XX XXXXX XXXX X
X   XX                XX   X
XXX XX XX XXXXXXXX XX XX XXX
XXX XX XX XXXXXXXX XX XX XXX
X      XX    XX    XX      X
X XXXXXXXXXX XX XXXXXXXXXX X
X XXXXXXXXXX XX XXXXXXXXXX X
X                          X
XXXXXXXXXXXXXXXXXXXXXXXXXXXX

It may look odd presented as Xs. and the aspect ratio is different, but it’s the exact Pac-Man layout used in the arcade. Here is the full play field that will scroll on the CoCo’s 32×16 screen:

Pac-Man full maze.

Since the original Pac-Man played on a monitor that was turned sideways, it was taller than it was wider. Most home ports either shrink the screen down, or flatten it out. By scrolling, maybe we can keep the aspect ratio similar.

And this is how my ASCII Pac-Man maze came to be.

As I referenced at the top of this article, I have been covering ways to Optimize Color BASIC in another article series. A recent part discussed reading the keyboard and moving a character around the screen. I took some of this code and used it to place a character in the Pac-Man maze and move it around. I also added collision detection making sure the player could not run through any of the walls.

Today I would like to present my work-in-progress Pac-Man maze, entirely in BASIC, and the changes I made to integrate the screen moving assembly routines. The assembly calls (and all the DATA statements) are in this listing, but are commented out. The ‘commented-out ines in red are what lines to uncomment to see the assembly-enhanced version, and any line just in red is the BASIC version that would need to be commented out.

The Listing

Here is the current listing, with comments to follow explaining how it works. I have been writing this on my Mac in a text editor, then loading it in to the XRoar emulator for testing. Because of this, you will notice I put spaces between program sections to make them easier to see. When this loads in to an emulator as an ASCII program, those empty lines are ignored. It works out nice.

0 REM
1 REM      PAC-MAZE 1.00
2 REM   BY ALLEN C. HUFFMAN
3 REM WWW.SUBETHASOFTWARE.COM
4 REM
6 REM
7 REM
8 REM
9 'CLEAR200,&H3F00

10 DIM MZ$(30)

15 REM
16 REM READ MAZE IN TO ARRAY
17 REM
20 FOR A=0 TO 30:READ MZ$(A):NEXT
21 'GOSUB2000:DEFUSR0=&H3F00

25 REM
26 REM UP+DOWN+LEFT+RGHT CHARS
27 REM
30 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)

35 REM
36 REM PLAYER/WALL/BG CHARS
37 REM
40 PC=159 'PAC-MAN CHAR
41 WC=ASC("X") 'WALL CHAR
42 BG=96 'BACKGRND CHAR

50 REM
51 REM INITIALIZATION
52 REM
60 ST=7 'SCRN START LINE
61 PM=1360 'PAC-MAN START LOC
62 DR=0 'CURRENT DIRECTION
63 DN=0 'NEXT DIRECTION

80 REM
81 REM DRAW INITIAL MAZE
82 REM
90 CLS:FOR A=0 TO 15:PRINT @A*32+2,MZ$(A+ST);:NEXT

100 REM
101 REM MAIN LOOP
102 REM
110 POKE PM,PC
120 A$=INKEY$:IF A$="" THEN 140
130 KB=INSTR(KB$,A$):IF KB=0 THEN 140 ELSE DN=KB
135 REM TRY NEXT DIRECTION
140 ON DN GOSUB 500,600,700,800
145 REM THEN TRY CURRENT DIR
150 IF DR<>DN THEN ON DR GOSUB 500,600,700,800
160 GOTO 100

500 REM
501 REM UP
502 REM
510 IF PEEK(PM-32)<>BG THEN RETURN
520 POKE PM,BG:DR=1
530 IF PM<1183 AND ST>0 THEN ST=ST-1:GOSUB 950 ELSE PM=PM-32
540 RETURN

600 REM
601 REM DOWN
602 REM
610 IF PEEK(PM+32)<>BG THEN RETURN
620 POKE PM,BG:DR=2
630 IF PM>1376 AND ST<15 THEN ST=ST+1:GOSUB 900 ELSE PM=PM+32
640 RETURN

700 REM
701 REM LEFT
702 REM
710 IF PEEK(PM-1)<>BG THEN RETURN
720 POKE PM,BG:DR=3
730 PM=PM-1:RETURN

800 REM
801 REM RIGHT
802 REM
810 IF PEEK(PM+1)<>BG THEN RETURN
820 POKE PM,BG:DR=4
830 PM=PM+1:RETURN

900 REM
901 REM SCROLL SCREEN UP
902 REM
910 FOR A=0 TO 15:PRINT @A*32+2,MZ$(A+ST);:NEXT
915 'Z=USR0(1):PRINT@482,MZ$(ST+15);
920 RETURN

950 REM
951 REM SCROLL SCREEN DOWN
952 REM
960 FOR A=0 TO 15:PRINT @A*32+2,MZ$(A+ST);:NEXT
965 'Z=USR0(2):PRINT@2,MZ$(ST);
970 RETURN

999 GOTO 999

1000 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"
1001 DATA "X            XX            X"
1002 DATA "X XXXX XXXXX XX XXXXX XXXX X"
1003 DATA "X X  X X   X XX X   X X  X X"
1004 DATA "X XXXX XXXXX XX XXXXX XXXX X"
1005 DATA "X                          X"
1006 DATA "X XXXX XX XXXXXXXX XX XXXX X"
1007 DATA "X XXXX XX XXXXXXXX XX XXXX X"
1008 DATA "X      XX    XX    XX      X"
1009 DATA "XXXXXX XXXXX XX XXXXX XXXXXX"
1010 DATA "     X XXXXX XX XXXXX X     "
1011 DATA "     X XX          XX X     "
1012 DATA "     X XX XXX--XXX XX X     "
1013 DATA "XXXXXX XX X      X XX XXXXXX"
1014 DATA "<         X      X         >"
1015 DATA "XXXXXX XX X      X XX XXXXXX"
1016 DATA "     X XX XXXXXXXX XX X     "
1017 DATA "     X XX          XX X     "
1018 DATA "     X XX XXXXXXXX XX X     "
1019 DATA "XXXXXX XX XXXXXXXX XX XXXXXX"
1020 DATA "X            XX            X"
1021 DATA "X XXXX XXXXX XX XXXXX XXXX X"
1022 DATA "X XXXX XXXXX XX XXXXX XXXX X"
1023 DATA "X   XX                XX   X"
1024 DATA "XXX XX XX XXXXXXXX XX XX XXX"
1025 DATA "XXX XX XX XXXXXXXX XX XX XXX"
1026 DATA "X      XX    XX    XX      X"
1027 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"
1028 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"
1029 DATA "X                          X"
1030 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"
1100 REM
1101 REM MAZE ARRAY TO GRAPHICS
1102 REM
1110 FOR R=0 TO 30
1120 DIM P,PL,PS,C:P=VARPTR(MZ$(R))
1130 PL=PEEK(P):PS=PEEK(P+2)*256+PEEK(P+3)
1140 FOR C=PS TO PS+PL-1
1150 PRINT CHR$(PEEK(C));
1155 IF PEEK(C)=ASC("X") THEN POKEC,175
1160 NEXT:PRINT
1170 NEXT

2000 REM
2001 REM LOAD ASSEMBLY ROUTINE
2002
2010 READ A,B
2020 IF A=-1 THEN 2070
2030 FOR C = A TO B
2040 READ D:POKE C,D
2050 NEXT C
2060 GOTO 2010
2070 RETURN 'END
2080 DATA 16128,16217,189,179,237,90,39,14,90,39,28,90,39,42,90,39,55,204,255,255,32,67,142,4,32,166,132,167,136,224,48,1,140,5,255,47,244,32,47,142,5,223,166,132,167,136,32,48,31,140,4,0,44,244,32,30,142,4,1,166,132,167,31,48,1,140,5,255,47,245,32,14
2090 DATA 142,5,254,166,132,167,1,48,31,140,4,0,44,245,204,0,0,126,180,244,-1,-1

As listed, this will do the game entirely in BASIC. Using the arrow keys, you can move around the yellow PAC-BLOCK and explore the maze. When you get near the top or bottom of the maze, the screen will sluggishly scroll so you can access the rest of the maze.

Give that a try and explore the top and bottom of the maze so you can get an idea of the speed BASIC scrolls at.

Then, to make it use the assembly language routines:

  1. Uncomment line 9. This protects memory beyond &H3F00 for the assembly language code.
  2. Uncomment line 21. This will GOSUB to the routine that reads in the assembly language and POKEs it in to memory starting at &H3F00.
  3. Comment line 910 (BASIC redraw/scroll up code).
  4. Uncomment line 915. This calls the assembly routine to scroll the screen up, then redraws a new line at the bottom.
  5. Comment line 960 (BASIC redraw/scroll down code).
  6. Uncomment line 965. This calls the assembly routine to scroll the screen down, then redraws a new line at the top.

Make those changes and re-run the program then move from top to bottom and see how much faster the scree “scrolls.”

Assembly!

And, the assembly could be made almost twice as fast, and the BASIC code could be optimized to be faster, too.

But before we do that, let’s dig in to how the code actually works.

Dissection

  • 20-21 read in all the maze lines in to an array called MZ$. The maze strings are in the DATA statements starting at line 1000.
  • 30 builds a string that contains the ASCII characters for Up, Down, Left and Right. It is much faster to use INSTR and parse through a string rather than have to build one with CHR$() inside the INSTR call every time.
  • 40-42 sets some default variables:
    • PC is the character of Pac-Man to POKE to the screen (159 is a yellow block).
    • WC is what character to use for wall detection (an ASCII “X” letter). The move code will PEEK screen memory, and not let you move in any direction that contains an “X”.
    • BG is the background character (a space) that will be used to erase Pac-Man before moving him.
  • 60-63 initialize some game play variables:
    • ST is which of the 31 lines of the maze should be the first line to display. Thus, ST=7 means we will initially draw lines 7-22 on the screen to display that middle section of the maze.
    • PM is the memory location where Pac-Man will be POKEd. The screen memory starts at 1024, so this default is somewhere in the middle of the screen under the ghost house.
    • DR is the direction Pac-Man is currently moving.
    • DN is the next direction the Pac-Man will try to move at an intersection. Like the arcade, this version will let you press UP while Pac-Man is moving left, and as soon as there is an opening in the wall, the direction will turn UP.
  • 90 draws the initial 16 maze lines that will fit on the screen.
  • 110 POKEs the Pac-Man character on to the screen (showing the yellow block).
  • 120-130 wait for one of the four keys in KB$ (up, down, left or right) to be pressed. If no key is pressed, it skips to line 140, else it sets DN (direction next) to match the key that was pressed.
  • 140 uses DN (next direction) to call a routine to try to move Pac-Man up, down, left or right.
  • 150 assumes that if DN and DR don’t match, a new direction has been pressed, so it will use DR (current direction) to call the up, down, left or right routine.
  • 160 goes back to 100 to keep doing this forever.
  • 510 is the UP routine. It will PEEK the memory location 32 bytes higher in memory (one line up from the current Pac-Man PM location) and if it is NOT the background character (ie, not some place we can move), it returns.
  • 520 POKEs the background character where Pac-Man is, erasing him, then sets DR (direction) to 1 for up.
  • 530 checks to see if the Pac-Man location is before a certain spot on the screen and that the screen is starting at a line later than the first one. If so, then the screen is allowed to scroll up (start line ST is decremented). A GOSUB to 950 will handle scrolling the screen. Otherwise, we don’t need to scroll and can just subtract 32 from the Pac-Man location, moving him up one line.
  • 540 returns us back to the main loop.
  • 610-640 is the same process for moving Pac-Man down, but we check for locations at the bottom of the screen and memory +32 from Pac-Man.
  • 710-730 is the same code for moving Pac-Man left. We never scroll left or right so we don’t have to do as much here.
  • 810-830 is the same code for moving Pac-Man right.
  • 910-920 is the routine to scroll the screen up:
    • 910 scrolls the screen up in BASIC by redrawing all 16 lines of the maze.
    • 915 uses the assembly language routine to move the screen up, then PRINTs the next line at the bottom that would be displayed.
  • 960-970 are the same thing for scrolling down.
  • 1000-1030 is the 31 line maze.
  • 2000-2090 is the assembly language loader generated by lwasm and renumbered to fix. It READs in the assembly from DATA statements and POKEs it in to memory.

Baby steps.

Next time, let’s improve this a bit.

Optimizing Color BASIC, part 7

See also: Part 1Part 2Part 3Part 4Part 5 and Part 6.

GOSUB Revisited

In response to part 4, William Astle wrote a very nice expansion to my musings about INKEY and GOTO versus GOSUB. If you have been following my ramblings, I highly recommend you check out his posting. Unlike me, he actually understands what is going on behind the scenes:

http://lost.l-w.ca/0x05/optimizing-color-basic-on-goto-vs-on-gosub/

One of the things he pointed out, then explained further in a comment after I didn’t understand, was how the GOSUB processing works. After the GOSUB keyword is found, BASIC acquires the line number and then scans to the end of the line or the next colon. That is where RETURN will RETURN to. This sounds as one might expect, but there was a bit of weirdness I didn’t “get” at first.

William demonstrated that anything after the line number is ignored, thus:

GOSUB 1000 I CAN TYPE STUFF HERE WITHOUT ERROR

…is valid. This surprised me. If you do this:

GOSUB 1000 I CAN TYPE STUFF HERE WITHOUT ERROR:PRINT "BACK FROM GOSUB"

…when the RETURN returns, you will see the “BACK FROM GOSUB” message printed as expected. Anything between the line number and the colon (or start of next line, whichever is found first) is ignored. William explains what is going on in his article.

This, of course, made me do some more stupid testing. First, I modified my benchmark program to run multiple tests and then average out the results. It looks like this:

0 REM BENCH.BAS
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 REM
40 REM PUT CODE TO BENCHMARK
50 REM HERE.
60 REM
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

Then I reran my GOSUB test:

0 REM GOSUB3.BAS
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 GOSUB100
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END
100 RETURN

When I run this, it prints the time taken for each run, and then the average:

GOSUB benchmarks.

Now for the stupid test, I added some junk after “GOSUB 100” and filled it up to the end of the line.

Side Note: I am loading BASIC programs in ASCII, so the program lines load in as if they were being typed in. Thus, it counts the characters “100 GOSUB ” as part of it. But, as soon as you press ENTER, that line is tokenized and GOSUB becomes a 1-byte token (is it 1-byte?). Then you can EDIT the line and Xtend it and type in a few more characters. So what I show here isn’t the max line size, but it is the max line size I could load in from an ASCII BASIC file. But I digress.

My line looks like this:

30 GOSUB100 ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRST*

Now when I run this, the extra “scan to the end” time causes the benchmark to show 1507!

But who would do that? If anything, you would have a colon and real stuff after the GOSUB. So I tried this by changing the space after “GOSUB 100” to a colon and “REM”:

30 GOSUB100:REMABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOP*

Now that’s a completely legitimate line. (Pretend the ABC/123 gibberish is a really long comment.)

This benchmark shows 1508, so no real difference. When GOSUB is encountered, BASIC has to scan to the end of line or a colon, whichever comes first, so it should find the colon instantly, BUT, after the RETURN it still has to scan through that REM to find the next line. Thus, it’s the same amount of scanning.

This is a meaningless test.

With real code, you might be doing something like this:

30 GOSUB 100:PRINT "BACK FROM ROUTINE"

Or you could have written it out as two lines:

30 GOSUB 100
40 PRINT "BACK FROM ROUTINE"

I thought the first one should be faster, since it has one less line.

And combining lines is good.

Right?

Well, in my silly example #2 above, what if I moved the REM to the next line, like this:

30 GOSUB100
40 REMABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ABCDEFGHIJKLMNOPQRST*

That’s basically the same, just with an extra line number.

When I run this, I get a benchmark value of … 860!

Look how much faster it is by moving code to a separate line! I guess we should use separate lines after all, then…

What’s going on here? The key seems to be the “REM” keyword. When BASIC encounters a REM, it can just skip to the next line. That makes it faster. But, it seems to be doing something different when a REM is in the middle of a line.

It appears it is faster to NOT put REMarks after a GOSUB.

30 GOSUB 100:REM MOVE PLAYER UP

…shows 266. This is slower than…

30 GOSUB 100
40 REM MOVE PLAYER UP

…which shows 220. And I’ve certainly seen programmers make use of the apostrophe REMark shortcut.

The apostrophe represents “:REM” (colon REM) so these two are the same:

30 GOSUB 100:REM MOVE PLAYER UP
30 GOSUB 100' MOVE PLAYER UP

 

REM versus ‘ for comments.

Thus, using the one character apostrophe may look like it saves code space versus “:REM” but it does not. It does save printer paper, though :)

But I digress…

It looks like I’m going to need another test. In the meantime, don’t put things after a GOSUB om the same line. It appears to be faster to put them on the next line:

30 REM MOVE PLAYER UP
40 GOSUB100

That is 219.

30 GOSUB100:REM MOVE PLAYER UP

That is 266!

30 GOSUB100
40 REM MOVE PLAYER UP

That is backwards. But it produces 220, so it doesn’t penalize you for being backwards.

Oh, and as Steve Bjork pointed out in the Facebook group, a faster solution is not to use REMs at all. I think I need smarter examples. There are too many real programmers watching. For you folks:

0 REM THIS TAKES 371
5 DIM Z,TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 GOSUB100:Z=Z+1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END
100 RETURN
0 REM THIS TAKES 357
5 DIM Z,TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 GOSUB100
40 Z=Z+1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END
100 RETURN

Easy peasy.

A few additional REMarks…

Above, I mentioned that the apostrophe represented “:REM”. Thus, doing something like this:

100 'MOVE UP

Is slower than doing:

100 REM MOVE UP

It may look smaller, but the first example is like scanning “:REM MOVE UP” and the second is just “REM MOVE UP” so it has less work to do.

And yes, I tested it inside the benchmark program:

30 REM

…is 82.

30 '

…is 90.

30 :REM

…is also 90.

I guess it’s just treating the apostrophe as “:REM” internally, or maybe it’s a 2-byte token for “:REM” versus a different 1-byte token just for “REM” or something. Dunno.

But interesting.

Until next time…

Optimizing Color BASIC, part 6

See also: Part 1, Part 2, Part 3, Part 4 and Part 5.

Size Matters. Or Space Matters. You decide.

Sometimes we want to optimize for code space, and other times for variable and string space. For example, if you want to create a 32 character string like this:

<------------------------------->

…you could either declare it as a static string:

A$="<------------------------------->"

Or build it programatically like this:

A$="<"+STRING$(30,"-")+">"

The second version takes about 16 bytes less of program space because the string is generated dynamically in string memory rather than being stored in the tokenized BASIC program.

Doing it the second way seems like a good idea, but keep in mind when you make this string, somewhere in string memory will be those 32 characters, PLUS you still have the BASIC statements that created it. It’s actually larger, overall, to do it this way.

BUT, any temporary strings like that might make sense to create on-the-fly as you need them since that memory can be reused by other strings.

10 A$="<------------------------------->"
20 PRINT A$:PRINT "MAIN MENU":PRINT A$
30 INPUT "COMMAND";C$

In the above example, A$ points to that sequence of characters INSIDE the BASIC program itself. It is always there. But, if you generated the string only when needed, the memory used by A$ could be used for other purposes:

10 A$="<"+STRING$(30,"-")+">"
20 PRINT A$:PRINT "MAIN MENU":PRINT A$
30 INPUT "COMMAND";A$

Above, A$ is allocated and turned in to the long 32 character string, printed, and then the memory used by A$ can be reused by INPUT. I suppose just setting it to A$=”” might give it back, too.

This would come with a speed penalty since the creation and destruction of strings takes more CPU time than just using a static string.

I think I may have also mentioned that, even if a string is part of the BASIC program, if you do anything to it, it has to duplicate it in string memory which creates a second copy of it:

10 A$="<------------------------------->"
20 A$=A$+"HELLO"

Above, A$ initially starts out pointing inside the program itself, taking up none of that string memory. At line 20, the entire A$ gets copied in to string memory and then the extra characters are added to it. At that point, that string is now using over twice the memory (program space plus string space).

Let’s try to prove that. The CLEAR command is used to reserve memory for strings. By default, 200 bytes are reserved. We can change that by doing CLEAR 0. Here is a program that has no string memory, yet it works because the string is inside the program space:

10 CLEAR 0
20 A$="<------------------------------->"

If you run this, you can PRINT A$ and prove it exists, but the moment you try to declare a second string like B$=”HELLO” or even manipulate A$ like A$=A$+”” you will get an Out of String Space errors (?OS ERROR):

Proving strings can live inside program space.

Sometimes you choose speed over size, and sometimes you choose size over speed. Thus, you can optimize for speed (which we have been doing so far), or optimize for size.

But I digress.

Elementary, my dear DATA

Today I want to discuss DATA statements. In my assembly language series, I showed how the lwasm assembler can generate a small BASIC program that has the assembly code in DATA statements, and a small loader which will READ them and POKE them in to memory:

10 READ A,B
20 IF A=-1 THEN 70
30 FOR C = A TO B
40 READ D:POKE C,D
50 NEXT C
60 GOTO 10
70 END
80 DATA 16128,16167,142,63,14,166,128,39,6,173,159,160,2,32,246,57,84,104,105,115,32,105,115,32,97,32,115,101,99,114,101,116,32,109,101,115,115,97,103,101,46,0,-1,-1

DATA statements can contain base-10 numbers, base-16 hexadecimal numbers, or strings (and I guess base-8 octal numbers too, but who would do that?). This means you could have the data stored as numbers:

100 DATA 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
100 DATA &H0,&H2,&H3,&H4,&H5,&H6,&H7,&H8,&H9,&HA,&HB,&HC,&HD,&HE,&HF

…or as strings like “FE” or “1F” that you could READ and convert to hex numbers in the loader:

100 DATA 59,4F,55,20,4D,55,53,54,20,42,45,20,42,4F,52,45,44

When it comes to a size, hexadecimal numbers without the “&H” in front are always smaller than their base-10 equivalent. Single-digit decimal values 0-9 are single digit 0-9 in hex. Double-digit decimal values 10-15 are represented by single digit hexadecimal values A-F. Every time a value from 10-15 appears, representing it in decimal takes up twice as much space. And for three digit decimal values 100-255, those are two digit hex values 64-FF.

If you store the data as strings, like this:

100 DATA 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
101 DATA 10,11,12,13,14,15,16,17,18,19,1A,1B,1C,1D,1E,1F

…you can read each string in, and convert it to a number by adding “&H” to the start and using the VAL() function:

READ B$:B=VAL("&H"+B$)

For the decimal and hexadecimal versions, you just read it as a number:

READ B

The smallest version would be the string approach, since three digit numbers can be represented with two digits. But, doing the string conversion with VAL() makes it slower.

The fastest version would be using hexadecimal numbers since BASIC can parse hex values faster than base-10 numbers. But, this is the largest version since 255 in decimal (3 characters) or FF as a hex string (2 characters) would be represented as &HFF as a hex number (4 characters). Those numbers would take up twice as much space as the string version!

In the middle is base-10 numbers. It’s not the largest, or the smallest, or the fastest or the slowest. It makes an ideal compromise.

Let’s do a test. I have DATA statements representing values from 0 to 255. I have three versions: the first will use base-10 numbers, the second will use hexadecimal numbers, and the third will use strings that are just the hex part of the “&H” number.

Base 10 Numbers

0 REM DATADEC.BAS
10 TIMER=0:TM=TIMER
20 FOR A=0 TO 255
30 READ B
40 NEXT
50 PRINT TIMER-TM
60 END
100 DATA 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
101 DATA 16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
102 DATA 32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47
103 DATA 48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63
104 DATA 64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79
105 DATA 80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95
106 DATA 96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111
107 DATA 112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127
108 DATA 128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143
109 DATA 144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159
110 DATA 160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175
111 DATA 176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191
112 DATA 192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207
113 DATA 208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223
114 DATA 224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239
115 DATA 240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255

Hexadecimal Base-16 Numbers

0 REM DATAHEX.BAS
10 TIMER=0:TM=TIMER
20 FOR A=0 TO 255
30 READ B
40 NEXT
50 PRINT TIMER-TM
60 END
100 DATA &H0,&H1,&H2,&H3,&H4,&H5,&H6,&H7,&H8,&H9,&HA,&HB,&HC,&HD,&HE,&HF
101 DATA &H10,&H11,&H12,&H13,&H14,&H15,&H16,&H17,&H18,&H19,&H1A,&H1B,&H1C,&H1D,&H1E,&H1F
102 DATA &H20,&H21,&H22,&H23,&H24,&H25,&H26,&H27,&H28,&H29,&H2A,&H2B,&H2C,&H2D,&H2E,&H2F
103 DATA &H30,&H31,&H32,&H33,&H34,&H35,&H36,&H37,&H38,&H39,&H3A,&H3B,&H3C,&H3D,&H3E,&H3F
104 DATA &H40,&H41,&H42,&H43,&H44,&H45,&H46,&H47,&H48,&H49,&H4A,&H4B,&H4C,&H4D,&H4E,&H4F
105 DATA &H50,&H51,&H52,&H53,&H54,&H55,&H56,&H57,&H58,&H59,&H5A,&H5B,&H5C,&H5D,&H5E,&H5F
106 DATA &H60,&H61,&H62,&H63,&H64,&H65,&H66,&H67,&H68,&H69,&H6A,&H6B,&H6C,&H6D,&H6E,&H6F
107 DATA &H70,&H71,&H72,&H73,&H74,&H75,&H76,&H77,&H78,&H79,&H7A,&H7B,&H7C,&H7D,&H7E,&H7F
108 DATA &H80,&H81,&H82,&H83,&H84,&H85,&H86,&H87,&H88,&H89,&H8A,&H8B,&H8C,&H8D,&H8E,&H8F
109 DATA &H90,&H91,&H92,&H93,&H94,&H95,&H96,&H97,&H98,&H99,&H9A,&H9B,&H9C,&H9D,&H9E,&H9F
110 DATA &HA0,&HA1,&HA2,&HA3,&HA4,&HA5,&HA6,&HA7,&HA8,&HA9,&HAA,&HAB,&HAC,&HAD,&HAE,&HAF
111 DATA &HB0,&HB1,&HB2,&HB3,&HB4,&HB5,&HB6,&HB7,&HB8,&HB9,&HBA,&HBB,&HBC,&HBD,&HBE,&HBF
112 DATA &HC0,&HC1,&HC2,&HC3,&HC4,&HC5,&HC6,&HC7,&HC8,&HC9,&HCA,&HCB,&HCC,&HCD,&HCE,&HCF
113 DATA &HD0,&HD1,&HD2,&HD3,&HD4,&HD5,&HD6,&HD7,&HD8,&HD9,&HDA,&HDB,&HDC,&HDD,&HDE,&HDF
114 DATA &HE0,&HE1,&HE2,&HE3,&HE4,&HE5,&HE6,&HE7,&HE8,&HE9,&HEA,&HEB,&HEC,&HED,&HEE,&HEF
115 DATA &HF0,&HF1,&HF2,&HF3,&HF4,&HF5,&HF6,&HF7,&HF8,&HF9,&HFA,&HFB,&HFC,&HFD,&HFE,&HFF

String HEX Numbers

0 REM DATASTR.BAS
10 TIMER=0:TM=TIMER
20 FOR A=0 TO 255
30 READ B$:B=VAL("&H"+A$)
40 NEXT
50 PRINT TIMER-TM
60 END
100 DATA 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
101 DATA 10,11,12,13,14,15,16,17,18,19,1A,1B,1C,1D,1E,1F
102 DATA 20,21,22,23,24,25,26,27,28,29,2A,2B,2C,2D,2E,2F
103 DATA 30,31,32,33,34,35,36,37,38,39,3A,3B,3C,3D,3E,3F
104 DATA 40,41,42,43,44,45,46,47,48,49,4A,4B,4C,4D,4E,4F
105 DATA 50,51,52,53,54,55,56,57,58,59,5A,5B,5C,5D,5E,5F
106 DATA 60,61,62,63,64,65,66,67,68,69,6A,6B,6C,6D,6E,6F
107 DATA 70,71,72,73,74,75,76,77,78,79,7A,7B,7C,7D,7E,7F
108 DATA 80,81,82,83,84,85,86,87,88,89,8A,8B,8C,8D,8E,8F
109 DATA 90,91,92,93,94,95,96,97,98,99,9A,9B,9C,9D,9E,9F
110 DATA A0,A1,A2,A3,A4,A5,A6,A7,A8,A9,AA,AB,AC,AD,AE,AF
111 DATA B0,B1,B2,B3,B4,B5,B6,B7,B8,B9,BA,BB,BC,BD,BE,BF
112 DATA C0,C1,C2,C3,C4,C5,C6,C7,C8,C9,CA,CB,CC,CD,CE,CF
113 DATA D0,D1,D2,D3,D4,D5,D6,D7,D8,D9,DA,DB,DC,DD,DE,DF
114 DATA E0,E1,E2,E3,E4,E5,E6,E7,E8,E9,EA,EB,EC,ED,EE,EF
115 DATA F0,F1,F2,F3,F4,F5,F6,F7,F8,F9,FA,FB,FC,FD,FE,FF

If we look at the size JUST the DATA statement lines take up (lines 100-115), here is the size breakdown:

  • DATADEC.BAS – speed 78, size 1010
  • DATAHEC.BAS – speed 49, size 1360
  • DATASTR.BAS – speed 109, size 848

As you can see, using hex values is over twice as fast as using string versions and converting them to hex.

For size, using strings is about 15% smaller in my test program than using decimal values.

If load time is important, use hex. If program space is important, use strings. Otherwise, normal decimal values are a good compromise between speed and size.

Bonus Data

One more thing… If we are going to use strings anyway, we could save more space by making the hex strings long, and parsing through them to pull out the individual hex values. Every number has to be two characters (00, 01, 02 … 0E, 0F) and this additional string parsing makes it even slower, but if code size is most important, try this:

0 REM DATASTR2.BAS
10 TIMER=0:TM=TIMER
20 FOR A=0 TO 15
30 READ B$:FOR I=1 TO 32 STEP 2:B=VAL("&H"+MID$(B$,I,2)):NEXT
40 NEXT
50 PRINT TIMER-TM
60 END
100 DATA 000102030405060708090A0B0C0D0E0F
101 DATA 101112131415161718191A1B1C1D1E1F
102 DATA 202122232425262728292A2B2C2D2E2F
103 DATA 303132333435363738393A3B3C3D3E3F
104 DATA 404142434445464748494A4B4C4D4E4F
105 DATA 505152535455565758595A5B5C5D5E5F
106 DATA 606162636465666768696A6B6C6D6E6F
107 DATA 707172737475767778797A7B7C7D7E7F
108 DATA 808182838485868788898A8B8C8D8E8F
109 DATA 909192939495969798999A9B9C9D9E9F
110 DATA A0A1A2A3A4A5A6A7A8A9AAABACADAEAF
111 DATA B0B1B2B3B4B5B6B7B8B9BABBBCBDBEBF
112 DATA C0C1C2C3C4C5C6C7C8C9CACBCCCDCECF
113 DATA D0D1D2D3D4D5D6D7D8D9DADBDCDDDEDF
114 DATA E0E1E2E3E4E5E6E7E8E9EAEBECEDEEEF
115 DATA F0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF
  • DATASTR2.BAS – speed 172, size 624

By removing all those commas, it’s the smallest data size yet. And, since the longest line you can type* in BASIC is 249 characters…

BASIC allows for typing up to 249 characters on a line.

…you could really back some data in to it.

Side Note: *The BASIC editor allows for 249 characters, but when you press ENTER, the line is tokenized. Keywords like PRINT get reduced to smaller tokens. You may have typed a five character keyword (taking up part of that 249 byte buffer), but when you press ENTER, that five characters may be converted to a one byte token. This means it’s possible for a BASIC line to contain more valid code than you could actually type. There have been utilities for BASIC (such as Carl England‘s CRUNCH) that do this, packing program lines as big as they can be, and making them un-editable since the moment you try, they get detokenized and you lose anything past 249 characters. We’ll have to discuss this in a later installment.

With that in mind, we could pack any type of DATA in to fewer lines and save a bit. Each line number takes up 6 bytes, so every line we can eliminate makes our program smaller.

Through some trail-and-error experimentation, I got this:

0 REM DATASTR3.BAS
10 TIMER=0:TM=TIMER
20 FOR A=0 TO 15
30 READ B$:IF B$="*" THEN 50
35 FOR I=1 TO LEN(B$) STEP 2:B=VAL("&H"+MID$(B$,I,2)):NEXT
40 NEXT
50 PRINT TIMER-TM
60 END
100 DATA000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F404142434445464748494A4B4C4D4E4F505152535455565758595A5B5C5D5E5F606162636465666768696A6B6C6D6E6F7071727374757677
107 DATA78797A7B7C7D7E7F808182838485868788898A8B8C8D8E8F909192939495969798999A9B9C9D9E9FA0A1A2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6B7B8B9BABBBCBDBEBFC0C1C2C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8D9DADBDCDDDEDFE0E1E2E3E4E5E6E7E8E9EAEBECEDEEEF
108 DATAF0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF,*
  • DATASTR2.BAS – speed 167, size 532

As you can see, this is slightly faster than the previous combined hex string version because it does less READs. It is also slightly smaller because it has less line numbers. And, I think, it could even be packed a bit more, but because I am loading these test programs as ASCII files in to XRoar, the lines cannot exceed 249 characters (the same as typing them in) so this was as much as I could fit on them (even though using EDIT on these lines shows I could still type about 6 more characters, but it only seemed to show me 5 more after I re-listed it).

Fun with DATA, eh?

Until next time, I leave you with this:

A virtual cookie goes to the first person that finds them.

Interfacing assembly with BASIC via DEFUSR, part 5

See also: Part 1, Part 2Part 3, and Part 4.

Now that I’ve gotten my digressions with BASIC variable access speeds and input speeds and INKEY/INSTR and GOTO/GOSUB speeds and HEX versus DECimal speeds out of the way, I can finally get back to digressing on using assembly language to speed up BASIC.

Where were we?

Oh, right…

In part 4 of this article I presented an example of using BASIC to scroll a PAC-MAN style maze that was too tall to fit on 16 the line screen.

I also presented some assembly code that would scroll the screen much faster than BASIC could ever hope to.

Today, let’s combine these two items and try to create a fast-scrolling maze playfield.

Let’s get started!

START SHOUTING AT ME! (revisited)

But before we get started, let’s revisit the uppercase routine I presented in Part 3.

Simon Jonassen is well-known in the CoCo Community for doing some amazing things on the original CoCo 1 and 2 hardware (and, lately, the CoCo 3 as well). He is quite the master of optimization, and has created some stunning sound players that allow the original CoCo to have cool background music while doing other things (if only the game programmers of 1980 knew about this!). He also has a cool web-based CoCo semigraphics editor. He provided a few enhancements:

* UCASE.ASM v1.01
* by Allen C. Huffman of Sub-Etha Software
* www.subethasoftware.com / alsplace@pobox.com
*
* 1.01 a bit smaller per Simon Jonassen
*
* DEFUSRx() uppercase output function
*
* INPUT:   VARPTR of a string
* RETURNS: # chars processed
*
* EXAMPLE:
*   CLEAR 200,&H3F00
*   DEFUSR0=&H3F00
*   A$="Print this in uppercase."
*   PRINT A$
*   A=USR0(VARPTR(A$))
*
ORGADDR     EQU     $3f00

GIVABF      EQU     $B4F4   * 46324
INTCNV      EQU     $B3ED   * 46061
CHROUT      EQU     $A002

            opt	    6809    * 6809 instructions only
            opt	    cd      * cycle counting

            org     ORGADDR

start       jsr     INTCNV  * get passed in value in D
            tfr     d,x     * move value (varptr) to X
            ldy     2,x     * load string addr to Y
;           ldb     ,x      * load string len to B
            beq     null    * exit if strlen is 0
            ldb     ,x      * load string len to B
            ldx     #0      * clear X (count of chars conv)

loop        lda     ,y+	    * get next char, inc Y
;           lda     ,y      * load char in A
            cmpa    #'a     * compare to lowercase A
            blt     nextch  * if less, no conv needed
            cmpa    #'z     * compare to lowercase Z
            bgt     nextch  * if greater, no conv needed
lcase       suba    #32     * subtract 32 to make uppercase
            leax    1,x     * inc count of chars converted
nextch      jsr     [CHROUT] * call ROM output character routine
;           leay    1,y     * increment Y pointer
cont        decb            * decrement counter
            bne	    loop    * not done yet
;           beq     exit    * if 0, go to exit
;           bra     loop    * go to loop

exit        tfr     x,d     * move chars conv count to D
;           bra     return
            jmp     GIVABF  * return to caller

null        ldd     #-1     * load -2 as error
return      jmp     GIVABF  * return to caller

* lwasm --decb -o ucase2.bin ucase2.asm -l
* lwasm --decb -f basic -o ucase2.bas ucase2.asm -l
* lwasm --decb -f ihex -o ucase2.hex ucase2.asm -l
* decb copy -2 -r ucase2.bin ../Xroar/dsk/DRIVE0.DSK,UCASE2.BIN

This code is 46 bytes long, compared to my original which was 49 bytes. The changes are:

  1. Move the initial LDB with string length to after the string length check, since it’s only needed if we get past that check and have a string.
  2. Change my LDA ,Y to LDA ,Y+ to increment Y there and not need the LEAY 1,Y later.
  3. Changed my “characters left” check from BEQ EXIT and BRA LOOP to BNE LOOP since it can just fall through and continue otherwise.
  4. Change a BRA RETURN to JMP GIVABF, since the branch would just end up at a JMP, and doing a JMP is faster than branching to a JMP.

Minor changes, but every little bit helps.

Simon also pointed out an embarrassing oversight in my very first example shown in part 1:

ORGADDR EQU $3f00

GIVABF EQU $B4F4   * 46324
INTCNV EQU $B3ED   * 46061

       org  ORGADDR
start  jsr  INTCNV * get passed in value in D
       tfr  d,x    * transfer D to X so we can manipulate it
       leax 1,x    * add 1 to X
       tfr  x,d    * transfer X back to D
return jmp  GIVABF * return to caller

He reminded me about the “addd” instruction which can add to D. For some reason, I was thinking I needed to use LEA to add to a 16-bit register, and since “LEAD 1,D” wasn’t a thing, I did the whole transfer to X, add one to X, transfer back to D thing.

He said I should just do this:

* ADDONE.ASM v1.01
* by Allen C. Huffman of Sub-Etha Software
* www.subethasoftware.com / alsplace@pobox.com
*
* 1.01 made less stupid per Simon Jonassen
*
* DEFUSRx() add one routine
*
* INPUT:   integer to add one to
* RETURNS: value +1
*
* EXAMPLE:
*   CLEAR 200,&H3F00
*   DEFUSR0=&H3F00
*   A=USR0(42)
*   PRINT A
*
ORGADDR EQU     $3f00

INTCNV  EQU     $B3ED   * 46061
GIVABF  EQU     $B4F4   * 46324

        org     ORGADDR

start   jsr     INTCNV  * get passed in value in D
;       tfr     d,x     * transfer D to X so we can manipulate it
;       leax    1,x     * add 1 to X
;       tfr     x,d     * transfer X back to D
        addd    #1      * add 1 to D
return  jmp     GIVABF  * return to caller

* lwasm --decb -o -9 addone2.bin addone2.asm
* lwasm --decb -f basic -o addone2.bas
* decb copy -2 -r addone2.bin ../Xroar/dsk/DRIVE0.DSK,ADDONE2.BIN

See what happens when people who actually know 6809 assembly language look at my code? Thanks, Simon!

Moving Day

My simple examples have been building up to slight less-simple ones that do something more useful, like moving data that would take days to move in BASIC. Previously, I presented a PAC-MAN maze that could “scroll” up and down the screen by PRINTing the whole screen each time with just the lines of the maze that should be visible. I also presented some assembly code that could be used to move the screen up, down, left or right.

Today, the first thing I want to do is integrate that assembly routine in to the PAC-MAN maze code. Instead of redrawing the entire screen each time, BASIC will only need to redraw the top or bottom line depending on which was the screen just scrolled. If my math is correct, printing one line instead of sixteen lines should be at least twice faster.

First, let’s revisit the screen moving assembly code, which, thanks to comments from L. Curtis Boyle, now has a smarter routine for checking which direction the user passed in to scroll (though it could still be thrown off by values larger than 255):

* SCRNMOVE.ASM v1.01
* by Allen C. Huffman of Sub-Etha Software
* www.subethasoftware.com / alsplace@pobox.com
*
* DEFUSRx() screen moving function
*
* INPUT:   direction (1=up, 2=down, 3=left, 4=right)
* RETURNS: 0 on success
*         -1 if invalid direction
*
* 1.01 better param parsing per L. Curtis Boyle
*
* EXAMPLE:
*   CLEAR 200,&H3F00
*   DEFUSR0=&H3F00
*   A=USR0(1)
*
ORGADDR EQU     $3f00

INTCNV  EQU     $B3ED   * 46061
GIVABF  EQU     $B4F4   * 46324

UP      EQU     1
DOWN    EQU     2
LEFT    EQU     3
RIGHT   EQU     4
SCREEN  EQU     1024    * top left of screen
END     EQU     1535    * bottom right of screen

        org     ORGADDR

start   jsr     INTCNV  * get incoming param in D
;       cmpb    #UP
        decb            * decrement B
        beq     up      * if one DEC got us to zero
;       cmpb    #DOWN
        decb            * decrement B
        beq     down    * if two DECs...
;       cmpb    #LEFT
        decb            * decrement B
        beq     left    * if three DECs...
;       cmpb    #RIGHT
        decb            * decrement B
        beq     right   * if four DECs...
error   ldd     #-1     * load D with -1 for error code
        bra     exit

up      ldx     #SCREEN+32
loopup  lda     ,x
        sta     -32,x
        leax    1,x
        cmpx    #END
        ble     loopup
        bra     return

down    ldx     #END-32
loopdown lda    ,x
        sta     32,x
        leax    -1,x
        cmpx    #SCREEN
        bge     loopdown
        bra     return

left    ldx     #SCREEN+1
loopleft lda    ,x
        sta     -1,x
        leax    1,x
        cmpx    #END
        ble     loopleft
        bra     return

right   ldx     #END-1
loopright lda   ,x
        sta     1,x
        leax    -1,x
        cmpx    #SCREEN
        bge     loopright
    
return  ldd     #0      * return code (0=success)
exit    jmp     GIVABF  * return to BASIC

* lwasm --decb -9 -o scrnmove2.bin scrnmove2.asm
* lwasm --decb -f basic -o scrnmove2.bas scrnmove2.asm
* decb copy -2 -r scrnmove2.bin ../Xroar/dsk/DRIVE0.DSK,SCRNMOVE2.BIN

The generated BASIC program looks like:

10 READ A,B
20 IF A=-1 THEN 70
30 FOR C = A TO B
40 READ D:POKE C,D
50 NEXT C
60 GOTO 10
70 END
80 DATA 16128,16217,189,179,237,90,39,14,90,39,28,90,39,42,90,39,55,204,255,255,32,67,142,4,32,166,132,167,136,224,48,1,140,5,255,47,244,32,47,142,5,223,166,132,167,136,32,48,31,140,4,0,44,244,32,30,142,4,1,166,132,167,31,48,1,140,5,255,47,245,32,14
90 DATA 142,5,254,166,132,167,1,48,31,140,4,0,44,245,204,0,0,126,180,244,-1,-1

Let’s take the original maze program and modify it to use the assembly routines instead:

0 REM MAZETEST.BAS
10 DIM MZ$(31)
20 FOR A=0 TO 30:READ MZ$(A):NEXT
30 CLS
40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 FOR LN=0 TO 15
70 PRINT @LN*32,MZ$(LN+ST);
80 NEXT:NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 FOR LN=0 TO 15
120 PRINT @LN*32,MZ$(LN+ST);
130 NEXT:NEXT
140 GOTO 40
999 GOTO 999
1000 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"    
1010 DATA "X            XX            X"    
1020 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1030 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1040 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1050 DATA "X                          X"
1060 DATA "X XXXX XX XXXXXXXX XX XXXX X"   
1070 DATA "X XXXX XX XXXXXXXX XX XXXX X"    
1080 DATA "X      XX    XX    XX      X"   
1090 DATA "XXXXXX XXXXX XX XXXXX XXXXXX"    
2100 DATA "     X XXXXX XX XXXXX X     "    
2110 DATA "     X XX          XX X     "    
2120 DATA "     X XX XXXXXXXX XX X     "   
2130 DATA "XXXXXX XX X      X XX XXXXXX"   
2140 DATA "          X      X          "   
2150 DATA "XXXXXX XX X      X XX XXXXXX"   
2160 DATA "     X XX XXXXXXXX XX X     "   
2170 DATA "     X XX          XX X     "   
2180 DATA "     X XX XXXXXXXX XX X     "   
2190 DATA "XXXXXX XX XXXXXXXX XX XXXXXX"   
3200 DATA "X            XX            X"   
3210 DATA "X XXXX XXXXX XX XXXXX XXXX X"   
3220 DATA "X XXXX XXXXX XX XXXXX XXXX X"   
3230 DATA "X   XX                XX   X"   
3240 DATA "XXX XX XX XXXXXXXX XX XX XXX"   
3250 DATA "XXX XX XX XXXXXXXX XX XX XXX"   
3260 DATA "X      XX    XX    XX      X"   
3270 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"   
3280 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"   
3290 DATA "X                          X"   
4200 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"

The maze is 31 lines tall. The fake scrolling is done by redrawing the entire screen line-by-line. The screen is 16 lines tall, so initially we draw maze lines 0-15. Then we redraw maze lines 1-16, giving the appearance that the screen is scrolling up and a line has scrolled off the top of the screen. This repeats for lines 2-17, 3-18 and so on until we’ve drawn the last 16 lines of 15-30.

After “scrolling” all the way to the bottom of the maze, a second block of FOR/NEXT loops reverses the process, starting with maze lines 15-30, then 15-30 and so on until it is back to displaying the top lines 0-15.

The scrolling is done by the FOR/NEXT loops using the LN variables in lines 60-80 and 110-130.

Rather than redrawing all sixteen lines each time, we could use the assembly routine to move the screen, and then we’d just draw one line – top or bottom, depending on which was the screen scrolled.

In effect, we’d replace this:

40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 FOR LN=0 TO 15
70 PRINT @LN*32,MZ$(LN+ST);
80 NEXT:NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 FOR LN=0 TO 15
120 PRINT @LN*32,MZ$(LN+ST);
130 NEXT:NEXT

…with this:

35 FOR LN=0 TO 15:PRINT @LN*32,MZ$(LN+ST);:NEXT
40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 Z=USR0(1)
70 PRINT @480,MZ$(ST+15);
80 NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 Z=USR0(2)
120 PRINT @0,MZ$(ST);
130 NEXT

Line 35 was added to initially draw the screen. After that, the assembly routine can move it up or down, and let BASIC redraw just the one line that needs to be drawn.

This, of course, requires the assembly routine to be loaded. We can take the BASIC loader of that and renumber it so we can call it from our test program. Here is the scrnmove2.asm updated code from the top of this article, renumbered and changed in to a subroutine:

5000 REM ASSEMBLY ROUTINE
5010 READ A,B
5020 IF A=-1 THEN 5070
5030 FOR C = A TO B
5040 READ D:POKE C,D
5050 NEXT C
5060 GOTO 5010
5070 RETURN
5080 DATA 16128,16217,189,179,237,90,39,14,90,39,28,90,39,42,90,39,55,204,255,255,32,67,142,4,32,166,132,167,136,224,48,1,140,5,255,47,244,32,47,142,5,223,166,132,167,136,32,48,31,140,4,0,44,244,32,30,142,4,1,166,132,167,31,48,1,140,5,255,47,245,32,14
5090 DATA 142,5,254,166,132,167,1,48,31,140,4,0,44,245,204,0,0,126,180,244,-1,-1

Now, I can add this to the end of the mazetest.bas program and set it up so the USR0() calls will work:

10 CLEAR 200,&H3F00:DIM MZ$(31)
25 GOSUB5000:DEFUSR0=&H3F00

Now the program will use the CLEAR command to protect memory starting at &H3F00 (where the assembly will load), then after it reads all the maze strings in to memory (those DATA statements appear first), it will GOSUB 5000 and that READs the assembly code statements and POKEs them in to memory starting at &H3F00. The DEFUSR call is then done to make USR0(x) work.

With just a few lines changed, and getting our assembly routine in memory, now the maze scrolling is very fast! And, if we optimized the BASIC code around it, it could be even faster since most of the time is spent processing the BASIC program.

Here is the full listing:

0 REM MAZETST2.BAS - W/ASM!
10 CLEAR 200,&H3F00:DIM MZ$(31)
20 FOR A=0 TO 30:READ MZ$(A):NEXT
25 GOSUB5000:DEFUSR0=&H3F00
30 CLS

40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 FOR LN=0 TO 15
70 PRINT @LN*32,MZ$(LN+ST);
80 NEXT:NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 FOR LN=0 TO 15
120 PRINT @LN*32,MZ$(LN+ST);
130 NEXT:NEXT

35 FOR LN=0 TO 15:PRINT @LN*32,MZ$(LN+ST);:NEXT
40 REM SCROLL MAZE DOWN
50 FOR ST=0 TO 15
60 Z=USR0(1)
70 PRINT @480,MZ$(ST+15);
80 NEXT
90 REM SCROLL MAZE UP
100 FOR ST=15 TO 0 STEP-1
110 Z=USR0(2)
120 PRINT @0,MZ$(ST);
130 NEXT
140 GOTO 40
999 GOTO 999
1000 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"    
1010 DATA "X            XX            X"    
1020 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1030 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1040 DATA "X XXXX XXXXX XX XXXXX XXXX X"    
1050 DATA "X                          X"
1060 DATA "X XXXX XX XXXXXXXX XX XXXX X"   
1070 DATA "X XXXX XX XXXXXXXX XX XXXX X"    
1080 DATA "X      XX    XX    XX      X"   
1090 DATA "XXXXXX XXXXX XX XXXXX XXXXXX"    
2100 DATA "     X XXXXX XX XXXXX X     "    
2110 DATA "     X XX          XX X     "    
2120 DATA "     X XX XXXXXXXX XX X     "   
2130 DATA "XXXXXX XX X      X XX XXXXXX"   
2140 DATA "          X      X          "   
2150 DATA "XXXXXX XX X      X XX XXXXXX"   
2160 DATA "     X XX XXXXXXXX XX X     "   
2170 DATA "     X XX          XX X     "   
2180 DATA "     X XX XXXXXXXX XX X     "   
2190 DATA "XXXXXX XX XXXXXXXX XX XXXXXX"   
3200 DATA "X            XX            X"   
3210 DATA "X XXXX XXXXX XX XXXXX XXXX X"   
3220 DATA "X XXXX XXXXX XX XXXXX XXXX X"   
3230 DATA "X   XX                XX   X"   
3240 DATA "XXX XX XX XXXXXXXX XX XX XXX"   
3250 DATA "XXX XX XX XXXXXXXX XX XX XXX"   
3260 DATA "X      XX    XX    XX      X"   
3270 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"   
3280 DATA "X XXXXXXXXXX XX XXXXXXXXXX X"   
3290 DATA "X                          X"   
4200 DATA "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"

5000 REM ASSEMBLY ROUTINE
5010 READ A,B
5020 IF A=-1 THEN 5070
5030 FOR C = A TO B
5040 READ D:POKE C,D
5050 NEXT C
5060 GOTO 5010
5070 RETURN
5080 DATA 16128,16217,189,179,237,90,39,14,90,39,28,90,39,42,90,39,55,204,255,255,32,67,142,4,32,166,132,167,136,224,48,1,140,5,255,47,244,32,47,142,5,223,166,132,167,136,32,48,31,140,4,0,44,244,32,30,142,4,1,166,132,167,31,48,1,140,5,255,47,245,32,14
5090 DATA 142,5,254,166,132,167,1,48,31,140,4,0,44,245,204,0,0,126,180,244,-1,-1

Try the original BASIC-only version and then this new assembly-enhanced version and see what you think.

Next time, I will share a version of this scrolling maze that has a character you can control and move through the maze.

Until then…

Optimizing Color BASIC, part 5

See also: Part 1, Part 2, Part 3 and Part 4.

Updates:

  • 2/14/2017 – Fixed numeric typo (thanks, Geroge P!).

HEX versus DECimal Numbers

As Barbie once said*…

Math is hard! – Barbie

While Mattel’s Math-Is-Hard Barbie never quite made the splash the marketing team had hoped for, her sentiment lives on.

Side Note: *This is in reference to a the Teen Talk Barbie doll released in 1992, and out of the 270 phrases the doll could say, that was not one of them. The real quote was “Math class is tough!”

Earlier in this series, I touched on the fact that dealing with numbers is time consuming for BASIC. Something as simple as B=65535 takes time to process as the interpreter translates that base-10 decimal number in to an internal floating point value. The more digits, the more work. For instance:

0 REM NUMBERS.BAS
10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 B=1
40 NEXT
50 PRINT TIMER-TM

That prints a value of 183. If you change line 3 to read “B=12345” the number jumps to 485. You can see the increase:

  • B=1 – 183
  • B=12 – 262
  • B=123 – 337
  • B=1234 – 408
  • B=12345 = 485

Obviously, the more numbers to parse and convert, the more time it will take. It also seems to matter if the value has a decimal point in it:

  • B=1.0 – 403
  • B=1.1 – 476

Even though that is only three characters to process, it takes longer than B=123. Clearly, more work is being done on floating point values. Even though all Color BASIC numbers are represented internally as floating point, it still makes sense to avoid using them unless you really need them.

You can also represent a base-16 number in hexadecimal. For the value of 1, it feels like parsing “&H1” should take longer than parsing “1”. Let’s try:

  • B=&H1 – 180
  • B=&H12 – 175
  • B=&H123 – 200
  • B=&H1234 – 203

It seems that parsing a hexadecimal value is much faster than dealing with base-10 values. Using this, you could speed up a program just by switching to hex, provided that your numbers are between 0 and 65535 (the values that can be represented in hex). I was surprised to see that negative values also work:

  • B=&HFFFF – 201
  • B=-&HFFFF – 230

It seems dealing with the negative takes a bit of more time, though, so it makes sense to avoid using them unless you really need them. ;-)

With this in mind, let’s test a FOR/NEXT loop:

10 TIMER=0:TM=TIMER
20 FOR A=&H1 TO &H3E8
30 B=&H1
40 NEXT
50 PRINT TIMER-TM

This prints 182, which is basically the same speed as the original that used 0 TO 1000. I guess hexadecimals don’t really help out FOR/NEXT.

Why? Because the FOR/NEXT statement is only parsed once, then the loop counters are set up and done. It is probably a tad faster to use hex, but that savings only happens once in the “do it 1000 times” test.

But, as you see, USING the variables gets faster. Any place we use a number, it seems using a hex version of that number may speed it up:

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 IF A>&HFF THEN REM
40 NEXT
50 PRINT TIMER-TM

This prints 278. Doing it with A>255 prints 427! Imagine if you could speed up every time you used a number in your code:

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 PRINT@&H20,"HELLO"
40 NEXT
50 PRINT TIMER-TM

That prints 391, but changing it to PRINT@32 prints 469! If you use a bunch of PRINT@s in your code, you can speed them up just by switching to hex!

Math could be accelerated, too, simply due to the number conversion being faster. The more digits, the better advantage hex has:

  • B=A+&H270F – 285
  • B=A+9999 – 483

And the more numbers, the more time you can save by using hex. A common PRINT thing is to use the length of a string to figure out how to center is on the screen:

0 REM NUMBERS.BAS
10 TIMER=0:TM=TIMER
15 CLS:A$="HELLO, WORLD!":LN=LEN(A$)
20 FOR A=1 TO 1000
25 PRINT@32*8+16-LN/2,A$
40 NEXT
50 PRINT TIMER-TM

That prints 1284. Converting line 24 to HEX:

25 PRINT@&H20*&H8+&H1-LN/&H2,A$

And now it prints 1097.

In a game where you might be PRINTing things on the screen constantly, those savings could really add up.

Pity that math is hard, else we could just use hex in our programs and get a free speed boost.

Until next time…

 

An effect of PAL versus NTSC…

Just a quick note about my Optimizing Color BASIC series…

In later installments, I started doing some benchmarking using BASIC’s TIMER command. TIMER returns an incrementing count value that is based on the 60hz TV sync. At 60hz, the timer counts to 60 approximately every second.

I have been doing my tests in the XRoar emulator, which was created to emulate the Dragon, a CoCo clone manufactured in England. Over there, their TV systems are 50hz, so the Dragon and UK versions of the CoCo operate using a different TV sync rate.

Thus, in Dragon/UK mode, TIMER counts up to 50 every second.

And by default, XRoar starts up in PAL mode.

So it’s likely that some of the TIMER values I have given have been off, because I may have been running in PAL/50hz mode sometime.

So if you catch any, let me know and I will go back and retest and correct. I already did that in a recent part, but I may have missed others.

Mea culpa.

Optmizing Color BASIC, part 4

See also: Part 1, Part 2 and Part 3.

Updates:

  • 2/14/2017 – added section header.

INSTR and GOTO/GOSUB

Here’s a quickie that discusses making INSTR faster, and GOTO versus GOSUB.

Side Note: In the code examples, I am using spaces for readability. Spaces slow things down, so instead of “FOR A=1 TO 1000” you would write “FORA=1TO1000”. If you remove the unnecessary spaces from these examples, they get faster.

In the previous installment, I discussed ways to speed up doing things based on INPUT by using the INSTR command. INSTR will return the position of where a string is inside another string:

PRINT INSTR("ABC", "B")
2

Above, INSTR returns 2, indicating the string “B” was found starting at position 2 in the search string “ABC”. If the string is not found, it returns zero:

PRINT INSTR("ABC", "X")
0

You could use this in weird ways. For instance, if you wanted to match only certain animals, you could do something like this:

10 INPUT "ENTER AN ANIMAL";A$
20 A=INSTR("CATDOGCOWCHICKEN", A$)
30 IF A>0 THEN PRINT "I KNOW THAT ANIMAL!" ELSE PRINT "WHAT'S THAT?"
40 GOTO 10

INSTR can help you identify animals!

…but why would you want to do that? And, it just matches strings, so any combination that appears in the search string will be matched:

…or not.

Above, searching for “A” was a match, since there is an “A” in that weird animal string, as well as a “C”. There was no “B”, so…

Okay, nevermind. Forget I mentioned it.

I am sure there are many good uses for INSTR, but I mostly use it to match single letter commands (as mentioned previously) like this:

10 A$=INKEY$:IF A$="" THEN 10
20 LN=INSTR("ABCD", A$):IF LN=0 THEN 10
30 ON LN GOTO 100,200,300,400

Since INSTR returns 0 when there is no match, it’s an easy way to validate that the character entered is valid.

According to the documentation in the CoCo 3 BASIC manual, the full syntax is this:

INSTR(start-position, search-string, target-string)

You can use the optional start-position to begin scanning later in the string. For instance:

PRINT INSTR(3, "ABCDEF", "A")

That would print 0 since we are searching for “A” in the string “ABCDEF” starting at the third character (so, searching “CDEF”).

The manual also notes conditions where a 0 can be returned:

  • The start-position is greater than the number of characters in the search-string: INSTR(4, “ABC”, “A”)
  • The search-string is null: INSTR(“”, “A”)
  • It cannot find the target: INSTR(“ABC”, “Z”)

I was surprised today to (re)discover that INSTR considers a null (empty) string to be a match, sorta:

PRINT INSTR("ABC", "")
1

If the search-string is empty, it returns with the current search-position (which starts at 1, for the first character). This seems like a bug to me, but indeed, this behavior is the same in later, more advanced Microsoft BASICs.

I bring this up now because I was almost going to show you something really clever. Normally, I use INSTR with a string I get back from INKEY$. But, you can also use INKEY$ directly. And, since ON GOTO/GOSUB won’t go anywhere if the value is 0, I thought it might be clever to use it like this:

10 ON INSTR("ABCD",INKEY$) GOTO 100,200,300,400

…and this is smaller and much faster and works great … if there is a key waiting! If no key is waiting, INKEY$ returns a null (“”) and … INSTR returns a 1, and then ON GOTO goes to 100 even though that was not the intent.

Darnit. I thought I had a great way to speed things up. Consider the speed of this version:

INSTR example … workaround for not using a variable.

I thought by replacing line 30 with…

ON INSTR("ABC",INKEY$)GOSUB70,80,90

…I would be set. But, since an empty INKEY$ is returning “”, it’s always GOSUBing to line 70.

I tried a hacky workaround, by adding a bogus character tho the start of the string, and making that GOSUB to a RETURN located real close to that code (so it didn’t have to search as far to find it):

INSTR example.

…but, the overhead of that extra GOSUB/RETURN that happens EVERY TIME there is no key waiting was enough to make it slightly slower. If it wasn’t for that, we could do this maybe 30% faster and use less variables :)

So, unfortunately, I guess I have no optimization to show you… Just a failed attempt at one.

But wait, there’s more!

I posted about this on the CoCo mailing list and in the CoCo Facebook group to figure out if this behavior was a bug. There were several responses confirming this behavior in other versions of BASIC and languages.

On the list, Robert Hermanek responded with a great solution:

Issue is just getting a return of 1 when searching for empty string? Then why not:

10 ON INSTR(” ABC”,INKEY$) GOTO 10,100,200,300

…notice the space before A.
-RobertH

His brilliant suggestion works by adding a bogus character to the search-string, and making any match of that string (or “”) GOTO the same line. Thus, problem solved!

This won’t work with ON GOSUB since every GOSUB expects a RETURN. Each time you use GOSUB, it takes up seven bytes (?) of memory to remember where to RETURN to in the program. If you did something like this, you’ll see the issue:

10 PRINT MEM:GOSUB 10

I make a quick change to my program to use GOTO instead:

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 ON INSTR(" ABC",INKEY$) GOTO 40,70,80,90
40 NEXT
50 PRINT TIMER-TM:END
70 PRINT"A PRESSED":GOTO 40
80 PRINT"B PRESSED":GOTO 40
90 PRINT"C PRESSED":GOTO 40

For my timing test inside the FOR/NEXT loop, I made the first GOTO point to the NEXT in line 40, but if I wanted to wait “forever” until a valid key was pressed, I would make that 30.

This version shows a time of 412, so let’s compare that to doing it with A$:

10 TIMER=0:TM=TIMER
20 FORA=1 TO 1000
30 A$=INKEY$:IFA$="" THEN 40 ELSE ON INSTR("ABC",A$) GOTO 70,80,90
40 NEXT
50 PRINT TIMER-TM:END
70 PRINT"A PRESSED":GOTO 40
80 PRINT"B PRESSED":GOTO 40
90 PRINT"C PRESSED":GOTO 40

This produces 486. We now have a way to avoid using A$ and speed up code just a bit.

This made me wonder … what is faster? Using GOTO, or doing a GOSUB/RETURN? Let’s try to predict…

Both GOTO and GOSUB will have to take time to scan through the program to find the destination line, but GOSUB will also have to take time to store the “where are we” return location so RETURN can get back there. This makes me think GOSUB will be slower.

BUT, we need a GOTO to return from a GOTO, and a RETURN to return from a GOSUB. GOTO always has to scan through the program line by line to find the destination, while RETURN just jumps back to a location that was saved by GOSUB. So, if we have to scan through many lines, the return GOTO is probably slower than a RETURN.

Let’s try.

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 GOSUB 1000
40 NEXT
50 PRINT TIMER-TM:END
1000 RETURN

That prints 140.

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 GOTO 1000
40 NEXT
50 PRINT TIMER-TM:END
1000 GOTO 40

That prints 150.

I expect it is because line 1000 says “GOTO 40” and since 40 is lower than 1000, BASIC has to start at the top and go line by line looking for 40. If you GOTO to a higher number, it starts from the current line and moves forward. The speed of GOTO (and GOSUB) varies based on where the lines are:

GOTO should be quick when going to a line right after it:

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 GOTO 31
31 GOTO 32
32 GOTO 33
40 NEXT
50 PRINT TIMER-TM:END

That prints 170.

Line 30 has to scan one line down to find 31, then 31 scans one line down to find 32, and 32 scans one line down to find 40.

But if you change the order of the GOTOs:

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 GOTO 32
31 GOTO 40
32 GOTO 31
40 NEXT
50 PRINT TIMER-TM:END

…that prints 175.

It is the same number of GOTOs, but line 30 scans two lines ahead to find 32, then 32 has to start at the top and scan four lines in to find line 31, then line 31 has to scan two lines ahead to find 40.

If we add more lines (even REMs), more things have to be scanned:

0 REM
1 REM
2 REM
3 REM
4 REM
5 REM
6 REM
7 REM
8 REM
9 REM
10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 GOTO 32
31 GOTO 40
32 GOTO 31
40 NEXT
50 PRINT TIMER-TM:END

We are now up to 192 and the one with the lines in order is still 170.

The more lines GOTO (or GOSUB) has to search, the slower it gets. So while MAYBE there might be a case where a GOTO could be quicker than a RETURN, it seems that even with a tiny program, GOSUB wins.

So … the question is, is the time we save doing the INKEY like this:

30 ON INSTR(" ABC",INKEY$) GOTO 40,70,80,90

…going to offset the time we lose because those functions all have to GOTO back, rather than using a RETURN?

If this was a normal “wait for a keypress” then it probably wouldn’t matter much. We are just waiting, so there is no time to save by making that faster.

If we were reading keys for an action game, the actual “is there a keypress?” code would be faster, giving more time for the actual program. But, every time a key was pressed, the time taken to get in and out of that code would be slower. I guess it depends on how often the key is pressed.

A game like Frogger, where a key would be pressed every time the frog jumps to the next spot, might be worse than a game like Pac-Man where you press a direction and then don’t press anything again until the character is at the next intersection to turn down.

I am not sure how I would benchmark that, yet, but let’s try this… We’ll modify the code so “” actually is honored as an action:

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 ON INSTR(" UDLR",INKEY$) GOTO 100,200,300,400,500
40 NEXT
50 PRINT TIMER-TM:END
100 REM IDLE LOOP
110 GOTO 40
200 REM MOVE UP
210 GOTO 40
300 REM MOVE DOWN
310 GOTO 40
400 REM MOVE LEFT
410 GOTO 40
500 REM MOVE RIGHT
510 GOTO 40

Now if no key is waiting (“”), INSTR will return a 1 causing the code to GOTO 100 where the background (keep objects moving, animate stars, etc.) action would happen. Any other value would go to the handler for up, down, left or right.

This prints 457. Doing the same thing with GOSUB:

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 ON INSTR(" UDLR",INKEY$) GOSUB 100,200,300,400,500
40 NEXT
50 PRINT TIMER-TM:END
100 REM IDLE LOOP
110 RETURN
200 REM MOVE UP
210 RETURN
300 REM MOVE DOWN
310 RETURN
400 REM MOVE LEFT
410 RETURN
500 REM MOVE RIGHT
510 RETURN

…prints 487.  It appears GOSUB/RETURN is slower for us than GOTO here. But why? GOSUB seemed faster in the first example. Is ON GOSUB slower than ON GOTO?

Quick side test:

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 ON 1 GOSUB 1000
40 NEXT
50 PRINT TIMER-TM:END
1000 RETURN

That prints 249. GOSUB by itself was 140, so ON GOSUB is much slower.

10 TIMER=0:TM=TIMER
20 FOR A=1 TO 1000
30 ON 1 GOTO 1000
40 NEXT
50 PRINT TIMER-TM:END
1000 GOTO 40

…also prints 249, and GOTO by itself was 150. I am a bit surprised by this. I will have to look in to this further for an explanation.

But I digress…

We can still slow down GOTO. If we had a bunch of extra lines for GOTO to have to scan through:

0 REM
1 REM
2 REM
3 REM
4 REM
5 REM
6 REM
7 REM
8 REM
9 REM

…that will slow down every GOTO to an earlier number. With those REMs added, we have:

GOTO/GOTO version with REMs: 473 (up from 457 without REMs)

GOSUB/RETURN version with REMs: 487 (it never passes through the REMs)

It appears that, while GOSUB/RETURN may be faster on it’s own, when I put it in this test program, GOTO/GOTO is slightly faster, but that can change depending on how big the program is. More research is needed…

So I guess, for now, I’m going to avoid using a variable for INKEY$ and use GOTO/GOTO for my BASIC game…

Until next time…

Optimizing Color BASIC, part 3

See also: Part 1 and Part 2.

Updates:

  • 2/14/2017 – Added some section headers, bolded benchmark values.

Since I am right in the middle of a multi-part article on interfacing assembly with BASIC, now is a great time to discuss something completely different.

INPUT, INKEY, INSTR, and POKE

The reason I do this now is because it is going to tie it in with the next part of the assembly article. Since I have been discussing using assembly to speed things up, it is a good time to address a few more things that can be done to speed up BASIC before resorting to 6809 code. Since BASIC will be the weakest link, we should try to make it as strong weak link.

In 1980, Color BASIC offered a simple way to input a string or number:

10 INPUT "WHAT IS YOUR NAME";A$
20 PRINT "HELLO, ";A$
30 INPUT "HOW OLD ARE YOU";A
40 PRINT A;"IS PRETTY OLD."

My First Input

The original INPUT command was a very simple way to get data in to a program, but it was quite limited. It didn’t allow for string input containing commas, for instance, unless you typed it in quotes:

INPUT hates commas.

INPUT likes quotes.

INPUT also prints the question mark prompt.

LINE INPUT

When Extended Color BASIC was introduced, it brought many new features including the LINE INPUT command. This command did not force the question mark prompt, and would accept commas without quoting it:

LINE INPUT likes everything.

If you were trying to write a text based program (or a text adventure game), INPUT or LINE INPUT would be fine.

INKEY$

For times when you just wanted to get one character, without requiring the characters to be echoed as the user types them, and without requiring the user to press ENTER, there was INKEY$.

INKEY$ returns whatever key is being pressed, or nothing (“”) if no key is ready.

10 PRINT "PRESS ANY KEY TO CONTINUE..."
20 IF INKEY$="" THEN 20
30 PRINT "THANK YOU."

It can also be used with a variable:

10 PRINT "ARE YOU READY? ";
20 A$=INKEY$:IF A$="" THEN 20
30 IF A$="Y" THEN PRINT "GOOD!" ELSE PRINT "BAD."

This is the method we might use for a keyboard-controlled BASIC video game. For instance, if we want to read the arrow keys (up, down, left and right), each one of those keys generates an ASCII character when pressed:

UP    - CHR$(94) - ^ character
DOWN  - CHR$(10) - line feed
LEFT  - CHR$(8)  - backspace
RIGHT - CHR$(9)  - tab

Knowing this, we can detect arrow keys using INKEY$:

10 CLS:P=256+16
20 PRINT@P,"*";
30 A$=INKEY$:IF A$="" THEN 30
40 IF A$=CHR$(94) AND P>31 THEN P=P-32
50 IF A$=CHR$(10) AND P<479 THEN P=P+32
60 IF A$=CHR$(8) AND P>0 THEN P=P-1
70 IF A$=CHR$(9) AND P<510 THEN P=P+1
80 GOTO 20

The above program uses PRINT@ to print an asterisk (“*”) in the middle of the screen. Then, it waits until a key is pressed (line 30). Once a key is pressed, it looks at which key it was (up, down, left or right) and then will move the position of the asterisk (assuming it’s not going off the end of the screen).

Side Note: The CoCo’s text screen is 32×16 (512 characters). PRINT@ can print at 0-511, but if you print to 511 (the bottom right location), the screen will scroll up. I have adjusted this code to disallow moving the asterisk to that location.

You now have a really crappy text drawing program. To make it less crappy, you could check for other keys to change the character that is being drawn, or make it use simple color graphics:

10 CLS0:X=32:Y=16:C=0
20 SET(X,Y,C)
30 A$=INKEY$:IF A$="" THEN 30
40 IF A$=CHR$(94) AND Y>0 THEN Y=Y-1
50 IF A$=CHR$(10) AND Y<31 THEN Y=Y+1
60 IF A$=CHR$(8) AND X>0 THEN X=X-1
70 IF A$=CHR$(9) AND X<63 THEN X=X+1
80 IF A$="C" THEN C=C+1:IF C>8 THEN C=0
90 GOTO 20

That program uses the primitive SET command to draw in beautiful 64×32 resolution with eight colors. The arrow keys move the pixel, and pressing C toggles through the colors. Spiffy!

Color graphics from 1980!

Instead of using “C” to just cycle through the colors, you could check the character returned and see if it was between “0” and “8” and use that value to set the color (0=RESET pixel, 1-8=SET pixel to color).

10 CLS0:X=32:Y=16:C=1
20 IF C=0 THEN RESET(X,Y) ELSE SET(X,Y,C)
30 A$=INKEY$:IF A$="" THEN 30
40 IF A$=CHR$(94) AND Y>0 THEN Y=Y-1
50 IF A$=CHR$(10) AND Y<31 THEN Y=Y+1
60 IF A$=CHR$(8) AND X>0 THEN X=X-1
70 IF A$=CHR$(9) AND X<63 THEN X=X+1
80 IF A$=>"0" AND A$<="8" THEN C=ASC(A$)-48
90 GOTO 20

Now that we have refreshed our 1980 BASIC programming, let’s look at lines 40-70 which are used to determine which key has been pressed.

If we are going to be reading the keyboard over and over for an action game, doing so with a bunch of IF/THEN statements is not very efficient. Lets do some tests to find out how not very efficient it is.

For our example, we would be using INKEY$ to read a keypress, then GOSUBing to four different subroutines to handle up, down, left and right actions. To see how fast this is, we will once again use the TIMER command and do our test 1000 times. We’ll skip doing the actual INKEY$ for now, and hard code a keypress. Since we will be checking for keys in the order of up, down, left then right, we will simulate pressing last key check, right, to get the worst possible condition.

Here is version 1 that does a brute-force check using IF/THEN/ELSE.

0 REM KEYBD1.BAS
10 TM=TIMER:FORA=1TO1000
15 A$=CHR$(9)
20 REM A$=INKEY$:IFA$=""THEN20
30 IFA$=CHR$(94)THENGOSUB100ELSEIFA$=CHR$(10)THENGOSUB200ELSEIFA$=CHR$(8)THENGOSUB300ELSEIFA$=CHR$(9)THENGOSUB400
50 NEXT:PRINT TIMER-TM
60 END
100 RETURN
200 RETURN
300 RETURN
400 RETURN

When I run this in the XRoar emulator, I get back 1821. That is now our benchmark to beat.

INSTR

Rather than doing a bunch of IF/THENs, if we are using Extended Color BASIC, there is the INSTR command. It will take a string and a pattern, and return the position of that pattern in the string. For example:

PRINT INSTR("CAT DOG RAT", "DOG")

If you run this line, it will print 5. The string “DOG” appears in “CAT DOG RAT” starting at position 5. You can use INSTR to parse single characters, too:

PRINT INSTR("ABCDEFGHIJ", "F")

This will print 6, because “F” is found in the search string starting at position 6.

If the search string is not found, it returns 0. Using this, you can parse a string containing all the possible keypress options, and turn them in to a number. You could then use that number in an ON GOTO/GOSUB statement, like this:

0 REM KEYBD2.BAS
10 A$=INKEY$:IF A$="" THEN 10
20 A=INSTR("ABCD",A$)
30 IF A=0 THEN 10
40 ON A GOSUB 100,200,300,400
50 GOTO 10
100 PRINT "A WAS PRESSED":RETURN
200 PRINT "B WAS PRESSED":RETURN
300 PRINT "C WAS PRESSED":RETURN
400 PRINT "D WAS PRESSED":RETURN

A long line of four IF/THEN/ELSE statements is now replaced by INSTR and ON GOTO/GOSUB.

Let’s rewrite our test program slightly, this time using INSTR:

10 TM=TIMER:FORA=1TO1000
15 A$=CHR$(9)
20 REM A$=INKEY$:IFA$=""THEN20
30 LN=INSTR(CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9),A$)
35 IF LN=0 THEN 20
40 ONLN GOSUB100,200,300,400
50 NEXT:PRINT TIMER-TM
60 END
100 RETURN
200 RETURN
300 RETURN
400 RETURN

Running this gives me 1724. We are now slightly faster.

We can do better.

One of the reasons this version is so slow is line 30. Every time that line is processed, BASIC has to dynamically build a string containing the four target characters — CHR$(94), CHR$(10), CHR$(8) and CHR$(9). String manipulation in BASIC is slow, and we really don’t need to do it every time. Instead, let’s try a version 3 where we create a string containing those characters at the start, and just use the string later:

0 REM KEYBD3.BAS
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 TM=TIMER:FORA=1TO1000
15 A$=CHR$(9)
20 REM A$=INKEY$:IFA$=""THEN20
30 LN=INSTR(KB$,A$)
35 IF LN=0 THEN 20
40 ONLN GOSUB100,200,300,400
50 NEXT:PRINT TIMER-TM
60 END
100 RETURN
200 RETURN
300 RETURN
400 RETURN

Running this gives me 902! It appears to be twice as fast as the original IF/THEN version!

Speed comparisons…

Now we have a much faster way to handle the arrow keys. Let’s go back to the original program and update it:

0 REM INKEY3.BAS
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=256+16
20 PRINT@P,"*";
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$)
50 ONLN GOSUB100,200,300,400
60 GOTO 20
100 IF P>31 THEN P=P-32
110 RETURN
200 IF P<479 THEN P=P+32:RETURN
210 RETURN
300 IF P>0 THEN P=P-1
310 RETURN
400 IF P<510 THEN P=P+1
410 RETURN

Side Note: Using GOSUB/RETURN may be slower than using GOTO, but that will be the subject of another installment.

Now that we have a faster keyboard input routine, let’s do one more thing to try to speed it up.

POKE

We are currently using PRINT@ to print a character on the screen in positions 0-510 (remember, we can’t print to the bottom right position because that will make the screen scroll). Instead of using PRINT, we can also use POKE to put a byte directly in to screen memory:

POKE location,value

Location is an address in the up-to-64K memory space (0-65535) and value is an 8-bit value (0-255).

Let’s see if it’s faster.

First, PRINT@ wants positions 0-511, and POKE wants an actual memory address. The 32 column screen is located from 1024-1535 in memory, so PRINT@0 is like POKE 1024. PRINT@511 is like POKE 1535. Let’s make some changes:

0 REM INKEY4.BAS
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=1024+256+16
20 POKE P,106
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$)
50 ONLN GOSUB100,200,300,400
60 GOTO 20
100 IF P>1024+31 THEN P=P-32
110 RETURN
200 IF P<1024+479 THEN P=P+32:RETURN
210 RETURN
300 IF P>1024+0 THEN P=P-1
310 RETURN
400 IF P<1024+511 THEN P=P+1
410 RETURN

WARNING: While PRINT@ is safe (a bad value just generates an error), POKE is dangerous! If you POKE the wrong value, you could crash the computer. Instead of POKEing a character on the screen, you could accidentally POKE to memory that could crash the system.

This program will behave identically to the original, BUT since we are using POKE, we can now go all the way to the bottom right of the screen :) That is just one of the reasons we might use POKE over PRINT@.

But is it faster, or slower? Let’s find out…

10 TM=TIMER:FORA=1TO1000
20 PRINT@0,"*";
30 NEXT:PRINT TIMER-TM

…versus…

10 TM=TIMER:FORA=1TO1000
20 POKE1024,106
30 NEXT:PRINT TIMER-TM

The PRINT@ version shows 259, and the POKE version shows 655. POKE appears to be significantly slower. Some reasons could be:

  1. POKE has to translate four digits (1024) instead of just one (0) so that’s longer to parse.
  2. POKE has to also translate the value (106) where PRINT can probably just jump to the string that is in the quotes.

Let’s try to test this… By giving PRINT@ a three digit number, 510, it slows down from 259 to 424. Parsing that number is definitely part of the problem. Let’s eliminate the number parsing completely by using variables:

5 P=0
10 TM=TIMER:FORA=1TO1000
20 PRINT@P,"*";
30 NEXT:PRINT TIMER-TM

This gives us 229, so it’s a bit faster than the original. Now let’s try the POKE version:

5 P=1024:
10 TM=TIMER:FORA=1TO1000
20 POKEP,106
30 NEXT:PRINT TIMER-TM

This gives us 400, so it’s faster than the original 655, but still nearly twice as slow as using PRINT@. But wait, there’s still that 106 value. Let’s replace that with a variable, too.

5 P=1024:V=106
10 TM=TIMER:FORA=1TO1000
20 POKEP,V
30 NEXT:PRINT TIMER-TM

This slows it down from 229 to 234!  We are now almost as fast as PRINT@! But now the POKE version has to look up two variables, while the PRINT@ version only looks up one, so that might give PRINT@ an advantage. Let’s test this by making the PRINT@ version also use a variable for the character:

5 P=0:V$="*"
10 TM=TIMER:FORA=1TO1000
20 PRINT@P,V$;
30 NEXT:PRINT TIMER-TM

That slows it down to 231. This seems to indicate the speed difference is really not between PRINT@ and POKE, but between how much number conversion of variable lookup each needs to do. You can use PRINT@ without having to look up the string to print (“*”), but POKE always has to either convert a numeric value (106) or do a variable lookup (L).

So why bother with POKE if the only advantage, so far, is that you can POKE to the bottom right character on the screen?

Because PEEK.

PEEK lets us see what byte is at a specified memory location. If I were writing a game and wanted to tell if the player’s character ran in to an enemy, I’d have to compare the player position (X/Y address, or PRINT@ location) with the locations of all the other objects. The more objects you have, the more compares you have to do and the slower your program becomes.

For example, here’s a simple game where the player (“*”) has to avoid four different enemies (“X”):

0 REM GAME.BAS
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=256+16
15 E1=32:E2=63:E3=448:E4=479
20 PRINT@P,"*";:PRINT@E1,"X";:PRINT@E2,"X";:PRINT@E3,"X";:PRINT@E4,"X";
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$):IF LN=0 THEN 30
45 PRINT@P," ";
50 ONLN GOSUB100,200,300,400
60 IF P=E1 OR P=E2 OR P=E3 OR P=E4 THEN 90
80 GOTO 20
90 PRINT@267,"GAME OVER!":END
100 IF P>31 THEN P=P-32
110 RETURN
200 IF P<479 THEN P=P+32:RETURN
210 RETURN
300 IF P>0 THEN P=P-1
310 RETURN
400 IF P<510 THEN P=P+1
410 RETURN

In this example, the four enemies (“X”) remain static in the corners, but if you move your player (“*”) in to one, the game will end.

It’s not much of a game, but with a few more lines you could make the enemies move around randomly or chase the player.

Take a look at line 60. Every move we have to compare the position of the player with four different enemies. This is a rather brute-force check. We could also use an array for the enemies. Not only would this simplify our code, but it would make the number of enemies dynamic.

Here is a version that lets you have as many enemies as you want. Just set the value of EN in line number 1 to the number of enemies-1.

Side Note: Arrays are base-0, so if you DIM A(10) you get 11 elements — A(0) through A(10). Thus, if you want ten elements in an array, you would do DIM A(9), and cycle through them using base-0 like FOR I=0 TO 9:PRINT A(I):NEXT I.

0 REM GAME2.BAS
1 EN=10-1 'ENEMIES
2 DIM E(EN)
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=256+16
15 FOR A=0 TO EN:E(A)=RND(510):NEXT
20 PRINT@P,"*";
25 FOR A=0 TO EN:PRINT@E(A),"X";:NEXT
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$):IF LN=0 THEN 30
45 PRINT@P," ";
50 ONLN GOSUB100,200,300,400
60 FOR A=0 TO EN:IF P=E(A) THEN 90 ELSE NEXT
80 GOTO 20
90 PRINT@267,"GAME OVER!":END
100 IF P>31 THEN P=P-32
110 RETURN
200 IF P<479 THEN P=P+32:RETURN
210 RETURN
300 IF P>0 THEN P=P-1
310 RETURN
400 IF P<510 THEN P=P+1
410 RETURN

In 1980, this was a game.

Now the program is more flexible, but it has gotten slower. After every move, the code must now compare locations of every enemy. This limits BASIC from being able to do a fast game with a ton of objects.

Which brings me back to PEEK… Instead of comparing the player against every enemy, all we really need to know is if the location of the player is where an enemy is. If we are using POKE to put the player on the screen, we know the location the player is, and can just PEEK that location to see if anything is there.

Let’s change the program to use POKE and PEEK:

0 REM GAME2.BAS
1 EN=10-1 'ENEMIES
2 DIM E(EN)
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=1024+256+16:V=106:VS=96:VE=88
15 FOR A=0 TO EN:E(A)=1024+RND(511):NEXT
20 POKEP,V
25 FOR A=0 TO EN:POKEE(A),VE:NEXT
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$):IF LN=0 THEN 30
45 POKEP,VS
50 ONLN GOSUB100,200,300,400
60 IF PEEK(P)=VE THEN 90
80 GOTO 20
90 PRINT@267,"GAME OVER!":END
100 IF P>1024+31 THEN P=P-32
110 RETURN
200 IF P<1024+479 THEN P=P+32:RETURN
210 RETURN
300 IF P>1024+0 THEN P=P-1
310 RETURN
400 IF P<1024+510 THEN P=P+1
410 RETURN

Now, instead of looping through an array containing the locations of all the enemies, we simply PEEK to our new player location and if the byte there is our enemy value (VE, character 88), game over.

It should be a bit faster now.

We should also change the “1024+XX” things t0 the actual values to avoid doing math each time, but I was being lazy.

Now we know a way to improve the speed of reading key presses, and ways to use POKE/PEEK to avoid having to do manual comparisons of object locations. Maybe this will come in handy someday when you need to write a game where an asterisk is being chased by a bunch of Xs.

Until next time…