Optimizing Color BASIC, part 8

See also: part 1, part 2, part 3, part 4, part 5, part 6, part 7, part 8 and part 9.

Arrays and Variable Length

In part 3, I demonstrated a simple “game” where you could move a character around the screen and try to avoid running in to enemies. The original version hard coded four enemies, each with their own variable. It looked like this:

0 REM GAME.BAS
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=256+16
15 E1=32:E2=63:E3=448:E4=479
20 PRINT@P,"*";:PRINT@E1,"X";:PRINT@E2,"X";:PRINT@E3,"X";:PRINT@E4,"X";
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$):IF LN=0 THEN 30
45 PRINT@P," ";
50 ONLN GOSUB100,200,300,400
60 IF P=E1 OR P=E2 OR P=E3 OR P=E4 THEN 90
80 GOTO 20
90 PRINT@267,"GAME OVER!":END
100 IF P>31 THEN P=P-32
110 RETURN
200 IF P<479 THEN P=P+32:RETURN
210 RETURN
300 IF P>0 THEN P=P-1
310 RETURN
400 IF P<510 THEN P=P+1
410 RETURN

I then modified it to use an array for the enemy variables, so less code was needed to cycle through them, while also allowing easy changing of the amount of enemies. It looked like this:

0 REM GAME2.BAS
1 EN=10-1 'ENEMIES
2 DIM E(EN)
5 KB$=CHR$(94)+CHR$(10)+CHR$(8)+CHR$(9)
10 CLS:P=256+16
15 FOR A=0 TO EN:E(A)=RND(510):NEXT
20 PRINT@P,"*";
25 FOR A=0 TO EN:PRINT@E(A),"X";:NEXT
30 A$=INKEY$:IF A$="" THEN 30
40 LN=INSTR(KB$,A$):IF LN=0 THEN 30
45 PRINT@P," ";
50 ONLN GOSUB100,200,300,400
60 FOR A=0 TO EN:IF P=E(A) THEN 90 ELSE NEXT
80 GOTO 20
90 PRINT@267,"GAME OVER!":END
100 IF P>31 THEN P=P-32
110 RETURN
200 IF P<479 THEN P=P+32:RETURN
210 RETURN
300 IF P>0 THEN P=P-1
310 RETURN
400 IF P<510 THEN P=P+1
410 RETURN

Arrays are an easy way to reduce code. What I did not realize is how slow they are! Consider this example:

4 DIM E1,E2,E3,E4
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 E1=1:E2=1:E3=1:E4=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

In the test loop we simply set four different variables. Notice that the ones we use most often (inside the test loop) I have declared first in line 4. This makes them faster, since they are found earlier when BASIC looks up the variables.

This results in 576.

Instead of using four separate variables, we could use an array of four elements:

4 DIM E(3)
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 FORC=0TO3:E(C)=1:NEXT
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

Although this simplifies the code, and allows us to easily change it from 3 to more variables, it adds another FOR/NEXT loop.

The speed drops to 1408! And that’s using a one-character variable name. It might be a tad slower if I had chosen “EN” for the array or something two-characters to match the original.

It seems that, if you can get away with it, manually handling separate variables is much faster.

Arrays will win for code size, but lose for speed. I probably won’t want to use arrays to track the enemies in my BASIC arcade action games.

Can we make arrays faster? Let’s remove the FOR/NEXT loop and access them manually, so it’s as direct a compare to non-arrays as we can get:

4 DIM E(3)
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 E(0)=1:E(1)=1:E(2)=1:E(3)=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

Now we are doing “E(0)=1” versus “E1=1”. Arrays should still be slower because there are more characters to parse, and an array lookup as to be done.

This produces 1021. It appears that setting the variable takes about a third of the time, looking up the array another third, and the FOR/NEXT loop a third third. Or something.

As brute-force as it looks, it appears that is the faster way for handling variables even though it produces more code.

And speaking of more code…

Variable Length

One character variable names can get confusing real quick, but the two-character limit in Color BASIC isn’t much better. Well, you can specify longer variable names, but only the first two characters are honored:

USERNUM=1

…turns in to…

US=1

If you could make sure the first two characters were unique, you could make your program more readable, but you would be wasting speed and code size.

Consider this benchmark example, which uses a 10-character variable:

0 REM VARLEN.BAS '211
4 DIM USERCOUNT
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 USERCOUNT=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

This produces 211. It’s wasteful, since BASIC is only honoring the first two characters. Let’s try this:

0 REM VARLEN.BAS '182
4 DIM US
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 US=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

This produces 182. All those extra useless characters did nothing but slow things down a tad.

But using a one-character variable is the fastest we can get:

0 REM VARLEN.BAS '177
4 DIM U
5 DIM TE,TM,B,A,TT
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 U=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

This produces 177.

For the most-used variables, use a one-character variable name for the best speed.

And remember, every time a variable is references, BASIC has to start with the first variable it knows about and walk through all of them until it finds a match. That is the purpose of the DIM statements in lines 4 and 4. They are declaring variables in the priority of most-used to least, sorta. The variable used in the inner FOR/NEXT loop is U, so I declare it first.

If I had done U last:

0 REM VARLEN.BAS '177
5 DIM TE,TM,B,A,TT
6 DIM U
10 FORA=0TO4:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 U=1
70 NEXT:TE=TIMER-TM
80 TT=TT+TE:PRINTA,TE
90 NEXT:PRINTTT/A:END

This slows it down to 188.

Putting it all together: Avoid arrays, use one-character variable names, and variables you want to be the fastest should be declared earlier.

Just an FYI.

19 thoughts on “Optimizing Color BASIC, part 8

  1. William Astle

    It actually follows that array accesses would be slower. Unsurprisingly, it has to look up the array itself, though that isn’t particularly slow since arrays have their own table that lives just after the regular variable table. Of course, if you have multiple arrays, the first array defined will be found quickly than the tenth since it’s still a linear scan of the array table.

    Incidentally, it’s better to allocate all your regular variables and *then* dimension any arrays for just that reason – the arrays have to be relocated any time a new scalar variable is allocated. That can substantially slow down variable allocation. Thus, you should have your array defining DIM appear after the scalar defining DIM.

    Anyway, in addition to just looking up the array, the subscript has to be calculated which involves a trip through the expression evaluator which could lead to any number of things like variable lookups, etc, but which will at a minimum be either a variable lookup or a constant conversion. Then the subscript is converted to an integer and then bounds checked.

    Once the subscript is determined, it is a simple calculation to get the offset into the array data to find the actual value, but it is a calcuation that is not required for scalars.

    Note that for multidimensional arrays, you get to repeat the subscript lookup mutlipe times so they will be correspondingly slower. Theoretically you could have up to 255 dimensions though you would be hard pressed to actually do anything useful with that.

    If you think about it, array type accesses are slower compared to a direct variable access even in low level languages like assembly language. It’s unavoidable since additional work is required to do the offsetting into the data.

    Reply
  2. Johann Klasek

    Just as side note, a story about array usage on a BASIC dialect with same ancestry, Commodore BASIC V2 on a C64. For a program doing Hires graphics only in plain BASIC we tried to accelerate the pixel access by using a integer array (16 bit value, alas, not implemented in ColorBASIC). Some would expect, the clumsy address calculation into the bitmap could be boosted using array access, but this failed. It was slower! It seemed elegant, but the index calculation (expression evaluation as mentioned above) took its toll, beside the fact that the adjustment calculation to signed 16 bit value to pixel location was clumsy too.

    Reply
          1. Johann Klasek

            No, the BASIC timer is alway 60 Hz even on PAL machines. It is based on the hardware timer which starting value is set accordingly to the base frequency of the system. It’s not derived from the raster interrupt.

          2. Allen Huffman Post author

            +1 for Commodore! I think the CoCo’s timing is based on the screen refresh, so programs that relied on it would have to ask the user if they were playing on a 50Hz or 60Hz machine :)

            Do you know of an online C= emulator? I think I found a C64 JavaScript one, and I’d like to find a VIC-20 one as well.

          3. Johann Klasek

            > +1 for Commodore! I think the CoCo’s timing is based on the screen refresh, so programs that relied on it would have to ask the user if they were playing on a 50Hz or 60Hz machine :)

            Hmm, just partially +1, let’s say better +0.5, because on a C128 with BASIC 7.0 additional sound(!) and graphic commands which are timing related are based on the raster interrupt timing. Not a bad idea for graphics, but sound!? Therefore, BASIC 7.0 programs has to ported or carefully coded to support NTSC and/or PAL … :(

            > Do you know of an online C= emulator? I think I found a C64 JavaScript one, and I’d like to find a VIC-20 one as well.

            Just know of one in Java, see http://www.z64k.com ….

          4. Johann Klasek

            Raster interrupt timing hits graphics in multiple ways. First, the mixed modes (graphic and textmode on the same screen) depends on raster timing. Second, controlling sprites with command MOVSPR allows you to let automatically move a sprite over the screen with a certain direction (angle) and speed. The latter is based on raster interrupt’s screen refresh rate. Beside this, the official commodore manuals are horrible, containing many more or less severe errors or just missing information regarding this issue (at least in the german translation).

          5. Allen Huffman Post author

            I was unaware of Sprites through BASIC, and split modes. Very nice. A friend of mine had a C128, and I recall playing with the basic graphics commands (seeing what it had compared to the CoCo’s EXTENDED BASIC commands). I had a Commodore friend who introduced me to GEOS, and we went to another C128 user’s house to use the 128 GEOS and some really NICE desktop publishing program to make tickets and flyers for a rock and roll band thing I was helping with. There was some impressive non-game stuff for the 128.

            Was the TI$ different, then?

          6. Johann Klasek

            Nice use-case of a C128 … another kind of of “band aid” :)
            I appreciate any non-game stuff.

            TI$ is always 60 Hz. It’s coming from the depth of the Kernal which provides the timer function. BASIC only maps it into pseudo variables.
            All CBMs do it the same way, with one exception: the B-series (AKA CBM-II, PET successor) has no pseudo variable TI, only TI$, but with 7 digits instead of 6, carrying an additional 1/10th second digit. But this CBM model branch didn’t play a role, so compatibility to them wasn’t an issue.

  3. Johann Klasek

    > Putting it all together: Avoid arrays, use one-character variable names, and variables you want to be the fastest should be declared earlier.

    In addition to that (I think this fits well here), I would also suggest to prevent constants in preference of using variables. Even the variable name space is quite limited (26+26*36), a value already in floating-point representation store in a variable -assuming it is defined “early” as visualized above- could be faster than the parsing and conversion of constants (involving a trimmed by 10 float multiplication for each additional digit). Caveat: Variables don’t make it for all constants, there is a break-even point between number of digits and the position in the variable table. If I correctly remember, for a 6502-based MS BASIC one digit constant is faster then a single-char variable if its table position is greater than 11, which may probably vary for ColorBASIC. But this might be a subject for further investigation and another posting in this interesting article series Allen will bring to us eventually …

    Reply
  4. James Jones

    There’s always READ/DATA, but the global nature of Color BASIC, along with not having RESTORE like BASIC09, would make it obnoxious. (Not to mention that I was kind of bummed that neither BASIC would let you READ and read the individual elements for you. Drat.)

    Reply
      1. William Astle

        RESTORE is there. You just can’t have it reset the pointer to a particular line number – only to the start of the program. The kicker is that it wouldn’t be terribly difficult to implement.

        I can see that READ [arrayname] could be useful. However, that is something of quite limited use but would require some special (read as duplicating existing variable parsing and lookup code) coding in the interpreter. Given that there is a straight forward alternative for the relatively few cases where it would be useful, it’s not surprising it isn’t there. It’s better to use the code space for other more useful things.

        Now, if anyone can explain to me how the ROM code for READ and INPUT (it’s the same routine, believe it or not) actually works from studying it, I’ll be seriously impressed. That’s probably the most obscure code in the entire interpreter save for the floating point code. Adding a “fill array” option for READ would be seriously hard given the code that’s already there.

        Reply
  5. Pingback: CoCo bouncing BASIC ball, part 5 | Sub-Etha Software

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.