See also: part 1
My original “nicer to read, slower to run” version took 25.28 seconds to run. Here are some things that can speed it up.
RENUMber by 1
Renumbering the program by 1 can save program memory (since “GOTO 77” takes less space than “GOTO 1500”, and if the program is large enough, you get a slight speed increase since it’s also less work to parse shorter line numbers. But this program is too small for this to matter.
Remove comments / avoid comments at the end of lines
Removing comments, in general, is a good way to make a program smaller (and thus, a bit faster since BASIC will have less lines to skip). But, as I found out, comments at the end of lines are a real problem, since BASIC cannot just skip the entire line like it can if the line starts with a comment:
10 X=42:REM SET X TO THE ANSWER.
10 REM SET X TO THE ANSWER.
The second version is faster, since even though BASIC has to scan an extra line, it can immediately skip it, while the first example must be parsed, character-by-character, to find the end of that line. But for this example, this did not help. It would have, if there were more GOTOs to earlier line numbers, since they would have to parse through a few lines with end comments, which would be slower. Since our looping was done with a FOR/NEXT, we didn’t have to do that.
Remove unnecessary spaces
Spaces are great for readability, but bad for code size and speed. There are only a few times when a space is needed, such as when BASIC can’t tell if the next character is a keyword or part of a variable:
10 IF A=B THEN 100
BASIC must have the space between the variable and the keyword THEN, else the parser isn’t sure if the variable is “B” or “BT” followed by the characters HEN. BUT, if it’s a number, it can figure it out:
Removing all other spaces reduces the amount of code BASIC has to parse through. My program had plenty of spaces. Removing them in this tiny program only reduced it to 25.15 seconds, but it can have a bigger impact on larger programs.
This one is easy. The less lines BASIC has to scan through, the quicker it will run. But, there are exceptions to this. Any time BASIC has to do more work at the end of a line, such as checking an ELSE, if that condition is not satisfied, BASIC has to scan forward character-by-character to get to the end of the line:
10 IF X=42 THEN 100 ELSE PRINT "I DON'T KNOW THE ANSWER"
Above, if X is 42, the ELSE code will not be ran. BUT, before the program can find 100, it has to scan through all the rest of the bytes in that line to find the end of it, then it continues scanning a line at a time looking for 100. Based on line lengths, sometime it might be faster to do:
10 IF X=42 THEN 100
20 IF X<>42 THEN PRINT "I DON'T KNOW THE ANSWER"
Combining lines in the program reduced the speed slightly to 25.13 seconds. The more code, the more you save by doing this (and it makes the program smaller, since every line number takes up space).
NEXT without variable
Unless you are doing really weird stuff with FOR/NEXT loops, it’s faster to leave off the variable in the NEXT. For example:
10 FOR A=1 TO 10:NEXT A
…will be slower than:
10 FOR A=1 TO 10:NEXT
NEXT without a variable will go to the most recent FOR anyway, so this works fine:
10 FOR A=1 TO 10
20 FOR B=1 TO 20
30 PRINT A,B
For this program, just removing the variables from NEXT A and NEXT Z increased the speed to 24.66 seconds.
Use HEX instead of decimal
It takes much more work to convert base-10 (i.e., “normal”) decimal numbers than it does to convert base-16 (hex) values. Every time a number is encountered, time can be saved by simply changing the number to a HEX — even though a hex digit has to start with “&H”. It will make the program larger, since 15 takes less bytes than &HF. For this example, it reduced time down to 21.66 seconds!
If you have to PRINT a generated string more than once, you are better off pre-rendering that string first so it only gets generated one time. This includes using things like STRING$, CHR$, etc. For example:
10 PRINT CHR$(128);"HELLO";CHR$(128)
Each time BASIC has to print that, it allocates a new string a few times and copies things over. See my string theory article for more details. For this program, I use STRING$() to generate a repeating block of characters. But, since it’s in a loop to print the block, it generates the same string over and over for each line of the block. Instead of this:
10 FOR A=1 TO 10
20 PRINT STRING$(10,128)
30 NEXT A
…I could generate that string ahead of time and just print it in the loop:
10 FOR A=1 TO 10
20 PRINT LN$
30 NEXT A
In this program, doing this trick reduces the time to 20.98 seconds! String are slow, you see…
The time savings so far…
Combining all of these (the ones that matter, at least), reduces time taken down to 18.45 seconds – saving 6.83 seconds overall.
Here is what it looks like, with the stuff outside of the benchmark timing loop left alone:
0 REM scale.bas 10 SW=32/4 ' SCALE WIDTH 20 SH=16/3 ' SCALE HEIGHT 30 SM=.1 ' SCALE INC/DEC 40 S=.5 ' SCALE FACTOR 70 TM=TIMER:FOR Z=1 TO 100 80 W=INT(SW*S):H=INT(SH*S):P=&HF-INT(W/&H2)+(&H7-INT(H/&H2))*&H20:LN$=STRING$(W,&HAF):CLS:FORA=&H1 TOH:PRINT@P+A*&H20,LN$:NEXT:S=S+SM:IFH<&H1 ORH>&HF THEN SM=-SM:S=S+(SM*&H2) 170 NEXT 180 ' 60=NTSC 50=PAL 190 PRINT:PRINT (TIMER-TM)/60;"SECONDS"
If we’d done any GOTO/GOSUBing to earlier lines, that would have to scan through those lines with ending REMarks, so cleaning all that up would be useful. If this was a larger program, RENUMbering by 1s might also squeak out a bit more speed.
But wait! There’s more!
I originally wrote this follow-up the evening that I wrote the original article. However, several folks have contributed their own optimizations. Let’s see what they did…
First, we go to the comments of the original article…
George Phillips – inner loop optimization
My current best effort is going full strength reduction on the inner loop. Replace lines 120 to 170 with:
The STRING$ requires Extended Color Basic which starts at a baseline of 20.4 seconds and those changes get it down to 13.George Phillips
George pre-rendered the string, removed spaces, and vastly sped things up by simplifying the math. On Xroar, it reports 16.15 – faster than any of my efforts. Nice job!
I took his updated code and converted the integers to HEX and that dropped it to 15.1 seconds! Even faster.
Jason Pittman – hex and start row calculation
Interestingly, changing the blue square value on line 130 to hex (“&HAF”) saves about 11.5% for me (27.55 to 24.33). Then, moving the row start calculation out of the print statement takes it down to 22.87.
120 FOR A=32 TO H*32 STEP 32Jason Pittman
140 NEXT A
Ah, the magic of hex! Jason did a similar trick with the row calculation. I tried Jason’s updates, and it was 20.28 on my Xroar. George is still in the lead.
Adam – DIM, reductions and variable substituion
So I was able to cut out 5 seconds doing the following:
– DIM all variables
– Adding variables for constants (2, 32, etc.) for lines 130, 160.
– Removing the math in lines 10, 20.
– Removing the variables after NEXT.
– Removing all remarks.
No change to the original algorithm or PRINT line.
Coco3 (6309): Original code = 27.15s. Optimized code 22.13s.
With high speed poke: 11.05s. :D
I’m intrigued with the post that mentioned using HEX values. I may have to try that.Adam
Ah! Using a variable rather than a constant can be significantly faster. I didn’t think about that, even though I’ve previously benchmarked and tested that. He adds:
I just put the STRING$ statement into a variable as suggested by George Phillips. That took out another 2 seconds! I also ran RENUM 1,,1. Run time is down from 27.2 s to 20.1 s.
It looks like George has gotten us all thinking. I don’t have Adam’s code to test, but I am going to experiment with replacing the constants with variables and see what that does.
Ciaran Anscomb – uh… not really sure!
The author of Xroar chimes in:
So I cheated and read the comments on your page and combined some of those techniques (and replace /2 with *1/2, forgot about that…).
13.1s? Not quite double speed, but not bad :)
0 REM scale.bas
1 GOSUB100:TM=TIMER:FORZ=1TO M:CLS:A$=STRING$(E-INT(W*B),L)+STRING$(W,C):PRINT@(8-INT(H*B))*L,””;:FORA=1TOH:PRINTA$:NEXT:IFH<1ORH>=&H10 THENQ=-Q:R=-R
3 ‘ 60=NTSC 50=PAL
4 T=TIMER:PRINT:PRINT (T-TM)/60;”SECONDS”
110 I=32/4 ‘ SCALE WIDTH
120 J=16/3 ‘ SCALE HEIGHT
130 D=.1 ‘ SCALE INC/DEC
140 S=.5 ‘ SCALE FACTOR
Ciaran reorganized the code, placing initialization at the end. This is a habbit I need to get into since GOTO/GOSUB to an earlier line number have to start at the FIRST line of the program and scan forward EVERY TIME. Code that only needs to run once, if placed at the top, slows down every GOTO to an earlier line number.
I just ran this and got 13.13 seconds!
There is alot to dissect here. Numbers are replaced with variables. Stuff is pre-calculated. And he’s using a variable for the end of a FOR/NEXT, for some reason. I’m going to have to dig in to that one.
Twice as fast, indeed!
I feel like there were a few more responses, but I can’t locate them at the moment, so maybe there will be a second attempt article down the line.
Until next time, kudos to all of you for giving me new things to consider — especially Ciaran with that amazing improvement. Very impressive.