A recent post by Art “ADOS” Flexser on the CoCo mailing list mentioned a few commercial programs for the Radio Shack Color Computer that were designed to optimize BASIC programs:
On Jan 2, 2015, at 12:26 AM, Arthur Flexser wrote:
A little googling identified the utility I was referring to as “The Stripper” from Eigen Systems. Apparently, there was also a similar program by Bob van der Poel called “Packer”. Maybe one or the other is available online someplace.
These should be useful for those finding tight memory for Basic programs in
the 4K contest.
The 4K contest he mentions is a programming challenge I recently organized to see what someone could do if they had the original 1980 4K Color Computer. You can read about it here:
In addition to the programs Art mentioned, I also had one given to me by Carl England. I used it on one of my programs at Sub-Etha Software. I do not know how it compares with The Stripper or Packer, but beyond true optimization of the code, it did about all you could do. Here are the tricks he used which produce the smallest and fastest version of a BASIC program, but they usually rendered it un-editable and hard to read.
NOTE: Code can be written for size or speed. Carl’s program optimized for size which in some cases made it faster since BASIC had less to parse, but as you will see, some optimizations might increase execution time making it slower. Maybe.
Remove all spaces
The less bytes to parse, the smaller and faster it will run. It has to be smart enough to know when spaces are required, such as “FORA=1TOQ STEP1” when a variable needs a space after it before another keyword.
Pack/combine lines where possible.
Any line called by a GOTO had to remain at the start, by following lines would be combined up to the max size of a BASIC line. Only when it had to be broken by logic (ELSE, etc.) would it not be combined. For example:
10 FOR I=1 TO 100 20 GOSUB 50 30 NEXT I 40 END 50 REM PRINT SOMETHING 60 PRINT "HELLO WORLD!" 70 RETURN
…would end up as:
10 FORI=1TO100:GOSUB60:NEXTI:END 60 PRINT"HELLO WORLD!":RETURN
Remove all REMs.
Obvious. I *think* his program would adjust a GOTO/GOSUB that went to a REM line to go to the next line after it, and remove the REM:
10 GOSUB 1000 20 END 1000 REM PRINT SOMETHING 1010 PRINT "HELLO" 1020 RETURN
…would become like:
10 GOSUB1010:END 1010 PRINT"HELLO":RETURN
NOTE: Even without packing, I learned not to GOTO or GOSUB to a REM since it required the interpreter to parse through the line. I would GO to the first line of code after the REM:
10 GOSUB 1010 ... 1000 REM MY ROUTINE 1010 PRINT "HERE IS WHERE IT STARTS" ...
Every thing you do like that saves a bit of parsing time since it doesn’t have to start parsing 1000 and look up a token then realize it can ignore the rest of the line.
Not bad so far
But that’s not all!
The BASIC input buffer is some maximum number of bytes (250?). When you type a line to max, it stops you from typing more. When you hit ENTER, it is tokenized and could be much smaller. For instance, “PRINT” turns in to a one-byte token instead of the five characters. Carl’s program would combine and pack the tokens up to the max line size. Thus, it created lines that BASIC could run which were IMPOSSIBLE to enter on the keyboard. If I recall, you could LIST and they would print (?) but if you did an EDIT you were done.
Renumber by 1.
GOSUB2000 would take up one bye for the token, then four bytes for the line number. If you renumber by 1s, it might make that function be GOSUB582 and save a byte. Multiply that by every use of GOTO/GOSUB/ON GOTO/etc. and you save a bit.
Even without one of these compressors, try a before/after just doing a RENUM 1,1,1. I recently posted an article with a short (27 line) word wrap routine in BASIC. Doing this renum saved 7 bytes.
Removing NEXT variables.
I am not sure if this is just something I knew, or if his program also did it. If you use FOR/NEXT and it is used normally, you don’t need the NEXT variables like this:
10 FOR D=0 TO 3 'DRIVES 20 FOR T=0 TO 34 'TRACKS 30 FOR S=1 TO 18 'SECTORS 40 DSKI$ D,T,S,S1$,S2$ 50 NEXT S 60 NEXT T 70 NEXT D
The NEXTs could also be
50 NEXT S,T,D
Ignoring spaces, “NEXT:NEXT:NEXT” takes up one less byte than “NEXTS,T,D” (NEXT is turned in to a token, so it is TOKEN COLON TOKEN COLON TOKEN versus TOKEN BYTEVAR COMMA BYTEVAR COMMA BYTEBAR”.
This takes up less memory, but there is a speed difference:
10 T=TIMER 20 FOR I=1 TO 10000 30 REM DO NOTHING 40 NEXT I 50 PRINT TIMER-T
Run this and it shows something in the 1181 range (speed varies based on other things BASIC does; pound on keys while it runs and it takes more time, for example). Change the “NEXT I” to “NEXT” and it shows something in the range of 1025. The more FOR variables, the more the savings, too.
10 TM=TIMER 20 FOR D=0 TO 3 30 FOR T=0 TO 34 40 FOR S=1 TO 18 50 REM DO SOMETHING 60 NEXT S 70 NEXT T 80 NEXT D 90 PRINT TIMER-TM
…shows around 345. Changing it to:
…changes it to around 336 – slightly smaller and faster since it is not parsing three different lines.
…changes it to around 293! One byte smaller and slightly faster.
AND THERE’S MORE! But this post is already too long.
What I think would be cool is an actual BASIC OPTIMIZER that could do some smart things to programs, much like we do with the C language optimizers for asm and C. For example, noticing stupid things:
A$ = B$+"."+"."+C$
A$ = B$+".."+C$
And removing NEXT variables, or doing minor restructuring based on what it knows about the interpreter.
I suppose, back in the 80s, processors were so slow and memory so limited that doing such an optimizer might not be practical. But today, imagine how you could optimize a .BAS file on a modern computer (just like we have 6809 cross-hosted compilers that compile and link in the link of an eye).
Maybe this has already been done for some other flavor of BASIC? Hmmm. I may have to do some Googling.