See also: part 1, part 2, part 3, part 4, part 5, part 6, part 7, part 8 and part 9.
A recent post by Art “ADOS” Flexser on the CoCo mailing list mentioned a few commercial programs for the Radio Shack Color Computer that were designed to optimize BASIC programs:
On Jan 2, 2015, at 12:26 AM, Arthur Flexser wrote:
A little googling identified the utility I was referring to as “The Stripper” from Eigen Systems. Apparently, there was also a similar program by Bob van der Poel called “Packer”. Maybe one or the other is available online someplace.
These should be useful for those finding tight memory for Basic programs in
the 4K contest.
The 4K contest he mentions is a programming challenge I recently organized to see what someone could do if they had the original 1980 4K Color Computer. You can read about it here:
In addition to the programs Art mentioned, I also had one given to me by Carl England. I used it on one of my programs at Sub-Etha Software. I do not know how it compares with The Stripper or Packer, but beyond true optimization of the code, it did about all you could do. Here are the tricks he used which produce the smallest and fastest version of a BASIC program, but they usually rendered it un-editable and hard to read.
NOTE: Code can be written for size or speed. Carl’s program optimized for size which in some cases made it faster since BASIC had less to parse, but as you will see, some optimizations might increase execution time making it slower. Maybe.
Remove all spaces
The less bytes to parse, the smaller and faster it will run. It has to be smart enough to know when spaces are required, such as “FORA=1TOQ STEP1” when a variable needs a space after it before another keyword.
Pack/combine lines where possible.
Any line called by a GOTO had to remain at the start, by following lines would be combined up to the max size of a BASIC line. Only when it had to be broken by logic (ELSE, etc.) would it not be combined. For example:
10 FOR I=1 TO 100 20 GOSUB 50 30 NEXT I 40 END 50 REM PRINT SOMETHING 60 PRINT "HELLO WORLD!" 70 RETURN
…would end up as:
10 FORI=1TO100:GOSUB60:NEXTI:END 60 PRINT"HELLO WORLD!":RETURN
Remove all REMs.
Obvious. I *think* his program would adjust a GOTO/GOSUB that went to a REM line to go to the next line after it, and remove the REM:
10 GOSUB 1000 20 END 1000 REM PRINT SOMETHING 1010 PRINT "HELLO" 1020 RETURN
…would become like:
10 GOSUB1010:END 1010 PRINT"HELLO":RETURN
NOTE: Even without packing, I learned not to GOTO or GOSUB to a REM since it required the interpreter to parse through the line. I would GO to the first line of code after the REM:
10 GOSUB 1010 ... 1000 REM MY ROUTINE 1010 PRINT "HERE IS WHERE IT STARTS" ...
Every thing you do like that saves a bit of parsing time since it doesn’t have to start parsing 1000 and look up a token then realize it can ignore the rest of the line.
Not bad so far
But that’s not all!
The BASIC input buffer is some maximum number of bytes (250?). When you type a line to max, it stops you from typing more. When you hit ENTER, it is tokenized and could be much smaller. For instance, “PRINT” turns in to a one-byte token instead of the five characters. Carl’s program would combine and pack the tokens up to the max line size. Thus, it created lines that BASIC could run which were IMPOSSIBLE to enter on the keyboard. If I recall, you could LIST and they would print (?) but if you did an EDIT you were done.
Renumber by 1.
GOSUB2000 would take up one bye for the token, then four bytes for the line number. If you renumber by 1s, it might make that function be GOSUB582 and save a byte. Multiply that by every use of GOTO/GOSUB/ON GOTO/etc. and you save a bit.
Even without one of these compressors, try a before/after just doing a RENUM 1,1,1. I recently posted an article with a short (27 line) word wrap routine in BASIC. Doing this renum saved 7 bytes.
Removing NEXT variables.
I am not sure if this is just something I knew, or if his program also did it. If you use FOR/NEXT and it is used normally, you don’t need the NEXT variables like this:
10 FOR D=0 TO 3 'DRIVES 20 FOR T=0 TO 34 'TRACKS 30 FOR S=1 TO 18 'SECTORS 40 DSKI$ D,T,S,S1$,S2$ 50 NEXT S 60 NEXT T 70 NEXT D
The NEXTs could also be
50 NEXT S,T,D
Or
50 NEXT:NEXT:NEXT
Ignoring spaces, “NEXT:NEXT:NEXT” takes up one less byte than “NEXTS,T,D” (NEXT is turned in to a token, so it is TOKEN COLON TOKEN COLON TOKEN versus TOKEN BYTEVAR COMMA BYTEVAR COMMA BYTEBAR”.
This takes up less memory, but there is a speed difference:
10 T=TIMER 20 FOR I=1 TO 10000 30 REM DO NOTHING 40 NEXT I 50 PRINT TIMER-T
Run this and it shows something in the 1181 range (speed varies based on other things BASIC does; pound on keys while it runs and it takes more time, for example). Change the “NEXT I” to “NEXT” and it shows something in the range of 1025. The more FOR variables, the more the savings, too.
10 TM=TIMER 20 FOR D=0 TO 3 30 FOR T=0 TO 34 40 FOR S=1 TO 18 50 REM DO SOMETHING 60 NEXT S 70 NEXT T 80 NEXT D 90 PRINT TIMER-TM
…shows around 345. Changing it to:
60 NEXTS,T,D
…changes it to around 336 – slightly smaller and faster since it is not parsing three different lines.
60 NEXT:NEXT:NEXT
…changes it to around 293! One byte smaller and slightly faster.
AND THERE’S MORE! But this post is already too long.
What I think would be cool is an actual BASIC OPTIMIZER that could do some smart things to programs, much like we do with the C language optimizers for asm and C. For example, noticing stupid things:
A$ = B$+"."+"."+C$
to
A$ = B$+".."+C$
And removing NEXT variables, or doing minor restructuring based on what it knows about the interpreter.
I suppose, back in the 80s, processors were so slow and memory so limited that doing such an optimizer might not be practical. But today, imagine how you could optimize a .BAS file on a modern computer (just like we have 6809 cross-hosted compilers that compile and link in the link of an eye).
Maybe this has already been done for some other flavor of BASIC? Hmmm. I may have to do some Googling.
You’ve pretty well covered the things that are specific to the peculiarities of the BASIC interpreter, though come to think of it here’s something you could do: read the source and count how many times the variables are used. Give the 26 most-used ones one-letter names.
Does Color BASIC have DEF FN() = ? If it does, it might be worth looking for expression schemata that appear several times, generating the appropriate DEF FNx, and replacing the occurrences of the expression with calls. You’d want to test to see whether it really saves space.
Past that,. you’re talking real live optimization, with real live data and control flow analysis. Things like:
See how many places GOSUB to a particular subroutine. If the answer is “one”, inline it.
Parse expressions and do constant folding (like the “:” + “:”) and other optimizations on them.
Strength reduction could be tricky–you’d have to flatten arrays into one dimension to let you do strength reduction in the language.
Another tricky part: that code you’re optimizing may be some carefully-tweaked delay loop, so you’ll have to know when not to optimize..Probably that means programmer marking:
100 REM NOOPT
code to be left alone
200 REM OPT
Speaking of REM, I presume you’d replace, um, is it PRINT or INPUT with ? and REM with ‘
Ah yes! Once character variable names. I completely forgot to mention that. I am not sure if Carl England’s program touched variable names, but that would make sense. One thing I did (after reading about it somewhere) even back in 1983 was declare variables at the start with a “DIM A$,A,B,C” which caused it to pre-allocate a spot for them. Perhaps, if BASIC does not sort them as they are created, doing such a DIM makes them in that order, so the most often used ones could be declared first so it would not need to search through 50 variables lookging for “I”.
DEF FN does exist in Extended Color BASIC. I am going to have to read up on that. I recall it existing, but do not remember what it is for.
In CoCo BASIC, REM is one token, and ‘ turns in to :REM and takes two tokens, so doing this:
10 REM This is a note
versus
10 ‘ This is a note
…it is better to use the first, “REM(SPACE)” being two bytes, and “(COLON)REM(SPACE)” being three. BUT, if you left out the space:
10 REM This is a note
10 REM ‘This is a note
…they are the same.
It looks like I have a few more updates to do on this topic — thanks!
Great notes, James! I understand why no one cared or bothered doing this for CoCo BASIC, but it would be a fun experiment just to see how much of a difference it can make.
Wow, DEFFN looks quite useful! I do not recall ever using it. In a program I am playing with, I turn an X and Y in to a memory location based on the starting address of screen memory:
10 P=1024+Y*32+X:POKEP,42
I converted that to:
DEF FNP(X Y)=1024+Y*32+X
10 P=FNP(X Y):POKE P,42
That could be a good code space savings if it is used many times. Speed tests with one parameter seemed a tiny bit faster, but with two, it was slower. Tradeoffs.
Excellent.
Drats. DEFFN in CoCo BASIC claims to handle multiple variables, but it was not implemented. It works with one, but with multiple, it is just using one parameter and then reading the global variables used at DF() time. That’s a workaround, but not much different than a GOSUB.
Oh, yeah… Someone long ago did and wrote up a paper on source-to-source optimizations in FORTRAN. Perhaps it would give you some ideas.
Pingback: Word wrap revisited… | Sub-Etha Software
Another Color BASIC optimization I recall was to put subroutines you call often at low line numbers because when you call GOSUB it starts from the the first line number and scans up until it finds the line number that matches the target.
You are correct! Less search time. Though I think it searches forward if it can, so if there were a bunch of subroutines up front there might be some instances where having one after the call could be found faster. Interesting!
Oh yeah. If anyone cares to experiment with modifying the BASIC interpreter, it might be fun to make the symbol table “adaptive”. When you find a variable in the symbol table, if it’s not the first one, swap it with its predecessor. The idea is that the more frequently looked up symbols migrate towards the front and thus are more quickly found. The question is whether the migration improves things enough to make up for the swapping.
I will have to do some tests. At the very least, perhaps doing a “DIM” at the top defining all the variables would put them in that order, so you could put the most used up there first. I’ll do some quick timing tests…
Pingback: Optimizing Color BASIC, part 2 | Sub-Etha Software
Pingback: Optmizing Color BASIC, part 4 | Sub-Etha Software
Pingback: An effect of PAL versus NTSC… | Sub-Etha Software
Pingback: Optimizing Color BASIC, part 3 | Sub-Etha Software
Pingback: Optimizing Color BASIC, part 5 | Sub-Etha Software
Pingback: CoCo Cross Development, part 1 – Vintage is the New Old
Pingback: Optimizing Color BASIC, part 6 | Sub-Etha Software
Pingback: Optimizing Color BASIC, part 7 | Sub-Etha Software
Pingback: CoCo Cross Development, part 2 – Vintage is the New Old
Pingback: CoCo Cross Development, part 3 – Vintage is the New Old
Pingback: Optimizing Color BASIC, part 8 | Sub-Etha Software
Pingback: Optimizing Color BASIC, part 9 | Sub-Etha Software
Pingback: Color BASIC optimization challenge – scaling | Sub-Etha Software
Pingback: The fastest way to zero in Color BASIC. | Sub-Etha Software
Pingback: The marbles in a bag puzzle in Color BASIC | Sub-Etha Software
Pingback: Interfacing assembly with BASIC via DEFUSR, part 6 | Sub-Etha Software
Pingback: IF AND/OR THEN versus IF THEN IF | Sub-Etha Software