See also: part 1, part 2, part 3 and part 4.
A Faster Base-64
I had planned to end this series with this third part, giving a simple way to turn 8-bit value DATA statements into Base-64 DATA statements. But smarter folks than I have looked at my previous work, so now my plans have changed. We will need an extra part or two… or three.
Today, let’s highlight some comments made to previous installments.
The always thought-provoking MiaM wrote:
I would had written out the 2 exponent values directly as 4, 16 and 64 rather than using INT(2^2).
The decdode-4-“chars”-to-3-bytes parts could use a modified base 64 thingy where you have the first 6 bits of three bytes in a row, and the fourth “char” contains the upper two bits for the previous three “chars”/bytes. Or the other way around, start with the “char” that contains the upper two bits for the following three chars and put that in a numerical variable. Then have a loop that runs three times. Each time first shift the “upper two bits” variable two steps left, i.e. multiply by 4, and then read a “char” and to that char OR the result of the “upper two bit” variable ANDed with 192 (=%110000).
That format would be incompatible with the standardized representation used by BASE 64, but it would indeed be a format with the same density and using the same characters.
Btw you could use the “BASE 90” as a way to slightly compress some data, by just having the values >63 represent two instances of the actual value minus 64. That might not save much, but perhaps worth investigating.
MiaM
Very good point! Eliminating “power of two” calculations and changing them to hard coded values should offer a noticeable speed increase. Pre-calculating values (i.e. writing 4 instead of 2^2) is a good way to save some time, and possibly space too (since “2^2” takes up more memory than “4”).
Normally I would go on a Benchmarking BASIC tangent, but I will save that for later.
I was more intrigued by the concept of making an easier to parse Base-64 format. Since the original goal of this article was to cram as much type-able DATA numbers as possible in to a BASIC program, there is nothing that says it needs to follow the standard Base-64 encoding format. Any format that gets more bits of data in to type-able DATA values would suffice.
This opens up an opportunity to tweak the encode/decode method to be easier to do in BASIC. Mia suggests something like this:
Instead of the standard Base-64 encoding of three 8-bit values into four 6-bit values:
+- Byte 1 --+- Byte 2 --+- Byte 3 --+ | 000000|00 | 0000|0000 | 00|000000 | | \__A__/\___B___/ \___C___/ \_D__/
We could alter the encoding into a different version:
+- Byte 1 --+- Byte 2 --+- Byte 3 --+ | 00|000000 | 00|000000 | 00|000000 | | \| \_A__/ \| \_B__/ | |/ \_C__/ | | \___________D__________/
The benefit here is that decoding this in BASIC could be done much easier and faster. Rather than all the multiplication/division needed to shift bits and then combine them into bytes, it could be as simple as this a few ANDs and divides. Here’s a rough example of converting three 8-bit (0-255) input values (A, B and C) into four 6-bit (0-63) output values (O1, O2, O3 and O4).
0 REM 6BIT.BAS
1 REM As proposed by MiaM
10 READ A,B,C
15 REM --XXXXXX of A
20 O1=(A AND &H3F)
25 REM --XXXXXX of B
30 O2=(B AND &H3F)
35 REM --XXXXXX of C
40 O3=(C AND &H3F)
45 REM XX------ of A
50 O4=(A AND &HC0)/4
55 REM XX------ of B
60 O4=O4+(B AND &HC0)/16
65 REM XX------ of C
70 O4=O4+(C AND &HC0)/64
75 PRINT "ENCODED:"
80 PRINT A;B;C,O1;O2;O3;O4
85 A=0:B=0:C=0
90 A=O1+INT(O4 AND &H30)*4
100 B=O2+INT(O4 AND &HC)*16
110 C=O3+INT(O4 AND &H3)*64
120 PRINT "DECODED:"
125 PRINT O1;O2;O3;O4,A;B;C
1000 DATA 111,222,123
Running this program displays:
The three 8-bit input values (111, 222 and 123) are converted into four 6-bit output values, then those four are turned back into three 8-bit values to verify it worked.
The conversion is very simple, since the output values O1, O2 and O3 are just the right 6-bits of the input values A, B, and C, which can be obtained by using AND to mask off the top two bits:
20 O1=(A AND &H3F)
30 O2=(B AND &H3F)
40 O3=(C AND &H3F)
Optimization Note: We could save a few bytes by omitting the parenthesis and the space before the &H3F. Due to how the Color BASIC’s parser works, we need the space between the variables (A, B and C) and the keyword “AND”. That space is what tells the tokenizer we want a variable followed by the keyword AND, versus a variable that starts with “AA”:
A=255 PRINT A AND 4 4 PRINT A AND4 4 PRINT AAND4 0
Above, Color BASIC thinks the third example is a variable called “AAND4” which is truncated to just be “AA” since Color BASIC only cares about the first two characters of a variable name:
A=255 AAND4=42 PRINT AAND4 42 PRINT A AND4 4
Oh the fun bugs that must have caused me back in the day!
But I digress…
The fourth byte is built by doing the opposite AND to get only the top two bits of A, then a divided that by 4 to shift them to the right 2 bits (AA—— to —AA—-), then do the same mask to B and divide by 16 to shift them 4 bits to the right (BB—— to ——BB–) and again for C divided by 64 to shift 6 bits to the right (CC—— to ——CC) and then add them together to make the result (–AABBCC).
50 O4=(A AND &HC0)/4 60 O4=O4+(B AND &HC0)/16 70 O4=O4+(C AND &HC0)/64
I think I did a poor job explaining that. But here it is visually:
INPUT (three 8-bit values): A: aaAAAAAA B: bbBBBBBB C: ccCCCCCC OUTPUT (four 6-bit values): O1: --AAAAAA O2: --BBBBBB O3: --CCCCCC O4: --aabbcc
Maybe that helps.
Doing it this was can be done faster and with less code, I think. Some benchmarking needs to be done to see if AND is faster than addition for combining the values, and the O4 line can just be made as one thing without the intermediate line numbers and steps:
50 O4=(A AND &HC0)/4+(B AND &HC0)/16+(C AND &HC0)/64
We will want to make the decoder as small as possible, since if we save 100 bytes doing BASE-64 over HEX and the decoder takes more than 100 bytes it defeats the purpose.
Maybe we can figure out this “base 90” concept in a future article, as well.
To be continued…
You could probably save a byte by trying 4ANDA instead of A AND4. THere sure aren’t any basic keywords that start with a 4 and a decimal number can’t contain letters.
I’d say that it’s better to optimize for size than speed re the decoder. Therefore I suggest using decimal values instead of hex values even if they are slower. Hex 3F = decimal 63 which should be shorter.
My idea for the decoder would be to have an inner loop doing something like this
100 FOR I=startaddr TO endaddr STEP 3
100 READ A
110 FOR J=0TO2
120 READ B
130 A=A*4
140 POKEI+J,(63ANDB)+(192ANDA)
150 NEXT J
160 NEXT I
but of course more compact.
You could fiddle around with replacing STEP 3 with a multiplication by 3 in the poke statement and then have the start/end addresses divided by three (and if you need a start address that isn’t evenly divisible by three just change the inner loop to 1-3 or 2-4 instead of 0-2). That would generate slower code but maybe save some space. STEP should take as much space as * but you might end up with smaller numbers for the start/end address when they are divided by 3. If you end up with an address that divided by three is slightly above 9999 you could set the inner loop to something like 7 TO 9. I assume that there isn’t any requirement for the machine code program to reside at around address 10000 though. Perhaps this idea might help if you use hex encoding for the start/end address numbers though?
First, AND comment is brilliant. I never thought about reversing the parameters to get rid of the space. Only recently did I start seeing C code that does the checks backwards, like “if (5 == count)”, and while I hate how goofy that looks, and still am not doing it myself, it prevents the accidental use of “=” instead of “==” from even compiling since you can’t assign anything to a constant.
But I need to look at your example and unwind what it’s doing. I’ll draw me some pictures!
And you should have seen my “stream of consciousness” last night as I was writing next part, stumbling in to issues with the DATA statement and how to include type-able characters like QUOTE, COMMA, COLON, etc. A fun time was had by all!
Swap of the AND came to my mind also.;)
I wouldn’t give up speed for space by using decimal values. They are much slower. Or these constants are defined as variables early, especially for rhe one with 2 or more digits. But in this case a space is needed to overcome the tokenizer deficiency (I wonder, that CBM’s BASIC V2 tokenizer is much better in this regard even they have the same MS BASIC roots).
I have a TheVIC20. Is that CBM V2? I plan to revisit my benchmark series on the VIC as well. And all these years and I’d never thought about reverting the constant.
Yes, VIC-20 and C64 have BASIC V2 (are the identical beside some aspects of the screen editor).