So far, we’ve taken a brute force PRINT program and turned it in to a less-brute force program that did the same thing using DATA statements:
0 ' LOGIKER8.BAS
10 CLS
15 CH=32:PRINTTAB(6);
20 READ A:IF A=-1 THEN 220
25 IF A=0 THEN PRINT:GOTO 15
30 PRINT STRING$(A,CH);
35 IF CH=32 THEN CH=42 ELSE CH=32
40 GOTO 20
50 DATA 5,1,7,1,0
60 DATA 5,2,5,2,0
70 DATA 5,3,3,3,0
80 DATA 5,4,1,4,0
90 DATA 1,17,0
100 DATA 2,15,0
110 DATA 3,13,0
120 DATA 4,11,0
130 DATA 5,9,0
140 DATA 4,11,0
150 DATA 3,13,0
160 DATA 2,15,0
170 DATA 1,17,0
180 DATA 5,4,1,4,0
190 DATA 5,3,3,3,0
200 DATA 5,2,5,2,0
210 DATA 5,1,7,1,0
215 DATA -1
220 GOTO 220
All of this in an effort to try to print out this image:
While there are still many BASIC optimizations we could do (removing spaces, combining lines even further, renumbering by 1, etc.), those would apply to any version of the code we create. Instead of doing that, let’s look at some other ways we can represent this data.
When the Atari VCS came out in 1977 (you younguns may only know it as the 2600, but it didn’t get that name until 1982 — five years after its release later), it required clever tricks to make games run in only 1K or 2K of ROM and with just 128 bytes of RAM.
The game Adventure was quite the challenge, since it had multiple screens representing different mazes, castles and areas.
Atari’s Greatest Hits on an old iPad
Each screen was represented by only 21 bytes of ROM! If you follow that link, you can read more about my efforts to understand how this worked. Here is an example of how the castle room was represented:
;Castle Definition
CastleDef:
.byte $F0,$FE,$15 ;XXXXXXXXXXX X X X R R R RRRRRRRRRRR
.byte $30,$03,$1F ;XX XXXXXXX RRRRRRR RR
.byte $30,$03,$FF ;XX XXXXXXXXXXRRRRRRRRRR RR
.byte $30,$00,$FF ;XX XXXXXXXXRRRRRRRR RR
.byte $30,$00,$3F ;XX XXXXXX RRRRRR RR
.byte $30,$00,$00 ;XX RR
.byte $F0,$FF,$0F ;XXXXXXXXXXXXXX RRRRRRRRRRRRRR
There are three bytes to represent each line. Three bytes would only be able to represent 24 pixels (8 bits per byte), and the ASCII art shows the screen width is actually 40. Those three bytes cannot represent the entire row of pixels.
In fact, 4-bits of that isn’t used. Each set of three bytes represents halfa row (20 bits out of the 24 the three bytes represent). Look at the first entry:
.byte $F0,$FE,$15 ;XXXXXXXXXXX X X X R R R RRRRRRRRRRR
If you turn those bytes into binary, you get this pattern:
The Atari drew the first 8-bits from least significant bit to most. the second 8-bits from most significant to least, then the third from least significant to most. That makes it look like this, matching the ASCII art (skipping the unused 4-bits):
000011111111111010101000
XXXXXXXXXXX X X X
To represent a full screen, the Atari had a trick that would mirror or duplicate the other half of the screen. In the case of the castle, the right side was a mirror image. In the case of certain mazes, the data was duplicated.
Looking at our image here, since it is symmetrical, we could certainly use the same trick and only store half of the image.
Since the image is 17×17 (an odd number, so there is a halfway row and column), we’d actually need to just draw to that halfway row/column, then reverse back through the data.
We should be able to take our existing data and crop it down from this, which represents the full image:
50 DATA 5,1,7,1,0
60 DATA 5,2,5,2,0
70 DATA 5,3,3,3,0
80 DATA 5,4,1,4,0
90 DATA 1,17,0
100 DATA 2,15,0
110 DATA 3,13,0
120 DATA 4,11,0
130 DATA 5,9,0
140 DATA 4,11,0
150 DATA 3,13,0
160 DATA 2,15,0
170 DATA 1,17,0
180 DATA 5,4,1,4,0
190 DATA 5,3,3,3,0
200 DATA 5,2,5,2,0
210 DATA 5,1,7,1,0
…to this, which represents the top left quarter-ish of the image:
50 DATA 5,1,4,0 ' X '
60 DATA 5,2,3,0 ' XX '
70 DATA 5,3,2,0 ' XXX '
80 DATA 5,4,1,0 ' XXXX '
90 DATA 1,9,0 ' XXXXXXXXX'
100 DATA 2,8,0 ' XXXXXXXX'
110 DATA 3,7,0 ' XXXXXXX'
120 DATA 4,6,0 ' XXXXXX'
130 DATA 5,5,0 ' XXXXX'
That represents all the data up to the center row/column, and that seems to be a considerable savings in code space (removing eight lines).
But how do we draw that forward, then in reverse? There is no way to back up when using the READ command, so we’d have to remember what we just did. For a general purpose “compress 1-bit image” routine it would be more complex, but since we know the image we are going to produce, we can make an assumption:
The image never has more than three transitions (space, asterisk, space) in a line.
No line entry has more than 4 numbers total.
Knowing that, we could simply save up to three numbers in variables, so we would print them out A B C and then C B A. We won’t even need the zeros now, since we can read A,B,C and act on them (stopping if C is 0).
Neat!
A quick bit of trial and error gave me this code that will print the top half of the image:
0 ' LOGIKER10.BAS
10 CLS
15 CH=32:PRINTTAB(6);
20 READ A:IF A=-1 THEN 220
21 PRINT STRING$(A,32);
22 READ B,C
25 IF C=0 THEN PRINT STRING$(B*2-1,42);STRING$(A,32):GOTO 15
30 PRINT STRING$(B,42);STRING$(C*2-1,32);STRING$(B,42)
40 GOTO 15
50 DATA 5,1,4
60 DATA 5,2,3
70 DATA 5,3,2
80 DATA 5,4,1
90 DATA 1,9,0
100 DATA 2,8,0
110 DATA 3,7,0
120 DATA 4,6,0
130 DATA 5,5,0
215 DATA -1
220 GOTO 220
It creates this:
I can now say “we are halfway there!”
But now I have another issue to solve. How do I back up? There is no way to READ data in reverse. It looks like I’m going to need to load all those numbers in to memory so I can reverse back through them.
The design is one row taller than will fit on the CoCo’s 32×16 text screen, but it would easily fit on the 40 or 80 column screen of the CoCo 3. For this article, I am going to stick with the standard text screen and just let it scroll one row off the top of the screen. When I have something figured out, it might only require modifying the centering code to display on the 40/80 column screen.
Let there be code!
At this stage, the design is being centered using the TAB command. Putting a “TAB(7)” at the start of each string takes up 3 bytes of programming space. It seems “TAB(” is tokenized, then there is the 3 character, followed by the “)” character. I had thought using PRINT@ might save some space, but the “@” takes a byte, then the screen position numbers follow it, and a comma is required. “PRINT@7,” takes the same amount of code space as “PRINTTAB(7)” so no savings there.
The biggest savings is going to come from eliminating the repeated use of the “* characters in the strings. Since the entire image is made up of spaces or asterisks, it could be represented by data that says how many spaces then how many asterisks then how many spaces, etc.
Here is what the image looks like centered to 32-columns:
The first line has 11 spaces, 1 asterisks, 7 spaces, then 1 asterisk.
The second line has 11 spaces, 1 asterisks, 5 spaces, then 2 asterisks.
And so on… The first line could be represented in DATA to tell how many repeating groups of spaces and asterisks there are such as 11,1,7,1,0 (using 0, to indicate the end of the line).
For something this small, I could have just counted up the spaces and asterisks myself, but it’s more fun to spend ten minutes writing a BASIC program to do what would have taken me 60 seconds to do manually.
I converted the pattern in to DATA statements and created a program that would scan each line looking for repeated characters. When the character changes, it prints how many of the previous repeating characters it saw and resets the counter.
Rather than explain how the program works, like I often try to do, I just submitted this code to the OpenAI ChatGPT system and let it do the work for me:
This BASIC program reads in a series of strings, one at a time, and outputs the number of times each character appears in the string.
Here is a brief explanation of the code:
Lines 10-16: This is the main loop of the program, where each string is read in, processed, and output.
Line 11: This line initializes the LC$ variable to the first character of the string and the DC variable to 1.
Lines 12-14: This is a FOR loop that iterates through each character of the string.
Line 13: This line compares the current character with the last character (stored in LC$). If they are the same, it increments the DC variable. If they are different, it outputs the value of DC and then resets DC to 1 and LC$ to the current character.
Line 15: This line outputs the value of DC after the loop has finished processing the string.
Lines 50-220: These are the data lines that contain the strings to be processed.
Line 220: This is the sentinel value that indicates the end of the data.
Note that this code uses a number of built-in BASIC functions, such as LEFT$, MID$, and LEN, which may not be available in all versions of BASIC. You may need to modify the code to use equivalent functionality if it is not available in the version of BASIC you are using.
– chat.openai.com
Seriously, folks. You have to go play with ChatGPT. It’s amazingly fun.
OpenAI ChatGPT Digression
After showing this program to ChatGPT, I asked “is there a way to do it better?” It responded with:
There are a few ways that this routine could be improved. Here are a few suggestions:
Use a DO loop instead of a GOTO statement to control the main loop. This would make the code easier to read and maintain.
Use a SELECT CASE statement to handle the different cases in line 13 instead of an IF statement. This would make the code more concise and easier to read.
Use a DICTIONARY or MAP data structure to store the counts of each character instead of using separate variables for each character. This would make the code more efficient and allow it to handle an arbitrary number of characters without having to hard-code variables for each one.
Add error-checking to the code to handle cases where the input data is invalid (e.g. empty strings, strings with invalid characters, etc.). This would make the code more robust and prevent it from crashing or producing incorrect results in these cases.
Here is an example of how the code could be modified to incorporate some of these changes:
– chat.openai.com
The code it provided was not something that would work on ancient Microsoft BASIC, but it was interesting to see:
DO
READ A$
IF A$="END" THEN EXIT DO
' Initialize the character counts to 0
CHAR_COUNTS = DICTIONARY()
' Loop through each character of the string
FOR i = 1 TO LEN(A$)
ch = MID$(A$, i, 1)
IF ch = " " THEN CONTINUE ' Skip whitespace
' Increment the count for this character
IF CHAR_COUNTS.EXISTS(ch) THEN
CHAR_COUNTS[ch] = CHAR_COUNTS[ch] + 1
ELSE
CHAR_COUNTS[ch] = 1
END IF
NEXT
' Output the counts for each character
FOR ch = "A" TO "Z"
PRINT CHAR_COUNTS[ch]; ",";
NEXT
PRINT "0"
LOOP
50 DATA" * *"
60 DATA" ** **"
70 DATA" *** ***"
80 DATA" **** ****"
90 DATA"*****************"
100 DATA" ***************"
110 DATA" *************"
120 DATA" ***********"
130 DATA" *********"
140 DATA" ***********"
150 DATA" *************"
160 DATA" ***************"
170 DATA"*****************"
180 DATA" **** ****"
190 DATA" *** ***"
200 DATA" ** **"
210 DATA" * *"
220 DATA "END"
I have no idea what version of BASIC that code would work on, and the program it wrote doesn’t really look like it’s doing the same thing, but still neat. In a future article (provided they continue to give us access to this AI), I plan to share some of the interesting things it can do with BASIC programs (including writing them from scratch).
But I digress…
Driving Miss Data
When I run the program, it starts printing out rows of numbers. Since there are seventeen rows of the graphic, it prints out seventeen lines of numbers. I needed to pause the program mid-printing so I can write down the numbers before they scroll off. (A fancier way would have been to have the program write out an ASCII BASIC program with the line numbers and DATA statements already in it which I could load back later. I think I showed how I do that in an earlier article somewhere on this site.)
Now that we have numbers for our DATA statements, a new version of the program could be written to generate the image using them. I will make use of the STRING$ that creates a string of a specific character of a specific length…
0 ' LOGIKER6.BAS
10 CLS
15 CH=32:PRINTTAB(7);
20 READ A:IF A=-1 THEN 220
25 IF A=0 THEN PRINT:GOTO 15
30 PRINT STRING$(A,CH);
35 IF CH=32 THEN CH=42 ELSE CH=32
40 GOTO 20
50 DATA 4,1,7,1,0
60 DATA 4,2,5,2,0
70 DATA 4,3,3,3,0
80 DATA 4,4,1,4,0
90 DATA 17,0
100 DATA 1,15,0
110 DATA 2,13,0
120 DATA 3,11,0
130 DATA 4,9,0
140 DATA 3,11,0
150 DATA 2,13,0
160 DATA 1,15,0
170 DATA 17,0
180 DATA 4,4,1,4,0
190 DATA 4,3,3,3,0
200 DATA 4,2,5,2,0
210 DATA 4,1,7,1,0
215 DATA -1
220 GOTO 220
Obviously those data statements could be combined in to fewer lines, but for this version I wanted them to match the same line number the original PRINT was on. You can easily compare the results:
50 DATA" * *"
50 DATA 4,1,7,1,0
60 DATA" ** **"
60 DATA 4,2,5,2,0
70 DATA" *** ***"
70 DATA 4,3,3,3,0
80 DATA" **** ****"
80 DATA 4,4,1,4,0
Before I show you the results, can you see the flaw in my program?
I’ll give you a hint… Line 170.
Close but no cigar
My program assumes each line starts with a space, so the first value will be printed as spaces, then the next value as asterisks, and so on. This causes a problem when it gets to the row that is entirely the asterisk it reads the first number and prints it as spaces, giving me this incorrect result:
I can think of several ways to solve this:
Use a different value other than 0 for “end-of-line” and make 0 mean “nothing to print, just switch to the astrisk”. That would change line 170 to be “DATA 0,17,X” (where “X” is the new end-of-line marker. This would probably require a new bit of IF logic to handle.
Make each group of data two bytes that specifies the character to print, and how many. Printing 17 asterisks would be “17,42”. Printing four spaces would be “4,32”. This would make the program logic simpler, but would double the size of the data. Depending on how much smaller the logic is, this might be a winner. (And I can think of optimizations to that as well, such as using 0 and 1 for the data to print and just printing “32+X*10” so it prints either 32 (if the value is 0) or 42 (if the value is 1). This is normally how I would have started, but I was trying to make the data as small as possible.
I could just encode the leading spaces at the start of each line rather than using TAB(7). By doing this, every line would start with a space. This would work for this specific challenge, but not be flexible for patterns that don’t start with a space.
For now, let’s make a quick change and try #3 by simply adding 7 to the first number in each DATA statement, and adding a 7 to line 170 which is the row that doesn’t have a space at the start. I think it would look like this:
0 ' LOGIKER7.BAS
10 CLS
15 CH=32
20 READ A:IF A=-1 THEN 220
25 IF A=0 THEN PRINT:GOTO 15
30 PRINT STRING$(A,CH);
35 IF CH=32 THEN CH=42 ELSE CH=32
40 GOTO 20
50 DATA 11,1,7,1,0
60 DATA 11,2,5,2,0
70 DATA 11,3,3,3,0
80 DATA 11,4,1,4,0
90 DATA 7,17,0
100 DATA 8,15,0
110 DATA 9,13,0
120 DATA 10,11,0
130 DATA 11,9,0
140 DATA 10,11,0
150 DATA 9,13,0
160 DATA 8,15,0
170 DATA 7,17,0
180 DATA 11,4,1,4,0
190 DATA 11,3,3,3,0
200 DATA 11,2,5,2,0
210 DATA 11,1,7,1,0
215 DATA -1
220 GOTO 220
Running this program produces the desired results! But, it has a drawback:
The data size grew. Not only did we add “7,” (two bytes) to line 170, but eleven other lines went from a 1-digit value to a 2-digit value. This means our data grew by 13 bytes. If we saved 13 bytes in the decoding routine, this is a win. If we did not, it is not an acceptable fix.
When I load the previous version of the program in to the XRoar emulator and PRINT MEM, it shows 22425 free. When I do the same with this version, I get 22416 — less memory free, so a larger program. This is bad, but the previous version is still missing the code to handle that line 170.
Perhaps, instead of adding 7 to each line to center on the screen, each line could just add 1 (so it doesn’t create two-digit values) and we can use TAB(6). That would look like this:
0 ' LOGIKER8.BAS
10 CLS
15 CH=32:PRINTTAB(6);
20 READ A:IF A=-1 THEN 220
25 IF A=0 THEN PRINT:GOTO 15
30 PRINT STRING$(A,CH);
35 IF CH=32 THEN CH=42 ELSE CH=32
40 GOTO 20
50 DATA 5,1,7,1,0
60 DATA 5,2,5,2,0
70 DATA 5,3,3,3,0
80 DATA 5,4,1,4,0
90 DATA 1,17,0
100 DATA 2,15,0
110 DATA 3,13,0
120 DATA 4,11,0
130 DATA 5,9,0
140 DATA 4,11,0
150 DATA 3,13,0
160 DATA 2,15,0
170 DATA 1,17,0
180 DATA 5,4,1,4,0
190 DATA 5,3,3,3,0
200 DATA 5,2,5,2,0
210 DATA 5,1,7,1,0
215 DATA -1
220 GOTO 220
Doing a PRINT MEM on that one shows 22421, so it is four bytes larger than the original, and still smaller than the “add 7” version. Perhaps that is good enough for now?
Combining all the lines to make a smaller program would look like this:
0 ' LOGIKER9.BAS
10 CLS
15 CH=32:PRINTTAB(6);
20 READ A:IF A=-1 THEN 220 ELSE IF A=0 THEN PRINT:GOTO 15
30 PRINT STRING$(A,CH);:IF CH=32 THEN CH=42 ELSE CH=32
40 GOTO 20
50 DATA 5,1,7,1,0,5,2,5,2,0,5,3,3,3,0,5,4,1,4,0,1,17,0,2,15,0,3,13,0,4,11,0,5,9,0,4,11,0,3,13,0,2,15,0,1,17,0,5,4,1,4,0,5,3,3,3,0,5,2,5,2,0,5,1,7,1,0,-1
220 GOTO 220
Better! But we can make it more better.
In the next installment, we will do something that I learned from studying the Atari 2600’s Adventure program…
Starting with that un-optimized version, I will change it to work on the CoCo 1/2/3’s 32-column screen by adjusting it to be properly centered on that display.
Unfortunately, this design is 17 rows tall, and the CoCo’s standard display is only 16. It won’t fit:
We should still be able to enter the challenge by having the program print this pattern, even if it scrolls off the screen a bit. To get one extra line there, we can get rid of the line feed at the end of the final PRINT statement in line 210 by adding a semi-colon to the end:
210 PRINT" * *";
And so it begins…
And so it begins
The goal is to make this as small as possible. There were many ways to approach last year’s Christmas tree challenge, and you can read about the results and a follow-up with suggestions folks gave to save a byte or two.
A simple thing is to remove the spaces at the front and replace them with the TAB() command:
Although this only looks like it saves a character per line (“TAB(8)” versus “seven spaces”), the code itself will be smaller since the TAB command tokenizes down to one (or maybe two?) bytes.
Also, the ending quote is not needed if it’s the last thing on a line, so they could be removed:
But, each line number consumes 5-bytes on it’s own, so a better way to save space would be to pack the lines together. Each line you eliminate saves five bytes. That would become pretty unreadable though, but let’s do it anyway:
This could still be made better, since the text lines were kept under the input buffer limitation size, but when you enter that line, BASIC compresses it (tokenizes keywords like PRINT, TAB and GOTO) making it take less space. You can then sometimes EDIT the line, Xtend to the end and type a few more characters.
That may or may not be allowed for the Logiker challenge. And since I want to provide code here you could copy and then load in to an emulator, I’ll keep it to the limit of what you could type in.
In the next installment, I’ll see if my brane can figure out a way to generate this code using program logic rather than brute-force PRINT statements.
In my day job, we have a device that needs data sent to it with the bits reversed. For example, if we were sending an 8-bit value of 128, that bit pattern is 10000000. The device expects the high bit first so we’d send it 00000001.
In one system, we do an 8-bit bit reversal using a lookup table. I suppose that one needed it to be really fast.
In another (using a faster PIC24 chip with more RAM, flash and CPU speed), we do it with a simple C routine that was easy to understand.
I suppose this breaks down to four main approaches to take:
Smallest Code Size – for when ROM/flash is at a premium, even if the code is a confusingf mess.
Smallest Memory Usage – for when RAM is at a premium, even if the code is a confusing mess.
Fastest – for when speed is the most important thing, even if the code is a confusing mess.
Clean Code – easiest to understand and maintain, for when you don’t want code to be a confusing mess.
In our system, which is made up of multiple independent boards with their own CPUs and firmware, we do indeed have some places where code size is most important (because we are out of room), and other places where speed is most important.
When I noticed we did it two different ways, I wondered if there might be even more approaches we could consider.
I did a quick search on “fastest way to reverse bits in C” and found a variety of resources, and wanted to point out this fun one:
At that section of this lengthy article are a number of methods to reverse bits. Two of them make use of systems that support 64-bit math and do it with just one line of C code (though I honestly have no understanding of how they work).
Just in case you ever need to do this, I hope this pointer is useful to you.
The C programming language has a few standard library functions that deal with strings, namely strcpy() (string copy) and strcat() (string concatenate).
Microsoft BASIC has similar string manipulation features built in to the language. For example, to copy an 8 character “string” in to a buffer (array of chars):
In BASIC, you do not need to allocate space for individual strings. Color BASIC allows doing whatever you want with a string provided it is 255 characters or less, and provided the total string space is large enough. By default, Color BASIC reserves 200 bytes for string storage. If you wanted to strcpy() “12345678” to a string variable, you would just do:
BUFFER1$="12345678"
Yes, that’s legal, but Color BASIC only recognizes the first two characters of a string name, so in reality, that is just like doing:
BU$="12345678"
If you need more than the default 200 bytes, the CLEAR command will reserve more string storage. For example, “CLEAR 500” or “CLEAR 10000”.
“CLEAR 500” would let you have five 100 character strings, or 500 one character strings.
And, keep in mind, strings stored in PROGRAM MEMORY do not use this space. For example, if you reduced string space to only 9 bytes, then tried to make a 10 byte string direct from BASIC:
CLEAR 9
A$="1234567890"
?OS ERROR
The wonderful “?OS ERROR” (Out of String Space).
BUT, if strings are declared inside PROGRAM data, BASIC references them from within your program instead of string memory:
…you would get a 48 character string (8 characters, added 6 times).
In C, using strcat(buffer, buffer) with the same buffer has a doubling effect each time, just like A$=A$+A$ does in BASIC each time.
And, adding a bunch of strings together like…
A$=A$+A$+A$+A$+A$+A$+A$+A$ '64 characters
…could also be done in three doubling steps like this:
A$=A$+A$:A$=A$+A$:A$=A$+A$ ' 64 characters
Two different ways to concatenate strings together to make a longer string. Which one should we use?
Must … Benchmark …
In my code, I just added my 8 character A$ up 8 times to make a 64 character string. Then I started thinking about it. And here we go…
Using my standard benchmark program, we will declare an 8 character string then add it together 8 times to make a 64 character string. Over and over and over, and time the results.
0 REM strcat1.BAS
5 DIM TE,TM,B,A,TT
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 A$="12345678"
40 A$=A$+A$+A$+A$+A$+A$+A$+A$
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A:END
This produces an average of 1235.
Now we switch to the doubling three times approach:
0 REM strcat2.BAS
5 DIM TE,TM,B,A,TT
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 A$="12345678"
40 A$=A$+A$:A$=A$+A$:A$=A$+A$
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A:END
This drops the time down to 888!
Adding separately three times versus add together eight times is a significant speed improvement.
Based on what I learned when exploring string theory (and being shocked when I realized how MID$, LEFT$ and RIGHT$ worked), I believe every time you do a string add, there is a new string created:
“A$=A$+A$+A$+A$+A$+A$+A$+A$” creates eight strings along the way.
A year or two ago, I ran across some C code at my day that finally got me to do an experiment…
When I was first using a modem to dial in to BBSes, it was strictly a text-only interface. No pictures. No downloads. Just messages. (Heck a physical bulletin board at least would let you put pictures on it! Maybe whoever came up with the term BBS was just forward thinking?)
The first program I ever had that sent a program over the modem was DFT (direct file transfer). It was magic.
Later, I got one that used a protocol known as XMODEM. It seems like warp speed compared to DFT!
XMODEM would send a series of bytes, followed by a checksum of those bytes, then the other end would calculate a checksum over the received bytes and compare. If they matched, it went on to the next series of bytes… If it did not, it would resend those bytes.
Very simple. And, believe it or not, checksums are still being used by modern programmers today, even though newer methods have been created (such as CRC).
Checking the sum…
A checksum is simple the value you get when you add up all the bytes of some data. Checksum values are normally not floating point, so they will be limited to a fixed range. For example, an 8-bit checksum (using one byte) can hold a value of 0 to 255. A 16-bit checksum (2 bytes) can hold a value of 0-65535. Since checksums can be much higher values, especially if using an 8-bit checksum, the value just rolls over.
For example, if the current checksum calculated value is 250 for an 8-bit checksum, and the next byte being counted is a 10, the checksum would be 250+10, but that exceeds what a byte can hold. The value just rolls over, like this:
250 + 10: 251, 252, 253, 254, 255, 0, 1, 2, 3, 4
Thus, the checksum after adding that 10 is now 4.
Here is a simple 8-bit checksum routine for strings in Color BASIC:
0 REM CHKSUM8.BAS
10 INPUT "STRING";A$
20 GOSUB 100
30 PRINT "CHECKSUM IS";CK
40 GOTO 10
100 REM 8-BIT CHECKSUM ON A$
110 CK=0
120 FOR A=1 TO LEN(A$)
130 CK=CK+ASC(MID$(A$,A,1))
140 IF CK>255 THEN CK=CK-255
150 NEXT
160 RETURN
Line 140 is what handles the rollover. If we had a checksum of 250 and the next byte was a 10, it would be 260. That line would detect it, and subtract 255, making it 4. (The value starts at 0.)
The goal of a checksum is to verify data and make sure it hasn’t been corrupted. You send the data and checksum. The received passes the data through a checksum routine, then compares what it calculated with the checksum that was sent with the message. If they do not match, the data has something wrong with it. If they do match, the data is less likely to have something wrong with it.
Double checking the sum.
One of the problems with just adding (summing) up the data bytes is that two swapped bytes would still create the same checksum. For example “HELLO” would have the same checksum as “HLLEO”. Same bytes. Same values added. Same checksum.
A good 8-bit checksum.
However, if one byte got changed, the checksum would catch that.
A bad 8-bit checksum.
It would be quite a coincidence if two data bytes got swapped during transfer, but I still wouldn’t use a checksum on anything where lives were at stake if it processed a bad message because the checksum didn’t catch it ;-)
Another problem is that if the value rolls over, that means a long message or a short message could cause the same checksum. In the case of an 8-bit checksum, and data bytes that range from 0-255, you could have a 255 byte followed by a 1 byte and that would roll over to 0. A checksum of no data would also be 0. Not good.
Checking the sum: Extreme edition
A 16-bit or 32-bit checksum would just be a larger number, reducing how often it could roll over.
For a 16-bit value, ranging from 0-65535, you could hold up to 257 bytes of value 255 before it would roll over:
255 * 257 = 65535
But if the data were 258 bytes of value 255, it would roll over:
255 * 258 = 65790 -> rollover to 255.
Thus, a 258-byte message of all 255s would have the same checksum as a 1-byte message of a 255.
To update the Color BASIC program for 16-bit checksum, change line 140 to be:
140 IF CK>65535 THEN CK=CK-65535
Conclusion
Obviously, an 8-bit checksum is rather useless, but if a checksum is all you can do, at least use a 16-bit checksum. If you were using the checksum for data packets larger than 257 bytes, maybe a 48-bit checksum would be better.
Or just use a CRC. They are much better and catch things like bytes being out of order.
But I have no idea how I’d write one in BASIC.
One more thing…
I almost forgot what prompted me to write this. I found some code that would flag an error if the checksum value was 0. When I first saw that, I thought “but 0 can be a valid checksum!”
For example, if there was enough data bytes that caused the value to roll over from 65535 to 0, that would be a valid checksum. To avoid any large data causing value to add up to 0 and be flagged bad, I added a small check for the 16-bit checksum validation code:
if ((checksum == 0) && (datasize < 258)) // Don't bother doing this.
{
// checksum appears invalid.
}
else if (checksum != dataChecksum)
{
// checksum did not match.
}
else
{
// guess it must be okay, then! Maybe...
}
But, what about a buffer full of 00s? The checksum would also be zero, which would be valid.
Apparently, I had planned to write about it even more than that. I found this unpublished source code. Writing a version in Color BASIC wasn’t enough. Apparently I tried writing it in C#, too:
// 3X+1
using System;
public class Program
{
public static void Main()
{
while (true)
{
Int32 x = 0;
Console.WriteLine();
Console.WriteLine("Enter number:");
x = Int32.Parse(Console.ReadLine());
while (true)
{
Console.Write(x);
Console.Write(" ");
if (x == 1) break;
if ((x & 1) == 1) // Odd
{
x = x * 3 + 1;
}
else // Even
{
x = x / 2;
}
}
}
}
}
So, if you’re in to that kind of thing (C# is available for Windows, Mac OS X and Linux), you can give that a try and tell me how I should have written it.
2022-08-30 – Corrected a major bug in the Get8BitHexStringPtr() routine.
“Here we go again…”
Last week I ran out of ROM space in a work project. For each code addition, I have to do some size optimization elsewhere in the program. Some things I tried actually made the program larger. For example, we have some status bits that get set in two different structures. The code will do it like this:
We have code like that in dozens of places. One of the things I had done earlier was to change that in to a function. This was primarily so I could have common code set fault bits (since each of the four different boards I work with had a different name for its status structures). It was also to reduce the amount of lines in the code and make what they were doing more clear (“clean code”).
During a round of optimizing last week, I noticed that the overhead of calling that function was larger than just doing it manually. I could switch back and save a few bytes every time it was used, but since I still wanted to maintain “clean code”, I decided to make a macro instead of the function. Now I can still do:
setFault (FAULT_BIT);
…but under the hood it’s really doing a macro instead:
…but from looking at the PIC24 assembly code, that’s much larger. I did end up using it in large blocks of code that conditionally decided which fault bit to set, and then I sync the long status at the end. As long as the overhead of “this = that” is less than the overhead of multiple inline instructions it was worth doing.
And keep in mind, this is because I am 100% out of ROM. Saving 4 bytes here, and 20 bytes there means the difference between being able to build or not.
Formatting Output
One of the reasons for the “code bloat” was adding support for an LCD display. The panel, an LCD2004, hooks up to I2C vie a PCF8574 I2C I/O chip. I wrote just the routines needed for the minimal functionality required: Initialize, Clear Screen, Position Cursor, and Write String.
The full libraries (there are many) for Arduino are so large by comparison, so often it makes more sense to spend the time to “roll your own” than port what someone else has already done. (This also means I do not have to worry about any licensing restrictions for using open source code.)
I created a simple function like:
LCDWriteDataString (0, 0, "This is my message.");
The two numbers are the X and Y (or Column and Row) of where to display the text on the 20×4 LCD screen.
But, I was quickly reminded that the PIC architecture doesn’t support passing constant string data due to “reasons”. (Harvard architecture, for those who know.)
To make it work, you had to do something like:
const char *msg = "This is my message";
LCDWriteDataString (0, 0, msg);
…or…
chr buffer[19];
memcpy (buffer, "This is my message");
LCDWriteDataString (0, 0, msg);
…or, using the CCS compiler tools, add this to make the compiler take care of it for you:
#device PASS_STRINGS=IN_RAM
Initially I did that so I could get on with the task at had, but as I ran out of ROM space, I revisited this to see which approach was smaller.
From looking at the assembly generated by the CCS compiler, I could tell that “PASS_STRINGS=IN_RAM” generated quite a bit of extra code. Passing in a constant string pointer was much smaller.
So that’s what I did. And development continued…
Then I ran out of ROM yet again. Since I had some strings that needed formatted output, I was using sprintf(). I knew that sprintf() was large, so I thought I could create my own that only did what I needed:
In my particular example, all I was doing is printing out an 8-bit value as HEX, and printing out a 16-bit value as a decimal number. I did not need any of the other baggage sprintf() was bringing when I started using it.
The above routine maintains a static character buffer of 3 bytes. Two for the HEX digits, and the third for a NIL terminator (0). I chose to do it this way rather than having the user pass in a buffer pointer since the more parameters you pass, the larger the function call gets. The downside is those 3 bytes of variable storage are reserved forever, so if I was also out of RAM, I might rethink this approach.
If you are wondering why I do a strcpy() with a constant string, then use const pointers for strcat(), that is due to a limitation of the compiler I am using. Their implementation of strcpy() specifically supports string constants. Their implementation of strcat() does NOT, requiring me to jump through more hoops to make this work.
Even with all that extra code, it still ends up being smaller than linking in sprintf().
And, for printing out a 16-bit value in decimal, I am sure there is a clever way to do that, but this is what I did:
Since I know the value is limited to what 16-bits can old, I know the max value possible is 65535.
I initialize my five-digit string with “00000”. I start with a temporary value of 10000. If the users value is larger than that, I decrement it by that amount and increase the first digit in the string (so “0” goes to “1”). I repeat until the user value has been decremented to be less than 10000.
Then I divide that temporary value by 10, so 10000 becomes 1000. I move my position to the next character in the output string and the process repeats.
Eventually I’ve subtracted all the 10000s, 1000s, 100s, 10s and 1s that I can, leaving me with a string of five digits (“00000” to “65535”).
I am sure there is a better way, and I am open to it if it generates SMALLER code. :)
And that’s my tale of today… I needed some extra ROM space, so I got rid of sprintf() and rolled my own routines for the two specific types of output I needed.
But this is barely scratching the surface of the things I’ve been doing this week to save a few bytes here or there. I’d like to revisit this subject in the future.