A year or two ago, I ran across some C code at my day that finally got me to do an experiment…
When I was first using a modem to dial in to BBSes, it was strictly a text-only interface. No pictures. No downloads. Just messages. (Heck a physical bulletin board at least would let you put pictures on it! Maybe whoever came up with the term BBS was just forward thinking?)
The first program I ever had that sent a program over the modem was DFT (direct file transfer). It was magic.
Later, I got one that used a protocol known as XMODEM. It seems like warp speed compared to DFT!
XMODEM would send a series of bytes, followed by a checksum of those bytes, then the other end would calculate a checksum over the received bytes and compare. If they matched, it went on to the next series of bytes… If it did not, it would resend those bytes.
Very simple. And, believe it or not, checksums are still being used by modern programmers today, even though newer methods have been created (such as CRC).
Checking the sum…
A checksum is simple the value you get when you add up all the bytes of some data. Checksum values are normally not floating point, so they will be limited to a fixed range. For example, an 8-bit checksum (using one byte) can hold a value of 0 to 255. A 16-bit checksum (2 bytes) can hold a value of 0-65535. Since checksums can be much higher values, especially if using an 8-bit checksum, the value just rolls over.
For example, if the current checksum calculated value is 250 for an 8-bit checksum, and the next byte being counted is a 10, the checksum would be 250+10, but that exceeds what a byte can hold. The value just rolls over, like this:
250 + 10: 251, 252, 253, 254, 255, 0, 1, 2, 3, 4
Thus, the checksum after adding that 10 is now 4.
Here is a simple 8-bit checksum routine for strings in Color BASIC:
0 REM CHKSUM8.BAS 10 INPUT "STRING";A$ 20 GOSUB 100 30 PRINT "CHECKSUM IS";CK 40 GOTO 10 100 REM 8-BIT CHECKSUM ON A$ 110 CK=0 120 FOR A=1 TO LEN(A$) 130 CK=CK+ASC(MID$(A$,A,1)) 140 IF CK>255 THEN CK=CK-255 150 NEXT 160 RETURN
Line 140 is what handles the rollover. If we had a checksum of 250 and the next byte was a 10, it would be 260. That line would detect it, and subtract 255, making it 4. (The value starts at 0.)
The goal of a checksum is to verify data and make sure it hasn’t been corrupted. You send the data and checksum. The received passes the data through a checksum routine, then compares what it calculated with the checksum that was sent with the message. If they do not match, the data has something wrong with it. If they do match, the data is less likely to have something wrong with it.
Double checking the sum.
One of the problems with just adding (summing) up the data bytes is that two swapped bytes would still create the same checksum. For example “HELLO” would have the same checksum as “HLLEO”. Same bytes. Same values added. Same checksum.

However, if one byte got changed, the checksum would catch that.

It would be quite a coincidence if two data bytes got swapped during transfer, but I still wouldn’t use a checksum on anything where lives were at stake if it processed a bad message because the checksum didn’t catch it ;-)
Another problem is that if the value rolls over, that means a long message or a short message could cause the same checksum. In the case of an 8-bit checksum, and data bytes that range from 0-255, you could have a 255 byte followed by a 1 byte and that would roll over to 0. A checksum of no data would also be 0. Not good.

Checking the sum: Extreme edition
A 16-bit or 32-bit checksum would just be a larger number, reducing how often it could roll over.
For a 16-bit value, ranging from 0-65535, you could hold up to 257 bytes of value 255 before it would roll over:
255 * 257 = 65535
But if the data were 258 bytes of value 255, it would roll over:
255 * 258 = 65790 -> rollover to 255.
Thus, a 258-byte message of all 255s would have the same checksum as a 1-byte message of a 255.
To update the Color BASIC program for 16-bit checksum, change line 140 to be:
140 IF CK>65535 THEN CK=CK-65535
Conclusion
Obviously, an 8-bit checksum is rather useless, but if a checksum is all you can do, at least use a 16-bit checksum. If you were using the checksum for data packets larger than 257 bytes, maybe a 48-bit checksum would be better.
Or just use a CRC. They are much better and catch things like bytes being out of order.
But I have no idea how I’d write one in BASIC.
One more thing…
I almost forgot what prompted me to write this. I found some code that would flag an error if the checksum value was 0. When I first saw that, I thought “but 0 can be a valid checksum!”
For example, if there was enough data bytes that caused the value to roll over from 65535 to 0, that would be a valid checksum. To avoid any large data causing value to add up to 0 and be flagged bad, I added a small check for the 16-bit checksum validation code:
if ((checksum == 0) && (datasize < 258)) // Don't bother doing this.
{
// checksum appears invalid.
}
else if (checksum != dataChecksum)
{
// checksum did not match.
}
else
{
// guess it must be okay, then! Maybe...
}
But, what about a buffer full of 00s? The checksum would also be zero, which would be valid.
Conclusion: Don’t error check for a 0 checksum.
Better yet, use something better than a checksum…
Until next time…


