Category Archives: Uncategorized

Color BASIC overflow bug – same as Commodore’s?

I just saw a tweet from Robin @ 8-Bit Show And Tell concerning a bug in Commodore BASIC that existed in the PET, C64 and VIC-20.

VAL() takes a string and converts it in to a floating point numerical variable. The value of “1E39” is a number in scientific notation, and this appears to cause a problem.

In Microsoft BASIC, the notation “1E39” represents the number 1 multiplied by 10 raised to the power of 39. This is also known as scientific notation, where the “E” indicates the exponent to which the base (10 in this case) is raised. So, “1E39” is equal to 1 * 10^39, which is an extremely large number:

1E39 = 1 * 10^39 = 1000000000000000000000000000000000000000

This number has 39 zeros after the 1, making it a very large value.

– ChatGPT

I was curious to see what the CoCo’s Color BASIC did, so I tried it…

It appears that port of BASIC to the 6809 also ported over this bug. Anyone want to take a look at the source and see what the issue is?

…to be continued, maybe…

Color BASIC program line info dump program

These days, I feel like I am regularly saying “I’ve learned more this week about X than I learned in Y years of using it back in the 1980s!”.

This is another one of those.

Each line of a Color BASIC program is tokenized (changing keywords like PRINT to a one or two byte token representing them) and then stored as follows:

  • 2-Bytes – Address in memory where next line starts
  • 2-Bytes – Line number (0-63999)
  • n-Bytes – Tokenized program line.
  • 1-Byte – Zero (0), indicating the end of the line

The four byte header and the 1 byte zero terminator mean that each line has an overhead of 5-bytes. You can see this by printing free memory and then adding a line that has a one byte token, such as “REM” or “PRINT”:

Above, you see the amount of memory decreases by 6 bytes after adding a line. That’s five bytes for the overhead, and one byte for the “REM” token.

The BASIC program starts in memory at a location stored in memory locations 25 and 26. You can see this by typing:

PRINT PEEK(25)*256+PEEK(27)

There are other such addresses that point to where variables start (directly after the program), and where string memory is. Here is an example program from an earlier article I wrote that shows them all. (The comments explain what each location is.)

0 ' BASINFO3.BAS

10 ' START OF BASIC PROGRAM
20 ' PEEK(25)*256+PEEK(26)

30 ' START OF VARIABLES
40 ' PEEK(27)*256+PEEK(28)

50 ' START OF ARRAYS
60 ' PEEK(29)*256+PEEK(30)

70 ' END OF ARRAYS (+1)
80 ' PEEK(31)*256+PEEK(32)

90 ' START OF STRING STORAGE
100 ' PEEK(33)*256+PEEK(34)

110 ' START OF STRING VARIABLES
120 ' PEEK(35)*256+PEEK(36)

130 ' TOP OF STRING SPACE/MEMSIZ
140 ' PEEK(39)*256+PEEK(40)

150 ' USING NO VARIABLES
160 PRINT "PROG  SIZE";(PEEK(27)*256+PEEK(28))-(PEEK(25)*256+PEEK(26)),;
170 PRINT "STR SPACE";(PEEK(39)*256+PEEK(40))-(PEEK(33)*256+PEEK(34))
180 PRINT "ARRAY SIZE";(PEEK(31)*256+PEEK(32))-(PEEK(29)*256+PEEK(30)),;
190 PRINT " STR USED";(PEEK(39)*256+PEEK(40))-(PEEK(35)*256+PEEK(36))
200 PRINT " VARS SIZE";(PEEK(29)*256+PEEK(30))-(PEEK(27)*256+PEEK(28)),;
210 PRINT " FREE MEM";(PEEK(33)*256+PEEK(34))-(PEEK(31)*256+PEEK(32))

I thought it might be interesting to write a BASIC program that displays information on each line of the BASIC program. That information would include:

  • Start address of the line
  • Address of the next line
  • Line number of the line

Here is what I came up with. It can use generic PRINT in lines 40 and 70 (for Color BASIC) or a nicer formatted PRINT USING (for Extended Color BASIC) in lines 50 an 80.

0 'BASINFO.BAS
1 REM BASINFO.BAS
2 REMBASINFO.BAS
10 PRINT " ADDR NADDR LINE# SIZ"
20 L=PEEK(25)*256+PEEK(26)
30 NL=PEEK(L)*256+PEEK(L+1)
40 'PRINT L;NL;
50 PRINT USING"##### #####";L;NL;
60 IF NL=0 THEN END
70 'PRINT PEEK(L+2)*256+PEEK(L+3);NL-L
80 PRINT USING" ##### ###";PEEK(L+2)*256+PEEK(L+3);NL-L
90 L=NL:GOTO 30

For this program, as shown, running on a virtual 32K Extended Color BASIC CoCo in the XRoar emulator, I see:

The first column (ADDR) is the address of the BASIC line in memory. After that is the address of where the next line begins (NADDR), and it will match the address shown at the start of the following line. The third column is the line number (LINE#), and last is the size of the line (SIZ) which includes the tokenized line AND the terminating zero byte at the end of it.

The final line has a “next address” of zero, indicating the end of the file.

At the start of the program I included three comments:

0 'BASINFO.BAS
1 REM BASINFO.BAS
2 REMBASINFO.BAS

In the output of the program, you see them described as:

 ADDR NADDR LINE# SIZ
 9729  9747     0  18  <- [0 'BASINFO.BAS]
 9747  9765     1  18  <- [1 REM BASINFO.BAS]
 9765  9782     2  17  <- [2 REMBASINFO.BAS]

You can see that the length of lines 0 and 1 are both 18, even though one looks like it should be shorter. In this case, the apostrophe (‘) abbreviation for REM seems to take as much space as “REM ” (with a space after it). This is because the apostrophe is encoded as a “:REM” (colon then REM). Alex Evans recently reminded me of this. This behavior would allow you to use it at the end of a line like this:

10 LINE INPUT A$'ASK FOR USERNAME

…instead of having to do:

10 LINE INPUT A$:REM ASK FOR USERNAME

But don’t do either! REMs at the end of the line can be the worst place to have REMs, since BASIC will have to scan past them to get to the next line, even if they are after a GOTO. This makes them slower. (Reminder to self: do an article on this since I’ve learned more since I original covered the topic in one of my Benchmarking BASIC articles…)

But I digress…

If you wanted to run this on your own program, you could do so by making this routine load at a high line of BASIC (higher than any lines you might be using), then you could save it as ASCII (SAVE”BASINFO”,A) and then use MERGE”BASINFO” (from disk) to bring those lines in to your program.

63000 PRINT " ADDR NADDR LINE# SIZ":L=PEEK(25)*256+PEEK(26)
63001 NL=PEEK(L)*256+PEEK(L+1):PRINT USING"##### #####";L;NL;:IF NL=0 THEN END ELSE PRINT USING" ##### ###";PEEK(L+2)*256+PEEK(L+3);NL-L:L=NL:GOTO 63001

Now you could do RUN 63000 to see what your program looks like. (The highest line number Color BASIC allows is 63999 so you could change that to 63998 and 63999 if you wanted absolutely the most line numbers available for your program ;-)

You could also add “IF L=63000 THEN END” somewhere and have it stop when it hits that routine.

What use is this?

For an upcoming article, I expect to use a version of this code to “prove” something as it relates to BASIC and the length of lines.

But, it might also be fun to generate some statistics — longest line, shortest line, a graph of the different line lengths, etc.

Until next time…

CoCo 3 and the 1988 Houston Boat Show

I graduated high school in 1987. Even though the CoCo 3 had come out the year before, I had remained with my CoCo 2. Sub-Etha Software co-founder, Terry, got his CoCo 3 first. I remember him asking me questions that I could not answer because he had lots of new features I never had seen.

By 1988 I had my own CoCo 3. I don’t recall when I got it, but it had to be in 1987 since I was writing CoCo 3 programs in January. One such program (or programs) was to display video titles. My father was producing a video which would be running at the Callaway Boatworks booth at the 1988 Houston Boat Show.

A few years ago, when I was going through 400+ floppy disks to archive them to a CoCoSDC, I found this disk but it had sector errors. While I could RUN some of the programs, many would not load due to disk errors.

Since then, I discovered a hard drive copy of the disk I had made to my KenTon RGB-DOS drive system. This image was intact! I wanted to go through the titles “some day” and see what all I had done.

“Some day” happened last week. I used the toolshed “decb” utility to pull each BASIC program off the disk image and convert it to ASCII. I then looked through all of them in Visual Studio Code on my Mac. Certain programs would daisy-chain to other programs, using a RUN”NEXTPROG” command at the end. Some paused for a key (at a black screen) before drawing the titles. The BREAK key (and CoCo 3 ON BRK command) was used to skip to the next program (why did I do it that way?).

I was able to come up with a list of two segments of daisy-chained titles, and then the rest were just one-off titles on their own. I recorded the two sequences, and all the separate images, and posted the video to YouTube:

1988 Houston Boat Show graphics done on a CoCo 3 in BASIC.

Some internet searching shows that Callaway Boatworks no longer exists. A few others in the video have since disappeared from the market, but the Houston Boat Show continues to this day.

I wrote them to see if they could provide a vendor list from 1988. I did not expect a response, but got one! They sent me a scan of the exhibitors from that year’s show, and I can now locate the two spots that Callaway Boatworks had that year.

A huge thank you to Lynette M at the boat show for taking time to get me this information. My father passed away a few years ago, so I did not have him to ask about these things.

More to come…

Color BASIC Attract Screen – part 5

See also: part 1, part 2, part 3, part 4, unrelated, and part 5.

In part 4 of this series, Jason Pittman provided several variations of creating the attract screen:

Jason Pittman variation #1

If those four corners bother you, then my attempt will really kick in that OCD when you notice how wonky the colors are moving…

Jason Pittman
10 CLS0:C=143:PRINT@268,"ATTRACT!";
20 FOR ZZ=0TO1STEP0:FORX=0TO15:POKEX+1024,C:POKEX+1040,C:POKE1535-X,C:POKE1519-X,C:POKE1055+(32*X),C:POKE1472-(32*X),C:GOSUB50:NEXT:GOSUB50:NEXT
50 C=C+16:IF C>255 THEN C=143
60 RETURN

Jason Pittman variation #2

Also, another option using the substrings might be to fill the sides by printing two-character strings on the 32nd column so that a character spills over to the first column of the next line:

Jason Pittman
10 CLS 0:C=143:OF=1:CH$=""
20 FOR X=0TO40:CH$=CH$+CHR$(C):GOSUB 90:NEXT
30 FOR ST=0TO1STEP0
40 PRINT@0,MID$(CH$,OF,31):GOSUB 120
50 FORX=31TO480STEP32:PRINT@X,MID$(CH$,OF,2);:GOSUB 120:NEXT
60 PRINT@481,MID$(CH$,OF,30);:GOSUB120
70 NEXT
80 REM ADVANCE COLOR
90 C=C+16:IF C>255 THEN C=143
100 RETURN
110 REM ADVANCE OFFSET
120 OF=OF+2:IF OF>7 THEN OF=OF-8
130 RETURN

Jason Pittman variation #3

One more try at O.C.D-compliant “fast”:

Jason Pittman
10 DIM CL(24):FORX=0TO7:CL(X)=143+(X*16):CL(X+8)=CL(X):CL(X+16)=CL(X):NEXT
20 CLS0:FORXX=0TO1STEP0:FORYY=0TO7:FORZZ=1TO12:READPO,CT,ST,SR:FOR X=SRTOSR+CT-1:PO=PO+ST:POKE PO,CL(X+YY):NEXT:NEXT:RESTORE:NEXT:NEXT
180 REM POSITION,COUNT,STEP,START
190 DATA 1024,8,1,0,1032,8,1,0,1040,8,1,0,1048,6,1,0,1055,8,32,6,1311,6,32,6,1535,8,-1,4,1527,8,-1,4,1519,8,-1,4,1511,6,-1,4,1504,8,-32,2,1248,6,-32,2

The #3 variation using DATA statements is my favorite due to its speed. Great work!

The need for speed: Some assembly required.

It seems clear that even the fastest BASIC tricks presented so far are still not as fast as an attract screen really needs to be. When this happens, assembly code is the solution. There are also at least two C compilers for Color BASIC that I need to explore, since writing stuff in C would be much easier for me than 6809 assembly.

Shortly after part 4, I put out a plea for help with some assembly code that would rotate graphical color blocks on the 32 column screen. William “Lost Wizard” Astle answered that plea, so I’ll present the updated routine in his LWASM 6809 compiler format instead of as an EDTASM+ screen shot in the original article.

* lwasm attract32.asm -fbasic -oattract32.bas --map

    org $3f00

start
    ldx #1024   X points to top left of 32-col screen
loop
    lda ,x+     load A with what X points to and inc X
    bpl skip    if not >128, skip
    adda #16    add 16, changing to next color
    ora #$80    make sure high gfx bit is set
    sta -1,x    save at X-1
skip
    cmpx #1536  compare X with last byte of screen
    bne loop    if not there, repeat
    sync        wait for screen sync
    rts         done

    END

The code will scan all 512 bytes of the 32-column screen, and any byte that has the high bit set (indicating it is a graphics character) will be incremented to the next color. This would allow us to draw our attract screen border one time, then let assembly cycle through the colors.

How it works:

  • The X register is loaded with the address of the top left 32-column screen.
  • The A register is loaded with the byte that X points to, then X is incremented.
  • BPL is used to skip any bytes that do not have the high bit set. This optimization was suggested by William. An 8-bit value can be treated as an unsigned value from 0-255, or as a signed value of -127 to 128. A signed byte uses the high bit to indicate a negative. Thus, a positive number would not have the high bit set (and is therefor not in the 128-255 graphics character range).
  • If the high bit was set, then 16 is added to A.
  • ORA is used to set the high bit, in case it was in the final color range (240-255) and had 16 added to it, turning it in to a non-graphics block. Setting the high bit changes it from 0-16 to 128-143.
  • The modified value is stored back at one byte before where X now points. (This was another William optimization, since originally I was not incrementing X until after the store, using an additional instruction to do that.)
  • Finally, we compare X to see if it has passed the end of screen memory.
  • If it hasn’t, we do it all again.
  • Finally, we have a SYNC instruction, that waits for the next screen interrupt. This is not really necessary, but it prevents flickering of the screen if the routine is being called too fast. (I’m not 100% sure if this should be here, or at the start of the code.)

The LWASM compiler has an option to generate a BASIC program full of DATA statements containing the machine code. You can then type that program in and RUN it to get this routine in memory. The command line to do this is in the first comment of the source code above.

10 READ A,B
20 IF A=-1 THEN 70
30 FOR C = A TO B
40 READ D:POKE C,D
50 NEXT C
60 GOTO 10
70 END
80 DATA 16128,16147,142,4,0,166,128,42,6,139,16,138,128,167,31,140,6,0,38,241,19,57,-1,-1

The program loads at $3f00 (16128), meaning it would only work on a 16K+ system. There is no requirement for that much memory, and it could be loaded anywhere else (even on a 4K system). The machine code itself is only 20 bytes. Since the code was written to be position independent (using relate branch instructions instead of hard-coded jump instructions), you could change where it loads just by altering the first two numbers in the DATA statement (start address, end address).

For instance, on a 4K CoCo, memory is from 0 to 4095. Since the assembly code only uses 20 bytes, one could load it at 4076, and use CLEAR 200,4076 to make sure BASIC doesn’t try to overwrite it. However, I found that the SYNC instruction hangs the 4K CoCo, at least in the emulator I am using, so to run on a 4K system you would have to remove that.

Here is the BASIC program modified for 4K. I added a CLEAR to protect the code from being overwritten by BASIC, changed the start and end addresses in the data statements, and altered the SYNC code to be an RTS (changing SYNC code of 19 to a 57, which I knew was an RTS because it was the last byte of the program in the DATA statements). This means it is wasting a byte, but here it is:

5 CLEAR 200,4076
10 READ A,B
20 IF A=-1 THEN 70
30 FOR C = A TO B
40 READ D:POKE C,D
50 NEXT C
60 GOTO 10
70 END
80 DATA 4076,4095,142,4,0,166,128,42,6,139,16,138,128,167,31,140,6,0,38,241,57,57,-1,-1

Using the code

Lastly, here is an example that uses this routine. I’ll use the BASIC loader for the 32K version, then add Jason’s variation #1 to it, modified by renaming it to start at line 100, and removing the outer infinite FOR Z loop so it only draws once. I’ll then add a GOTO loop that just executes this assembly routine over and over.

5 CLEAR 200,16128
10 READ A,B
20 IF A=-1 THEN 70
30 FOR C = A TO B
40 READ D:POKE C,D
50 NEXT C
60 GOTO 10
70 GOTO 100
80 DATA 16128,16147,142,4,0,166,128,42,6,139,16,138,128,167,31,140,6,0,38,241,19,57,-1,-1
100 CLS0:C=143:PRINT@268,"ATTRACT!";
120 FORX=0TO15:POKEX+1024,C:POKEX+1040,C:POKE1535-X,C:POKE1519-X,C:POKE1055+(32*X),C:POKE1472-(32*X),C:GOSUB150:NEXT:GOSUB150
130 EXEC 16128:GOTO 130
150 C=C+16:IF C>255 THEN C=143
160 RETURN

And there you have it! An attract screen for BASIC that uses assembly so it’s really not a BASIC attract screen at all except for the code that draws it initially using BASIC.

I think that about covers it. And, this routine also looks cool on normal 32-column VDG graphics screens, too, causing the colors to flash as if there is palette switching in use. (You can actually palette switch the 32-column screen colors on a CoCo 3.)

Addendum: WAR by James Garon

On 7/2/2022, Robert Gault posted to the CoCo list a message titled “Special coding in WAR“. He mentioned some embedded data inside this BASIC program. You can download it as a cassette or disk image here:

https://colorcomputerarchive.com/search?q=War+%28Tandy%29&ww=1

You can even go to that link and click “Play Now” to see the game in action.

I found this particularly interesting because this BASIC program starts with one of the classic CoCo attract screens this article series is about. In the program, the author did two tricks: One was to embed graphics characters in a PRINT statement, and the other was to embedded a short assembly language routine in a string that would cycle through the screen colors, just like my approach! I feel my idea has been validated, since it was already used by this game in 1982. See it in action:

https://colorcomputerarchive.com/test/xroar-online/?machine=cocous&basic=RUN%22WAR%22%5cr&cart=rsdos&disk0=/unzip%3Ffile%3DDisks/Games/War%20(Tandy).zip/WAR.DSK

And if you are curious, the code in question starts at line 60000. I did a reply about this on the CoCo mailing list as I dug in to what it is doing. That sounds like it might make a part 6 of this series…

Until next time…

MacInvaders09 by Jamie Cho

In 1994, Sub-Etha Software released Invaders09 – a Space Invaders-style game for the CoCO 3.

The game was written in 6809 assembly, and ran under Microware OS-9 Level 2.

Jamie Cho took on the task of porting the game to the MM/1, a “CoCo 4” system that ran OS-9/68000.

Years later, he ported the MM/1 version to run on classic MacOS (on the original 68000 series of processors).

This led to the game being re-ported to the Mac OS for PowerPC, then Intel x86, and finally Apple’s ARM-based M1 series of processors.

Invaders 09 was originally written in 6809 assembly language for the Tandy Color Computer 3 running OS-9 Level 2. I ported it to C on the MM/1 running OS-9 68K sometime around 1995 or so and eventually to the Mac in the early 2000s or so. This means the game has successfully run more or less natively on 5 different platforms – the 6809, 68K, PowerPC, x86 and ARM. This also explains the kind of weird way the bitmaps are drawn to the screen…

– Jamie Cho on his GitHub page.

Downloading MacInvaders09

You can download the source code for the current release from his GitHub:

https://github.com/jamieleecho/MacInvaders09

If you just want to download the binary and play it, he has that available here:

https://github.com/jamieleecho/MacInvaders09/releases/tag/1.0.6

(Note, this leads to the current 1.0.6 build. If that link doesn’t work, check his main page for a later release.)

Running MacInvaders09

After opening the archive file, you will see the “MacInvaders09.app” application. If you try to run it, you will get this warning:

Click OK on that box to dismiss it. Go in to “System Preferences”, then in to the “Security & Privacy” section. It will look like this:

You can then click “Open Anyway” to allow this program to run. You should then see the same warning box, but now you can “Open” the program to run it.

Playing MacInvaders09

The game will open in a tiny size, matching its 1994 CoCo 3 release:

You can go full screen if you actually want to see it on a modern sized monitor:

Differences from the CoCo 3 original

Jamie’s port is more of an update/rewrite than a straight port. Trying to port 6809 assembly to C doesn’t make a lot of sense. Instead, the game graphics were brought over, and likely some of the logic. There are some significant differences:

  1. The laser shots were enhanced to vertical lines instead of just dots. Nice.
  2. There are new sound effects added. Also nice.
  3. The Power level seems to be missing. (I’m not sure if the game increases the number of simultaneous shots you get as levels progress.)
  4. There is no joystick support.
  5. As far as I know, the undocumented “cheat mode” is not implemented, nor is the “zoop mode” which made the game play too fast on the CoCo 3.
  6. The Invader graphics are upside down from the original. You will notice they have some dots at the top of them. In the CoCo 3 original, I looked for a “hit” by seeing is my bullet dot encountered a non-empty screen byte. I put those dots there just to give it a target at the bottom of the invader. I didn’t like the way they looked, but I didn’t know a better way to do it at the time.
  7. The UFO in the original would always drop a bomb the moment it was above the player, forcing the player to NEVER sit still as the UFO passes. In this version, the UFO seems to drop bombs in a more random pattern.
  8. The UFO bombs can be stopped by the shields. In the original, the UFO bomb would go THROUGH the shields, forcing the player to not hide under shields. I always through it was unfair that the player could just sit in one spot and nothing could hurt them there.
  9. The player is allowed to move while shooting. In the original, any time you fired the laser, your movement was stopped. This prevented the player from doing “run and shoot” moves, making it a bit harder since you couldn’t take a pot shot as you scooted across the screen.
  10. The damage taken to the shields is different. In the original, the blocks kind of pushed outwards, with the first hit taking a small dent out of the shield, and another hit in that location making the dent wider. This version takes longer bits out of the shield, matching the longer laser bolts.
  11. I think the Invaders may move down at a slower rate. It seems much easier to clear them all out and have one left that is still very high, but I’d have to play them side by side to really compare that.

I am quite impressed and honored that Jamie has taken time to do these ports. I can’t express how thrilled I was the first time I saw this on an MM/1, and even more so on an Apple Macintosh.

Thank you, Jamie Cho, for your efforts to make defending the Earth a cross-platform activity!

CoCo and 16-bits

When dealing with bits in Color BASIC, we have AND, OR and NOT. Unfortunately, we can really only use these on values 15-bits or less. For example, here is a table represent various 8-bit values in the range of 0-255:

Dec    Hex   Binary
-----  ----  --------
    0    00  00000000
    1    01  00000001
    2    02  00000010
    4    04  00000100
    8    08  00001000
   16    10  00010000
   32    20  00100000
   64    40  01000000
  128    80  10000000
  255    FF  11111111

We have no problem using 8-bit values with standard Color BASIC. Here is my routine that will print out the bits of any 8-bit value:

0 REM 8BITS.BAS
10 DIM BT(7):FOR BT=0 TO 7:BT(BT)=2^BT:NEXT
20 INPUT "VALUE     ";Z
30 GOSUB 500:GOTO 20
500 REM SHOW Z AS BINARY
510 FOR BT=7 TO 0 STEP-1
520 IF Z AND BT(BT) THEN PRINT "1"; ELSE PRINT "0";
530 NEXT
540 PRINT Z:RETURN

Here is a program using that routine that will print out a similar table:

0 REM 8BITTABL.BAS
10 DIM BT(7):FOR BT=0 TO 7:BT(BT)=INT(2^BT):NEXT
20 PRINT "DEC    HEX   BINARY"
30 PRINT "-----  ----  --------"
40 FOR I=0 TO 7:Z=INT(2^I)
50 GOSUB 100
60 NEXT
70 Z=255:GOSUB 100
80 END

100 REM PRINT TABLE ENTRY
110 PRINT USING"#####    ";Z;
120 IF Z<&H10 THEN PRINT "0";
130 PRINT HEX$(Z);"  ";
140 GOSUB 500
150 RETURN

500 REM SHOW Z AS BINARY
510 FOR BT=7 TO 0 STEP-1
520 IF Z AND BT(BT) THEN PRINT "1"; ELSE PRINT "0";
530 NEXT
540 PRINT:RETURN

When I started experimenting with bits like this, I tried to modify my routine to work with 16-bit values. It did not work:

0 REM 8BITTABL.BAS - DOES NOT WORK!
10 DIM BT(15):FOR BT=0 TO 15:BT(BT)=INT(2^BT):NEXT
20 PRINT "DEC    HEX   BINARY"
30 PRINT "-----  ----  ----------------"
40 FOR I=0 TO 15:Z=INT(2^I)
50 GOSUB 100
60 NEXT
70 Z=255:GOSUB 100
80 END

100 REM PRINT TABLE ENTRY
110 PRINT USING"#####  ";Z;
120 IF Z<&H10 THEN PRINT "0";
121 IF Z<&H100 THEN PRINT "0";
122 IF Z<&H1000 THEN PRINT "0";
130 PRINT HEX$(Z);"  ";
140 GOSUB 500
150 RETURN

500 REM SHOW Z AS BINARY
510 FOR BT=15 TO 0 STEP-1
520 IF Z AND BT(BT) THEN PRINT "1"; ELSE PRINT "0";
530 NEXT
540 PRINT:RETURN

A bit of investigation revealed that AND could not operate on values greater than 32767 (&H3FFF in hex):

I did not understand why, but I expected it has something to do with integer values being treated as signed values, as if this was an INT16 (−32768 to +32767 range) rather than a UIN16 (0-65535 range).

rflberg to the rescue

I had recently posted a series of YouTube videos discussing bits in Color BASIC. My most recent one showed a program I wrote that demonstrated AND, OR and NOT operations:

The program I demonstrated looked like this:

0 REM ANDOR.BAS
10 DIM BT(7):FOR BT=0 TO 7:BT(BT)=INT(2^BT):NEXT
20 INPUT "VALUE     ";V
30 PRINT "(A/O/N)";
40 A$=INKEY$:IF A$="" THEN 40
50 IF A$="A" THEN M=0:PRINT "AND";:GOTO 90
60 IF A$="O" THEN M=1:PRINT "OR ";:GOTO 90
70 IF A$="N" THEN M=2:PRINT "NOT":GOTO 100
80 SOUND 1,1:GOTO 40
90 INPUT O
100 PRINT:PRINT "    ";:Z=V:GOSUB 500
110 IF M=0 THEN PRINT "AND ";:Z=O:GOSUB 500:Z=V AND O:PRINT "    ";:GOSUB 500
120 IF M=1 THEN PRINT "OR  ";:Z=O:GOSUB 500:Z=V OR O:PRINT "    ";:GOSUB 500
130 IF M=2 THEN PRINT "NOT ";:Z=NOT V:GOSUB 500
140 PRINT:GOTO 20

500 REM SHOW Z AS BINARY
510 FOR BT=7 TO 0 STEP-1
520 IF Z AND BT(BT) THEN PRINT "1"; ELSE PRINT "0";
530 NEXT
540 PRINT:RETURN

In the video I explain how it works, somewhat, but you will notice it works only on 8-bit values. Because I did not know a way to make it work.

However, in the comments, use rflberg left a few comments:

IF you want to see the full bits change the program to this:

10 DIM BT(15):FOR BT=0 TO 15:BT(BT)=2^BT:NEXT
501 IF Z<0 THEN PRINT”1″; ELSE PRINT”0″;
510 FOR BT=14 TO 0 STEP -1

rflberg (via YouTube)

I was intrigued. The modifications did not work for me, but a few additional comments help me understand the intent:

-1 is actually 1111111111111111 and 255 is 0000000011111111. It computes numbers -32768 to 32767. Negative numbers the most significant bit is a 1 and positive numbers is a 0.

-32768 is 1000000000000000 and 32767 is 0111111111111111

rflberg (via YouTube)

I experimented with this for awhile last night, and now I think I understand it. AND, NOT and OR allow you to pass in 0 to 32677 just fine. But, you can also pass in -32768 to -1 as well! It seems to be using the high bit (bit 15) to indicate a negative value. The explanation was to simply use negative values to make AND, NOT and OR see that bit.

The code modification would work if I passed in 0-32767 for the normal 15-bit range then -32768 to 1 to represent the high-bit range. I should be able to modify my routine to do this automatically.

I could use standard bit values for bits 0 to 14 (my BT array values of 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, and 16384, just like in the earlier table), and then have a special case for bit 15 — a value of -32768 — which I would have in the array as BT(15)=-32768.

Then, in the print bit routine I could check to see if the value was greater than 32767, and turn it in to a negative number by subtracting 65536. (i.e., 32767 would be fine, but 32768 would turn in to -32768).

Since I print out the integer value after the bit display, I decided to make a temporary (altered) variable Z2, and retain the user’s intended Z value. This means I could pass in 32768 and it would print 32768, but would be really using -32768.

I ended up with a minor modification to my program, giving me this routine that will display the bits of any 16-bit value (0-65535):

0 REM 16BITS.BAS
1 REM WORKS THANKS TO rflberg
10 DIM BT(15):FOR BT=0 TO 14:BT(BT)=INT(2^BT):NEXT:BT(15)=-32768

20 INPUT "VALUE     ";Z
30 GOSUB 500:GOTO 20
500 REM SHOW Z AS BINARY
505 IF Z>32767 THEN Z2=Z-65536 ELSE Z2=Z
510 FOR BT=15 TO 0 STEP-1
520 IF Z2 AND BT(BT) THEN PRINT "1"; ELSE PRINT "0";
530 NEXT
540 PRINT Z;Z2:RETURN

Using this updated routine, I modified my table printing program to handle 16-bits:

0 REM 8BITTABL.BAS
1 REM WORKS THANKS TO rflberg
10 DIM BT(15):FOR BT=0 TO 14:BT(BT)=INT(2^BT):NEXT:BT(15)=-32768
20 PRINT "DEC    HEX   BINARY"
30 PRINT "-----  ----  ----------------"
40 FOR I=0 TO 15:Z=INT(2^I)
50 GOSUB 100
60 NEXT
70 Z=65535:GOSUB 100
80 END

100 REM PRINT TABLE ENTRY
110 PRINT USING"#####  ";Z;
120 IF Z<&H10 THEN PRINT "0";
121 IF Z<&H100 THEN PRINT "0";
122 IF Z<&H1000 THEN PRINT "0";
130 PRINT HEX$(Z);"  ";
140 GOSUB 500
150 RETURN

500 REM SHOW Z AS BINARY
505 IF Z>32767 THEN Z2=Z-65536 ELSE Z2=Z
510 FOR BT=15 TO 0 STEP-1
520 IF Z2 AND BT(BT) THEN PRINT "1"; ELSE PRINT "0";
530 NEXT
540 PRINT:RETURN

Tada! Thanks for those great YouTube comments, I now have a workaround to doing bit detection on all 16 bits. Thank you very much, rflberg!

3X+1 in C#

For my day job, I do embedded C programming for PIC24 compilers and some Windows C programming in something called LabWindows. Lately, I’ve been touching some C# stuff, so I decided to revisit last night’s 3X+1 program by converting it to C#.

You can compile and run it online here: https://www.onlinegdb.com/online_csharp_compiler

// 3X+1

using System;
					
public class Program
{
	public static void Main()
	{
		while (true)
		{
			Int32 x = 0;

			Console.WriteLine();
			Console.Write("STARTING NUMBER? ");
			x = Int32.Parse(Console.ReadLine());
			
			while (true)
			{
				Console.Write(x);
				Console.Write(" ");
				
				if (x == 1) break;
				
				if ((x & 1) == 1) // Odd
				{
					x = x * 3 + 1;
				}
				else // Even
				{
					x = x / 2;
				}
			}
		}
	}
}

Compressing BASIC DATA with Base-64 – part 2

See also: part 1, part 2, part 3 and part 4.

Today we will explore writing a standard base-64 converter in BASIC, and then see if we can make a smaller and faster (and nonstandard) Color-BASIC-specific one.

When we last left off, we were looking at ways to get as much encoded data on to a DATA statement as possible. Instead of using integer numbers (base-10) or hex values (base-16), we began exploring if we could increase the base and use more typeable character to encode the data.

Although it seems we could create a weird base-90 format using every typeable character except for quote (which we’d need to start a DATA line else we couldn’t use comma), the decoder would be much larger and have to do much more work, and we actually wouldn’t benefit since we really need numbers that round to specific numbers of bits:

  • Base-8 (octal) values can be represented by 3-bits (111). (Extended BASIC supports octal when you use &Oxx or just &xx.)
  • Base-16 (hexadecimal) values can be represented as 4-bits (1111). (Extended BASIC supports hexadecimal when you use &Hxx.)
  • Base-32 values would be represented as 5-bits (11111).
  • Base-64 values would be represented as 6-bits (111111).
  • Base-128 values would be represented as 7-bits (111111).

As you can see, a base-90 value isn’t a large enough range to give us an extra bit over base-64. We need to use bases that are nice multiples of the power of 2. Because of this, we’ll ignore a made-up base-90 and look at something a bit more standard, such as base-64 encoding.

Pump up the base

As previously discussed, natively, you can represent a number in a DATA statement as a base-10 value, or a hexadecimal value. Both of these are the value 32:

100 DATA 32,&H20

BASIC will READ them the same way, though hex values are much faster for BASIC to read and parse. Using native hex values like “&H20” is the fastest way to load DATA, but it is also the largest since every value has two extra characters (“&H”) in front.

A recent tip was given by Shaun Bebbington about how you can represent zero just by leaving it out between commas. It saves space, and the parse gets zero from this faster than if you put a zero there:

100 DATA 8,6,7,5,3,,9

But since we are trying to get as much DATA in there as possible, we don’t want to separate numbers by commas. We can pack all the 2-digit hex values together in a string then read that entire string and parse out the individual 2-digit hex values. That is more work, and slower, but gets more data per DATA line. Here are the values 0 to 15 in hex (00 to 0f):

100 DATA 000102030405060708090A0B0C0D0E0F

As previously demonstrated, this is the most efficient way to store HEX values. Even when we pad a low 0-15 value to make it two digits (1 represented by 01), it stills saves space over comma delimited values since no commas are used.

But each hex value is wasting 50% of the bits it takes to represent it. HEX values of 0-15 could be represented by four bits (0000 to 1111). We are storing them as one 8-bit character and thus achieving 50% storage efficiency.

We can do better by using a higher base-x value that can use those wasted bits. We want the highest value we can represent with typeable characters, which is 64 (since the next higher would be 128 and we don’t have a way to type 128 different characters on the CoCo).

Base-64

The standard Base-64 encoding uses the following 64 characters to represent values of 0 to 63:

ABCDEFGHIJKLMNOPQRSTUVWZYZabcdefghijklmnopqrstuvwxyz01234567890+/

Each base-64 character needs 6-bits to be represented (000000-111111).

Representing values that way only wastes 2 bits per character, rather than 4-bits like hex base-16 does:

ASCII HEX Chars.:    ASCII Base-64 Chars.:
      0    15              0    63
     "0"   "F"            "A"   "/"
     /       \            /       \
xxxx0000  xxxx1111   xx000000   xx111111

But, converting to and from base-64 is much trickier. Hex base-16 is as simple as this:

  • Hex F0” -> F is 15 which is 1111 in binary. 0 is 0000 in binary. Thus the first character becomes the left four bits, and the second character becomes the right four bits. Super easy. Barely an inconvenience. Two ASCII bytes represent one byte of data.

But for base-64, we are dealing with 6-bits, and two of those won’t fit into an 8-bit byte. Instead, four base-64 6-bit values are merged together to make a 3-byte 24-bit value.

  • Base-64 “ABCD” (xx000000 xx000001 xx000010 xx000011) -> A is 0 which is 000000 in binary. B is 1 which is 000001 in binary. C is 2 which is 000010 in binary. D is 3 which is 000011 in binary. These values are merged together (removing the unused 2-bits in each one) and stored in 3 bytes as:
+- Byte 1 --+- Byte 2 --+- Byte 3 --+
| 000000|00 | 0001|0000 | 10|000011 |
| \__A__/\___B___/ \___C___/ \_D__/
  • Byte 1 contains 6 bits of base-64 value A and 2 bits of base-64 value B.
  • Byte 2 contains 4 bits of base-64 value B and 4-bits of base-64 value C.
  • Byte 3 contains 2 bits of base-64 value C and 6 bits of base-64 value D.

Well that’s a mess. Moving bits around like that is super easy under languages like C, but a bit more work in BASIC.

Encode this!

We will start with encoding a simple ASCII string into base-64 using a web tool:

https://www.base64encode.org

If you go to that link, you can type something in and then encode it into base-64. I typed:

Greetings from Sub-Etha Software! Do you know where your towel is?

And that gets encoded into this:

R3JlZXRpbmdzIGZyb20gU3ViLUV0aGEgU29mdHdhcmUhIERvIHlvdSBrbm93IHdoZXJlIHlvdXIgdG93ZWwgaXM/

Each character represents a 6-bit (0-63) value which we will have to combine into 8-bit values and decode.

An easy way to decode the characters used by base-64 encoding is with a string:

10 Z$="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"

We can use Extended BASIC’s INSTR() function to match a character from the encoded string with a character in that string, and the position it is found in will the the value it represents (well, minus 1, since INSTR returns a base-1 value).

Here is an example that will display the bytes of the encoded string:

0 REM base64-1.bas
10 Z$="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
20 READ A$:PRINT A$
30 FOR A=1 TO LEN(A$)
40 PRINT INSTR(Z$,MID$(A$,A,1))-1;
50 NEXT
1000 REM BASE-64 DATA
1010 DATA R3JlZXRpbmdzIGZyb20gU3ViLUV0aGEgU29mdHdhcmUhIERvIHlvdSBrbm93IHdoZXJlIHlvdXIgdG93ZWwgaXM/

Running that shows me this:

Displaying base-64 values.

If A is 0, then R should be 17, and that is what it prints first. Now we know we can get the values for each character in a base-64 encoded string.

Next we have to turn four 6-bit base-64 values into three bytes (24-bits). I am not sure what a good way to do this is, so I’ll just brute-force it and see how that works out.

First, I know that I need four base-64 values to make my 3 8-bit values, so I’ll modify my loop to skip every four values, and then add an inner loop to process the individual four base-64 values.

Inside that inner loop it will process the next four base-64 6-bit values and convert them into 3 8-bit values.

Here is what I came up with:

0 REM base64.bas
5 POKE65395,0
10 Z$="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
20 READ A$
30 FOR A=1 TO LEN(A$) STEP 4
35 REM GET 4 6-BIT VALUES
40 FOR B=0 TO 3:B(B)=INSTR(Z$,MID$(A$,A+B,1))-1
50 IFB(B)<0 THEN B(B)=0
60 NEXT
65 REM CONVERT TO 3 8-BIT
70 C1=INT(B(0)*INT(2^2)) OR INT(B(1)/INT(2^4))
80 C2=(B(1) AND &HF)*INT(2^4) OR B(2)/INT(2^2)
90 C3=(B(2) AND &H3)*INT(2^6) OR B(3)
100 PRINT CHR$(C1);CHR$(C2);CHR$(C3);
110 NEXT
120 END
1000 REM BASE-64 DATA
1010 DATA R3JlZXRpbmdzIGZyb20gU3ViLUV0aGEgU29mdHdhcmUhIERvIHlvdSBrbm93IHdoZXJlIHlvdXIgdG93ZWwgaXM/

I figured out all the 6-bit to 8-bit stuff (lines 70-90) with alot of trial and error, so I expect there is a faster and easier way to do this. But, then end results is a program that will print out the expected message, albeit really slowly.

A successful, but slow, decode of a base-64 encoded message.

One unexpected problem was with the powers of two — (2^2) and such. They produce rounding errors which caused some bits to be lost. I had to use INT() round them. That took me hours to figure out, but it’s just part of the inaccuracies of floating point values, especially limited ones like a 1970s BASIC used.

PROBLEM: Since the goal here is to put more data in DATA statements, the base-64 decode routine needs to be small. If it is 100 bytes larger than just using HEX, you have to save 100 bytes in DATA before you break even. The routine I give is not small and not fast. It would probably not be useful in the 10 LINE contest I mentioned. Maybe one of you can help improve it.

Now that we have a simple base-64 decoder, the next step will be making an encoder to turn DATA statement values into a base-64 string.

Until next time…