VIC-20: Sky-Ape-Er code dissection – part 4

See also: part 1, part 2, part 3, part 4 or part 5 (with more coming).

Hey hey! It’s another VIC-20 Tuesday!

I have completed my code walk-through of one of my earliest computer programs, a Donkey Kong-inspired VIC-20 game called Sky-Ape-Er. But, the version I presented was not the only version of the game I created. I have dozens of saved copies of this game in various stages of completion, but one in particular stood out. It used completely different graphics.

Sky-Ape-Er: The Prototype

As a reminder, here is what the graphics were later changed to in the release version:

The ape was completely different, and the platform graphics were meant to resemble the ones used in the arcade game Donkey Kong. But why were there pinwheels? I have so many questions for my junior high self.

When I first uncovered these tapes I was unaware of how many variations of my programs were on them. When I started this article on my Sky-Ape-Er game, I discovered that the earlier version with different graphics was also using some different code — most notably in how it read the keyboard input.

I thought it might be fun to look at the programming choices I originally made, and speculate on why I changed them.

Sky-Ape-Er: The Mystery

The first thing I want to discuss is a mystery I am currently trying to solve. My VIC-20 games that used custom character sets seem to come in three forms:

DATA STATEMENTS

The BASIC program reads the character set from DATA statements and POKEs it into memory. This is what the Sky-Ape-Er INSTRUCTIONS program does. I believe these character sets may have been designed by the Eight by Eight Create program I previously mentioned. If true, I don’t envy my junior high self having to manually copy down the numbers to paper and type them in to my own program later. From the documentation:

“Once you have created, designed and examined enough characters, you can copy their associated numbers on paper to be used in any programs you make.”

Eight by Eight Create instruction from January 1983 Creative Computing magazine (Volume 9, Number 1), page 270.

Yipes.

LOADABLE CHARACTER SET

A standalone binary “program” of the custom character set that can be loaded into memory, presumably at the address of where the character data goes. I have found several programs called things like CHARS and TNTCH that do not have any BASIC code in them. I suspected these were character set data (especially TNTCH which was on the tape after the main program TNT) but I had no idea how to use them. I was finally able to see what was inside by importing them into the CBM prg Studio‘s characters set editor. CBM prg Studio is a Windows integrated development environment (IDE) for making Commodore programs in BASIC or assembly. It has some great features and is worth checking out.

VIC-20 Factory TNT character set in CBM prg Studio.

This let me see that these were indeed character sets, though my first attempt to import them had all the graphics off by a few lines. I needed to use an offset (bytes to skip in the file) of 2 for the characters to load properly. That told me that whatever type of file this was had some 2 byte header at the start (perhaps memory location where to load the data?).

ALL-IN-ONE PROGRAM WITH CHARACTER SET

And this is the mystery! My early prototype version of Sky-Ape-Er was just one program, and it loaded up with the custom character set. There was no font in DATA statements. There was no pre-loader that did it. It just loaded and “just worked.” I have no idea how I created this, nor do I know why, for the release version, I change it to use two programs and DATA statements.

But I have theories.

Dissecting the data

Thanks to suggestions from a VIC-20 group on Facebook and the Denial Commodore forum, I looked at the contents of the mystery SKY-APE-ER program file.

Using a free hex editor, I opened the file and looked for the end of the BASIC program. I could tell it ended around byte 2612 because the last line was a “SYS xxxxx” command that would reboot the VIC-20. The xxxxx numeric value was visible as plain text in the tokenized BASIC file, so it was easy to spot.

VIC-20 .prg file in a HEX editor.

After this was a bunch more data. Somewhere in there must be the character set. But where? I decided to try opening the entire program file in the character set editor and using the 2612 offset where the BASIC program ended.

VIC-20 CBM prg Studio importing a .prg to find the embedded charset data.

Doing this showed garbage between the BASIC program and character data, but scrolling down let me visibly see where the font data began.

VIC-20 CBM prg Studio trying to find where character set data is in a .prg file.

I now knew that approximately 58 characters (each character is 8×8, so 8 bytes per) into the file was the start of the font data. A little math (which was hard) and some trial and error (which was easy) and I came up with 3073 as the offset to use from the stat of the .prg to where my custom characters were. I imported using that value and got this:

VIC-20 character set data imported from a .prg file.

Tada!

If I knew what the font data was to begin wish (from DATA statements), I could have just scanned the HEX file looking for those values. But I didn’t, so I couldn’t.

Now I have a BASIC file for the game, as well as a character set file in the CBM prg Studio editor. But how did I combine them together in the first place?

Where does the data go from here?

The clue is in these POKEs found on the first line of the program:

POKE45,56:POKE46,26:POKE51,0:POKE52,28:POKE55,0:POKE56,28:CLR

They reminded me of similar POKEs in Color BASIC that track where the program starts in memory as well as where variables and strings go. I expected CBM BASIC would be similar, so I went searching for a VIC-20 memory map.

I found this one archived on Bo Zimmerman’s site. He’s the guy behind the incredible Zimodem firmware that lets you wire up a WiFi serial modem for under $10.

http://www.zimmers.net/cbmpics/cbm/vic/memorymap.txt

I want to do a deep dive into this later, but for now, here are what those POKEs are doing:

*002D-002E 45-46 Pointer: Start of Variables
*0033-0034 51-52 Pointer: String storage (moving down)
*0037-0038 55-56 Pointer: Limit of memory

The “*” notes “Useful memory locations” in the memory map. I agree. I seem to be changing where variables and strings start, as well as where the end of memory is on startup.

Why was I changing the start of variables, the end of string storage, and limiting the end of BASIC? I have a theory, which parallels something I’ve done on the CoCo.

In Color BASIC, we use the CLEAR command to allocate more string space (“CLEAR 500” for 500 bytes for strings). It looks like CBM BASIC doesn’t do that, and allows strings to use as much memory as is available (the memory between the end of the BASIC program + variable arrays, and the limit of memory).

CLEAR can also limit how much memory BASIC can use (“CLEAR 200,&H3F00”). That’s useful when you are wanting to use some of that memory for machine language and don’t want BASIC to overwrite it. I am betting POKE 51/52 is like CLEAR x,XXXX.

VIC-20 Memory Map

To better visualize this, let’s take a quick look at where the 5K of RAM in the VIC-20 is located.

   0 -> +------------------------------+
....    | 1K of System Memory          |
1024 -> +------------------------------+ <- 1023
        | 3K Expansion RAM (cartridge) |
4096 -> +------------------------------+ <- 4095
        | User BASIC Area (3583 bytes) |
7680 -> +------------------------------+ <- 7679
        | Screen Memory (512 bytes)    |
        +------------------------------+ <- 8191

Hey, look at that! The memory range used by BASIC (4096-7679) is the “3583 BYTES FREE” value shown on the startup screen:

VIC-20 startup screen showing 3583 bytes free.

Notice the 3K gap (1024-4095) which is where the 3K RAM expansion cartridge goes if you have one. I never did, though I did have the Super Expander cartridge which gave extra memory as well as enhanced graphics and sound commands.

Side Note: When memory expansion cartridges are plugged in, more memory becomes available and some things shift around. But for this discussion, we will talk only about the stock 5K VIC-20. It was only in recent years that I learned the VIC-20 was a 5K computer. I’d always thought it was 4K. That now makes the weird 3K memory expansion make more sense, since that would boost it to a nice even 8K. But I digress…

Now let’s zoom in on just the memory BASIC is using:

43/44 -> +---------------+
         | BASIC program |
45/46 -> +---------------+
         | Variables     |
47/48 -> +---------------+  
         | Arrays        |
         +---------------+ <- 49/50 End of Arrays
         |               |
         |               |
51/52 -> |---------------+  
         | Strings       |
         +---------------+ <- 55/56 Mem Limit (7679)

When a new numeric variable is added, it goes into the Variables section, which grows larger downward. When a new array is added, it goes into the Arrays area (and likely the entries there point to the Variable) and it grows larger downward. When a string is added, it gets an entry in the Variable section (“A$”) which has a pointer into the actual string content in the Strings section, which grows upwards.

So why was I changing the start of variables, the end of string storage, and limiting the end of BASIC? I believe I was making BASIC think the program was larger than it really was so it would SAVE out (and thus LOAD back later) the program PLUS some custom character data. When the program would run, it would need to reset the pointers to be at the actual end of the BASIC program, and limit memory so BASIC did not write over the character data.

In order to explain this, we need to look at how the VIC-20 custom characters worked.

How the VIC-20 custom characters worked

The VIC-20 character set was 4K of data stored in ROM starting at 0x8000:

8000-83FF 32768-33791 Upper case and graphics
8400-87FF 33792-33815 Reversed upper case and graphics
8800-8BFF 33816-35839 Upper and lower case
8C00-8FFF 35840-36863 Reversed upper and lower case

Each character was 8 pixels wide (one byte) by 8 pixels tall (8 bytes total). There is room for 512 characters in that 4K. Normal printable ASCII characters are 0-127, so it looks like the Commodore PETASCII was similar, with 128 special Commodore characters per bank (128 characters * 8 bytes per character = 1024 bytes).

On power up, the VIC’s video chip is programmed to use the first of those four 1K blocks of ROM for its character set. There is a register with four bits that can be changed to select which of those four ROM blocks it uses, or point it to four 1K RAM blocks in RAM. By loading a character set in to one of those RAM areas and setting the register, the VIC will now display the custom character set rather than the one built in to the ROM. Here are the important four bits:

9005 36869 bits 0-3 start of character memory (default = 0)
                     bits 4-7 is rest of video address (default= F)
                     BITS 3,2,1,0 CM starting address
                                  HEX   DEC
                     0000   ROM   8000  32768
                     0001         8400  33792
                     0010         8800  34816
                     0011         8C00  35840
                     1000   RAM   0000  0000
                     1100         1000  4096
                     1101         1400  5120
                     1110         1800  6144
                     1111         1C00  7168

Memory location 36869 can be one of these 8 values:

  • 0xF0 / 240 / 11110000 – Use 8000-83FF (Upper case and graphics)
  • 0xF1 / 241 / 11110001 – Use 8400-87FF (Reversed upper case and graphics)
  • 0xF2 / 242 / 11110010 – Use 8800-8BFF (Upper and lower case)
  • 0xF3 / 243 / 11110011 – Use 8C00-8FFF (Reversed upper and lower case)
  • 0xfc / 252 / 11111100 – Use 1000-13FF RAM area #1
  • 0xfd / 253 / 11111101 – Use 1400-17FF RAM area #2
  • 0xfe / 254 / 11111110 – Use 1800-1BFF RAM area #3
  • 0xff / 255 / 11111111 – Use 1C00-1FFFF RAM area #4

The four RAM locations all are within the 4K that is used by BASIC and screen memory:

  • 0x1000 – 0x1dff – 3583 bytes used by BASIC programs.
  • 0x1e00 – 0x1fff – 512 bytes used by screen memory.

We can’t use RAM area #1 for characters because that is where our BASIC program is. If we kept our BASIC program and all its variables very small (1K, 0x1000-0x13FF), we could use area #2. But, it makes more sense to use area #4 and give as much memory as possible to BASIC.

In my programs I see POKE 36869,255 and POKE 36869,240. The first POKE makes the video chip start using characters in RAM starting at 0x1c00 (bit pattern 1111). This means a BASIC program and all its variables can’t be any larger than 3072 bytes (0x1000-0x1bff). The second poke switches the characters back to using the standard ROM location for uppercase and graphics characters (bit pattern 0000).

My Sky-Ape-Er INSTRUCTIONS program would READ character data and then POKE it into memory starting at 0x1c00 (7168). It then did POKE 36869,255 to start using them. To display a normal text screen, or at the end of the program, it would POKE 36869,240 to get back to the ROM character set. (This is the part I actually mostly remembered.)

I do want to point out that the character RAM area #4 overlaps with the screen memory:

  • 0x1c00 – 0x1fff – Character RAM area #4.
  • 0x1e00 – 0x1fff – 512 bytes used by screen memory.

This tells me that you really only have 0x1c00 to 0x1dff (7168-7679) for custom characters. That’s 512 bytes, and at 8 bytes per letter, there is only room for 64 custom characters. Assembly language programs that did not need BASIC could move things around and use one of the other blocks in its entirety, but this last block shares memory with the screen so not all of it can be used for character data.

Character sets

The PETSCII character set starts at zero with an “@” symbol, followed by the alphabet characters “A-Z” (1-26), then various punctuation and symbols (27-47), then numbers (48-57), then more punctuation and symbols (59-63). This covers the basic characters and uppercase alphabet just like standard ASCII does for characters 32-96. This means being limited to just 64 characters is not bad at all.

And, as I was working through this, I discovered why my Factory TNT game is not displaying the score correctly! I did not realize it also contained updated number characters (48-57) that I was not properly loading during my “restoration” of the game:

VIC-20 Factory TNT character set also remaps the numbers.

I’m so glad I am writing this article and figured that out. It was driving me mad!

But I digress…

Stay on target… Stay on target…

This means that you could have a BASIC program from 0x1000 to 0x1bff followed by custom character data from 0x1c00 to 0x1dff. If you had the characters in memory for your BASIC program to use, and you SAVEd your BASIC program, those custom characters would NOT be saved with it since BASIC doesn’t know anything about them.

But … you could lie to BASIC and tell it your BASIC program actually ENDS at 0x1dff, then when you SAVE it should write out the entire range of memory (0x1c00 to 0x1dff) thinking it’s just one large BASIC program…

Then, when you loaded it back it, BASIC would start loading it into memory at 0x1c00 and keep going until it got to the end of the program. You now have loaded memory that is part BASIC, and part character set!

But, if you tried to RUN it, you wouldn’t get very far because BASIC would think there is no memory left for variables. You would need to un-lie to BASIC and tell it where the program really ends, and do so without using varaibles.

That is what my six POKEs were apparently doing.

In part 5, we’ll see if this theory works.

Until next time…

2 thoughts on “VIC-20: Sky-Ape-Er code dissection – part 4

  1. MiaM

    Did you try to simply do a LOAD “”,1,1 for the CHARS and TNTCH and similar files? That will load them at whatever starting address they were saved at, so you don’t have to use CBM PRG Studio (unless you want to)

    Reply
    1. Allen Huffman Post author

      No, I did not remember that when I was first trying it. I must have known it in 1982, though. If I had re-learned that earlier, it would have saved some time. Thanks for the reminder. I will try it on TNTCH and CHARS to verify.

      Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.