In this article, I want to talk a bit about how Color BASIC variables are stored in memory. This will help explain what the VARPTR (“variable pointer”) command is used for, and also be useful in the future when I get around to exploring how the Extended BASIC GET and PUT graphics commands work.
What’s in a (variable) name?
In Color BASIC, variable names (both string and numeric) start with a letter and may be followed by a second letter or a number. Valid variable names are:
- A to Z
- A0 to A9 through Z0 to Z9.
- AA to ZZ
String variables follow the same convention, just with a $ after the name (A$, AB$, F5$, etc.).
What’s in a (long variable) name?
While BASIC to allow you to type longer variable names, only the first two characters are used. Thus, a variable such as:
…will actually let you…
…and see that the value is 1. However, any other variable starting with the first two characters (“LO”) would overwrite this variable (since it would be the same variable).
Each time you declare a new variable, you will see memory decrease by seven bytes:
Without looking at the Color BASIC Unravelled BASIC ROM disassembly, we can speculate that two of those bytes are probably for the variable name, with the other five used for … whatever it takes to have a variable. (And with looking at those books, we can see exactly how this works. But exploring is more fun, so let’s do that instead…)
Thanks for the (variable) memory.
The Microsoft BASIC found in the Radio Shack Color Computer has an interesting command called VARPTR. It returns the memory address of a specified variable. In an earlier series on interfacing BASIC with assembly, I discussed VARPTR to some detail, but all we need to know here is that a standard numeric variable is stored as five bytes of memory.
Thus, if you use VARPTR to get the location of a numeric variable, you can print the five bytes at that location and see the raw data that BASIC uses to represent that floating point value:
Above, we see that the numeric variable “A” was located in memory at 9868 at that moment in time. The five bytes that represent “A=1” appear to be 129, 0, 0, 0, 0. All numbers in Color BASIC are floating point values, with the bytes representing an Exponent, Mantissa and Sign. I’d try to explain, but then my head would explode. (Though I do plan to figure it out and write about it at some point.)
For now, let’s just go with “numbers take up five bytes” and move on.
String me along.
String variables (variable names that end in $) are a bit different. The text of the string is stored elsewhere in string memory, and the five bytes that VARPTR points you to contain information on how to get to where the actual string lives:
Above you see the five bytes are 7, 0, 127, 248 and 0. The first byte is the length of the string. In this case, “ABCDEFG” is 7 bytes long, so the first byte is 7. The next byte is not used for a string, and is always 0. The third and fourth bytes are the memory location where the string text is actually stored. The fifth byte is not used and is always 0.
See my earlier article for more examples of this.
What’s in a (variable) name (revisited)?
We know that each variable takes up 7 bytes of memory, and that VARPTR returns the location of the 5 bytes that are the actual variable. So where is the name? Directly before the 5-bytes. If you look at the two bytes before the VARPTR address, you will see the name bytes:
The 65 is the ASCII character for “A”, and since it was only one digit, the second location is 0. If the variable had been named “AA”, those two bytes would be 65 65.
And, if the variable is a string, the second byte will have the high bit set (128 added to it).
If the variable had been AA$, those two bytes would have been 65 193 (65+128).
Hip, hip array!
Arrays are another variable type that I have not explored until shortly before writing this article. The DIM command can be used to declare an array of 1 to n numbers (or strings). It is a base-0 value (meaning elements count from 0 on up) so if you wanted ten entries, you could DIM A(9):
10 DIM A(9) 20 FOR I=0 TO 9 30 A(9)=RND(100) 40 NEXT 50 FOR I=0 TO 9 60 PRINT I,A(I) 70 NEXT
The above code would declare room for 10 entries in the A array, then load each entry (0 to 9) with a random number. It then does another loop to print the contents of all ten array entries, showing the random numbers that were stored there.
When I first looked at this, I noticed declaring an array seemed to take 7 bytes of memory plus 5 bytes per entry, versus a normal variable taking just 7 bytes (2 bytes for the name plus 5 bytes of the number).
Above, you see we started with 22823 bytes free. We declared DIM A(0) to hold one entry and memory went down to 22811 — 12 bytes. Then doing a CLEAR to erase all variables and declaring the A array to hold two entries consumed 17 bytes. We know that 2 bytes is the variable name, and each value would take 5 bytes, so it looks like the array has 2 bytes for the name, 5 bytes for “something special to an array”, then 5-bytes for each entry.
CLEARing memory then declaring the A array to hold three entries (0-2) confirmed this, since it now took 22 bytes (7 bytes plus 15 for the three numbers).
But VARPTR still ends up pointing to something that looks just like a normal variable – 5 bytes that represent that variable or string. Here’s that normal variable again:
And here’s the same, but for an array variable:
So where are the other bytes? Just like the name, they are just before the value that VARPTR returns. To see them, we’d look at the 7 bytes just before what VARPTR returns.
A WARNING ABOUT VARPTR AND ARRAYS
The BASIC program is stored in memory, followed by variables, and then arrays. At the very end of memory is string storage. If you get a VARPTR to an array, and then create a new variable, the new variable will be inserted in variable memory, and array storage will be relocated. This can and will render the VARPTR value incorrect! To avoid this, make sure you declare any variables you plan to use before doing the VARPTR!
In this example, notice the use of DIM to declare “I”, since “I” will be used in a FOR/NEXT loop AFTER the VARPTR is obtained. If it had not been re-declared, the VARPTR would have been printed, then “I” would have been dynamically created, causing arrays to be moved up in memory. Everything would have been off by the 7 bytes that got inserted by allocating the new variable “I”.
65 is the ASCII value for “A”, the name of our variable. Just like normal variables, if the array is a string, the second byte of the name will have the high bit set, which adds 128 to it. DIM AA(0) would be 65 65, but DIM AA$(0) would be 65 193 (65+128).
- Byte 1 – first letter of variable name
- Byte 2 – second letter of variable name (add 128 if it’s a string)
Next is a 0, then 12. These values increased based on how large the array was, going up by 5 each time a new element was added. With only one entry, 12 seemed to be the size of the 7-byte header, plus a 5-byte variable. Therefore…
- Bytes 3 and 4 – memory used by the array, from start of header to end of array data.
The next byte seems to increase based on the number of DIMensions. DIM A(0) would show 1. DIM A(0,0) would show two. Therefore…
- Byte 5 – number of dimensions. DIM A(0)=1, DIM A(0,0)=2, DIM A(0,0,0)=3
The next two bytes seemed to change based on the size of the array — as in, DIM A(0) would give 1, and DIM A(99) would give 100:
- Bytes 6 and 7 – number of elements in the array.
And then I realized this information was only correct for a single dimension array, such as DIM A(9).
What happens with a multi-dimensioned array, like DIM A(9,9)? Each dimension gets another two bytes added to the header. DIM A(0) would have a 7 byte header. DIM A(0,0) would have a 9 byte header. DIM A(0,0,0) would have an 11 byte header, and so on.
Here is what I see when I do DIM A(1,2,3):
We see three sets of two bytes each — 0 4, 0 3, and 0 2. That corresponds to our DIM A(1,2,3) since using base-0 that is 2 (0-1), 3 (0-2) and 4 (0-3). But, it’s in reverse order.
That tells us after the 2 byte name, and 2 byte size, the 1 byte number of dimensions will be 2 bytes containing the size of each array in the reverse order they were declared.
- Bytes 6 and 7, 8 and 9, 10 and 11, etc. – number of elements in each array, in reverse order for some reason.
Good to know. Using that above example, maybe it looks like this:
DIM A(1,2,3) NAME SIZE #D third second first data.... | | | | | | | | | | | |  [x][x]     [.....][.....]
The order of the data bytes was the next thing I wanted to figure out. If it was just a single dimension array, it would be simple. But with multiple dimensions I was curious what order they were stored.
I did a test where I made a two dimensional array, each holding two elements. This 2×2 array would hold four numbers. By checking the address of each entry, I was able to determine the order:
This showed me that the elements for the first array would be first, which surprised me because the order of the “how big is each array” entries was reversed.
I tested with a three dimensional array, each with two elements. This 2x2x2 array could hold eight elements. Again I saw that the first dimension was stored, then the second, and so on:
Looking at a two-element array looked quite like binary, representing 000, 100, 010, 110, 001, 101, 011, 111. That’s opposite of how binary counting is normally represented, where it starts with the right-most bit. But still interesting to see the familiar pattern.
Here is the program I used to show the VARPTR of each dimension in the array:
0 ' showdims.bas 10 SZ=1 20 DIM DV,I,J,K,A(SZ,SZ,SZ) 30 ' 0=SCREEN, -2=PRINTER 40 DV=0 50 PRINT #DV,"ENTRIES:";(SZ+1)^3 60 FOR K=0 TO SZ 70 FOR J=0 TO SZ 80 FOR I=0 TO SZ 90 PRINT #DV,"VARPTR(A("; 100 PRINT #DV,USING ("# # #");I;J;K; 110 PRINT #DV,")",; 120 PRINT #DV,VARPTR(A(I,J,K)) 130 NEXT:PRINT#DV 140 NEXT 150 NEXT
Since I could only fit all the entries from a small array on the 32 column screen’s 16 lines, and only a few more on a CoCo 3’s 24 column 40/80 screen, I made the program support outputting to the printer. Using XRoar’s “print to a file” feature, I was able to capture larger dimensions:
Here is a 3x3x3 “cube” dimension:
ENTRIES: 27 VARPTR(A(0 0 0) 10060 VARPTR(A(1 0 0) 10065 VARPTR(A(2 0 0) 10070 VARPTR(A(0 1 0) 10075 VARPTR(A(1 1 0) 10080 VARPTR(A(2 1 0) 10085 VARPTR(A(0 2 0) 10090 VARPTR(A(1 2 0) 10095 VARPTR(A(2 2 0) 10100 VARPTR(A(0 0 1) 10105 VARPTR(A(1 0 1) 10110 VARPTR(A(2 0 1) 10115 VARPTR(A(0 1 1) 10120 VARPTR(A(1 1 1) 10125 VARPTR(A(2 1 1) 10130 VARPTR(A(0 2 1) 10135 VARPTR(A(1 2 1) 10140 VARPTR(A(2 2 1) 10145 VARPTR(A(0 0 2) 10150 VARPTR(A(1 0 2) 10155 VARPTR(A(2 0 2) 10160 VARPTR(A(0 1 2) 10165 VARPTR(A(1 1 2) 10170 VARPTR(A(2 1 2) 10175 VARPTR(A(0 2 2) 10180 VARPTR(A(1 2 2) 10185 VARPTR(A(2 2 2) 10190
And here is a 10x10x10:
ENTRIES: 1000 VARPTR(A(0 0 0) 10060 VARPTR(A(1 0 0) 10065 VARPTR(A(2 0 0) 10070 VARPTR(A(3 0 0) 10075 VARPTR(A(4 0 0) 10080 VARPTR(A(5 0 0) 10085 VARPTR(A(6 0 0) 10090 VARPTR(A(7 0 0) 10095 VARPTR(A(8 0 0) 10100 VARPTR(A(9 0 0) 10105 VARPTR(A(0 1 0) 10110 VARPTR(A(1 1 0) 10115 VARPTR(A(2 1 0) 10120 VARPTR(A(3 1 0) 10125 VARPTR(A(4 1 0) 10130 VARPTR(A(5 1 0) 10135 VARPTR(A(6 1 0) 10140 VARPTR(A(7 1 0) 10145 VARPTR(A(8 1 0) 10150 VARPTR(A(9 1 0) 10155 VARPTR(A(0 2 0) 10160 VARPTR(A(1 2 0) 10165 VARPTR(A(2 2 0) 10170 VARPTR(A(3 2 0) 10175 VARPTR(A(4 2 0) 10180 VARPTR(A(5 2 0) 10185 VARPTR(A(6 2 0) 10190 VARPTR(A(7 2 0) 10195 VARPTR(A(8 2 0) 10200 VARPTR(A(9 2 0) 10205 ...just kidding...
I really did start with 1000 lines of output in this article. I get paid per line ;-)
If you get bored, maybe you can figure out a program that would dump the bytes for each array element.
So how big is it?
Here is a short program which will calculate how much room an array will take. If it is a string array, it will be this size plus the size of all the bytes in the string data portion.
0 ' dimsize.bas 10 INPUT "NUMBER OF DIMENSIONS";ND 20 M=1 30 FOR I=0 TO ND-1 40 PRINT "ENTRIES FOR DIM";I; 50 INPUT J 60 M=M*J 70 NEXT 80 M=5+(2*ND)+(M*5) 90 PRINT "MEMORY USED:";M
And that is about all I have to say on VARTPR. For now. Until I started writing about it, I knew nothing about how it worked with arrays. I had used it a few times to pass in a string to an assembly language routine, but that was the extent of my knowledge.
Let me know what I got wrong.
Until next time…