The VAL overflow bug and scientific notation

A few months back, 8-bit Show and Tell tweeted this:

I was curious if the bug existed in Color BASIC, so I tried it and tweeted back confirming ours had the same issue.

Recently, he posted this deep dive video about the bug, mentioning my reply and others (including CoCoist Tim Lindner) that showed this bug on various other systems:

At the time of the original tweet, I wrote a blog post about the bug (that is the screen shot he shows in his video). I thought I’d add a follow-up to my post, with a bit more detail.

The bug explained…

William Astle commented on my original post explaining what was going on, similar to the explanation Robin gave in a tweet reply.

The issue is that VAL puts a NUL at the end of the string then calls the regular number parser, which can then bail out unceremoniously. While VAL itself does restore the original byte after the string, that code path does not execute when an error occurs parsing the number. The NUL is required so the parser knows where to stop interpreting bytes.

– William Astle, 8/17/2023

Let’s dive in a bit more from the CoCo side… First, just in case you aren’t a BASIC programmer, the VAL keyword is designed to convert a string to a numeric variable. For instance, you can do this:

A$="1234"
A=VAL(A$)

Above, A$ is a string variable containing the characters “1234” and A is a numeric variable of 1234. I see this often used in Extended Color BASIC when the LINE INPUT command is used with a string, and then converted to a number:

10 LINE INPUT "AGE: ";A$
20 A=VAL(A$)

But I digress..

LINE INPUT is a better form of INPUT, but it only works with string variables. If you were to type letters in to a LINE INPUT, then run those through VAL, they should evaluate as 0. So type in “42” and VAL(A$) gives you 42. Type in “BACON” and VAL(A$) gives you 0. If you had just used INPUT “AGE”;A and typed non-numbers, it would respond with “?REDO” and go back to the input prompt.

Second, let’s make the bug easier to see by clarifying this “1E39” scientific notation thing. The bug has nothing to do with using scientific notation. It has to do with having a number that is too big causing the Overflow Error and aborting the VAL conversion of a string to a number.

Scientific Notation

“1E39” is a number 1 followed by 39 zeros. It appears BASIC is happy to print out the full number if it is short enough, but at some point starts showing it in scientific notation. I found that 1 followed by 8 zeros (100000000) is fine, but 9 zeros switched over to scientific notation:

And it does that even if you just try to use a number like “1000000000” directly:

I guess I had never used numbers that large during my Color BASIC days. ;-)

You may notice it prints “1E9” back as “1E+09”. You can use “1E+09” or “1E+9” as well, and it does the same thing. If you leave out the “+”, it assumes it. The reason for the plus is because you can also use it to represent fractional numbers. In the case of +9, it is moving the decimal place nine places to the right. “1E5” is taking “1.0” and moving the decimal place five places to the right like “100000.0”

If you use a “-“, you are moving the decimal that many places left. “1E-1” takes “1.0” and moves the decimal one spot left, producing “.1”. It appears you cannot print as many values that way before it turns in to scientific notation:

And, printing those values directly shows something similar:

I guess I had never used numbers that small during my Color BASIC days. ;-)

This made me wonder if the VAL bug would happen if a value was too small, but it seems at some point the number just becomes zero, so no error occurs. (Maybe William will chime in with some more information on this. I was actually expecting a similar “Underflow” error, but I don’t think we have a ?UF ERROR in Color BASCIC ;-)

For fun, I wondered if this was truly considered zero. In C, using floating to compare against specific floating point values can cause issues. For example:

#include <stdio.h>
#include <stdlib.h>

int main()
{
float a = 902.1;

if (a == 902.1)
{
printf ("a is 902.1\n");
}
else
{
printf ("a is NOT 902.1\n");
}

return EXIT_SUCCESS;
}

I have discussed this here in the past, but if you run that, it will print “a is NOT 902.1″. This is because 902.1 is not a value that a 32-bit C floating point variable can exactly represent. I expect this could also be the case in Color BASIC, so I wanted to do a quick check and see if “1E-39” (which shows as zero) really was 0:

IF 0=1E-39 THEN PRINT "YES"

That printed “YES” so I will just assume at a certain point, BASIC floating point values just turn in to zero.

But I digress… Again.

The point is, it’s a bug with the number being too large, so even if you do this, you can cause the overflow:

10 A=VAL("1000000000000000000000000000000000000000"):REM SHOW BUG
20 PRINT A

Above, that 40 character number (1 with the decimal place 39 places to the right) is just too long and it will cause the ?OV ERROR.

Strings in String Memory

In my String Theory series, I dove in to how strings work on the CoCo. The important bit is there is reserved string memory for strings, for things like INPUT/LINE INPUT, and string manipulation like MID$, LEFT$, etc. There are also “constant” strings that exist in the program code itself.

If you assign a string directly from BASIC (not in a line number of a program), it will go in to string memory:

A$="THIS IS IN STRING MEMORY"
PRINT A$
THIS IS IN STRING MEMORY

But, if that is in a program, BASIC just makes an entry for “A$” and points it to the spot in the BASIC program where the quoted text exists:

10 A$="THIS IS IN PROGRAM MEMORY"
20 PRINT A$
RUN
THIS IS IN PROGRAM MEMORY

That is what causes the problem with VAL. BASIC attempts to modify the closing quote in the BASIC program itself and make it a 0, and never restores it. The BASIC “LIST” command starts showing the line up until it sees a 0, then stops. The rest of the line is still in memory, but is now invisible to LIST. If you try to run the program after it gets “corrupted”, it will error out on the VAL line since it is missing the closing quote:

However, the code is still there. If you know where the BASIC program starts in memory, and ends in memory, you can use PEEK/PRINT to see the contents. Memory locations 25/26 are the start of the BASIC program, and locations 27/28 are the start of variables which are stored directly after the program, so something like this would do it:

Much like what Robin showed in his video using a machine language monitor to dump memory, above we can look for the bytes that would be the “1E39” (quote is 34, “1” is 49, “E” is 69, “3” is 51 and “9” is 57), we can find that byte sequence of 34, 49, 69, 43, 51 and 57 in the second line followed by a zero where the final quote (34) used to be. After that zero is a 41 which is the “)” that used to be in VAL(“1E39”), then a 58 which is a “:” colon, and then a 130 which is the byte for the “REM” token, then a 32 which is a space, and 83, 72, 79 and 38 are “SHOW” followed by a 32 space then 66, 85, 71 which is “BUG” and a real 0 marking the end of the line.

If I knew the byte that is now a 0, I could just POKE it back to 34 and restore the program, just like Robin did on his Commodore 64.

FOR A=PEEK(25)*256+PEEK(26) TO PEEK(27)*256+PEEK(28):PRINT A,PEEK(A):NEXT

That would start printing memory locations and I could quickly BREAK the program when I see the 0 I am looking for show up.

I believe the zero at 9744 is the one after “1E39” and I can do this to restore the program:

Now, if only Color BASIC did that after an ?OV ERROR! Although we did get an updated BASIC in 1986 for the Color Computer 3, it was just patches on top of the old Microsoft BASIC to add new CoCo 3 features.

Avoiding the VAL bug

Which brings me to this… If the string to parse was in string memory, changing that final byte and not changing it back would be no problem because strings all end with a 0 in memory anyway! There is nothing to corrupt.

To force a variable to be in string memory, you can add +”” when you declare it, like this:

10 A$="THIS IS IN STRING MEMORY"+""

Since BASIC has to combine those two strings, it makes a new string in string memory to copy the “THIS IS IN STRING MEMORY” string and the “” string there. (It is not smart enough to check to know that “” is unneeded, which is good because it lets us do things like this.)

10 A=VAL("1E39"+""):REM NO BUG HERE
20 PRINT A

And that is a simple way to work around this bug. Since the bug only affects hard-coded strings in program memory, it should be easy to avoid just by not using values too large for variables :)

And if you are inputting them, the INPUT is going in to string memory and you will still get an ?OV ERROR (crashing the program) but at least the program would not get corrupted:

10 PRINT "TYPE 1 AND 39 ZEROS":
20 INPUT A

Have fun…

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.