Color BASIC “DATA” quirk.

I was in the middle of writing more on my CoCo Base-64 encoding series and stumbled upon some weirdness with the DATA command. Consider this silly program:

0 REM baddata.bas
10 READ A$:IF A$="" THEN END
15 PRINT A$;:GOTO 10
20 DATA ":"
30 DATA HELLO
40 DATA ""

This will print:

:HELLO

I know I could just have done DATA “:HELLO” but stay with me on this..

If you try to combine lines like this:

0 REM baddata.bas
10 READ A$:IF A$="" THEN END
15 PRINT A$;:GOTO 10
20 DATA ":":DATA HELLO
40 DATA ""

…you get this:

Color BASIC DATA quirk.

When I get a moment, I’ll have to look at the ROM disassembly and see what is going on.

Until then…

9 thoughts on “Color BASIC “DATA” quirk.

  1. MiaM

    IIRC this works the same way on the 6502 versions of Microsoft Basic. At least the execution part, not sure about the tokenizer part.

    It looks like a bug but is probably an intended feature, or somewhere in between.

    Compare the REM statement which behaves the same way in that you can’t have a colon and additional statements.

    The tokenizer storing DATA as a token rathern than as the individual letters D A T A surely is a bug, at least if the execution part is intentional.

    No matter how we see it, it is a discrepancy between the tokenizer and the execution part. Perhaps their internal documentation weren’t good enough so when it got ported to 6809 the execution part and the tokenizer part were implemented in different ways?

    Reply
  2. RogelioP

    Looks like the parsing of the combined DATA line on the second example goes to la-la land. The DATA statement token is decimal 134, which when read on that line after the semicolon it will be seen as the ASCII code for graphic character dec 134.- what gets printed before HELLO

    Interesting bug, I would probably never seen it as I usually never pack more than one DATA statement per line in my BASIC coding shenanigans :-)

    Reply
    1. Allen Huffman Post author

      See my post in the Facebook CoCo group… (10 DATA “PASSWORD”:REM a password) and many other variations won’t work. Nothing can be on a line after DATA other than DATA, it seems.

      Reply
    2. Johann Klasek

      CBM BASIC (C64, 6502 based) does not have this bug/behavior, despite they have the same MS BASIC roots. It seems the tokenizer for Coco’s Extended BASIC is completely new implementation (introducing a lot of strange things). It seems to me that the handling of quoted strings is kind of wrong.

      Reply
        1. xotmatrix

          Both examples behave the same in Applesoft BASIC, first reading a one-byte string (a colon) and then a 5-byte string (“HELLO”), then a zero length string ending the loop.

          Atari BASIC is a bit different. Strings are not expected to be quoted, so they are read completely, quotes and all.

          This:
          20 DATA “:”:DATA HELLO
          … is read as:
          “:”:DATA HELLO

          Because of this, the loop cannot terminate because the last string read is a pair of quotes not an empty string. Another difference is string memory must be allocated before use. For example DIM A$(10) reserves 10 characters. Trying to assign more than that cuts off the end of the string.

          Reply
          1. Allen Huffman Post author

            Interesting on the DIM usage. Was there no string array concept? I noticed CBM BASIC seems to allocate strings until it can’t, versus Color BASIC having 200 bytes by default and you have to set the limit.

  3. Pingback: Exploring Atari VCS/2600 Adventure – part 4 | Sub-Etha Software

  4. Pingback: DATA problems revisited | Sub-Etha Software

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.