Hacking the Color BASIC PRINT command – part 4

See Also: part 1, part 2, part 3, part 4, part 5 and more to come…

One thing that bugs me about this series is the title. We are not actually hacking the PRINT command. Instead, we are changing the CHROUT character output routine. Close enough, I guess, since PRINT uses this routine…

Now let’s take a tangent to my “Attract Screen” series of posts from a few years ago, specifically the final installment:

Color BASIC Attract Screen – part 6

In that series, I was playing with ways to create the classic CoCo attract screens with the colored blocks that went around the screen:

War by James Garon (title screen)

In that article, Robert Gault had pointed me to this specific “WAR!” program sold by Radio Shack. It had embedded stuff in a PRINT command. I dissected those lines. I found some contained embedded CHR$ graphics characters, and another contained an assembly language routine. Very cool BASIC hacking.

In BASIC, there is an “uncrunch” routine that converts tokenized BASIC to full-text output via the LIST command. LIST will not have a good time LISTing such programs. In BASIC, any byte with the high-bit set (128-255 value) is seen as a token and the tokenizer takes over to convert that one (or two) byte sequence to a word.

Instead of showing a CHR$(128) graphics block, LIST will show the token keyword that is 128.

This led me to wonder if I could create a patch for BASIC that would allow it to properly LIST lines with embedded characters like this. And I did. Here is the test program I wrote back then and completely forgot about until a few days ago.

UPPERCASE stuff is taken from the Color BASIC ROM code. You will see I had to clone most of the token parsing code in my program so I could modify its behavior.

* lwasm uncrunch.asm -fbasic -ouncrunch.bas --map
*
* 0.00 2022-07-04 allenh - initial klunky version.
*

* Allow LIST to display graphics characters inside of quoted strings.

RVEC24  equ $1A6     UNCRUNCH BASIC LINE RAM hook

COMVEC  EQU $0120    Some BASIC locations we need.
LINBUF  EQU $02DC
SKP2    EQU $8C
LBUFMX  EQU 250

    org $3f00

init
    lda RVEC24      get op code
    sta savedrvec   save it
    ldx RVEC24+1    get address
    stx savedrvec+1 save it

    lda #$7e        op code for JMP
    sta RVEC24      store it in RAM hook
    ldx #newcode    address of new code
    stx RVEC24+1    store it in RAM hook
    rts             done

newcode
* UNCRUNCH A LINE INTO BASIC'S LINE INPUT BUFFER
LB7C2
    clr     AREWEQUOTED
    *JSR    >RVEC24     HOOK INTO RAM
    LEAS    2,S         Remove JSR from stack
    LEAX    4,X         MOVE POINTER PAST ADDRESS OF NEXT LINE AND LINE NUMBER
    LDY     #LINBUF+1   UNCRUNCH LINE INTO LINE INPUT BUFFER
LB7CB
    LDA     ,X+         GET A CHARACTER
    LBEQ    LB820       BRANCH IF END OF LINE

    * Check for quote/unquote
    cmpa    #34         Is A a quote character?
    bne     quotedone

togglequote
    tst     AREWEQUOTED
    bne     quoteoff
quoteon
    inc     AREWEQUOTED
    bra     quotedone
quoteoff
    clr     AREWEQUOTED Toggle quote mode off.

quotedone
    tst     AREWEQUOTED
    beq     notquoted

quoted
    * If we are quoted, just store whatever it is.
    lda     -1,x

    CMPY    #LINBUF+LBUFMX  TEST FOR END OF LINE INPUT BUFFER
    BCC     LB820   BRANCH IF AT END OF BUFFER
    *ANDA   #$7F    MASK OFF BIT 7
    STA     ,Y+     * SAVE CHARACTER IN BUFFER AND
    CLR     ,Y      * CLEAR NEXT CHARACTER SLOT IN BUFFER
    BRA     LB7CB   GET ANOTHER CHARACTER

notquoted
    lda     -1,x

    LBMI    LB7E6   BRANCH IF IT'S A TOKEN
    CMPA    #':     CHECK FOR END OF SUB LINE
    BNE     LB7E2   BRNCH IF NOT END OF SUB LINE
    LDB     ,X      GET CHARACTER FOLLOWING COLON
    CMPB    #$84    TOKEN FOR ELSE?
    BEQ     LB7CB   YES - DON'T PUT IT IN BUFFER
    CMPB    #$83    TOKEN FOR REMARK?
    BEQ     LB7CB   YES - DON'T PUT IT IN BUFFER
    FCB     SKP2    SKIP TWO BYTES
LB7E0
    LDA     #'!     EXCLAMATION POINT
LB7E2
    BSR     LB814   PUT CHARACTER IN BUFFER
    BRA     LB7CB   GET ANOTHER CHARACTER
* UNCRUNCH A TOKEN
LB7E6
    LDU     #COMVEC-10  FIRST DO COMMANDS
    CMPA    #$FF    CHECK FOR SECONDARY TOKEN
    BNE     LB7F1   BRANCH IF NON SECONDARY TOKEN
    LDA     ,X+     GET SECONDARY TOKEN
    LEAU    5,U     BUMP IT UP TO SECONDARY FUNCTIONS
LB7F1
    ANDA    #$7F    MASK OFF BIT 7 OF TOKEN
LB7F3
    LEAU    10,U    MOVE TO NEXT COMMAND TABLE
    TST     ,U      IS THIS TABLE ENABLED?
    BEQ     LB7E0   NO - ILLEGAL TOKEN
LB7F9
    SUBA    ,U      SUBTRACT THE NUMBER OF TOKENS FROM THE CURRENT TOKEN NUMBER
    BPL     LB7F3   BRANCH IF TOKEN NOT IN THIS TABLE
    ADDA    ,U      RESTORE TOKEN NUMBER RELATIVE TO THIS TABLE
    LDU     1,U     POINT U TO COMMAND DICTIONARY TABLE
LB801
    DECA            DECREMENT TOKEN NUMBER
    BMI     LB80A   BRANCH IF THIS IS THE CORRECT TOKEN
* SKIP THROUGH DICTIONARY TABLE TO START OF NEXT TOKEN
LB804
    TST     ,U+     GRAB A BYTE
    BPL     LB804   BRANCH IF BIT 7 NOT SET
    BRA     LB801   GO SEE IF THIS IS THE CORRECT TOKEN
LB80A
    LDA     ,U      GET A CHARACTER FROM DICTIONARY TABLE
    BSR     LB814   PUT CHARACTER IN BUFFER
    TST     ,U+     CHECK FOR START OF NEXT TOKEN
    BPL     LB80A   BRANCH IF NOT DONE WITH THIS TOKEN
    BRA     LB7CB   GO GET ANOTHER CHARACTER
LB814
    CMPY    #LINBUF+LBUFMX  TEST FOR END OF LINE INPUT BUFFER
    BCC     LB820   BRANCH IF AT END OF BUFFER
    ANDA    #$7F    MASK OFF BIT 7
    STA     ,Y+     * SAVE CHARACTER IN BUFFER AND
    CLR     ,Y      * CLEAR NEXT CHARACTER SLOT IN BUFFER
LB820
    RTS

* Unused at the moment.

savedrvec   rmb 3   call regular RAM hook
    rts             just in case...

AREWEQUOTED rmb 1

    end     $3f00

And here is a BASIC loader for this routine:

10 READ A,B
20 IF A=-1 THEN 70
30 FOR C = A TO B
40 READ D:POKE C,D
50 NEXT C
60 GOTO 10
70 EXEC 16128
80 DATA 16128,16290,182,1,166,183,63,163,190,1,167,191,63,164,134,126,183,1,166,142,63,24,191,1,167,57,127,63,167,50,98,48,4,16,142,2,221,166,128,16,39,0,121,129,34,38,13,125,63,167,38,5,124,63,167,32,3,127,63,167,125,63,167,39,14,166,31,16,140
90 DATA 3,214,36,91,167,160,111,164,32,214,166,31,16,43,0,21,129,58,38,13,230,132,193,132,39,198,193,131,39,194,140,134,33,141,48,32,187,206,1,22,129,255,38,4,166,128,51,69,132,127,51,74,109,196,39,231,160,196,42,246,171,196,238,65,74,43,6,109
100 DATA 192,42,252,32,247,166,196,141,6,109,192,42,248,32,141,16,140,3,214,36,6,132,127,167,160,111,164,57,16294,16294,57,-1,-1

If you RUN that, then install it using EXEC &H3F00, you can now LIST and see characters embedded in strings:

I bring this up now because 1) I forgot to post about it back in 2022, and 2) because I think I want to do something similar with my cursor movement PRINT patch. Ideally, you should be able to LIST a program and see the original characters in it, and just have them move around when the program is running and PRINTs those characters. This matches how the VIC-20 worked with embedded characters inside a PRINT:

Screenshot from the Javascript Vic20 Emuator: https://www.mdawson.net/vic20chrome/vic20.php

Since I do not remember how this worked, I thought we could go through this program and see what it does.

* lwasm uncrunch.asm -fbasic -ouncrunch.bas --map
*
* 0.00 2022-07-04 allenh - initial klunky version.
*

* Allow LIST to display graphics characters inside of quoted strings.

RVEC24 equ $1A6 UNCRUNCH BASIC LINE RAM hook

COMVEC EQU $0120 Some BASIC locations we need.
LINBUF EQU $02DC
SKP2 EQU $8C
LBUFMX EQU 250

The start is very similar to the consmove.asm code presented in the first parts of this series. RVEC24 is the RAM hook for the UNCRUNCH routine. Microsoft made this available so future BASICs (like Extended, Disk, etc.) could support LISTING their new tokens, I assume.

LINBUF is the BASIC input buffer. This is where things go when you are typing at the OK prompt.

SKP2 is a thing Microsoft used in the ROM as a shortcut to skip two bytes. It is $8C, which is “CMPX #”. It seems they could place that in the code and it would be an “Load X” then whatever two bytes were behind it, allowing a shortcut to branch over those two bytes. Can anyone explain this better (or more accurately, or just accurately)? Leave a comment if you can.

And lastly was a define that was the max size of the line buffer – 250 bytes. I have not looking into why it was 250 instead of, say, 255, but this is as many characters as you can type before input stops and you can only backspace or press ENTER.

    org $3f00

init
    lda RVEC24      get op code
    sta savedrvec   save it
    ldx RVEC24+1    get address
    stx savedrvec+1 save it

    lda #$7e        op code for JMP
    sta RVEC24      store it in RAM hook
    ldx #newcode    address of new code
    stx RVEC24+1    store it in RAM hook
    rts             done

This is the code that installs the RAM hook. It should look close to what I have in the previous routine, just with a different vector.

Now we get to the new code. For this I needed to duplicate some code in the BASIC ROM. The code in UPPERCASE represents code I brought in from the Unravelled disassembly listing. The ROM code has no provision for doing something different if it is a quoted string, so I needed to use the original code, with some extra code around that it turns on or off the detokenizer if we are listing something within quotes.

Things are about to get messy…

newcode
* UNCRUNCH A LINE INTO BASIC'S LINE INPUT BUFFER
LB7C2
    clr     AREWEQUOTED
    *JSR    RVEC24      HOOK INTO RAM
    LEAS    2,S         Remove JSR from stack
    LEAX    4,X         MOVE POINTER PAST ADDRESS OF NEXT LINE AND LINE NUMBER
    LDY     #LINBUF+1   UNCRUNCH LINE INTO LINE INPUT BUFFER
LB7CB
    LDA     ,X+         GET A CHARACTER
    LBEQ    LB820       BRANCH IF END OF LINE

This code is mostly from the BASIC ROM except for that first “clr”. I reserve a byte of memory to use as a flag for when we are or are not parsing something in quotes. If a quote is seen, that flag is set. It stays set until another quote is seen or the end of line is reached. The commented-out line is in the original ROM and that would be the line that jumps to this hook. I kept the line in for clarity, but it is not used in my function.

Now we have my custom code. If we are not at the end of line, I check for quotes and turn the flag on or off:

    * Check for quote/unquote
    cmpa    #34         Is A a quote character?
    bne     quotedone

togglequote
    tst     AREWEQUOTED
    bne     quoteoff
quoteon
    inc     AREWEQUOTED
    bra     quotedone
quoteoff
    clr     AREWEQUOTED Toggle quote mode off.
quotedone
    tst     AREWEQUOTED
    beq     notquoted

This flag will be used in a moment. I clone some of the ROM code that outputs the un-crunched line but will bypass the un-crunch for items within quotes.

quoted
    * If we are quoted, just store whatever it is.
    lda     -1,x

    CMPY    #LINBUF+LBUFMX  TEST FOR END OF LINE INPUT BUFFER
    BCC     LB820   BRANCH IF AT END OF BUFFER
    *ANDA   #$7F    MASK OFF BIT 7
    STA     ,Y+     * SAVE CHARACTER IN BUFFER AND
    CLR     ,Y      * CLEAR NEXT CHARACTER SLOT IN BUFFER
    BRA     LB7CB   GET ANOTHER CHARACTER

For quoted items, A is loaded with a character. Then the ROM code runs but I commented out the thing that would mask off bit 7. Tokens have that bit set, but I wanted to leave high-bit bytes alone.

This batch of code is more Color BASIC ROM code I duplicated here after that first “lda” line:

notquoted
lda -1,x

LBMI LB7E6 BRANCH IF IT'S A TOKEN
CMPA #': CHECK FOR END OF SUB LINE
BNE LB7E2 BRNCH IF NOT END OF SUB LINE
LDB ,X GET CHARACTER FOLLOWING COLON
CMPB #$84 TOKEN FOR ELSE?
BEQ LB7CB YES - DON'T PUT IT IN BUFFER
CMPB #$83 TOKEN FOR REMARK?
BEQ LB7CB YES - DON'T PUT IT IN BUFFER
FCB SKP2 SKIP TWO BYTES
LB7E0
LDA #'! EXCLAMATION POINT
LB7E2
BSR LB814 PUT CHARACTER IN BUFFER
BRA LB7CB GET ANOTHER CHARACTER

* UNCRUNCH A TOKEN
LB7E6
LDU #COMVEC-10 FIRST DO COMMANDS
CMPA #$FF CHECK FOR SECONDARY TOKEN
BNE LB7F1 BRANCH IF NON SECONDARY TOKEN
LDA ,X+ GET SECONDARY TOKEN
LEAU 5,U BUMP IT UP TO SECONDARY FUNCTIONS
LB7F1
ANDA #$7F MASK OFF BIT 7 OF TOKEN
LB7F3
LEAU 10,U MOVE TO NEXT COMMAND TABLE
TST ,U IS THIS TABLE ENABLED?
BEQ LB7E0 NO - ILLEGAL TOKEN
LB7F9
SUBA ,U SUBTRACT THE NUMBER OF TOKENS FROM THE CURRENT TOKEN NUMBER
BPL LB7F3 BRANCH IF TOKEN NOT IN THIS TABLE
ADDA ,U RESTORE TOKEN NUMBER RELATIVE TO THIS TABLE
LDU 1,U POINT U TO COMMAND DICTIONARY TABLE
LB801
DECA DECREMENT TOKEN NUMBER
BMI LB80A BRANCH IF THIS IS THE CORRECT TOKEN
* SKIP THROUGH DICTIONARY TABLE TO START OF NEXT TOKEN
LB804
TST ,U+ GRAB A BYTE
BPL LB804 BRANCH IF BIT 7 NOT SET
BRA LB801 GO SEE IF THIS IS THE CORRECT TOKEN
LB80A
LDA ,U GET A CHARACTER FROM DICTIONARY TABLE
BSR LB814 PUT CHARACTER IN BUFFER
TST ,U+ CHECK FOR START OF NEXT TOKEN
BPL LB80A BRANCH IF NOT DONE WITH THIS TOKEN
BRA LB7CB GO GET ANOTHER CHARACTER
LB814
CMPY #LINBUF+LBUFMX TEST FOR END OF LINE INPUT BUFFER
BCC LB820 BRANCH IF AT END OF BUFFER
ANDA #$7F MASK OFF BIT 7
STA ,Y+ * SAVE CHARACTER IN BUFFER AND
CLR ,Y * CLEAR NEXT CHARACTER SLOT IN BUFFER
LB820
RTS

* Unused at the moment.

savedrvec rmb 3 call regular RAM hook
rts just in case...

AREWEQUOTED rmb 1

end $3f00

Looking at this now, I expect I had to duplicate all this ROM code because I need to stay in the un-crunch loop for the complete line. If you look at the ROM code and see a better approach, please leave a comment.

I also think this may be wrong, since I do not see it calling the original vector when complete. This may not work for tokens added by Extended or Disk BASIC. We will have to fix that at some point.

For now, I will just leave this here as an example, and then figure out how I could do something similar with my cursor move codes so they LIST showing the codes, and only move around the screen when PRINTing them.

This should be interesting… The way I patched CHROUT, it has no idea if the character is from LIST or PRINT. I will have to figure out how to solve that, next.

Until then…

One thought on “Hacking the Color BASIC PRINT command – part 4

  1. William Astle

    By duplicating the uncrunch code, it should work just fine for ECB and Disk Basic tokens. But it won’t work for Coco3 tokens. That’s a less straight forward problem but you can hack around it.

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.