Word wrapping in Extended Color BASIC

2015/1/2 Update: Please see my follow-up article for two improved versions of this.

Last night, I began writing a program on my 1980s Radio Shack Color Computer. I am planning on going through hundreds of old floppy diskettes and copying them to disk image files.  The program I began writing will help automate this task.

I am using a new bit of retro hardware called the CoCoSDC. It acts like a floppy drive controller, but instead of writing to round pieces of magnetic plastic, it writes that data out to disk image files on an SD card. You can learn more about CoCoSDC in another series of articles I have been writing.

CoCoSDC is a brilliant piece of design work in that it detects and adjusts for the various models of Color Computers (1, 2 or 3) it is plugged in to. I decided my program should do the same, so my goal is to have one BASIC program that could be loaded on an original 1980 Color Computer 1 or 2, using its 32 column display, or run on a 1986 Color Computer 3, using the 40 or 80 column text screen. There are a few challenges I needed to solve:

  • The program should work on an original Radio Shack TRS-80 Color Computers or early model TRS-80 CoCo 2s, which had a 32 column display with uppercase character set. (On the Motorola 6847 VDG chip used by this machine, lowercase letters were represented by inverse uppercase characters.)
  • The program should take advantage of the later model Tandy Color Computer 2s that had an enhanced VDG chip that could display true lowercase. This option should still be available for earlier model CoCos which may have an aftermarket lowercase kit (like my old grey CoCo 1 has).
  • The program should work on a Tandy Color Computer 3, and let the user select between 32, 40 or 80 column display. This would allow it to worked hooked up to a crappy old TV display, or have nice 80 column text for users that had a good monochrome composite monitor or RGB-A color monitor.

I quickly designed and wrote a functional prototype, missing only a few features I had in mind. I started out by hard coding it to use a variable for the screen width so I could center text and and such. For example:

WD = 32
A$ = "CENTER THIS!"
PRINT TAB(WD/2-LEN(A$)/2);A$

By making that PRINT line a subroutine, I could call it anywhere I wanted to print centered text:

10 WD = 32 'SET SCREEN WIDTH
20 CLS 'CLEAR SCREEN
30 A$="IT WORKS!":GOSUB 1000
40 END
1000 REM PRINT CENTERED A$
1010 PRINT TAB(WD/2-LEN(A$)/2);A$
1020 RETURN
CoCo centered text.

XRoar emulator acting as a Radio Shack Color Computer 1.

I also made a routine to display a horizontal row of dashes so I could use that with nicely formatted menus.

Eventually, I needed to display some verbose text to the user, and knew I would need to make a routine to handle wrapping long lines. It’s easy to hard code PRINT statements when you know your screen width, but when it can vary, we can have the program do it for us.

Earlier in 2014 I implemented such a word wrap routine in C for a Raspberry Pi program I was designing. Unfortunately, porting that code to BASIC wouldn’t be possible since it took advantage of too many C features that have no BASIC equivalents. Instead, I had to design a new one.

Also, I wanted the option to pass in strings with upper and lower case and have them converted to uppercase. The only design restriction I enforced was to try not to do any string rebuilding. Interpreted BASICs have their own type of garbage collection (like you might find in Java and other modern languages) that do all kinds of memory shifting when you add two strings together. My program could be faster if it avoided that. (Not that it really matters, but good habits learned later in my programming career are difficult to break even when I am pretending it’s 1983.)

Here is the first version I came up with:

2015/1/5 Update: Please see my next article for two improved versions of this.

10 CLS 15 INPUT"SCREEN WIDTH [32]";WD:IF WD=0 THEN WD=32
20 INPUT"UPPERCASE ([0]=NO, 1=YES)";UC
25 TM=TIMER
30 A$="This is a string we want to word wrap. I wonder if I can make something that will wrap like I think it should?":GOSUB 1000
35 PRINT"TIME TAKEN:"TIMER-TM
999 END
1000 REM WORD WRAP
1001 '
1002 'IN : A$=MESSAGE
1003 '     UC=1 UPPERCASE
1004 '     WD=SCREEN WIDTH
1005 'OUT: LN=LINES PRINTED
1006 'MOD: ST, EN
1007 '
1010 IF A$="" THEN PRINT:LN=1:RETURN
1015 IF UC>0 THEN FOR ST=1 TO LEN(A$):EN=ASC(MID$(A$,ST,1)):IF EN<96 THEN NEXT ST ELSE MID$(A$,ST,1)=CHR$(EN-32):NEXT ST
1020 LN=0:ST=1
1025 EN=LEN(A$)
1030 IF MID$(A$,ST,1)=" " AND ST<EN THEN ST=ST+1:GOTO 1030
1035 IF EN-ST>WD THEN FOR EN=ST+WD TO ST STEP-1:IF MID$(A$,EN,1)<>" " THEN NEXT EN ELSE EN=EN-1
1040 IF EN=ST THEN EN=ST+WD-1
1045 PRINT MID$(A$,ST,EN-ST+1);
1050 LN=LN+1
1055 IF EN-ST+1<WD THEN PRINT
1060 ST=EN+1:IF ST<LEN(A$) THEN 1025
1065 RETURN

Note: This is not my normal production programming style. I created this example for better readability, but in the future I will share some of the tricks we used at Sub-Etha to optimize BASIC programs to take up less memory and run faster. (They won’t be very readable…)

Word wrap, lowercase.

Word wrap, lowercase.

Word wrap, uppercase.

Word wrap, uppercase.

The subroutine at line 1000 expects the string to be in A$, the width of the display in WD, and the variable UC can be set to 1 to force the output to be in uppercase (which is slower). When it returns, LN will contain how many lines were printed. I also print out the TIMER so I can compare the speed of the routine, and see how much slower it is with uppercase conversion.

On the right are screen shots of the wrapped output with and without uppercase conversion. You can see that converting a few lines of text the way I did it took about five times longer.

There is one extra feature I should mention. One of the things that bothers me about many word-wrap routines I have seen is that they tend to ignore the final character of a line meaning that if you print a line that is exactly 40 characters long to a 40 column display, they usually wrap the last word even though it would have fit on the line. This is because, at least on all the systems I have experience with, when you print to the final character, the cursor moves to the start of the next line, then the PRINT command finishes and adds a carriage return. Here is an example of printing three 32 column lines on the CoCo:

10 CLS
20 PRINT"12345678901234567890123456789012"
30 PRINT"12345678901234567890123456789012"
40 PRINT"12345678901234567890123456789012"

You get something like this:

Printing without semicolon.

Printing without semicolon.

BASIC does not know that a line was skipped before it adds its own carriage return. You can prevent BASIC from adding a carriage return by adding a semicolon to the end of the PRINT line:

10 CLS
20 PRINT"12345678901234567890123456789012";
30 PRINT"12345678901234567890123456789012";
40 PRINT"12345678901234567890123456789012";
Printing with semicolon.

Printing with semicolon.

Most word wrap routines I have seen just don’t bother dealing with this, and never use the final character of the line. For my routine, I wanted a line that was exactly the length of the screen width to fit, so I always add the semicolon, then in line 1055 I print a carriage return if the line was shorter than the screen width (and thus BASIC didn’t already move the cursor to the next line).

Little things like this make the code bulkier than it needs to be.

Now I turn things over to you… This can be done much better. How would you implement a word wrap? And note there are several goals that would require different code:

  • Code size may be most important.
  • Execution speed may be most important.
  • You might want to use as few variables as possible.
  • You might want to avoid string manipulation.
  • You might not be able to alter any variables passed in (thus, if the user passes you A$, you are not allowed to change A$).

For me, I wanted to avoid string manipulation (for speed) and use fewer variables. Without that goal, I could have done the word wrap routine easier by making a copy of the string the user passed in and manipulating the copy. This is why my uppercase conversion makes changes to the string the user passed in. If I had the goal of not modifying what the user passed in, I would have had no choice other than making a copy in another string variable.

The choice is yours. Send in your best attempt and explain your goal. Here are the requirements:

  • A$ will be set to the string the user wishes to display.
  • WD will be set to the screen width.
  • UC will be 0 if the string is to be displayed as-is, or not 0 to convert to uppercase.
  • On return, LN will be the count of how many lines were displayed.

If you need a quick and easy emulator to run it in, check out XRoar. I have tips on how to get it running over at CoCopedia:

http://www.cocopedia.com/wiki/index.php/Using_XRoar

Have fun!

P.S. This example requires Extended Color BASIC. As it turns out, while the 1980 Color BASIC supports MID$(), you cannot use it to modify a string. Thus, A$=MID$(B$,1,1) would work, but MID$(B$,1,1)=”N” would not. My first CoCo had Extended Color BASIC so I never had to live without the extra features.