More BASIC word wrap versions

Welcome to another exciting installment of text word wrapping in Color BASIC! This time, I will share another word wrap submission, then compare all four versions in speed and code size.

Be sure to check out part 1, part 2, and part 3.

I originally shared some BASIC code I was writing to do word wrapping of text. My program needed to run on 32, 40 or 80 columns on the CoCo, and I did not want to hard code versions of every screen for each screen width.

CoCo/MC-10 BASIC wiz Jim Gerrie passed along his three-line version, and now CoCoSDC designer Darren Atkinson shares some more tweaks. He writes:

Hi Allen,

Here is a submission for your Word Wrap article. It’s not my own design. I just took the liberty of making a couple changes to Jim Gerrie’s code:

1. It will now use the last column in a line.
2. It runs a bit faster.

By focusing on speed, the trade-off is somewhat larger code size and the allocation of a few more variables.

– Darren

His update looks like this:

0 CLEAR200:DIMCC,VP,C1,LN,SP:GOTO10
1 C1=1:CC=WD+2:VP=VARPTR(M$):VP=PEEK(VP+2)*256+PEEK(VP+3)-1:LN=LEN(M$)-1:SP=32
2 CC=CC-1:IFCC<LN ANDPEEK(VP+CC)<>SP ANDCC>C1 THEN2ELSEC2=CC-C1:IFCC=C1 THENC2=WD:CC=C1+WD-1
3 PRINTMID$(M$,C1,C2);:C1=CC+1:CC=C1+WD:IFC2<>WD ORLN+1<WD THENPRINT
4 IFC1<LN THEN2ELSERETURN
10 CLS
30 INPUT"SCREEN WIDTH [32]";WD
40 IF WD=0 THEN WD=32
50 INPUT"UPPERCASE ([0]=NO, 1=YES)";UC
60 TIMER=0:TM=TIMER
70 PRINT "SHORT STRING:"
80 M$="This should not need to wrap.":GOSUB 1
90 PRINT "LONG STRING:"
100 M$="This is a string we want to word wrap. I wonder if I can make something that will wrap like I think it should?":GOSUB 1
110 PRINT "WORD > WIDTH:"
120 M$="123456789012345678901234567890123 THAT WAS TOO LONG TO FIT BUT THIS IS EVEN LONGER ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234 SO THERE.":GOSUB 1
130 PRINT"TIME TAKEN:"TIMER-TM
140 END
Word-wrap by Darren Atkinson.
Word-wrap by Darren Atkinson.

Hi version is four lines of actual code (1-4) and uses a few more variables. He is using a technique I had not seen before. Rather than check characters using MID$(), he gets the address of the string using VARPTR() and then PEEKs memory. I will have to benchmark this and see the time differences. His version clocks in with a count of 192.

Let’s see how all four versions stack up, from fastest to slowest.

  1. My version 2 (LEFT$/RIGHT$): 107
  2. My version 1 (MID$): 114
  3. Darren Atkinson’s four line version: 192
  4. Jim Gerrie’s three line 3rd version: 248

That takes care of speed, but what about memory usage? In order to do a true apples-to-apples comparison, I need to alter my versions so they use the same starting line number at Jim’s and Darren’s. Jim moves subroutines to the start of the program so they are found quicker. He also numbered by 1 as a space-spacing step (GOSUB1 takes three bytes less than GOSUB1000). Since my versions are too long to fit before the start of the test code at line 10, I am going to make the test code start at 100, and GOSUB to line 1 for the word wrap routine. The new test program will look like this:

100 CLS
110 INPUT"SCREEN WIDTH [32]";WD
120 IF WD=0 THEN WD=32
130 INPUT"UPPERCASE ([0]=NO, 1=YES)";UC
140 TIMER=0:TM=TIMER
150 PRINT "SHORT STRING:"
160 A$="THIS SHOULD NOT NEED TO WRAP.":GOSUB 1
170 PRINT "LONG STRING:"
180 A$="This is a string we want to word wrap. I wonder if I can make something that will wrap like I think it should?":GOSUB 1
190 PRINT "WORD > WIDTH:"
200 A$="123456789012345678901234567890123 THAT WAS TOO LONG TO FIT BUT THIS IS EVEN LONGER ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234 SO THERE.":GOSUB 1
210 PRINT"TIME TAKEN:"TIMER-TM
220 END

Line 0 will have any needed CLEAR command (for extra string space), and could use DIM to pre-declare any variables used later. Then it should GOTO 100 to start the test routine. The word wrap routine will start at line 1.

Jim provided me with an existing routine he had, so he altered the test program to use M$ for the message to wrap. In order to keep all cases the same, I have changed Jim’s (and Darren’s) test programs to use A$. Now all four word wrap routines will be as similar as possible.

I also removed the DIMs for now, since I wanted everything in the .BAS file to be the same except for the word wrap routine.

My version 1 (LEFT$/RIGHT$):

1 REM WORD WRAP V1
2 '
3 'IN : A$=MESSAGE
4 ' UC=1 UPPERCASE
5 ' WD=SCREEN WIDTH
6 'OUT: LN=LINES PRINTED
7 'MOD: ZS, ZE, ZC
8 '
9 LN=1
10 IF A$="" THEN PRINT:RETURN ELSE ZS=1
11 IF UC>0 THEN FOR ZC=1 TO LEN(A$):ZC=ASC(MID$(A$,ZC,1)):IF ZC<96 THEN NEXT ELSE MID$(A$,ZC,1)=CHR$(ZC-32):NEXT
12 ZE=LEN(A$)
13 IF ZE-ZS+1<=WD THEN PRINT MID$(A$,ZS,ZE-ZS+1);:IF ZE-ZS+1<WD THEN PRINT:RETURN
14 FOR ZE=ZS+WD TO ZS STEP-1:IF MID$(A$,ZE,1)<>" " THEN NEXT:ZC=0 ELSE ZE=ZE-1:ZC=1
15 IF ZE<ZS THEN ZE=ZS+WD-1
16 PRINT MID$(A$,ZS,ZE-ZS+1);
17 IF ZE-ZS+1<WD THEN PRINT
18 LN=LN+1
19 ZS=ZE+1+ZC
20 GOTO 12

My version 2 (MID$):

1 REM WORD WRAP V2
2 '
3 'IN : A$=MESSAGE
4 ' UC=1 UPPERCASE
5 ' WD=SCREEN WIDTH
6 'OUT: LN=LINES PRINTED
7 'MOD: ZC, ZE, ZS
8 '
9 LN=1
10 IF A$="" THEN PRINT:RETURN
11 IF UC>0 THEN FOR ZS=1 TO LEN(A$):ZC=ASC(MID$(A$,ZS,1)):IF ZC<96 THEN NEXT ELSE MID$(A$,ZS,1)=CHR$(ZC-32):NEXT
12 ZE=LEN(A$)
13 IF ZE<=WD THEN PRINT A$;:IF ZE<WD THEN PRINT:RETURN
14 FOR ZE=WD+1 TO 1 STEP-1:IF MID$(A$,ZE,1)<>" " THEN NEXT:ZP=0 ELSE ZE=ZE-1:ZP=1
15 IF ZE=0 THEN ZE=WD
16 PRINT LEFT$(A$,ZE);
17 IF ZE<WD THEN PRINT
18 LN=LN+1
19 A$=RIGHT$(A$,LEN(A$)-ZE-ZP)
20 GOTO 12

Jim Gerrie’s three line version:

1 C1=1:CC=WD+1
2 CC=CC-1:ON-(MID$(A$,CC,1)<>""ANDMID$(A$,CC,1)<>" "ANDCC>C1)GOTO2:C2=CC-C1:IFCC=C1 THENC2=31:CC=C1+WD-2
3 PRINTMID$(A$,C1,C2):C1=CC+1:CC=C1+WD:ON-(C1<=LEN(A$))GOTO2:RETURN

Darren Atkinson’s four line version:

1 C1=1:CC=WD+2:VP=VARPTR(A$):VP=PEEK(VP+2)*256+PEEK(VP+3)-1:LN=LEN(A$)-1:SP=32
2 CC=CC-1:IFCC<LN ANDPEEK(VP+CC)<>SP ANDCC>C1 THEN2ELSEC2=CC-C1:IFCC=C1 THENC2=WD:CC=C1+WD-1
3 PRINTMID$(A$,C1,C2);:C1=CC+1:CC=C1+WD:IFC2<>WD ORLN+1<WD THENPRINT
4 IFC1<LN THEN2ELSERETURN

To determine program size, I will PRINT MEM before loading,and then load the full test program and delete line 0 (the CLEAR/GOTO) and anything past 100 (the test program). I will print MEM again and subtract.

Here they are again, from smallest code size to largest:

  1. Jim Gerrie’s three line version: 176 bytes
  2. Darren Atkinson’s four line version:  236 bytes
  3. My version 2 (LEFT$/RIGHT$): 500 bytes
  4. My version 1 (MID$): 542 bytes

Now the order is nearly reversed, with Jim’s and Darren’s versions substantially smaller than my versions. But, my version has those big REM statements and plenty of spaces and lines that could be combined. Mine also returns the number of lines printed (LN) and has the code for uppercase conversion and a check at the start to make sure we aren’t passed an empty string. If an empty string is passed to Jim’s, it gives “?FC ERROR IN 2”. Darren’s just returns. I will have to inspect his code and see if that was intentional. I should probably add an empty string to the test case.

I will remove the uppercase (UC) and line count (LN) code from mine, and try to pack things together. I want to keep the empty string check since error checking is good and since it seems Darren’s does this.

My optimized version 1 (MID$) now looks like this:

1 IFA$=""THENPRINT:RETURNELSEZS=1
2 ZE=LEN(A$):IFZE-ZS+1<=WD THENPRINTMID$(A$,ZS,ZE-ZS+1);:IFZE-ZS+1<WD THENPRINT:RETURN
3 FORZE=ZS+WD TOZS STEP-1:IFMID$(A$,ZE,1)<>" "THENNEXT:ZC=0ELSEZE=ZE-1:ZC=1
4 IFZE<ZS THENZE=ZS+WD-1
5 PRINTMID$(A$,ZS,ZE-ZS+1);:IFZE-ZS+1<WD THENPRINT
6 ZS=ZE+1+ZC:GOTO2

My optimized version 2 (LEFT$/RIGHT$) now looks like this:

1 IFA$=""THENPRINT:RETURN
2 ZE=LEN(A$):IFZE<=WD THENPRINTA$;:IFZE<WD THENPRINT:RETURN
3 FORZE=WD+1TO1STEP-1:IFMID$(A$,ZE,1)<>" "THENNEXT:ZP=0ELSEZE=ZE-1:ZP=1
4 IFZE=0THENZE=WD
5 PRINTLEFT$(A$,ZE);:IF ZE<WD THENPRINT
6 A$=RIGHT$(A$,LEN(A$)-ZE-ZP):GOTO2

Now let’s see how things stack up.

  • My version 1 (MID$): size 240 bytes / speed 107
  • My version 2 (LEFT$/RIGHT$): size 199 bytes / speed 99

Wow. By removing extra functionality, REMs, removing spaces, and packing lines together, I went from 542 to 240 for version 1, and from 500 to 199 in the second version. For speed, version went from 114 to 107, and version 2 went from 107 to 99.

Out of string space!
Out of string space!

…but I should point out that, with the default 200 bytes reserved to variables in BASIC, my second version crashed with an “?OS ERROR” (out of string space). I had to CLEAR255 to make it work. The actual number has to be large enough to hold the biggest string being passed in, plus the overhead for the string manipulation using LEFT$/RIGHT$. To really know how much memory a BASIC program takes up, we really do need to consider its variable usage.

And now the new rankings for code size (smallest to largest):

  1. Jim Gerrie’s three line 3rd version: 176 bytes
  2. My version 2 (LEFT$/RIGHT$): 199* bytes (see note below)
  3. Darren Atkinson’s four line version: 236 bytes
  4. My version 1 (MID$): 240 bytes

…and for speed:

  1. My version 2 (LEFT$/RIGHT$): 99 (was 107)
  2. My version 1 (MID$): 107 (was 114)
  3. Darren Atkinson’s four line version: 192
  4. Jim Gerrie’s three line 3rd version: 248

NOTE: Since my version 2 required me to do a CLEAR255 for it to run, that means it actually took up 254 bytes (code, plus the extra 55 bytes allocated past the default 200 for variables).

It is tricky to calculate variable usage since the sizes of strings passed in will be a factor, and maybe it would have ran with CLEAR 210 or CLEAR 220 to save a bit more. To crash-proof a program, you need to ensure CLEAR is done to cover the maximum size of string usage in the worst case. Here, we’re not that concerned.

In conclusion, right now if ever byte of program space matters (such as Jim’s case when programming in 4K), his is best. For overall speed, my version 2 is best (but it uses the most memory between code and variable storage). Even my version 1 (with no extra string space required is still faster than the next fastest.

Can we do better? Can you do better? Let’s find out.

I will make a .DSK image (that works in the XRoar emulator, and should be loadable via CoCoSDC on a real CoCo) as soon as I figure out how to post one in WordPress.

To be continued… Probably…

6 thoughts on “More BASIC word wrap versions

  1. Pingback: Interfacing assembly with BASIC via DEFUSR, part 3 | Sub-Etha Software

  2. P. Ingerson

    Hi. I know I’m replying to a really old post, so you’ll probably never see this. But you might be able to make your two routines more efficient by changing line 5 to

    5 PRINT whatever;TAB(0);

    That should automatically take care of moving to the next line without any blank lines.

    (OTOH I’m not familiar with the CoCo at all, but that trick works in other versions of BASIC, so should work here too.)

    Reply
  3. P. Ingerson

    Interesting. In most BASICs ;TAB(0); will move the print position to the start of the next line but only if it’s not already at the start of a line (and just leave it where it is if it’s already there.) That way you don’t need a separate IF…THEN PRINT statement to control the line feeds. The TAB has already put you where you need to be.

    That works in most BASICs. I’m not sure why it doesn’t work here, but maybe the CoCo does it differently.

    Reply

Leave a Reply to Allen HuffmanCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.