Color BASIC string abuse – part 2

See also: part 1 and part 2.

In the first part, the following Color BASIC program was shared:

5 CLEAR 100*255
10 MX=99
20 DIM A$(MX):AL=0:DL=-1
30 SZ=RND(255):SZ=255
40 A$=STRING$(SZ,"X")
50 GOSUB 1000:GOTO 30
999 END

1000 REM
1001 REM ADD NEW STRING
1002 REM
1010 IF AL=DL THEN GOSUB 2000
1020 GOSUB 3000:IF Z<1024 THEN GOSUB 2000
1030 PRINT "ADDING";LEN(A$)"BYTES AT";AL;Z
1040 A$(AL)=A$
1050 AL=AL+1:IF AL>MX THEN AL=0
1060 IF DL=-1 THEN DL=0
1070 RETURN

2000 REM
2001 REM DELETE OLD STRING
2002 REM
2010 PRINT ,"DELETING";DL
2020 A$(DL)=""
2030 DL=DL+1:IF DL>MX THEN DL=0
2040 GOSUB 5000
2050 RETURN

3000 REM
3001 REM GET FREE STRING SPACE
3002 REM
3010 Z=(PEEK(&H23)*256+PEEK(&H24))-(PEEK(&H21)*256+PEEK(&H22))
3020 RETURN

5000 REM
5001 REM PAUSE
5002 REM
5010 PRINT"[PAUSE]";
5020 IF INKEY$="" THEN 5020
5030 PRINT STRING$(7,8);:RETURN

The program allocates enough string space to hold 100 strings of 255 bytes each. It then starts adding line after line until it detects string memory is getting low. When that happens, the oldest line is deleted (set to “”). The process continues…

The “gut feeling” was that this program should have been able to hold 100 full sized strings, but since it did use a temporary A$ (created to be 255 strings line, and the string we would be adding to the array), it seemed logical that it would have to start purging lines maybe a few entries before the end.

But instead, it starts deleting lines at around entry 47. And, the memory usage being printed out shows it drops by 510 byes each time instead of 255.

510 is an interesting number. That is 255 * 2. That makes it seem like each time we add a 255 byte string, we are using twice that memory.

And we are!

Strings may live again to see another day

The key to what is going on is in the main loop starting at line 30. We create a new A$ and set it to a string of 255 X’s. That string has to be stored somewhere, so it goes in to string memory. Then we do the GOSUB and add a string to the array, which copies our A$ in to the array A$(AL) entry.

When we go back to 30 for the next round, we create A$ again. The old copy of A$, in string memory, is deallocated, and a new A$ is created at the current string memory position. BASIC does not see if the old string space is large enough to be re-used. It just moves on to a new allocation of string space.

It looks like this… And note that strings fill from the end of the memory and move lower. Let’s say we have 16 bytes of reserved string memory (just to keep things fitting on this screen).

FRETOP points to where reserved string memory begins. MEMSIZ is the end of string memory. If we had done CLEAR 16, then FRETOP would be MEMSIZ-16.

STRTAB is where the next string will be added. It looks like this:

FRETOP                                     MEMSIZ
 |                                            |
[.][.][.][.][.][.][.][.][.][.][.][.][.][.][.][.]
                                              |
                                           STRTAB

Let’s say we create a 4-byte A$=STRING(4, “X”) (“XXXX”) that we plan to add to an array. It will be stored in string memory:

FRETOP                                     MEMSIZ
 |                                            |
[.][.][.][.][.][.][.][.][.][.][.][.][X][X][X][X]
                                  |
                               STRTAB

Later in the code, we assign that to the array, such as A$(0)=A$. Now A$(0) gets a copy of A$, and it looks like this (using lowercase ‘x’ to represent the copy of A$ that was put in A$(0)):

FRETOP                                     MEMSIZ
 |                                            |
[.][.][.][.][.][.][.][.][x][x][x][x][X][X][X][X]
                      |
                   STRTAB

When we go back to do this again, a new A$ is created, and it gets stored next. The old string data is still there, but A$ has been updated to point to the new entry.

FRETOP                                     MEMSIZ
 |                                            |
[.][.][.][.][X][X][X][X][x][x][x][x][X][X][X][X]
          |
       STRTAB

…and so on. As you can see, the way this loop was written, it is creating a new A$ every time through, copying it to the array (a new entry for that) and so on. That is why we see 510 each time through instead of just 255.

Now, if the string was short, we could have done A$=”XXXXXXXXXXXXX”. If we did that, the string would exist in program space and not in string memory. But we wanted a 255 byte string, and you can’t type a line that long in BASIC so STRING$() was used instead, which requires putting the string in string memory.

However, since in THIS version we are just using the same 255 character A$ over and over again, let’s make one change so we don’t create it every time through the loop. Just change the GOTO 30 in line 50 to be GOTO 50:

30 SZ=RND(255):SZ=255
40 A$=STRING$(SZ,"X")
50 GOSUB 1000:GOTO 50

Now the program will create one A$, which will store at the start of string memory, and then loop over and over just making new entries in the array.

That small change will instantly change the results to be more like we might have expected. Now we get all the way to entry 94 before any string deleting happens:

And, from looking at that screen, each number is dropping by 255 bytes as we expected.

By the time it reached line 94, it saw that there were less than 1024 bytes of free string space left. 1024 would have held another four 255 byte strings, meaning actually had enough memory to have gotten to line 98 — just one line short of our max 99 before it rolls over. And that memory is where the initial 255-byte A$ is stored.

Tada! Mystery solved.

But wait! There’s more…

The reason I chose 1024 as a threshold was to allow for other temporary string use in the program. Things like LEFT$, MID$, STRING$ all make temporary strings. When you add two strings together it creates a third string that combines the first two. Be sure to check out my string theory article for more details on this — I learned quite a bit when researching it. I also learned that some things required strings that I did not expect to. Fun reads. Helps put you to sleep.

If I modify line 1020 to check for 255 bytes remaining instead of 1024, then re-run, I get this:

…and that is as perfect as it gets. Array is filled with strings 0-98, plus the temporary string, which is a total of 100 strings of 255-bytes each — and that is how much memory we set aside with CLEAR!

Now how much would you pay?

And because this program is self-aware when it comes to knowing how much string space is there, it can actually operate with much less string space. It will just delete old strings sooner.

You can change the CLEAR in line 5 to something smaller, like 2000, and it will still work. But, 2000/255 is 7, so it has room for the A$ plus six array entries. I expect it would DELETE every 6 lines. Let’s try…

Bingo! After lines 0-5 (six lines) it deleted and old one, then since everything was now full, it had to delete every time it added something new.

And the point is…?

Well, let’s just say I wish I knew about this back in 1983 when I wrote my cassette-based Bulletin Board System, *ALLRAM*.

I always wanted to write version 2.0.

Until next time…

3 thoughts on “Color BASIC string abuse – part 2”

William Astle September 14, 2022 at 12:51 pm

An experiment that might be interesting: try assigning STRING$(…) directly to the array entries and see when you run out of string space. “LET” may be smart enough to not create a copy of an anonymous string when assigning to a variable. (An anonymous string is one where the string descriptor lives on the string stack. You get anonymous strings as the return value of string manipulation operaetions (STRING$, concatenation, etc.) There’s another concept worth diving into.)

Loading...

Reply ↓
1. Allen Huffman Post authorSeptember 14, 2022 at 12:55 pm
  
  Intriguing. I’ll investigate.
  
  Loading...
  
  Reply ↓
2. Johann Klasek October 19, 2022 at 5:05 am
  
  Actually “LET” is that smart. In case the string points into the program text (constant string) or the descriptor resides on the Temporary String Descriptor Stack, the descriptor is copied directly into the variable. In the latter case the descriptor is pulled from the TSDS if it is on top.
  
  Loading...
  
  Reply ↓

Sub-Etha Software

"In Support of the CoCo and OS-9 since 1990!"

Color BASIC string abuse – part 2

Strings may live again to see another day

But wait! There’s more…

Now how much would you pay?

And the point is…?

Like this:

Related

3 thoughts on “Color BASIC string abuse – part 2”

Leave a ReplyCancel reply