This program demonstrates Color BASIC’s string garbage collection. As strings are manipulated, string memory used will continue to grow until BASIC decides to clean up the string space and move things around a bit.
The program creates a string array and fills each entry up with string data. Using code that checks how much string space is remaining, it will delete old strings if string memory is low.
Watching it run reveals some curious things…
The Program
On an Extended or Disk BASIC system, you will need to do a PCLEAR 1 to get enough memory to run it.
- Initialization
- Line 5 – We CLEAR enough string space to hold 100 lines at their largest size of 255 characters. Note that when you use DIM to set an array size, it is base-0. Doing a DIM A$(99) gives you A$(0) to A$(99) — 100 elements. Thus, the CLEAR uses 100, but the DIM in the next line uses 99.
- Line 10 – MX represents the maximum number of lines in the array.
- Line 20 – DIM A$(MX) creates an array of 0-99 entries (100). AL is set to 0, and will be the line (array entry) we will ADD next. DL will be used to track the next line (array entry) we need to delete. DL is set to -1 for “nothing to delete yet.”
- Main Loop
- Line 30 – SZ is set to the length of the string we want to add. It has an unused RND at the start, but then SZ is hard-coded to 255. The program was designed to test random lengths of strings, but for this demo, we will be overriding that and making every string the maximum size.
- Line 40 – GOSUB 1000 goes to a routine that will ADD a line. Then we GOTO 30 to create a new line and do it again.
- Add New String subroutine
- Line 1010 checks to see if the ADD line has caught up to the DELETE line. If it has, we GOSUB 2000 to handle deleting the delete line.
- Line 1020 – GOSUB 3000 will return the amount of free string space in Z. If Z is less than 104, GOSUB 2000 is called to delete a string.
- Line 1030 – It prints how long of a string is being added to which line, followed by the current free string space before the add.
- Line 1040 – The string is added to the A$ array at position AL.
- Line 1050 – AL (add line) is incremented by 1, moving it to the next line of the array. If AL is larger than the MX (max array entries), it will wrap around to entry 0.
- Line 1070 – Returns.
- Delete Old String subroutine
- Line 2010 – Print which line is about to be deleted (set to “”).
- Line 2020 – The A$ entry at position DL is set to “”, releasing the string memory it was using.
- Line 2030 – DL (delete line) is incremented by one. If DL is larger than MX (max array entries), it will wrap around to entry 0.
- Line 2040 – GOSUB 5000 is a pause routine so we can see what just happened.
- Get Free String Space routine
- Line 3010 – Z is calculated as the difference between the FRETOP (start of string storage) memory location and the STRTAB (start of string variables) in the reserved string area. This gives us how many bytes of unused string memory is available.
- Line 2030 – Returns.
- Pause subroutine
- Line 510 – Print the message “[PAUSE]” with no carriage return at the end.
- Line 5020 – If INKEY$ returns nothing (no key pressed), keep checking.
- Line 5030 – Prints a string of 7 CHR$(8)s (backspace character) which will erase over the “[PAUSE]” prompt.
Based on the intent, one might think that running this would fill strings up to around entry 100, and then it would start deleting the old string
But is that what it will do?
Let’s run it and find out.
5 CLEAR 100*255 10 MX=99 20 DIM A$(MX):AL=0:DL=-1 30 SZ=RND(255):SZ=255 40 A$=STRING$(SZ,"X") 50 GOSUB 1000:GOTO 30 999 END 1000 REM 1001 REM ADD NEW STRING 1002 REM 1005 IF AL=DL THEN GOSUB 2000 1006 GOSUB 3000:IF Z<1024 THEN GOSUB 2000 1010 PRINT "ADDING";LEN(A$)"BYTES AT";AL;Z 1020 A$(AL)=A$ 1030 AL=AL+1:IF AL>MX THEN AL=0 1040 IF DL=-1 THEN DL=0 1050 RETURN 2000 REM 2001 REM DELETE OLD STRING 2002 REM 2010 PRINT ,"DELETING";DL 2020 A$(DL)="" 2030 DL=DL+1:IF DL>MX THEN DL=0 2035 GOSUB5000 2040 RETURN 3000 REM 3001 REM GET FREE STRING SPACE 3002 REM 3010 Z=(PEEK(&H23)*256+PEEK(&H24))-(PEEK(&H21)*256+PEEK(&H22)):RETURN 5000 REM 5001 REM PAUSE 5002 REM 5010 PRINT"[PAUSE]"; 5020 IF INKEY$="" THEN 5020 5030 PRINT STRING$(7,8);:RETURN
Running the Program
After a PCLEAR 1 and RUN, the program starts filling lines and we can see string memory going down. When it gets to line 47, there is only 1275 bytes of string space left.
And if we check those values, the difference between each one isn’t the 255 we might have expected. We clearly added a new 255 byte string, but memory went from 7905 to 7395 (510 bytes less) and continued down to 1785 to 1275 (510 bytes less). It appears each time we add a 255 byte string, it takes 510 bytes of string memory.
As we get to line 47, we must have had less than 1024 bytes of string memory left and the DELETE LINE subroutine is called. It deletes the oldest line, which is currently line 0. That should free up 255 bytes of string memory.
After pressing a key, we see the next few entries:
After deleting 0, memory has gone from 1275 to 756 (5120 bytes), so DELETE is called again. This time it deletes line 1.
We press a key and let it scroll a bit more until the next DELETE:
Here we see that, at some point, some string cleanup was done and our free string space grew larger. The trend of reducing by 510 bytes for each ned 255 byte string added continues…
And the process repeats, until eventually we roll over at line 99.
From here on, things get a bit more predictable, with a DELETE happening almost every line — though sometimes every two lines.
Eventually things settle in to a pattern of basically DELETING for every line ADDED.
Is it broken? Is it just poorly implemented? Or is it behaving exactly like it is supposed to?
Tune in next time for the exciting conclusion…
It’s definitely working as intended. But it’s definitely not obvious why it behaves the way it does.
This normal behavior. The first assignment to A$ takes 255 bytes on the string heap, the assignment to the array creates again a new string of 255 bytes giving 510 bytes allocation on the string heap for each “new” string (producing 255 garbage each time).
Each assignment is always a copy by value of a string. A$(x) and A$ must not point to the same string (string do not have a reference count).
The GC gets rid of all lost A$ content from due to the repeated assignments. The array strings and the last A$ assigned value remains active and are kept on the string heap.
In the final stage, where each iteration need 510 bytes, but the deletion condition (<1024) strikes every time. Each deletion frees 255 bytes and after two iterations the free 1020 bytes are consumed and triggers the GC, which frees the garbage from A$ and the two array elements giving 1020 bytes free space again.
Why not assigning to A$ just one time (with a 255 byte string) and let the loop just assign its value to a new array element (as long SZ is a constant)? This should consume only 255 bytes for each iteration.
Or assign directly to the array element (line 40 …. A$(AL) = STRING(…))
You get a gold star! Yes.
There are many things we did as beginner programmers that were doing really nasty things under the hood, and we were completely unaware of them. Like:
10 PRINT”HELLO”:REM NEVER ADD COMMENTS AT THE END OF A LINE
Lots of things we did caused much slower code than doing a simple change.
Years ago, when I was porting my ALLRAM BBS to Arduino, I learned how string stuff worked, and realized I could have used some simply tricks to save me alot of code and make the program’s memory use far more efficient and faster by having the BASIC ROM do the work for me. That’s what I am alluding to in this article, but I won’t have that series as part of #SepTandy. But the test code I have is cool and feels like magic to me.
Pingback: CoCo Disk BASIC disk structure – part 1 | Sub-Etha Software