2020-03-31 – Corrected the spelling of Art’s last name. Sorry about that!
Well, you can disregard my previous article. I misunderstood the process Steve Ostrom was describing. I have a follow-up planned, but wanted to share Steve’s response when I explained what I thought the process was:
… the probability puzzle is a little more complex. Here is the idea: Put 4 trays in front of you, one for each color. Pull one marble and place it in the proper tray. Pull a second marble. Place in its tray. Continue pulling marbles until one of the trays contains 3 marbles. Stop there. That’s iteration #1. Replace all the marbles into the bag. Start again. Continue 100,000 times !! Now calculate the odds for each tray that it will be the tray that gets to 3 first.
Steve Ostrom
That’s a bummer, because I let mine run all weekend and got these results:
The correct results from an incorrect simulation.
If my simulation had been correct, when a bag has five red, four blue, three white and three black marbles, the odds of pulling out three of the same color would have been:
Red – 62.7% of the matches
Blue – 24.7% of the matches
White / Black – 6.2% of the matches
What I forgot to do is include a count of the number of times to pull marbles out of the bag! Without that, I can’t even show the results of my incorrect simulation.
Please disregard.
I will be writing a “correct” simulation and sharing the results soon*.
* soon [so͞on] ADVERB in or after a short time.
Oxford Dictionary definition of “soon”
Okay, fine. Hopefully soon. Soon for me, anyway. Maybe tomorrow. Or next week.
Math and Art
Others have responded on this and I plan to share their efforts and results as well. One of the most notable was seeing the legendary Art Flexser join in. Mr. Flesxer was responsible for the most popular DISK BASIC replacement the Color Computer ever had – ADOS. Back then I wanted ADOS so bad, but the thought of replacing a ROM chip seemed beyond my skills. “Wish I knew then, what I know now!” For those unfamiliar, ADOS is described in the CoCo FAQ as follows:
A-DOS was developed by Art Flexser. It came in three versions, ADOS for the CoCo 1 and 2, and ADOS 3 & Extended ADOS 3 for the CoCo 3. It was 100% compatible with RS-DOS if you didn’t need to patch Disk BASIC, and added features to RS-DOS, noteably 40 and 80 track drive support. ADOS came on a disk, and could be loaded into the CoCo, or you could customize ADOS, program an EPROM, and use the EPROM as your disk ROM, therefore booting your CoCo with ADOS. This was a neat, because many users then set their CoCos to boot with the 80 column screen. It also ran the CoCo at double-speed, even during disk and printer I/O, featured auto line numbering, arrow scroll through listings, auto edit of errors, macros, etc. Extended ADOS 3 added things like parellel printer output (assuming you had the right hardware), wildcard filenames, and a RAMdisk. This was arguably the most popular modified RS-DOS used with the CoCo.
Description of Art Flexser’s ADOS from the Color Computer FAQ at CoCopedia.com
But I digress… Art implemented this puzzle using a modern basic, QB64. According to the website:
QB64 is a modern extended BASIC programming language that retains QBasic/QuickBASIC 4.5 compatibility and compiles native binaries for Windows, Linux, and macOS.
NOTE: I got the objective of the marble puzzle wrong, but I’d already written this version. I’ll post a follow-up with the actual puzzle later.
Recently in the Color Computer Facebook group, Steve Ostrom wrote a BASIC program to solve a probability puzzle. He wrote:
You have 15 marbles in a bag. 5 are red, 4 are blue, 3 are white and 3 are black. You pull out one marble at a time, without putting it back, until you have 3 marbles of the same color. What are the odds of getting 3 red marbles, of getting 3 blue marbles, of getting 3 white marbles or of getting 3 black marbles.
Steve Ostrom (March 21, 2020 on the CoCo Facebook Group)
I mentioned this in my article about the BASIC RND command recently, but did not address the actual puzzle Steve presented. Today, I will try.
If my understanding of the question is correct, you have a bag with 15 marbles:
5 red marbles
4 blue marbles
3 white marbles
3 black marbles
You pull three marbles out of the bag. What are the odds that all three are red, blue, white or black? Since there are more red marbles than any other, grabbing three red marbles must be more likely than any other color. Since there are more blue marbles than white or black marbles, grabbing three blue ones must be more likely than white or black. And, with white and black marbles having the same count, they should have the same probability.
There’s probably a very simple math formula to solve this, but it’s more more fun to write a program to actually try it!
Simulating a Bag of Marbles
I remember making some card programs in BASIC as a teenage. You could create an array representing all the cards, and then shuffle them using the RND random function. For example:
10 REM shuffle.bas
20 REM CREATE ALPHABET ARRAY
30 DIM A$(25)
40 FOR A=0 TO 24
50 A$(A)=CHR$(65+A)
60 NEXT
70 REM DISPLAY ARRAY
80 GOSUB 190
90 REM SWAP EACH ONE RANDOMLY
100 FOR A=0 TO 25
110 S=RND(26)-1
120 SV$=A$(S)
130 A$(S)=A$(A)
140 A$(A)=SV$
150 NEXT
160 REM DISPLAY ARRAY
170 GOSUB 200
180 END
190 REM DISPLAY ARRAY
200 FOR A=0 TO 25
210 PRINT A$(A);
220 NEXT
230 PRINT
240 RETURN
Starting at line 30, this program creates a string array of 26 items, each containing a letter of the alphabet:
A$(0)=”A” A$(1)=”B” …etc…
In line 80, it calls a subroutine that will PRINT out the array.
Starting at line 100, it shuffles the array by going through each entry and swapping it with a randomly selected entry.
In line 170, it calls the subroutine to display the array again.
Color BASIC Shuffle
My first thought was to create an array of marbles and then shuffle them. The program would then randomly pull one out of the virtual bag (array) to see what it got.
But then I realized that would be silly and wasteful since we are dealing with randomly selecting an item.
For cards, you need to shuffle them because you get the “next out” card (the one on the top of the deck, unless you are cheating and bottom dealing). But, if you fan open a deck and say “pick a card, any card” the deck doesn’t have to be shuffled since you are randomly picking one. For a magic trick, though, we want the deck to be shuffled because if they were all in order, one could easily look at the remaining cards and determine which one was missing.
But I digress…
If we used a non-sorted array of alphabetical letters:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
…and we are going to randomly select one, the order of the items in the array should not matter. We might randomly select 16, 4 and 22. Or 1, 9 and 14. Or 26, 2 and 7.
I don’t need to randomize the marbles in a marble bag at all.
Visual Programming
With that out of the way, I next thought about a way to visualize the process of pulling random marbles out of a bag. Since we are dealing with colors, and I am doing this on a Color Computer, I thought maybe I’d use the screen and have color characters represent each of the marble colors.
10 CLS
20 REM DISPLAY MARBLES
30 REM 5 RED=191, 4 BLUE=175, 3 WHITE=207, 3 BLACK=128
40 PRINT@0,STRING$(5,191);STRING$(4,175);STRING$(3,207);STRING$(3,128)
CoCo Marbles
I could then use PEEK to randomly select one of the colors on the screen. Since I need to pick three, and can’t pick the same one twice, I could use POKE to reset the one I Just picked to some value representing “already picked.”
And, to track which ones I picked, I would also POKE the value to another spot on the screen. I could then visually see the process run as a marble was “picked” from the top row (changed to a different character) and “moved” to somewhere else (the color block appearing in a new location).
Then, after picking three marbles, I could look at what I picked and see if they all matched. If they did, I could increment a counter for each color. If all three were red, I would increment a Red Count variable. If all three were black, I’d increment a Black Count variable.
Along with four individual counters (for red, blue, white and black), I’d also have a Total Count. I could get statistics by dividing the individual counts into that total. For example, if there were 100 total counts, and the Blue Count was 15, then picking three blue ones happened 15 out of 100 times – 15%.
I could then reset the marbles and do it again, and let this program run as long as it took for a pattern to emerge.
According to Steve, here is what I should expect:
I ran 100,000 iterations (took most of the night) and here are the rounded percentages I found. Red = 42.5%, blue = 27.5%, white and black are both 15.0%.
Steve Ostrom
Marble Madness
Keeping my Optimizing Color BASIC series in mind, I am going to present to you an “easy to read but slower to run” version of what I came up with. It is heavily comment, and thus heavily slowed down.
10 ' marble.bas
20 '
30 ' INITIALIZE STATS
40 '
50 RC=0:BC=0:WC=0:KC=0:TC=0
60 CLS
70 '
80 ' DISPLAY MARBLES
90 ' 5 RED=191, 4 BLUE=175, 3 WHITE=207, 3 BLACK=128
100 '
110 PRINT@0,STRING$(5,191);STRING$(4,175);STRING$(3,207);STRING$(3,128)
120 '
130 ' GRAB THREE MARBLES
140 '
150 FOR MD=1088 TO 1090
160 '
170 ' RANDOM MARBLE LOCATION
180 '
190 ML=1024+RND(15)-1
200 '
210 ' GRAB MARBLE
220 '
230 V=PEEK(ML)
240 '
250 ' IF MARBLE USED, TRY AGAIN
260 '
270 IF V=0 THEN 190
280 '
290 ' MARK MARBLE REMOVED
300 '
310 POKE ML,0
320 '
330 ' PUT MARBLE IN HAND
340 '
350 POKE MD,V
360 NEXT
370 '
380 ' CHECK FIRST GRAB
390 '
400 V=PEEK(1088)
410 '
420 ' DOES IT MATCH NEXT TWO?
430 '
440 IF V=PEEK(1089) AND V=PEEK(1090) THEN 500
450 '
460 ' NO, START OVER
470 '
480 GOTO 110
490 '
500 ' ALL THREE THE SAME!
510 '
520 ' IF RED, INC RED COUNT
530 '
540 IF V=191 THEN RC=RC+1
550 '
560 ' IF BLUE, INC BLUE COUNT
570 '
580 IF V=175 THEN BC=BC+1
590 '
600 ' IF WHITE, INC WHITE COUNT
610 '
620 IF V=207 THEN WC=WC+1
630 '
640 ' IF BLACK, INC BLACK COUNT
650 '
660 IF V=128 THEN KC=KC+1
670 '
680 ' INC TOTAL COUNT
690 '
700 TC=TC+1
710 '
720 ' PRINT COUNT AND PERCENTAGE
730 '
740 PRINT @128,"RED :";RC,RC/TC
750 PRINT "BLUE :";BC,BC/TC
760 PRINT "WHITE:";WC,WC/TC
770 PRINT "BLACK:";KC,KC/TC
780 GOTO 110
When it runs, it looks like this:
Color BASIC marble bag simulator in action.
The top line represents the avialable marbles in the bag. As one is chosen, I POKE it to 0, which appears as an inverted @ character.
Below are the three picked marbles. It looks like I forgot to erase the “chosen” line between cycles, so it’s showing three marbles when above only two have actually been selected. This does not impact the outcome, since I choose three, which overwrites the previous three, before looking to see if they match.
As you can see, I am quite a bit off from Steve’s 42% / 22% / 15% / 15% results. I need to let it run much longer.
Or, perhaps I just need to optimize this program and dramatically speed it up.
Recently, Steve Ostrom posted to the Facebook CoCo group a puzzle he was trying to work out:
It’s been a few years since I turned on my Coco, even though it’s still all set up. I came across an interesting probability problem a few days ago, so I fired up the Coco and wrote a program to calculate the approximate answers using a random number generator. I’ll work on the real probability calculations maybe when I get even more bored. So, here’s the problem: You have 15 marbles in a bag. 5 are red, 4 are blue, 3 are white and 3 are black. You pull out one marble at a time, without putting it back, until you have 3 marbles of the same color. What are the odds of getting 3 red marbles, of getting 3 blue marbles, of getting 3 white marbles or of getting 3 black marbles. I ran 100,000 iterations (took most of the night) and here are the rounded percentages I found. Red = 42.5%, blue = 27.5%, white and black are both 15.0%. Anyone else want to try? :-) Three cheers for the Coco and the ability to write quick programs !!
Steve Ostrom (March 21, 2020 on the CoCo Facebook Group)
This started a discussion on the CoCo Mailing List about just how random the RND() function is in Color BASIC. This reminds me of one of my earliest computer encounters back around 1979-1980.
I was living in Mequite, Texas (near Dallas) and my next door neighbor’s mom worked for Texas Instruments. They had plenty of TI stuff at home (watches and such) as well as a TI-99 home computer. Due to the year, it must have been the original TI-99/4.
His mom has typed in a BASIC program that played a card game. My friend Kris explained that he knew all the cards it would pick because it did the same ones each time. My memory seems to recall him trying to explain that the computer did not do random numbers, so his mother programmed the card sequence. Looking back, maybe what he was explaining was that it did the same numbers each time.
He was, of course, very good at this card game because of this.
Years later, I was shown an amazing magic trick as a friend talked to me over the phone and had me type in things into my CoCo and he told me what the result would be. Turn on a CoCo (or load an emulator, even the online JS Mocha or Xroar versions), and type this in:
FOR A=1 TO 10:PRINT RND(0):NEXT
JS Mocha RND.
Xroar Online RND.
What do you get if you try that on a real CoCo 1, 2 or 3?
If I recall, the RND function was meant to be used like this:
CoCo 3 BASIC manual entry on RND.
I am quite sure he just had me type “PRINT RND(100)” and he could tell me the results each time.
The pseudo random number generator in Color BASIC will do the same sequence every time, so after a power up, any game that relied on RND would be as predictable as my friend’s TI-99/4 card game was.
You can “seed” the random number generator by giving it a negative number:
CoCo RND before and after seed.
CoCo RND after seed.
As you can see, the random number generator will reset and generate the same sequence of numbers based on the negative value you give it.
Ever play Mine Sweeper on Windows? It had an option to enter a number and replay the same game. I am guessing this just seeded the Microsoft random number generator in the same way, and allowed the randomly generated playfield to be the same each time you gave it the same seed number.
Seems we could do that on the CoCo too and have a random game that you could let others try, using the same sequence of randomness.
I did not understand this when I was first getting in to the CoCo, but I knew you were supposed to seed the generator doing something like:
A=RND(-TIMER)
In Extended Color BASIC, TIMER was an incrementing variable that started at 0 on power up, and then counted up 60 times a second (the screen refresh interrupt time) until it rolled over at 65535 (about 18 minutes).
Since the time it took to turn on a computer, load a program and type run would vary, using the negative value of the timer gave a different random seed each time. I suppose if you had a BOOT rom that would automatically start up and run something (like RGB-DOS’s AUTOEXEC.BAS), maybe it would be the same amount of time each time, but beyond that, this was a good way to make the predictable RND function a bit less predictable.
As suggested on the mailing list, you can “see” the randomness by plotting out the values, such as doing something like this:
…then let it run for awhile. If it was truly random, eventually it should set every single dot on the screen. But, pseudo random number generators have patterns. There is a level of predictability in them which has been used by hackers to exploit “secure” Internet connections for many years.
Randomly plotting points on a CoCo PMODE 4 screen.
Interesting. I never would have thought to try that back in the 1980s.
Random Benchmarking
And, lastly, benchmarks! How fast is decimal versus hex for plotting on the screen?
Hey hey! 18.61 seconds! I sure wish I knew all this back when I was writing BASIC programs in the 80s…
Oh, and FOR/NEXT is faster than a GOTO since it does not have to scan through lines to find where it is going. I think if I was going to let this run for a few hours, I might do this:
10 PMODE 4,1:SCREEN 1,1:PCLS
11 X=256:Y=192:O=1
15 FOR A=1 TO 2 STEP 0
20 PSET(RND(X)-O,RND(Y)-O):NEXT
That should be marginally faster than the GOTO 20 was.
I’ll tell you how it turns out tomorrow.
Tomorrow
Well, it looks like RND does hit every location from 0 to 255. Here is what my screen looked like after running overnight:
Randomly plotting points on a CoCo PMODE 4 screen, the next morning.
I guess I need a better test to show the frequency that each value comes up. Sorta like Steve’s original project of randomly pulling colored marbles from a bag.
10 REM MS-EGG.BAS
20 P$="ABCDEFGHIJKLMNOPQRSTUVWXYZ......!"
30 FOR L=&HBFEF TO &HBFE6 STEP -1
40 V=PEEK(L) AND &H3F
50 C$=MID$(P$,V,1)
60 PRINT HEX$(L);" ";HEX$(PEEK(L));" -> ";HEX$(V),C$
70 NEXT
Now I’ve finally seen it with my own eyes.
Color BASIC Easter Egg
Line 20 is just enough of the PETASCII characters that would get POKEd to the screen memory on the Commodore for our hidden message. (A is POKE x, 1 … B is POKE x,2, etc.)
Line 30 loops through the memory locations in the Color BASIC ROM where the easter egg is stored, just after a numeric table. And stored in reverse.
Like 40 uses PEEK to get the byte at that location, then uses AND to mask off the top 2 bits, which were added (“randomly?”) to obfuscate the message so you couldn’t see in the ROM binary.
Line 50 gets the corresponding character from the P$ that matches the value of the PEEK’d byte.
Line 60 prints out the memory location, then the original value stored there, then the “mask off the top two bits” value, and then the character it represents.
…neat, huh?
BASIC Code
For those curious, here is what those bytes look like in the ROM. This screen shot is from the Color BASIC Unravelled book.
Unused “MICROSOFT!” hidden in the Color BASIC ROMs.
My original “nicer to read, slower to run” version took 25.28 seconds to run. Here are some things that can speed it up.
RENUMber by 1
Renumbering the program by 1 can save program memory (since “GOTO 77” takes less space than “GOTO 1500”, and if the program is large enough, you get a slight speed increase since it’s also less work to parse shorter line numbers. But this program is too small for this to matter.
Remove comments / avoid comments at the end of lines
Removing comments, in general, is a good way to make a program smaller (and thus, a bit faster since BASIC will have less lines to skip). But, as I found out, comments at the end of lines are a real problem, since BASIC cannot just skip the entire line like it can if the line starts with a comment:
10 X=42:REM SET X TO THE ANSWER.
10 REM SET X TO THE ANSWER. 20 X=42
The second version is faster, since even though BASIC has to scan an extra line, it can immediately skip it, while the first example must be parsed, character-by-character, to find the end of that line. But for this example, this did not help. It would have, if there were more GOTOs to earlier line numbers, since they would have to parse through a few lines with end comments, which would be slower. Since our looping was done with a FOR/NEXT, we didn’t have to do that.
Remove unnecessary spaces
Spaces are great for readability, but bad for code size and speed. There are only a few times when a space is needed, such as when BASIC can’t tell if the next character is a keyword or part of a variable:
10 IF A=B THEN 100
BASIC must have the space between the variable and the keyword THEN, else the parser isn’t sure if the variable is “B” or “BT” followed by the characters HEN. BUT, if it’s a number, it can figure it out:
10 IFA=42THEN100
Removing all other spaces reduces the amount of code BASIC has to parse through. My program had plenty of spaces. Removing them in this tiny program only reduced it to 25.15 seconds, but it can have a bigger impact on larger programs.
Combine lines
This one is easy. The less lines BASIC has to scan through, the quicker it will run. But, there are exceptions to this. Any time BASIC has to do more work at the end of a line, such as checking an ELSE, if that condition is not satisfied, BASIC has to scan forward character-by-character to get to the end of the line:
10 IF X=42 THEN 100 ELSE PRINT "I DON'T KNOW THE ANSWER"
Above, if X is 42, the ELSE code will not be ran. BUT, before the program can find 100, it has to scan through all the rest of the bytes in that line to find the end of it, then it continues scanning a line at a time looking for 100. Based on line lengths, sometime it might be faster to do:
10 IF X=42 THEN 100 20 IF X<>42 THEN PRINT "I DON'T KNOW THE ANSWER"
Combining lines in the program reduced the speed slightly to 25.13 seconds. The more code, the more you save by doing this (and it makes the program smaller, since every line number takes up space).
NEXT without variable
Unless you are doing really weird stuff with FOR/NEXT loops, it’s faster to leave off the variable in the NEXT. For example:
10 FOR A=1 TO 10:NEXT A
…will be slower than:
10 FOR A=1 TO 10:NEXT
NEXT without a variable will go to the most recent FOR anyway, so this works fine:
10 FOR A=1 TO 10 20 FOR B=1 TO 20 30 PRINT A,B 40 NEXT 50 NEXT
For this program, just removing the variables from NEXT A and NEXT Z increased the speed to 24.66 seconds.
Use HEX instead of decimal
It takes much more work to convert base-10 (i.e., “normal”) decimal numbers than it does to convert base-16 (hex) values. Every time a number is encountered, time can be saved by simply changing the number to a HEX — even though a hex digit has to start with “&H”. It will make the program larger, since 15 takes less bytes than &HF. For this example, it reduced time down to 21.66 seconds!
Pre-render strings
If you have to PRINT a generated string more than once, you are better off pre-rendering that string first so it only gets generated one time. This includes using things like STRING$, CHR$, etc. For example:
10 PRINT CHR$(128);"HELLO";CHR$(128)
Each time BASIC has to print that, it allocates a new string a few times and copies things over. See my string theory article for more details. For this program, I use STRING$() to generate a repeating block of characters. But, since it’s in a loop to print the block, it generates the same string over and over for each line of the block. Instead of this:
10 FOR A=1 TO 10 20 PRINT STRING$(10,128) 30 NEXT A
…I could generate that string ahead of time and just print it in the loop:
5 LN$=STRING$(10,128) 10 FOR A=1 TO 10 20 PRINT LN$ 30 NEXT A
In this program, doing this trick reduces the time to 20.98 seconds! String are slow, you see…
The time savings so far…
Combining all of these (the ones that matter, at least), reduces time taken down to 18.45 seconds – saving 6.83 seconds overall.
Here is what it looks like, with the stuff outside of the benchmark timing loop left alone:
If we’d done any GOTO/GOSUBing to earlier lines, that would have to scan through those lines with ending REMarks, so cleaning all that up would be useful. If this was a larger program, RENUMbering by 1s might also squeak out a bit more speed.
But wait! There’s more!
I originally wrote this follow-up the evening that I wrote the original article. However, several folks have contributed their own optimizations. Let’s see what they did…
First, we go to the comments of the original article…
George Phillips – inner loop optimization
My current best effort is going full strength reduction on the inner loop. Replace lines 120 to 170 with:
The STRING$ requires Extended Color Basic which starts at a baseline of 20.4 seconds and those changes get it down to 13.
George Phillips
George pre-rendered the string, removed spaces, and vastly sped things up by simplifying the math. On Xroar, it reports 16.15 – faster than any of my efforts. Nice job!
I took his updated code and converted the integers to HEX and that dropped it to 15.1 seconds! Even faster.
Jason Pittman – hex and start row calculation
Interestingly, changing the blue square value on line 130 to hex (“&HAF”) saves about 11.5% for me (27.55 to 24.33). Then, moving the row start calculation out of the print statement takes it down to 22.87.
120 FOR A=32 TO H*32 STEP 32 130 PRINT@P+A,STRING$(W,&HAF) 140 NEXT A
Jason Pittman
Ah, the magic of hex! Jason did a similar trick with the row calculation. I tried Jason’s updates, and it was 20.28 on my Xroar. George is still in the lead.
Adam – DIM, reductions and variable substituion
So I was able to cut out 5 seconds doing the following: – DIM all variables – Adding variables for constants (2, 32, etc.) for lines 130, 160. – Removing the math in lines 10, 20. – Removing the variables after NEXT. – Removing all remarks.
No change to the original algorithm or PRINT line.
Coco3 (6309): Original code = 27.15s. Optimized code 22.13s. With high speed poke: 11.05s. :D
I’m intrigued with the post that mentioned using HEX values. I may have to try that.
Adam
Ah! Using a variable rather than a constant can be significantly faster. I didn’t think about that, even though I’ve previously benchmarked and tested that. He adds:
I just put the STRING$ statement into a variable as suggested by George Phillips. That took out another 2 seconds! I also ran RENUM 1,,1. Run time is down from 27.2 s to 20.1 s.
L$=STRING$(W,C) FORA=1TO H PRINT@P+A*B,L$ NEXT
Adam
It looks like George has gotten us all thinking. I don’t have Adam’s code to test, but I am going to experiment with replacing the constants with variables and see what that does.
Ciaran Anscomb – uh… not really sure!
The author of Xroar chimes in:
So I cheated and read the comments on your page and combined some of those techniques (and replace /2 with *1/2, forgot about that…).
Ciaran reorganized the code, placing initialization at the end. This is a habbit I need to get into since GOTO/GOSUB to an earlier line number have to start at the FIRST line of the program and scan forward EVERY TIME. Code that only needs to run once, if placed at the top, slows down every GOTO to an earlier line number.
I just ran this and got 13.13 seconds!
There is alot to dissect here. Numbers are replaced with variables. Stuff is pre-calculated. And he’s using a variable for the end of a FOR/NEXT, for some reason. I’m going to have to dig in to that one.
Twice as fast, indeed!
I feel like there were a few more responses, but I can’t locate them at the moment, so maybe there will be a second attempt article down the line.
Until next time, kudos to all of you for giving me new things to consider — especially Ciaran with that amazing improvement. Very impressive.
“Let’s build a worksheet to calculate these values.”
This is a quote that will stay with me. It comes from a video response from Paul Fiscarelli that demonstrates his solution for adding gravity to the bouncing ball demo. He suggests pre-calculating the positions so nothing has to be calculated at run-time.
In part 3 of my bouncing ball series, I discussed the idea of using DATA for the positions rather than doing math simply because it would be faster. But, the math still has to be done somewhere to generate the values for those DATA statements.
But math is hard.
Fortunately, Paul provides the math.
And, he does it with style, using real equations that calculate an object in free-fall. Impressive! His video explains how the formulas work, and then he uses an Excel spreadsheet to do all the math, and then fills it up with locations representing each Y position of the ball, all designed to perfectly fit the 32 column screen. That’s something my brute-force simulated gravity attempt did not do.
He then uses some fancy Excel formulas to make it generate all the data together as numbers separated by commas, ready to be copy-and-pasted into an editor with the rest of the BASIC program.
Screen shot from Paul Viscarellis’ YouTube video showing turning Excel spreadsheet data into BASIC DATA statements.
Well done, Paul! I’m impressed!
Your demo sent me back to my benchmarking to find out which was faster:
Array or DATA?
Paul’s recalculated data is stored in DATA statements, such as:
DATA 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,-1
During the initialization code, he loads the data up in an array, such as:
FOR I=0 TO 20:READ Y(I):NEXT
He then has all his Y positions available just by using an index counter (he re-purposes my YM Y-movement variable) that loops through them. In his example, he places a -1 at the end of the DATA statement, and uses that as a dynamic way to indicate the end of the data. When he’s retrieving his Y value from the array, he checks for that to know when he needs to start back at the beginning of the array:
YM=YM+1:Y=Y(YM):IF Y<0 THEN YM=0:Y=Y(YM)
In part 8 of my Optimizing Color BASIC series, I looked into arrays and discovered they were significantly slower than non-array variables. Surely an array is faster than the calculating that gravity equation real-time, but it it faster than just READing each item each time?
READ/RESTORE versus Arrays
I present a new benchmark… which is faster? Using READ/RESTORE, or using pre-loaded arrays? Let’s fine out!
First, the test program is going to load 20 values into an array, then in the benchmark loop it’s going to just assign the next array value to a variable and increment an index variable until it gets to the end of the data. It will then reset the index to 0 and start over. This is a scaled-down version of Paul’s demo looping through the array values.
0 REM ARRAYS.BAS
6 DIMZ(20),Y,X:FORA=0TO19:READZ(A):NEXT:X=0
5 DIM TE,TM,B,A,TT
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
40 Y=Z(X):X=X+&H1:IFX>&H14 THENX=0:GOTO40
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A/60:END
100 DATA 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,-1
Running this in the Xroar emulator gives 12.65 seconds.
Next I removed the array and just do a READ each time, looking for a -1 which indicates we need to RESTORE back to the start of the data. It looks like this:
0 REM RESTORE.BAS
5 DIM TE,TM,B,A,TT
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 READ Z:IFZ=-1THENRESTORE:GOTO30
40 Y=Z
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A/60:END
100 DATA 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,-1
This produces 10.51 seconds! A bit faster. But it’s not quite the same.
In the READ/RESTORE example, I have to keep checking for -1. In the array version, there is an index value (needed for the array) that can be checked. Is it faster to do that, rather than looking for -1?
0 REM RESTIDX.BAS
5 DIM TE,TM,B,A,TT
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 READ Z:X=X+&H1:IFX>&H14 THENRESTORE:X=&H0:GOTO30
40 Y=Z
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A/60:END
100 DATA 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,-1
Yipes! The one slows down to 14.15 seconds. It seems incrementing a variable, comparing it, then resetting it adds a lot of overhead. It seems just checking for -1 is the faster way.
Let’s make it a tad bit faster.
In part 6 of my BASIC series, I looked at using READ on integers, hex, and strings. I found HEX to be almost twice as fast. I’ll also replace the -1 with some HEX value we won’t be using. Let’s try that here:
0 REM DATAHEX.BAS
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 READ Z:IFZ=&HFF THENRESTORE:GOTO30
40 Y=Z
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A/60:END
100 DATA &1,&H2,&H3,&H4,&H5,&H6,&H7,&H8,&H9,&HA,&HB,&HC,&HD,&HE,&HF,&H10,&H11,&H12,&H13,&H14,&HFF
9.27 seconds!!!
It looks like using Paul’s technique to pre-calculate all the gravity values, and then reading them dynamically from DATA statements in the demo, should give us a great speed boost even over my simple attempt to do things like “Y=Y+.25”.
Thanks, Paul, for providing us with a way to do great “real” gravity without having to do a bit of math…on the CoCo. Hopefully I can dissect those equations and replicate it.
Paul’s Video
This is the first time I have ever enjoyed watching a presentation on Excel. I hope you enjoy it, too. Here’s Paul’s video:
Mind blown. Jerry Stratton has created SuperBASIC for the TRS-80 Color Computer. It processes a text file written in a more modern-looking scripting language and turns it into Color BASIC.
It adds keywords like loop, switch/case, sub, etc. that then get turned into BASIC that does the same thing.
Here is a simple Color BASIC program that will scale blue box on the screen from small to large then back, going through the scaling processes 100 times.
0 REM scale.bas
10 SW=32/4 ' SCALE WIDTH
20 SH=16/3 ' SCALE HEIGHT
30 SM=.1 ' SCALE INC/DEC
40 S=.5 ' SCALE FACTOR
70 TM=TIMER:FOR Z=1 TO 100
80 W=INT(SW*S)
90 H=INT(SH*S)
100 P=15-INT(W/2)+(7-INT(H/2))*32
110 CLS
120 FOR A=1 TO H
130 PRINT@P+A*32,STRING$(W,175)
140 NEXT A
150 S=S+SM
160 IF H<1 OR H>15 THEN SM=-SM:S=S+(SM*2)
170 NEXT Z
180 ' 60=NTSC 50=PAL
190 PRINT:PRINT (TIMER-TM)/60;"SECONDS"
After this runs, it will report the approximate number of seconds it took. It does this by resetting the TIMER at the start, then printing the current TIMER value divided by 60 (since the CoCo timer is based on the NTSC video interrupt that happens 60 times a second).
NOTE: If you run this on a PAL system, you will need to change the 60 to a 50 in line 190. (edit: thanks, George P., for catching my typo.)
On the Xroar emulator running on my Mac it reports 25.25 seconds.
Color BASIC scaling demo.
Your challenge, should you decide to accept it, is to take this code and make it run faster.
Rules
You must leave the basic algorithm intact (the SW, SH, S and SH stuff with all the math). You can rename variables, change the representation of values, speed up PRINTing, etc. but the core program flow should remain the same.
For bonus points, you are welcome to rewrite the program (in BASIC) to improve upon the algorithm in any way that makes sense, provided it achieves the same results (including the 1 to 100 benchmark loop).
There are some very (very!) simple things that can be done to dramatically improve the speed to his code.
Feel free to share your efforts in the comments. If you post your code, be sure to post the resulting time, too.
0 REM gravity.bas
10 CLS
20 X=1:Y=1:XM=1:YM=1
30 PRINT@P," ";:P=X+Y*&H20:PRINT@P,"O";
50 X=X+XM:IF X<&H1 OR X>&H1E THEN XM=-XM
60 Y=Y+YM:IF Y<&H1 OR Y>&HE THEN YM=-YM
80 GOTO 30
…how would you add simulated gravity to the bounce? When I was a teen, I did this on my CoCo 3. I forget how I did it, but here is what I tried tonight:
0 REM gravity.bas
10 CLS
20 X=1:Y=1:XM=1:YM=.25
30 PRINT@P," ";:P=X+INT(Y)*&H20:PRINT@P,"O";
50 X=X+XM:IF X<&H1 OR X>&H1E THEN XM=-XM
60 Y=Y+YM:IF Y<&H1 OR Y>&HE THEN YM=-YM:Y=Y+YM70 YM=YM+.25
80 GOTO 30
But on a text screen, the “jump” is large enough when it’s near the bottom that it never actually hits the bottom of the screen. In the CoCo 3’s high-resolution screen, this wasn’t an issue. With only 16 horizontal positions, it’s quite limited.
I’m sure there’s a real clever way to do this. Any thoughts?
It seems any time I touch BASIC these days, it turns into a benchmarking session to see if I can do something faster.My Jim Gerrie-inspired bouncing ball program has taken quite a tangent, and today it not going to change that.
MC-10 has its advantages
As previously discussed, PRINT@ seems to be the fastest way to put characters on the screen. But what if you want something that’s not just text? In Jim’s MC-10 demo, he uses the semi graphics characters in his ball. The MC-10’s BASIC allowed you to type those characters similarly to how Commodore computers let you type in their PETASCII characters. The excellent MC-10 Javascript Emulator has this image showing the keyboard layout:
MC-10 keyboard layout (image from MC-10.com).
If you look at the keys, you will see some contain graphics blocks next to the letter (Q and a solid block, F and checkerboard, etc.). You can generate them with SHIFT-Letter. You also see some keywords above the keys which you could generate by doing CONTROL-Letter. This let them type in graphics characters in a PRINT statement:
MC-10.com emulator showing how to “type” demographics characters.
Advantage MC-10. We have no way to do that on the CoCo.
PRINT CHR$()
So how do we print the graphics characters? We use CHR$() which will print whatever character we tell it to. For example, letter “A” is ASCII 65. We could type:
PRINT CHR$(65)
…and it would print the letter A.
Our graphics characters start at 128 and go to 255, looping the same basic shapes through the 8 available colors (color + black). We can see them all by typing:
FOR A=128 TO 255:PRINT CHR$(A);:NEXT A
Using PRINT CHR$() to print the CoCo semi graphics characters.
If I knew which characters to use to draw a ball, I could print them using CHR$(). Unfortunately, I don’t. I have no idea where my old CoCo “quick guide” is from the 1980s that listed them all. Fortunately, Simon Jonassen has a website that lets us design semi graphics:
Using my previous text ball for reference, I want to make a semi graphics ball that is 10×7 characters (so it appears round on the 32×16 4:3 aspect ratio display…). Using Simon’s tool, I came up with this:
http://cocobotomy.roust-it.dk/sgedit/
It’s not a great ball, but it gives me something to start with.
On the bottom right of this web page are buttons to spit out the assembly, BASIC or CSV “code” to display this. But, it’s the whole screen, and looks like this:
10 CLEAR2000:DIMT,A:CLS
20 FORT=1024TO1535:READA:POKET,A:NEXT
100 A$=INKEY$:IFA$="" THEN100
1000 DATA 128,161,166,172,172,172,172,169,162,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1010 DATA 161,168,128,128,128,128,128,128,164,162,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1020 DATA 170,128,128,128,128,128,128,128,128,165,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1030 DATA 170,128,128,128,128,128,128,128,128,165,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1040 DATA 170,128,128,128,128,128,128,128,128,165,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1050 DATA 164,162,128,128,128,128,128,128,161,168,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1060 DATA 128,164,169,163,163,163,163,166,168,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1070 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1080 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1090 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1100 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1110 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1120 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1130 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1140 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
1150 DATA 128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128,128
That program, when ran, would draw the VDG semi graphics screen. But I only want the ball at the top left, so I should be able to find the values for that.
Side Note: Hey, Simon! I don’t think the CLEAR 2000 is necessary. That’s only used for strings. And since you aren’t using A$ for anything (you don’t DIM it either, I notice), you could do 100 IF INKEY$=”” THEN 100 instead and eliminate that variable. Also, generating the values as HEX would make it draw the screen faster. (Heh, force-of-habit when I write these articles. Simon is one of the most amazing CoCo programmers out there, and in one of my earlier articles, he contributed enhancements to my attempts at assembly code. This is about as “helpful” as I could ever be for someone as talented as Simon.)
There seems to be 16 DATA statements, each containing 32 values. Thus, the first seven DATAs look to be the first seven lines of the screen, and the first 10 values of each of those should be the 10 values for my ball. This gives me the following values:
1000 DATA 128,161,166,172,172,172,172,169,162,128
1010 DATA 161,168,128,128,128,128,128,128,164,162
1020 DATA 170,128,128,128,128,128,128,128,128,165
1030 DATA 170,128,128,128,128,128,128,128,128,165
1040 DATA 170,128,128,128,128,128,128,128,128,165
1050 DATA 164,162,128,128,128,128,128,128,161,168
1060 DATA 128,164,169,163,163,163,163,166,168,128
Those are the numbers I’d use to PRINT CHR$() that ball. I’ll first try it like this:
Using P as the starting PRINT@ (0 for the top left of the screen), each line will print the ten CHR$() values of the ball, then the next line will print at “P+32”, making it the next line down, and so on.
0 REM BALLVDG.BAS
5 DIM TE,TM,B,A,TT
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 PRINT@P+0,CHR$(128);CHR$(161);CHR$(166);CHR$(172);CHR$(172);CHR$(172);CHR$(172);CHR$(169);CHR$(162);CHR$(128);
31 PRINT@P+32,CHR$(161);CHR$(168);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(164);CHR$(162);
32 PRINT@P+64,CHR$(170);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(165);
33 PRINT@P+96,CHR$(170);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(165);
34 PRINT@P+128,CHR$(170);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(165);
35 PRINT@P+160,CHR$(164);CHR$(162);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(128);CHR$(161);CHR$(168);
36 PRINT@P+192,CHR$(128);CHR$(164);CHR$(169);CHR$(163);CHR$(163);CHR$(163);CHR$(163);CHR$(166);CHR$(168);CHR$(128);
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A:END
When this runs, you can SEE it PRINT the ball character-by-character! This is very slow. My benchmark took “forever” to run, reporting 26133 (at 60 ticks per second, that’s over 7 minutes to draw that 1001 times). If you take 26133 / 1001 (I really need to change that loop to be 0 to 999) you get 26.10 “ticks” per time. Divide that by 60 (per tick) you get .43. So it’s taking almost half a second to draw seven lines of ten characters each using CHR$() for each character. (Plus overhead of PRINT@ and such).
We need our ball to bounce faster than that.
We have discussed how switching from decimal to hex will speed things up, so let’s try that:
This looks a tiny bit faster. The benchmark reports 14790 (which breaks down to .24 seconds each time). That’s almost twice as fast (well, .24 to .44) but still not fast enough.
The only other thing we could do would be to remove all the in-between semicolons, since they aren’t actually needed to print the characters side-by-side (except for the last one, since we don’t want it to clear the rest of the screen line):
This removes the extra time it takes BASIC to parse (and skip) NINE semicolons on each line (times ten lines). That adds up, but removing them only increases the benchmark to 14542 — barely measurable.
I don’t see a faster way to do this using PRINT CHR$() over and over and over and over again.
Definition of insanity…
To avoid all the time it takes for BASIC to parse each CHR$() over and over and over and over, we can do that just once and store the result in a string and print that string instead. I’ll use one-letter string names, unique for each line, for speed. It would be “easier” to use an array (like BL$(7) or something) but I’ve previously explored that and found array access to be slower.
Since this is a demo, we’ll do it the silly way, like this:
Now we can just print those strings and, instead of BASIC having to parse and generate seventy CHR$()s each time, it will just have to look up and print seven strings. This should be much faster!
I made some room in my BENCH.BAS program to fit this strings in at the top, and it now looks like this:
The benchmark now reports 3200! That’s a huge improvement from the original 26133. I think this is about as fast as we can get… at least using strings.
But, for this demo, I wanted to have several frames of animation. If I make a bunch of strings, that’s going to slow things down since there will be more strings to search through. Also, I’ll run out of single-character variables names and might have to do things like A1$, A2$, A3$, etc. for the first frame, and B1$, B2$, B3$, etc. for the second frame, and so on. While this would STILL be faster than doing a bunch of PRINT CHR$(), and I expect still faster than using arrays like F$(0) through F$(6).
If I was less impatient, I’d test that, and even try a double dimension array like F$(frame, line) which would be really easy and look really nice … but would probably be be slower than anything but PRINT CHR$()s…
Note to Self: Write an article about using multi-dimensioned arrays to do simple animation in BASIC.
But I’m impatient, so now let’s circle back to the start of this post where I mentioned the MC-10 being able to embed characters in its PRINT statements directly without needing variables.
We can’t type those graphics characters on the CoCo, but if we don’t mind cheating a bit, we can modify the contents of a program so that the quoted string contains special graphic characters. It makes lines that do that impossible to edit, and print (on most printers), but if it’s just a demo (or some program you write for people to JUST run and not mess with), it works.
Self-modifying BASIC code
Suppose we had a print statement like this:
10 PRINT "**********";
…and we wanted to change those ten asterisks to be a graphics character. If we knew where they were located in BASIC memory, we could POKE the values and change them from a 42 (ASCII for ‘*’) to something else, like a 128 for the solid black block graphics character.
As a kid, I did this in a pretty brute-force way, blindly PEEKing through program memory and changing values to what I wanted them to be. This sometimes had dangerous side effects if the value I was PEEKing for (like 42) appeared as part of a BASIC keyword token or something not inside the quoted string.
So, let’s try to be a bit smarter and scan through the program memory but only look for values between quotes.
If I recall, memory locations 25 and 26 contain the start of the BASIC program. You can get that address like this:
PRINT PEEK(25)*256+PEEK(26)
The end of the program is at 27 and 28:
PRINT PEEK(27)*256+PEEK(28)
It should be super easy (barely an inconvenience) to make a program that PEEKs memory between that range and looks for a 34 (the quote character) and then will start substituting our source character (42, the asterisk) for our target character (128, a black block).
10 PRINT "**********"
20 END
100 ST=PEEK(25)*256+PEEK(26)
110 EN=PEEK(27)*256+PEEK(28)
120 QF=0 'QOUTE FOUND FLAG
130 FOR A=ST TO EN
140 IF PEEK(A)=32 THEN IF QF=1 THEN QF=0 ELSE QF=1
150 IF QF=1 THEN IF PEEK(A)=42 THEN POKE A,128
160 NEXT
If you load this program and RUN it, it will print a line of ten asterisks.
Then if you RUN 100, it will do the search/replace looking for asterisks between quotes and changing them to 128s (black block).
One that is done, RUN again and you see it now prints a row of ten black blocks!
A self-modifying CoCo BASIC program.
From this point on, that line 10 has been forever changed. If you LIST it, you will see weirdness. In this case, line 10 says:
10 PRINT "FORFORFORFORFORFORFORFORFORFOR"
…because apparently that byte of 128 is the token for the keyword FOR.
If you try to EDIT LINE 10, BASIC turns the tokens back into ASCII text for you to edit. Thus, as soon as you edit, you are changing that line to say “FORFORFORFOR…” instead of the graphics characters.
Thus, don’t edit a line after you do this trick!
I think this is a good stopping point for today. My goal here is to switch from my ball print ASCII “X” characters to graphics blocks, and retain the speed of raw PRINTs rather than using a ton of variables that have to be looked up each time — and the more variables, the slower that lookup gets.
But there’s more ASCII fun to be had before I start doing this.