The top article on this site for the past 5 or so years has been a simple C tidbit about splitting 16-bit values in to 8-bit values. Because of this, I continue to drop small C things here in case they might help when someone stumbles upon them.
Today, I’ll mention some redundant, useless code I always try to add.,,
I seem to recall that older specifications for the C programming language did not guarantee variables would be initialized to 0. I am not even sure if the current specification defines this, since one of the compilers I use at work has a specific proprietary override to enable this behavior.
You might find that this code prints non-zero on certain systems:
int i;
printf ("i = %d\n", i);
Likewise, trying to print a buffer that has not been initialized might produce non-empty data:
Because of this, it’s a good habit to always initialize variables with at least something:
int i=0;
char message[42];
...
memset (message, 0x0, sizeof(message));
Likewise, when setting variables in code, it is also a good idea to always set an expected result and NOT rely on any previous initialization. For example:
int result = -1;
if (something == 1)
{
result = 10;
}
else if (something == 2)
{
result = 42;
}
else
{
result = -1;
}
Above, you can clearly see that in the case none of the something values are met, it defaults to setting “result” to the same value it was just initialized to.
This is just redundant, wasteful code.
And you should always do it, unless you absolutely positively need those extra bytes of code space.
It is quite possible that at some point this code could be copy/pasted elsewhere, without the initialization. On first compile, the coder sees the undeclared “result” and just adds “int result;” at the top of the function. If the final else with “result = -1;” wasn’t there, the results could be unexpected.
The reverse of this is also true. If you know you are coding so you ALWAYS return a value and never rely on initialized defaults, it would be safe to just do “int result;” at the top of this code. But, many modern compilers will warn you of “possibly initialized variables.”
Because of this, I always try to initialize any variable (sometimes to a value I know it won’t ever use, to aid in debugging — “why did I suddenly get 42 back from this function? Oh, my code must not be running…”).
And I always try to have a redundant default “else” or whatever to set it, instead of relying on “always try.”
Another day, another issue with a C compiler of questionable quality…
Consider this bit of C code, which lives in an infinite main loop and is designed to do something every now and then (based on the status of a variable being toggled by a timer interrupt):
while (timeToDoSomething == true)
{
timeToDoSomething = false;
// Do something.
}
The program in question was trying to Do Something every 25 ms. A timer interrupt was toggling a boolean to true. The main loop would check that flag and if it were true it would set it back to false then handle whatever it was it was supposed to handle.
While this would have worked with “while”, it would really be better as an “if” — especially if the code to handle whatever it was supposed to handle took longer than 25ms causing the loop to get stuck.
Thus, it was changed to an “if”, but a typo left the old while still in the code:
//
while (timeToDoSomething == true)
if (timeToDoSomething == true)
{
timeToDoSomething = false;
// Do something.
}
Since things were taking longer than 25ms, the new code was still getting stuck in that loop — and that’s when the while (which was supposed to be commented out) was noticed.
The while without braces or a semicolon after it generated no compiler warning. That seemed wrong, but even GCC with full error reporting won’t show a warning.
Because C is … C.
Curly braces! Foiled again.
In C, it is common to see code formatted using whitespace like this:
if (a == 1)
printf("One!\n");
That is fine, since it is really just doing this:
if (a == 1) printf("One!\n");
…but is considered poor coding style these days because many programmers seem to be used to languages where indention actually means something — as opposed to C, where whitespace is whitespace. Thus, you frequently find bugs where someone has added more code like this:
if (a == 1)
printf("One!\n");
DoSomething ();
printf("Done.\n");
Above, it feels like it should execute three things any time a is 1, but to C, it really looks like this:
if (a == 1) printf("One!\n");
DoSomething ();
printf("Done.\n");
Thus, modern coding standards often say to always use curly braces even if there is just one thing after the if:
if (a == 1)
{
printf("One!\n");
}
With the braces in place, adding more statements within the braces would work as expected:
if (a == 1)
{
printf("One!\n");
doSomething ();
printf("Done.\n");
}
This is something that was drilled in to my brain at a position I had many years ago, and it makes great sense. And, the same thing should be said about using while. But while has it’s own quirks. Consider these two examples:
// This way:
while (1);
// That way:
while (1)
{
}
They do the same thing. One uses a semicolon to mark the end of the stuff to do, and other uses curly braces around the stuff to do. That’s the key to the code at the start of this post:
while (timeToDoSomething == true)
if (timeToDoSomething == true)
{
timeToDoSomething = false;
// Do something.
}
Just like you could do…
while (timeToDoSomething == true) printf("I am doing something");
…you could also write it as…
while (timeToDoSomething == true)
{
printf("I am doing something");
}
So when the “if” got added after the “while”, it was legit code, as if the user was trying to do this:
while (timeToDoSomething == true)
{
if (timeToDoSomething == true)
{
timeToDoSomething = false;
// Do something.
}
}
Since while can be followed by braces or a statement, it can also be followed by a statement using just braces.
The compiler can’t easily warn about needing a brace, since it is not required to have braces. But if it braces were required, that would catch the issues mentioned here with if and while blocks.
Code that looks like it should at least generate a warning is completely valid and legal C code, and that same code can be formatted in a way that makes it clear(er):
while (timeToDoSomething == true)
if (timeToDoSomething == true)
{
timeToDoSomething = false;
// Do something.
}
Whitespace makes things look pretty, but lack of it can also make things look wrong. Or correct when they aren’t.
I suppose the soapbox message of today is just to use braces. That wouldn’t have caught this particular typo (forgetting to comment something out), but its probably still good practice…
In a language like C, you often have multiple ways to accomplish the same thing. In general, the method you use shouldn’t matter if the end result is the same.
For example:
// This way:
if (a == 1)
{
function1();
}
else if (a == 2)
{
function2();
}
else
{
unknown();
}
// Versus that way:
switch (a)
{
case 1:
function1();
break;
case 2:
function2();
break;
default:
unknown();
break;
}
Both of those do the same thing, though the code they generate to do the same thing may be different.
We might not care which one we use unless we are needing to optimize for code space, memory usage or execution speed.
Optimizing for these things can be done by trail-and-error testing, but there is no guarantee that the method that worked best on the Arduino 16-bit processor and GCC compiler will be the same for a 64-BIT ARM processor and CLANG compiler.
If you ever do make a choice like this, just be sure to leave a comment explaining why you did it in case your code ever gets ported to a different architecture or compiler.
Short circuiting
Many things in C are compiler-specific, and are not part of the C standard. Some compilers are very smart and do amazing optimizations, while others may be very dump and do everything very literally. Here is an example of something I encountered during my day job that may or may not help others.
I had code that was intended to adjust power levels, unless any of the four power generators hit a maximum level. It looked like this:
// Version 1: Do nothing if power limit exceeded.
if ((Power1 > Power1Max) ||
(Power2 > Power2Max) ||
(Power3 > Power3Max) ||
(Power4 > Power4Max))
{
// Max power hit. Do nothing.
}
else
{
increasePower ();
}
Having conditionals lead to nothing seems odd. Wouldn’t it make more sense to check to see if we can do the thing we want to do?
// Version 2: Do something is power limit not exceeded.
if ((Power1 < Power1Max) &&
(Power2 < Power2Max) &&
(Power3 < Power3Max) &&
(Power4 < Power4Max))
{
increasePower ();
}
That looks much nicer. Are there any advantages to one versus the other?
For Version 1, the use of “OR” lets the compiler stop checking the moment any of those conditions is met. If Power1 is NOT above the limit, it then checks to see if Power2 is above the limit. If it is, we are done. We already know that one of these items is above, so no need to check the others. This works great for simple logic like this.
For Version 2, the use of “AND” requires all conditions to be met. If we check Power1 and it is below the limit, we then and check Power2. If that one is NOT below, we are done. We know there is no need to check any of the others.
Those sure look the same to me, and Version 2 seems easier to read.
The first example is basically saying “here is why we won’t do something” while the second example is “here is why we WILL do something.”
…but if I convert the printf() and run the same code on an Arduino:
void setup() {
// put your setup code here, to run once:
Serial.begin(9600);
uint16_t val1;
uint16_t val2;
uint32_t result;
val1 = 40000;
val2 = 50000;
result = val1 + val2;
//printf ("%u + %u = %u\n", val1, val2, result);
Serial.print(val1);
Serial.print(" + ");
Serial.print(val2);
Serial.print(" = ");
Serial.println(result);
}
void loop() {
// put your main code here, to run repeatedly:
}
This gives me:
40000 + 50000 = 24464
…and this was the source of a bug I introduced and fixed at my day job recently.
Tha’s wrong, int’it?
I tend to write alot of code using the GCC compiler since I can work out and test the logic much quicker than repeatedly building and uploading to our target hardware. Because of that, I had “fully working” code that was incorrect for our 16-bit PIC24 processor.
In this case, the addition of “val1 + val2” is being done using native integer types. On the PC, those are 32-bit values. On the PIC24 (and Arduino, shown above), they are 16-bit values.
A 16-bit value can represent 65536 values in the range of 0-65535. If you were to have a value of 65535 and add 1 to it, on a 16-bit variable it would roll over and the result would be 0. In my example, 40000 + 50000 was rolling over 65535 and producing 24464 (which is 90000 – 65536).
You can see this happen using the Windows calculator. By default, it uses DWORD (double word – 32-bit) values. You can do the addition just fine:
You see that 40,000 + 50,000 results in 90,000, which is 0x15F90 in hex. That 0x1xxxx at the start is the rollover. If you switch the calculator in to WORD mode you see it gets truncated and the 0x1xxxx at the start goes away, leaving the 16-bit result:
Can we fix it?
The solution is very simple. In C, any time there is addition which might result in a value larger than the native int type (if you know it), you simply cast the two values being added to a larger data type, such as a 32-bit uint32_t:
void setup() {
// put your setup code here, to run once:
Serial.begin(9600);
uint16_t val1;
uint16_t val2;
uint32_t result;
val1 = 40000;
val2 = 50000;
// Without casting (native int types):
result = val1 + val2;
//printf ("%u + %u = %u\n", val1, val2, result);
Serial.print(val1);
Serial.print(" + ");
Serial.print(val2);
Serial.print(" = ");
Serial.println(result);
// Wish casting:
result = (uint32_t)val1 + (uint32_t)val2;
Serial.print(val1);
Serial.print(" + ");
Serial.print(val2);
Serial.print(" = ");
Serial.println(result);
}
void loop() {
// put your main code here, to run repeatedly:
}
Above, I added a second block of code that does the same add, but casting each of the val1 and val2 variables to 32-bit values. This ensures they will not roll over since even the max values of 65535 + 65535 will fit in a 32-bit variable.
The result:
40000 + 50000 = 24464
40000 + 50000 = 90000
Since I know adding any two 16-bit values can be larger than what a 16-bit value can hold (i.e., “1 + 1” is fine, as is “65000 + 535”, but larger values present a rollover problem), it is good practice to just always cast upwards. That way, the code works as intended, whether the native int of the compiler is 16-bits or 32-bits.
As my introduction of this bug “yet again” shows, it is a hard habit to get in to.
Once again, oddness from floating point values took me down a rabbit hole trying to understand why something was not working as I expected.
Earlier, I had stumbled upon one of the magic values that a 32-bit floating point value cannot represent in C. Instead of 902.1, a float will give you 902.099976… Close, but it caused me issues due to how we were doing some math conversions.
float value = 902.1;
printf ("value = %f\n", value);
To work around this, I switched these values to double precision floating point values and now 902.1 shows up as 902.1:
double value = 902.1;
printf ("value = %f\n", value);
That example will indeed show 902.100000.
This extra precision ended up causing a different issue. Consider this simple code, which took a value in kilowatts and converted it to watts, then converted that to a signed integer.
That looks simple enough, but the output shows it is not:
kw : 64.600000
watts: 64600.000000
int32: 64599
Er… what? 64.6 multiplied by 1000 displayed as 64600.00000 so that all looks good, but when converted to a signed 32-bit integer, it turned in to 64599. “Oh no, not again…”
I was amused that, by converting these values to float instead of double it worked as I expected:
Apparently, whatever extra precision I was gaining from using double in this case was adding enough extra precision to throw off the conversion to integer.
I don’t know why. But at least I have a workaround.
As you can see, three extra bytes were added to the “blob” of memory that contains this structure. This is being done so each element starts on an even-byte address (0, 2, 4, etc.). Some processors require this, but if you were using one that allowed odd-byte access, you would likely get a sizeof() 7.
Do not rely on processor architecture
To create portable C, you must not rely on the behavior of how things work on your environment. The same can/will could produce different results on a different environment.
See also:sizeof() matters, where I demonstrated a simple example of using “int” and how it was quite different on a 16-bit Arduino versus a 32/64-bit PC.
Make it smaller
One easy thing to do to reduce wasted memory in structures is to try to group the 8-bit values together. Using the earlier structure example, by simple changing the ordering of values, we can reduce the amount of memory it uses:
You can see an extra byte of padding being added after the third 8-bit value. Just out of curiosity, I moved the third byte to the end of the structure like this:
…but that also produced 8. I believe it is just adding an extra byte of padding at the end (which doesn’t seem necessary, but perhaps memory must be reserved on even byte boundaries and this just marks that byte as used so the next bit of memory would start after it).
Because you cannot ensure how a structure ends up in memory without knowing how the compiler works, it is best to simply not rely or expect a structure to be “packed” with all the bytes aligned like the code. You also cannot expect the memory usage is just the values contained in the structure.
I do frequently see programmers attempt to massage the structure by adding in padding values, such as:
At least on a system that aligns values to 16-bits, the structure now matches what we actually get. But what if you used a processor where everything was aligned to 32-bits?
It is always best to not assume. Code written for an Arduino one day (with 16-bit integers) may be ported to a 32-bit Raspberry Pi Pico at some point, and not work as intended.
Here’s some sample code to try. You would have to change the printfs to Serial.println() and change how it prints the sizeof() values, but then you could see what it does on a 16-bit Arduino UNO versus a 32-bit PC or other system.
Beyond removing some spaces and a REM statement, here is the smallest I have been able to get my “attract” program:
10 ' ATTRACT4.BAS
20 FOR I=0 TO 3:READ L(I),LD(I),CL(I),CD(I):NEXT:Z=143:CLS 0:PRINT @268,"ATTRACT!";
30 Z=Z+16:IF Z>255 THEN Z=143
40 FOR I=0 TO 3:POKE L(I),Z:L(I)=L(I)+LD(I):FOR C=0 TO 3:IF L(I)=CL(C) THEN LD(I)=CD(C)
50 NEXT:NEXT:GOTO 30
60 DATA 1024,1,1024,1,1047,1,1055,32,1535,-1,1535,-1,1512,-1,1504,-32
(We could reduce it by one line by sticking the DATA statement on the end of line 50, now that I look at it.)
Let’s rewind and look at the original, which used individual variables for each of the moving color blocks:
10 ' ATTRACT.BAS
20 A=1024:B=A+23:C=1535:D=C-23:Z=143
30 AD=1:BD=1:CD=-1:DD=-1
40 CLS 0:PRINT @268,"ATTRACT!";
50 POKE A,Z:POKE B,Z:POKE C,Z:POKE D,Z
60 Z=Z+16:IF Z>255 THEN Z=143
70 A=A+AD
80 IF A=1055 THEN AD=32
90 IF A=1535 THEN AD=-1
100 IF A=1504 THEN AD=-32
110 IF A=1024 THEN AD=1
120 '
130 B=B+BD
140 IF B=1055 THEN BD=32
150 IF B=1535 THEN BD=-1
160 IF B=1504 THEN BD=-32
170 IF B=1024 THEN BD=1
180 '
190 C=C+CD
200 IF C=1055 THEN CD=32
210 IF C=1535 THEN CD=-1
220 IF C=1504 THEN CD=-32
230 IF C=1024 THEN CD=1
240 '
250 D=D+DD
260 IF D=1055 THEN DD=32
270 IF D=1535 THEN DD=-1
280 IF D=1504 THEN DD=-32
290 IF D=1024 THEN DD=1
300 GOTO 50
This was then converted to us an array:
10 ' ATTRACT2.BAS
20 L(0)=1024:L(1)=1024+23:L(2)=1535:L(3)=1535-23
30 Z=143
40 CL(0)=1024:CD(0)=1
50 CL(1)=1055:CD(1)=32
60 CL(2)=1535:CD(2)=-1
70 CL(3)=1504:CD(3)=-32
80 CLS 0:PRINT @268,"ATTRACT!";
90 LD(0)=1:LD(1)=1:LD(2)=-1:LD(3)=-1
100 FOR I=0 TO 3:POKE L(I),Z:NEXT
110 Z=Z+16:IF Z>255 THEN Z=143
120 FOR I=0 TO 3:L(I)=L(I)+LD(I):NEXT
130 FOR L=0 TO 3
140 FOR C=0 TO 3
150 IF L(L)=CL(C) THEN LD(L)=CD(C)
160 NEXT
170 NEXT
180 GOTO 100
And then it was converted to use READ/DATA instead of hard-coding values:
10 ' ATTRACT3.BAS
20 FOR I=0 TO 3
30 READ L(I),LD(I),CL(I),CD(I)
40 NEXT
50 Z=143
60 CLS 0:PRINT @268,"ATTRACT!";
70 Z=Z+16:IF Z>255 THEN Z=143
80 FOR I=0 TO 3
90 POKE L(I),Z
100 L(I)=L(I)+LD(I)
110 FOR C=0 TO 3
120 IF L(I)=CL(C) THEN LD(I)=CD(C)
130 NEXT
140 NEXT
150 GOTO 70
160 ' L,LD,CL,CD
170 DATA 1024,1,1024,1
180 DATA 1047,1,1055,32
190 DATA 1535,-1,1535,-1
200 DATA 1512,-1,1504,-32
Shuffling code around is fun.
But it’s still really slow.
10 PRINT “FASTER”
There are other ways to do similar effects, such as with strings. We could make a string that contained a repeating series of the color block characters, like this:
FOR I=0 TO 7:B$=B$+CHR$(143+16*I):NEXT
Then we could duplicate that 8-character string a few times until we had a string that was twice the length of the 32 column screen:
B$=B$+B$+B$+B$+B$+B$+B$+B$
Then we could make the entire thing move by printing the MID$ of it, like this:
FOR I=1 TO 32
PRINT@0,MID$(B$,33-I,32);
PRINT@480,MID$(B$,I,31);
NEXT
We print one section @0 for the top line, and the other @480 for the bottom line. Unfortunately, using PRINT instead of POKE means if we ever print on the bottom right location, the screen would scroll, so the bottom right block has to be left un-printed (thus, printing 31 characters for the bottom line instead of the full 32). This bothers me so apparently I do have O.C.D. Maybe we can fix that later.
But, it gives the advantage of scrolling ALL the blocks, and is super fast. Check it out:
10 ' ATTRACT5.BAS
20 CLS 0:PRINT @268,"ATTRACT!";
30 FOR I=0 TO 7:B$=B$+CHR$(143+16*I):NEXT
40 B$=B$+B$+B$+B$+B$+B$+B$+B$
50 FOR I=1 TO 32
60 PRINT@0,MID$(B$,33-I,32);
70 PRINT@480,MID$(B$,I,31);
80 NEXT:GOTO 50
That’s not bad, but only gives the top and bottom rows (minus that bottom right location). But, it’s fast!
ATTRACT5.BAS
Since the orders of the colors is the same on the top and bottom, we’d really need to reverse the bottom characters to make it look like it’s rotating versus just reversing. Let’s tweak that:
10 ' ATTRACT6.BAS
20 CLS 0:PRINT @268,"ATTRACT!";
30 FOR I=0 TO 7:B$=B$+CHR$(143+16*I)
35 R$=R$+CHR$(255-16*I):NEXT
40 B$=B$+B$+B$+B$+B$+B$+B$+B$
45 R$=R$+R$+R$+R$+R$+R$+R$+R$
50 FOR I=1 TO 32
60 PRINT@0,MID$(B$,33-I,32);
70 PRINT@480,MID$(R$,I,31);
80 NEXT:GOTO 50
That’s a bit better. But getting the sides to work is a bit more work and it will slow things down quite a bit. But let’s try anyway.
Initially, I tried scanning down the sides of the string using MID$, like this:
FOR J=1 TO 14
PRINT@480-32*J,MID$(R$,39-J+I,1);
PRINT@31+32*J,MID$(R$,33-J+I,1);
NEXT
But that was very, very slow. You could see it “paint” the sides. Each time you use MID$, a new string is created (with data copied from the first string). That’s a bunch of memory shuffling just for one character.
Then I thought, since I can’t get the speed up from a horizontal string being PRINTed, it was probably faster to just use CHR$().
I tried that, and it was still too slow.
Benchmark Digression: POKE vs PRINT
This led me back to an earlier benchmark discussion… Since I cannot get any benefit of using PRINT for a vertical column of characters, I could switch to the faster POKE method. This would also allow me to fill that bottom right character block. My O.C.D. approves.
To prove this to myself, again, I did two quick benchmarks — one using PRINT@ and the other using POKE.
0 ' LRBENCH1.BAS
1 ' 4745
10 C=143+16
20 TIMER=0:FOR A=1 TO 1000
30 FOR P=1024 TO 1535 STEP 32
40 POKEP,C
50 NEXT
60 NEXT:PRINT TIMER
0 ' LRBENCH2.BAS
1 ' 6013
10 C=143+16
20 TIMER=0:FOR A=1 TO 1000
30 FOR P=0 TO 511 STEP 32
40 PRINT@P,CHR$(C);
50 NEXT
60 NEXT:PRINT TIMER
Line 1 has the time that it printed for me in the Xroar emulator.
POKE will be the way.
However, there is still a problem: Math.
It just doesn’t add up…
The CoCo screen is 32×16. There are 8 colors. That means those 8 colors can repeat four times along the top of the screen, and four times along the bottom, leaving only 14 on each side going vertical. 32+32+14+14 is 92, which is not evenly divisible by our 8 colors. If we represent them as numbers, they would look like this:
If you start at the top left corner and go across, repeating 12345678 over and over, you end up back at the top left on 4. We have three colors that won’t fit. This means even if I had a nice fast routine for rotating the colors, they would not be evenly balanced using this format.
However…
…if I leave out the four corners, we get 88, and that divides just fine by our 8 colors!
Thus, the actual O.C.D.-compliant border I want to go for would look like this:
The only problem is … how can this be done fast in BASIC?
To be continued…
Bonus: Show Your Work
Here are the stupid BASIC programs I wrote to make the previous four screens:
0 ' border1.bas
10 CLS:C=113:L=1024
20 ' RIGHT
30 L=1024:D=1:T=31:GOSUB 110
40 ' DOWN
50 L=1087:D=32:T=13:GOSUB 110
60 ' LEFT
70 L=1535:D=-1:T=31:GOSUB 110
80 ' UP
90 L=1472:D=-32:T=13:GOSUB 110
100 GOTO 100
110 ' L=LOC, D=DELTA, T=TIMES
120 POKE L,C
130 C=C+1:IF C>120 THEN C=113
140 IF T=0 THEN RETURN
150 L=L+D:IF L>1023 THEN IF L<1536 THEN 170
160 L=L-D:SOUND 200,1
170 T=T-1:GOTO 120
0 ' border2.bas
10 CLS:C=113:L=1024
20 ' RIGHT
30 L=1025:D=1:T=29:GOSUB 110
40 ' DOWN
50 L=1087:D=32:T=13:GOSUB 110
60 ' LEFT
70 L=1534:D=-1:T=29:GOSUB 110
80 ' UP
90 L=1472:D=-32:T=13:GOSUB 110
100 GOTO 100
110 ' L=LOC, D=DELTA, T=TIMES
120 POKE L,C
130 C=C+1:IF C>120 THEN C=113
140 IF T=0 THEN RETURN
150 L=L+D:IF L>1023 THEN IF L<1536 THEN 170
160 L=L-D:SOUND 200,1
170 T=T-1:GOTO 120
0 ' border3.bas
10 CLS 0:C=143:L=1024
20 ' RIGHT
30 L=1025:D=1:T=29:GOSUB 110
40 ' DOWN
50 L=1087:D=32:T=13:GOSUB 110
60 ' LEFT
70 L=1534:D=-1:T=29:GOSUB 110
80 ' UP
90 L=1472:D=-32:T=13:GOSUB 110
100 GOTO 100
110 ' L=LOC, D=DELTA, T=TIMES
120 POKE L,C
130 C=C+16:IF C>255 THEN C=143
140 IF T=0 THEN RETURN
150 L=L+D:IF L>1023 THEN IF L<1536 THEN 170
160 L=L-D:SOUND 200,1
170 T=T-1:GOTO 120
Awhile back I ported 8-Bit Show and Tell‘s “10 PRINT RACER” from Commodore PET to CoCo. I tried to make it a literal port, keeping the code as close as I could to the original. I did, however, mention a few things that could make it faster, taking advantage of things like Extended Color BASIC’s hex values (&H2 is faster to parse than 2, for instance).
The other day, MiaM left a comment on the original article:
It might be faster to use A=ASC(INKEY$) and IF A=4 instead of IF A$=CHR$(4)
– MiaM
Intriguing. The original Commodore version, the direction was read by using GET A$, and I simply converted that over to A$=INKEY$ for Color BASIC. Here is a look at Robin’s Commodore PET original:
On the Commodore PET, without arrow keys, it used “4” and “6” on the numeric keypad for Left and Right. On the CoCo, I changed that to the Left Arrow key and the Right Arrow key.
The Commodore PET has much less work to do looking for A$=”4″ versus A$=CHR$(8) not he CoCo (due to all the parsing). I could have made the CoCo use letter keys like “A” for left and “S” for right to get similar performance.
But what MiaM suggests may be faster. Instead of comparing strings like A$=CHR$(8), the suggestion is to use BASIC’s ASC() keyword to return the numeric value of the character, then compare a numeric value rather than a string compare.
Which is faster? A one character string compare, or ASC() and a number compare?
Let’s find out.
Comparing a String to a String
For this, I dug out my old BENCH.BAS benchmarking code and inserted the first method I wanted to test — the way the Commodore PET did it:
5 DIM TE,TM,B,A,TT
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 A$=INKEY$:IF A$="4" THEN REM
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A:END
Comparing A$ to a quoted value in this loop produces 515.
Comparing a String to a CHR$
My conversion changed this to comparing to a CHR$(8) value, like this:
0 REM ascvsstringcompare.BAS
5 DIM TE,TM,B,A,TT
10 FORA=0TO3:TIMER=0:TM=TIMER
20 FORB=0TO1000
30 A$=INKEY$:IF A$="4" THEN REM
30 A$=INKEY$:IF A$=CHR$(8) THEN REM
70 NEXT
80 TE=TIMER-TM:PRINTA,TE
90 TT=TT+TE:NEXT:PRINTTT/A:END
This produces a slower 628. No surprise, due to having to parse CHR$() and the number. I could easily speed up the CoCo port by using quoted characters like “A” for Left and “S” for Right.
But I really wanted to use the arrow keys.
ASC and you shall receive…
The new suggestion is to use ASC. ASC will convert a character to its ASCII value (or PETASCII on a Commodore, I would suppose). For example:
PRINT ASC("A")
65
The cool suggestion was to try using INKEY$ as the parameter inside of ASC(), and skipping the use of a variable entirely. Unfortunately, when I tried it, I received:
?FC ERROR
Function Call error. Because, if no key is pressed, INKEY$ returns nothing, which I suppose would be like trying to do:
PRINT ASC("")
We have been able to use INKEY$ directly in other functions, such as INSTR (looking up a character inside a string), and that works even when passing in “”:
PRINT INSTR("","ABCDE")
0
But ASC() won’t work without a character, at least not in Color BASIC. And, even if we used A$=INKEY$, we can’t pass A$ in to ASC() if it is empty (no key pressed) which means we’d need an extra check like:
30 A$=INKEY$:IF A$<>"" THEN IF ASC(A$)=4 THEN ..
The more parsing, the slower. This produced 539, which isn’t as slow as I expected. It’s slower than doing IF A$=”4″ but faster than IF A$=CHR$(8). Thus, it would be faster in my CoCo port than my original.
This did give me another thing to try. ASC() allows you to pass in a string that contains more than one character, but it only acts upon the first letter. You can do this:
PRINT ASC("ALLEN TRIED THIS")
65
This means I could always pad the return of INKEY$ with another character so it would either be whatever keys he user pressed, or my other character if nothing was pressed. Like this:
30 IF ASC(INKEY$+".")=8 THEN REM
If no key has been pressed, this would try to parse “”+”.”, and give me the ASCII of “.”.
If a key had been pressed, this would parse that character (like “4.” if I pressed a 4).
As I learned when I first stated my benchmarking BASIC series, string manipulation is slow. Very slow. So I expect this to be very slow.
To my surprise, it returns 520! Just a smidge slower than the original IF A$=”4″ string compare! I’m actually quite surprised.
Now, in the actual 10 PRINT RACER game, which is doing lots of string manipulations to generate the game maze, this could end up being much slower if it had to move around other larger strings. But, still worth a shot.
Thank you, MiaM! Neat idea, even if Color BASIC wouldn’t let me do it the cool way you suggested.
Until next time…
Bonus
Numbers verses string compares:
30 IF Z=4 THEN REM
That gives me 350. Even though decimal values are much slower to parse than HEX values, they are still faster than strings.
But, in pure Color BASIC, there is no way to get input from a keypress to a number other than ASC. BUT, you could PEEK some BASIC RAM value that is the key being held down, and do it that way (which is something I have discussed earlier).
My “big maze” program printed 2×2 character blocks along the bottom of the screen until it got to the bottom right of the screen, then the screen will scroll (and an extra PRINT is added to add a second line) and the process resets and repeats.
After William Astle provided some optimizations, it dawns on me that there was another thing we could try. Here is the code in question (removing unneeded lines and adjusting the GOTO as appropriate):
That was a very subtle change that could double (or more, or less) the speed just by not needing to parse over “PRINT:GOTO 70” every time P was NOT greater than 479 (which is most of the time in that loop).
This made me think that perhaps instead of checking for greater than 479 we could adjust the logic and check for less than 480. Something like this, perhaps:
There’s really no reason for this to be any different speed, is there? GOTO (“THEN”) 100 still has to start at the top and move forward, the same as GOTO (“THEN”) 70 would.
But, in the first case, it quickly skips “THEN 70” to hit the “GOTO 100” below, every time the value is not greater than 479. That
In the second, every time the value is LESS than 480 it returns to 100 (go to top of program and search forward).
When we last left off, it had been so long since I did any BASIC programming that I found myself wondering why these two sections of BASIC did not perform as I expected:
0 'bigmazebench.bas
100 P=0:TIMER=0:A=0
110 P=0
120 P=P+2:IF P>479 THEN PRINT:GOTO 110
120 A=A+1:IF A >1000 THEN 150
140 GOTO 120
150 PRINT TIMER
200 P=0:TIMER=0:A=0
210 PRINT:P=0
220 P=P+2:IF P>479 THEN 210
230 A=A+1:IF A>1000 THEN 250
240 GOTO 220
250 PRINT TIMER
William Astle once again saw the obvious (just not obvious to me at the time)…
If you have both versions in the same program, the “backwards” jumps will be slower the later in the program they are because they have to do a sequential scan of the program from the beginning to find the correct line number. If you have been running them in the same program, try separating them and running them independently.
– William Astle
Well, duh. Of course. When the block of code starting at line 200 runs, the GOTO 220 has to start at the top of the program and seek past every line to find 220. Much slower compared to how few lines the GOTO 120 has to. Normally my benchmark program is inside a FOR/NEXT loop so there is no line seeking and it behaves the same speed regardless of line number location…
So let’s try them one at a time. I loaded the program and deleted the line 0 comment, and lines 200 and up (DEL 0 and DEL 200-):
100 P=0:TIMER=0:A=0
110 P=0
120 P=P+2:IF P>479 THEN PRINT:GOTO 110
120 A=A+1:IF A >1000 THEN 150
140 GOTO 120
150 PRINT TIMER
This gives me 762.
Then, loading it again, and deleting everything up to 200 (“DEL -199”):
200 P=0:TIMER=0:A=0
210 PRINT:P=0
220 P=P+2:IF P>479 THEN 210
230 A=A+1:IF A>1000 THEN 250
240 GOTO 220
250 PRINT TIMER
That gives me 1394!
Yep, William’s suggestion of moving the PRINT to the destination line, instead of using “THEN PRINT:GOTO xxx” almost doubled the speed it takes to run through that code.