In my 3X+1 post, I needed to check if a value was odd or even. I did so by using “AND 1” which would test the least significant bit. This works, and is fast, but is limited to values 32767 or lower (15 bits).
Comments from William Astle, RogelioP and John offered corrections and updates to my code. I decided to benchmark a few different methods for detecting if a value was odd or even, and here is what I came up with.
0 REM 3x+1 Benchmarking 5 'GOTO 300 100 ' 0-32767 ONLY 110 TIMER=0 120 FOR A=1 TO 1000 130 IF X AND 1 THEN REM 140 NEXT:PRINT TIMER 200 ' ALL RANGES? 210 TIMER=0 220 FOR A=1 TO 1000 230 IF INT(X/2)=X/2 THEN REM 240 NEXT:PRINT TIMER 300 ' ROGELIO P 310 TIMER=0 320 FOR A=1 TO 1000 330 IF INT(X/2)*2=X THEN REM 340 NEXT:PRINT TIMER 400 ' WILLIAM ASTLE 405 T=0:H=0.5 410 TIMER=0 420 FOR A=1 TO 1000 430 T=X*H:IF T=INT(T) THEN REM 440 NEXT:PRINT TIMER 500 ' DIVIDE TEST 505 T=0:H=2 510 TIMER=0 520 FOR A=1 TO 1000 530 T=X/H:IF T=INT(T) THEN REM 540 NEXT:PRINT TIMER
When I run this program in Xroar on my Raspberry Pi 400, I get:
313 508 506 485 485
As expected, AND is the fastest, so use this if you know your values will be 32767 or lower.
Using INT(X/2)=X/2 was a fraction slower as INT(X/2)*2=X, which I guess makes sense since both do an INT and two math functions.
William’s suggestion of multiplying by .5 instead of dividing by two rang a bell. I believe he (or someone) pointed this out to me a few years ago when I was doing similar benchmarks. A big speed up comes just from putting the value in a variable, but I was surprised to see that dividing by a variable of 2 was the same speed as multiplying by a variable of .5.
What ideas do you have? Anything with math (“/2”) can be sped up by using a variable (“/H”), so there are some improvements just from that. Using HEX values (“&H2”) instead of decimal is also faster, as is removing extra spaces.
But are there better approaches we can use?
The similarity in timing between multiplcation and division is likely down to the implementation in the ROM. Some research shows that in both cases, a bit by bit algorithm is used by the ROM which almost certainly explains the measured speed. Had the ROM used a more optimized multiplication algorithm using the MUL instruction, it would have been faster to multiply. It seems likely the algorithm used for multiplication was originally developed for a CPU with no hardware multiplication at all.
The legend of porting from 6502!
Since this specific example uses /2, that could be a fast ROR. Not that I think BASIC would be that smart. But I recall earlier tests showing *.5 was faster than /2. I need to double check.
The measurement suffers from the fact that X, an undefined variable, which is always 0. To get real operational condition one should better use the loop variable A for testing.
I checked this out on a C64 (more or less the interpreter basis) this shows that the divide test is slower than W. Astle’s variant based on multiplication (which is the better and faster variant). In my test 8.6 s (multiplication) against 9.5 (division).
BTW: The AND 1 gets even faster if the constant 1 is kept in a variable. This would be only fair to do so because the other variants also use H to hold the constant value.
To keep the same initial conditions I would suggest to predeclare variables with DIM to guarantee that they are created always in the same order over all benchmarks. Something like
1 DIM A, T, H
Not a real problem here because these are create in this order by chance. This won’t hold if some other supporting variables are added later …