Updates:
- 2022-08-30 – Corrected a major bug in the Get8BitHexStringPtr() routine.
“Here we go again…”
Last week I ran out of ROM space in a work project. For each code addition, I have to do some size optimization elsewhere in the program. Some things I tried actually made the program larger. For example, we have some status bits that get set in two different structures. The code will do it like this:
shortStatus.faults |= FAULT_BIT;
longStatus.faults |= FAULT_BIT;
We have code like that in dozens of places. One of the things I had done earlier was to change that in to a function. This was primarily so I could have common code set fault bits (since each of the four different boards I work with had a different name for its status structures). It was also to reduce the amount of lines in the code and make what they were doing more clear (“clean code”).
void setFault (uint8_t faultBit)
{
shortStatus.faults |= faultBit;
longStatus.faults |= faultBit;
}
During a round of optimizing last week, I noticed that the overhead of calling that function was larger than just doing it manually. I could switch back and save a few bytes every time it was used, but since I still wanted to maintain “clean code”, I decided to make a macro instead of the function. Now I can still do:
setFault (FAULT_BIT);
…but under the hood it’s really doing a macro instead:
#define setFault(faultBit) { shortStatus.faults |= faultbit; longStatus.faults |= faultBit; }
Now I get what I wanted (a “function”) but retain the code size savings of in-lining each instance.
I also thought that doing something like this might be smaller:
shortStatus.faults |= FAULT_BIT;
longStatus.faults = shortStatus.faults;
…but from looking at the PIC24 assembly code, that’s much larger. I did end up using it in large blocks of code that conditionally decided which fault bit to set, and then I sync the long status at the end. As long as the overhead of “this = that” is less than the overhead of multiple inline instructions it was worth doing.
And keep in mind, this is because I am 100% out of ROM. Saving 4 bytes here, and 20 bytes there means the difference between being able to build or not.
Formatting Output
One of the reasons for the “code bloat” was adding support for an LCD display. The panel, an LCD2004, hooks up to I2C vie a PCF8574 I2C I/O chip. I wrote just the routines needed for the minimal functionality required: Initialize, Clear Screen, Position Cursor, and Write String.
The full libraries (there are many) for Arduino are so large by comparison, so often it makes more sense to spend the time to “roll your own” than port what someone else has already done. (This also means I do not have to worry about any licensing restrictions for using open source code.)
I created a simple function like:
LCDWriteDataString (0, 0, "This is my message.");
The two numbers are the X and Y (or Column and Row) of where to display the text on the 20×4 LCD screen.
But, I was quickly reminded that the PIC architecture doesn’t support passing constant string data due to “reasons”. (Harvard architecture, for those who know.)
To make it work, you had to do something like:
const char *msg = "This is my message";
LCDWriteDataString (0, 0, msg);
…or…
chr buffer[19];
memcpy (buffer, "This is my message");
LCDWriteDataString (0, 0, msg);
…or, using the CCS compiler tools, add this to make the compiler take care of it for you:
#device PASS_STRINGS=IN_RAM
Initially I did that so I could get on with the task at had, but as I ran out of ROM space, I revisited this to see which approach was smaller.
From looking at the assembly generated by the CCS compiler, I could tell that “PASS_STRINGS=IN_RAM” generated quite a bit of extra code. Passing in a constant string pointer was much smaller.
So that’s what I did. And development continued…
Then I ran out of ROM yet again. Since I had some strings that needed formatted output, I was using sprintf(). I knew that sprintf() was large, so I thought I could create my own that only did what I needed:
char buffer[21];
sprintf (buffer, "CF:%02x C:%02x T:%02x V:%02x", faults, current, temp, volts);
LCDWriteDataString (0, 0, buffer);
char buffer[21];
sprintf (buffer, "Fwd: %u", watts);
LCDWriteDataString (0, 1, buffer);
In my particular example, all I was doing is printing out an 8-bit value as HEX, and printing out a 16-bit value as a decimal number. I did not need any of the other baggage sprintf() was bringing when I started using it.
I came out with these quick and dirty routines:
char GetHexDigit(uint8_t nibble)
{
char hexChar;
nibble = (nibble & 0x0f);
if (nibble <= 9)
{
hexChar = '0';
}
else
{
hexChar = 'A'-10;
}
return (hexChar + nibble);
}
char *Get8BitHexStringPtr (uint8_t value)
{
static char hexString[3];
hexString[0] = GetHexDigit(value >> 4);
hexString[1] = GetHexDigit(value & 0x0f);
hexString[2] = '\0'; // NIL terminate
return hexString;
}
The above routine maintains a static character buffer of 3 bytes. Two for the HEX digits, and the third for a NIL terminator (0). I chose to do it this way rather than having the user pass in a buffer pointer since the more parameters you pass, the larger the function call gets. The downside is those 3 bytes of variable storage are reserved forever, so if I was also out of RAM, I might rethink this approach.
I can now use it like this:
const char *msgC = " C:"; // used by strcat()
const char *msgT = " T:"; // used by strcat()
const char *msgV = " V:"; // used by strcat()
char buffer[20];
strcpy (buffer, "CF:"); // allows constants
strcat (buffer, Get8BitHexStringPtr(faults));
strcat (buffer, msgC);
strcat (buffer, Get8BitHexStringPtr(current));
strcat (buffer, msgT);
strcat (buffer, Get8BitHexStringPtr(temp));
strcat (buffer, msgV);
strcat (buffer, Get8BitHexStringPtr(volts));
LCDWriteDataString (0, 1, buffer);
If you are wondering why I do a strcpy() with a constant string, then use const pointers for strcat(), that is due to a limitation of the compiler I am using. Their implementation of strcpy() specifically supports string constants. Their implementation of strcat() does NOT, requiring me to jump through more hoops to make this work.
Even with all that extra code, it still ends up being smaller than linking in sprintf().
And, for printing out a 16-bit value in decimal, I am sure there is a clever way to do that, but this is what I did:
char *Get16BitDecStringPtr (uint16_t value)
{
static char decString[6];
uint16_t temp = 10000;
int pos = 0;
memset (decString, '0', sizeof(decString));
while (value > 0)
{
while (value >= temp)
{
decString[pos]++;
value = value - temp;
}
pos++;
temp = temp / 10;
}
decString[5] = '\0'; // NIL terminate
return decString;
}
Since I know the value is limited to what 16-bits can old, I know the max value possible is 65535.
I initialize my five-digit string with “00000”. I start with a temporary value of 10000. If the users value is larger than that, I decrement it by that amount and increase the first digit in the string (so “0” goes to “1”). I repeat until the user value has been decremented to be less than 10000.
Then I divide that temporary value by 10, so 10000 becomes 1000. I move my position to the next character in the output string and the process repeats.
Eventually I’ve subtracted all the 10000s, 1000s, 100s, 10s and 1s that I can, leaving me with a string of five digits (“00000” to “65535”).
I am sure there is a better way, and I am open to it if it generates SMALLER code. :)
And that’s my tale of today… I needed some extra ROM space, so I got rid of sprintf() and rolled my own routines for the two specific types of output I needed.
But this is barely scratching the surface of the things I’ve been doing this week to save a few bytes here or there. I’d like to revisit this subject in the future.
Until next time…