Category Archives: C Programming

IEEE-754 online tool thing

A coworker passed this site along to me.

I still do not understand how floating point works, but this sure does help visualize what I do not understand.

https://www.h-schmidt.net/FloatConverter/IEEE754.html

C and returning values quickly or safely. But not both.

5 Replies

WARNING: This article contains a C coding approach that many will find uncomfortable.

In my day job as a mild-mannered embedded C programmer, I am usually too busy maintaining what was created before me to be creating something new for others to maintain after me. There was that one time I had two weeks that were very different, and fun, since they were almost entirely spent “creating” versus “maintaining.”

Today’s quick C tidbit is about getting parameters back from a C function. In C, you only get one thing back — typically a variable type like an int or float or whatever:

int GetTheUltimateAnswer()
{
    return 42;
}

int answer = GetTheUltimateAnswer();
print ("The Ultimate Answer is %d\n", answer);

If you need more than one thing returned, it is common to pass in variables by reference (the address of, or pointer to, the variable in memory) and have the function modify that memory to update the variables:

void GetMinAndMax (int *min, int *max)
{
    *min = 0;
    *max = 100;
}

int min, max;
GetMinAndMax (&min, &max)
printf ("Min is %d and Max is %d\n", min, max);

The moment pointers come in to play, things get very dangerous. But fast.

When passing values in, they get copied in to a new variable:

int variable = 42;

printf ("variable = %d\n", variable);
Function (variable);
printf ("variable = %d\n", variable);

void Function (int x)
{
    x = x + 1;
}

Try it: https://onlinegdb.com/WC3ihCAuj

Above, Function() gets a new variable (called “x” in this case) with the value of the variable that was passed in to the call. The function is like Las Vegas. Anything that happens to that variable inside the function stays inside the function – the variable disappears at the end of the function, while the original variable remains as-was.

C++ changes this, I have learned, so you can pass in variables that can be modified, but I am not a C++ programmer so this post is only about old-skool C.

Pointing to a variable’s memory

By passing in the address of a variable, the function can go to that memory and modify the variable. It will be changed:

int variable = 42;

printf ("variable = %d\n", variable);
Function (&variable);
printf ("variable = %d\n", variable);

void Function (int *x)
{
    *x = *x + 1;
}

Try it: https://onlinegdb.com/Y2Z9WUvFG

Passing by value is slower, since a new variable has to be created. Passing by reference just passes an address and the code uses that address – no new variable is created.

But, using a reference for just for speed is dangerous because the function can modify the variable even if you didn’t want it to. Consider passing in a string buffer, which is a pointer to a series of character bytes:

void PrintError (char *message)
{
    print ("ERROR: %s\n", message);
}

PrintError ("Human Detected");

We do this all the time, but since PrintError() has access to the memory passed in, it could try to modify it. If we passed in a constant string like “Human Detected”, that string would typically be in program memory (though this is not true for Harvard Architecture systems like the PIC and Arduino). At best, an operating system with memory protection would trap that access with an exception and kill the program. At worst, the program would self-modify (which was the case when I learned this on OS-9/6809 back in the late 80s — no memory protection on my TRS-80 CoCo!).

void PrintError (char *message)
{
    message[0] = 42;
}

PrintError ("Human Detected");

Above would likely crash, though if the user had passed in the buffer holding a string, it would just be modified:

void PrintError (char *message)
{
    message[0] = 42;
}

char buffer[80];
strncpy (buffer, "Hello, world!", 80);
printf ("buffer: %s\n", buffer);
PrintError (buffer);
printf ("buffer: %s\n", buffer);

Try it: https://onlinegdb.com/L50JRWYj

And your point is?

My point is — there are certainly times when speed is the most important thing, and it outweighs the potential problems/crashes that could be caused by a bug with code using the pointer. Take for example anything that passes in a buffer:

void UppercaseString (char *buffer)
{
    for (int idx=0; idx<strlen(buffer); idx++)
    {
        buffer[i] = toupper(buffer[I])
    }
}

There are many bad things that could happen here. By using “strlen”, the buffer MUST be a string that has a NIL (‘\0’) byte at the end. This routine could end up trampling through memory uppercasing bytes that are beyond the caller’s string.

It is wise to always add another parameter that is the max size of the buffer:

void UppercaseString (char *buffer, int bufferSize)
{
    for (int idx=0; idx<bufferSize; idx++)
    {
        buffer[i] = toupper(buffer[I])
    }
}

That helps. But it is still up to the compiler to catch the wrong type of pointer being passed in.

int Number = 10;

UppercaseString (&Number, 100);

The compiler should not let you do that, but some may just issue a warning and build it anyway. (This is why I always try to have NO warnings in my code. The more warnings there are, the more likely you will start ignoring them.)

Try #1: Passing by Reference

Suppose we have a function that returns the date and time as individual values (year, month, day, hour, minute and second). Since we cannot get six values back from a function, we first try passing in six variables by reference and having the routine modify them:

void GetDateTime1 (int *year, int *month, int *day,
                   int *hour, int *minute, int *second)
{
    *year = 2023;
    *month = 8;
    *day = 19;
    *hour = 4;
    *minute = 20;
    *second = 0;
}

int year, month, day, hour, minute, second;
GetDateTime1 (&year, &month, &day, &hour, &minute, &second);
printf ("GetDateTime1: %d/%d/%d %02d:%02d:%02d\n",
        year, month, day, hour, minute, second);

That works fine … as long as you know the parameters are “ints” (whatever that is) and not shorts or longs or any other numeric type. This, for example, would be bad:

short year, month, day, hour, minute, second;

GetDateTime1 (&year, &month, &day, &hour, &minute, &second);

Above, we are passing in a short (let’s say that is a 16-bit variable on this system) in to a function that expects an int (let’s say that is a 32-bit signed variable on this system). The function would try to place 32-bits of information at the address of a 16-bit value.

Bad things, as they say, can happen.

Try #2: Passing a structure by reference

Passing in six variable pointers is more work than passing in one, so if we put the values in a structure we could pass in just the pointer to that structure. This has the benefit of making sure the structure is only loaded with values it can handle (unlike passing in an address of something that might be 8, 16, 32 or 64 bits).

typedef struct
{
    int year;
    int month;
    int day;
    int hour;
    int minute;
    int second;
} TimeStruct;

void GetDateTime2 (TimeStruct *timePtr)
{
    timePtr->year = 2023;
    timePtr->month = 8;
    timePtr->day = 19;
    timePtr->hour = 4;
    timePtr->minute = 20;
    timePtr->second = 0;   
}

TimeStruct time;
GetDateTime2 (&time);
printf ("GetDateTime2: %d/%d/%d %02d:%02d:%02d\n",
        time.year, time.month, time.day,
        time.hour, time.minute, time.second);

This should greatly reduce the potential problems since you only have one pointer to screw up, and if you get the type correct (a TimeStruct) the values it contains should be fine since the compiler takes care of trying to set a “uint8_t” to “65535” (a warning, hopefully, and storing 8-bits of that 16-bit value as a “loss of precision”).

Try #3: Returning the address of a static

An approach various standard C library functions take is having some fixed memory allocated inside the function as a static variable, and then returning a pointer to that memory. The user doesn’t make it and therefore isn’t passing in a pointer that could be wrong.

TimeStruct *GetDateTime3 (void)
{
    static TimeStruct s_time;
    
    s_time.year = 2023;
    s_time.month = 8;
    s_time.day = 19;
    s_time.hour = 4;
    s_time.minute = 20;
    s_time.second = 0;

    return &s_time;
}

TimeStruct *timePtr;
timePtr = GetDateTime3 ();  
printf ("GetDateTime3: %d/%d/%d %02d:%02d:%02d\n",
       timePtr->year, timePtr->month, timePtr->day,
       timePtr->hour, timePtr->minute, timePtr->second);

This approach is better, since it gets the speed from using a pointer, and the safety of not being able to get the pointer wrong since the function tells you where it is, not the other way around.

BUT … once you have the address of that static memory, you can modify it.

TimeStruct *timePtr;
timePtr = GetDateTime3 ();
timePtr->year = 1969;

In a real Date/Time function (like the one in the C library), those variables are populated with the system time when you call the function, so even if the user changed something like this, it would be set back to what it was the next time it was called. But, I can see where there could be issues with other types of functions that just hold on to memory like this.

Plus, it’s always holding on to that memory whether anyone is using it or not. That is a no-no when working on memory constrained systems like an Arduino with 4K of RAM.

Try #4: Returning a copy of a structure

And now the point of today’s ramblings… I rarely have used this, since it’s probably the slowest way to do things, but … you don’t just have to return a date type like and int or a bool or a pointer. You can return a structure, and C will give the caller a copy of the structure.

TimeStruct GetDateTime4 (void)
{
    TimeStruct time;
    
    time.year = 2023;
    time.month = 8;
    time.day = 19;
    time.hour = 4;
    time.minute = 20;
    time.second = 0;

    return time;
}

TimeStruct time;
time = GetDateTime4 ();    
printf ("GetDateTime4: %d/%d/%d %02d:%02d:%02d\n",
       time.year, time.month, time.day,
       time.hour, time.minute, time.second);

Above is possibly the safest way to return data, since no pointers are used. The called makes an new structure variable, and then the function creates a new structure variable and the return copies that structure in to the caller’s structure.

Try it: https://onlinegdb.com/F6rR1V-xb

This is slower, and consumes more memory during the process of making all these copies, BUT it’s far, far safer. Even ChatGPT agrees that, if going to “safe” code, this is the better approach.

And, at my day job, I experimented with this and it’s been working very well. It’s about the closest thing C has to “objects”. I even use it for a BufferStruct so I can pass a buffer around without using a pointer (though internally there is a pointer to the buffer memory). It looks something like this:

#include <stdio.h>
#include <string.h>

typedef struct
{
    char buffer[80];
    char bufferSize;
} BufferStruct;

BufferStruct GetBuffer ()
{
    BufferStruct buf;
    
    strncpy (buf.buffer, "Hello, world!", sizeof(buf.buffer));
    buf.bufferSize = strlen(buf.buffer);
    
    return buf;
}

void ShowBuffer (BufferStruct buf)
{
    printf ("Buffer: %s\n", buf.buffer);
    printf ("Size  : %d\n", buf.bufferSize);
}

int main()
{
    BufferStruct myBuffer;
    myBuffer = GetBuffer ();
    ShowBuffer (myBuffer);

    BufferStruct testBuffer;
    strncpy (testBuffer.buffer, "I put this in here",
             sizeof(testBuffer.buffer));
    testBuffer.bufferSize = strlen (testBuffer.buffer);
    ShowBuffer (testBuffer);
    
    return 0;
}

The extra overhead may be a problem if you are coding for speed, but doing this trick (while trying not to think about all the extra work and copying the code is doing) gives you a simple way to pass things around without ever using a pointer. You could even do this:

typedef struct
{
    int year;
    int month;
    int day;
    int hour;
    int minute;
    int second;
} TimeStruct;

// Global time values.
int g_year, g_month, g_day, g_hour, g_minute, g_second;

void SetTime (TimeStruct time)
{
    // Pretend we are setting the clock.
    g_year = time.year;
    g_month = time.month;
    g_day = time.day;
    g_hour = time.hour;
    g_minute = time.minute;
    g_second = time.second;
}

TimeStruct GetTime ()
{
    TimeStruct time;

    // Pretend we are reading the clock.
    time.year = g_year;
    time.month = g_month;
    time.day = g_day;
    time.hour = g_hour;
    time.minute = g_minute;
    time.second = g_second;

    return time;
}

TimeStruct time;

time.year = 2023;
time.month = 8;
time.day = 19;
time.hour = 12;
time.minute = 4;
time.second = 20;
SetTime (time);

...

time = GetTime ();

And now a certain percentage of C programmers who stumble in to this article should be having night terrors at what is going on here.

Until next time…

TIL: You can build C in Microsoft Visual Studio

My first C program for CoCo DISK BASIC.

1 Reply

On this day in history … I built the CMOC compiler and compiled my first C program for non-OS-9 CoCO.

I created this source file:

int main()
{
	char *ptr = 1024;
	while (ptr < 1536) *ptr++ = 128;
	return 0;
}

I compiled it using “cmoc hello.c” and it produces “hello.bin”.

I created a new blank disk image using “decb dskini C.DSK”.

I copied the binary to that disk image using “decb copy hello.bin C.DSK,HELLO.BIN -2”

I booted up the XRoar emulator and mounted that disk image as the first drive.

I did LOADM”HELLO” and then EXEC.

And so it begins…

Reversing bits in C

Checksums and zeros and XMODEM and randomness.

14 Replies

A year or two ago, I ran across some C code at my day that finally got me to do an experiment…

When I was first using a modem to dial in to BBSes, it was strictly a text-only interface. No pictures. No downloads. Just messages. (Heck a physical bulletin board at least would let you put pictures on it! Maybe whoever came up with the term BBS was just forward thinking?)

The first program I ever had that sent a program over the modem was DFT (direct file transfer). It was magic.

Later, I got one that used a protocol known as XMODEM. It seems like warp speed compared to DFT!

XMODEM would send a series of bytes, followed by a checksum of those bytes, then the other end would calculate a checksum over the received bytes and compare. If they matched, it went on to the next series of bytes… If it did not, it would resend those bytes.

Very simple. And, believe it or not, checksums are still being used by modern programmers today, even though newer methods have been created (such as CRC).

Checking the sum…

A checksum is simple the value you get when you add up all the bytes of some data. Checksum values are normally not floating point, so they will be limited to a fixed range. For example, an 8-bit checksum (using one byte) can hold a value of 0 to 255. A 16-bit checksum (2 bytes) can hold a value of 0-65535. Since checksums can be much higher values, especially if using an 8-bit checksum, the value just rolls over.

For example, if the current checksum calculated value is 250 for an 8-bit checksum, and the next byte being counted is a 10, the checksum would be 250+10, but that exceeds what a byte can hold. The value just rolls over, like this:

250 + 10: 251, 252, 253, 254, 255, 0, 1, 2, 3, 4

Thus, the checksum after adding that 10 is now 4.

Here is a simple 8-bit checksum routine for strings in Color BASIC:

0 REM CHKSUM8.BAS
10 INPUT "STRING";A$
20 GOSUB 100
30 PRINT "CHECKSUM IS";CK
40 GOTO 10

100 REM 8-BIT CHECKSUM ON A$
110 CK=0
120 FOR A=1 TO LEN(A$)
130 CK=CK+ASC(MID$(A$,A,1))
140 IF CK>255 THEN CK=CK-255
150 NEXT
160 RETURN

Line 140 is what handles the rollover. If we had a checksum of 250 and the next byte was a 10, it would be 260. That line would detect it, and subtract 255, making it 4. (The value starts at 0.)

The goal of a checksum is to verify data and make sure it hasn’t been corrupted. You send the data and checksum. The received passes the data through a checksum routine, then compares what it calculated with the checksum that was sent with the message. If they do not match, the data has something wrong with it. If they do match, the data is less likely to have something wrong with it.

Double checking the sum.

One of the problems with just adding (summing) up the data bytes is that two swapped bytes would still create the same checksum. For example “HELLO” would have the same checksum as “HLLEO”. Same bytes. Same values added. Same checksum.

However, if one byte got changed, the checksum would catch that.

It would be quite a coincidence if two data bytes got swapped during transfer, but I still wouldn’t use a checksum on anything where lives were at stake if it processed a bad message because the checksum didn’t catch it ;-)

Another problem is that if the value rolls over, that means a long message or a short message could cause the same checksum. In the case of an 8-bit checksum, and data bytes that range from 0-255, you could have a 255 byte followed by a 1 byte and that would roll over to 0. A checksum of no data would also be 0. Not good.

Checking the sum: Extreme edition

A 16-bit or 32-bit checksum would just be a larger number, reducing how often it could roll over.

For a 16-bit value, ranging from 0-65535, you could hold up to 257 bytes of value 255 before it would roll over:

255 * 257 = 65535

But if the data were 258 bytes of value 255, it would roll over:

255 * 258 = 65790 -> rollover to 255.

Thus, a 258-byte message of all 255s would have the same checksum as a 1-byte message of a 255.

To update the Color BASIC program for 16-bit checksum, change line 140 to be:

140 IF CK>65535 THEN CK=CK-65535

Conclusion

Obviously, an 8-bit checksum is rather useless, but if a checksum is all you can do, at least use a 16-bit checksum. If you were using the checksum for data packets larger than 257 bytes, maybe a 48-bit checksum would be better.

Or just use a CRC. They are much better and catch things like bytes being out of order.

But I have no idea how I’d write one in BASIC.

One more thing…

I almost forgot what prompted me to write this. I found some code that would flag an error if the checksum value was 0. When I first saw that, I thought “but 0 can be a valid checksum!”

For example, if there was enough data bytes that caused the value to roll over from 65535 to 0, that would be a valid checksum. To avoid any large data causing value to add up to 0 and be flagged bad, I added a small check for the 16-bit checksum validation code:

if ((checksum == 0) && (datasize < 258)) // Don't bother doing this.
{
    // checksum appears invalid.
}
else if (checksum != dataChecksum)
{
    // checksum did not match.
}
else
{
    // guess it must be okay, then! Maybe...
}

But, what about a buffer full of 00s? The checksum would also be zero, which would be valid.

Conclusion: Don’t error check for a 0 checksum.

Better yet, use something better than a checksum…

Until next time…

When there’s not enough room for sprintf…

10 Replies

Updates:

2022-08-30 – Corrected a major bug in the Get8BitHexStringPtr() routine.

“Here we go again…”

Last week I ran out of ROM space in a work project. For each code addition, I have to do some size optimization elsewhere in the program. Some things I tried actually made the program larger. For example, we have some status bits that get set in two different structures. The code will do it like this:

shortStatus.faults |= FAULT_BIT;
longStatus.faults |= FAULT_BIT;

We have code like that in dozens of places. One of the things I had done earlier was to change that in to a function. This was primarily so I could have common code set fault bits (since each of the four different boards I work with had a different name for its status structures). It was also to reduce the amount of lines in the code and make what they were doing more clear (“clean code”).

void setFault (uint8_t faultBit)
{
    shortStatus.faults |= faultBit;
    longStatus.faults |= faultBit;
}

During a round of optimizing last week, I noticed that the overhead of calling that function was larger than just doing it manually. I could switch back and save a few bytes every time it was used, but since I still wanted to maintain “clean code”, I decided to make a macro instead of the function. Now I can still do:

setFault (FAULT_BIT);

…but under the hood it’s really doing a macro instead:

#define setFault(faultBit) { shortStatus.faults |= faultbit; longStatus.faults |= faultBit; }

Now I get what I wanted (a “function”) but retain the code size savings of in-lining each instance.

I also thought that doing something like this might be smaller:

shortStatus.faults |= FAULT_BIT;
longStatus.faults = shortStatus.faults;

…but from looking at the PIC24 assembly code, that’s much larger. I did end up using it in large blocks of code that conditionally decided which fault bit to set, and then I sync the long status at the end. As long as the overhead of “this = that” is less than the overhead of multiple inline instructions it was worth doing.

And keep in mind, this is because I am 100% out of ROM. Saving 4 bytes here, and 20 bytes there means the difference between being able to build or not.

Formatting Output

One of the reasons for the “code bloat” was adding support for an LCD display. The panel, an LCD2004, hooks up to I2C vie a PCF8574 I2C I/O chip. I wrote just the routines needed for the minimal functionality required: Initialize, Clear Screen, Position Cursor, and Write String.

The full libraries (there are many) for Arduino are so large by comparison, so often it makes more sense to spend the time to “roll your own” than port what someone else has already done. (This also means I do not have to worry about any licensing restrictions for using open source code.)

I created a simple function like:

LCDWriteDataString (0, 0, "This is my message.");

The two numbers are the X and Y (or Column and Row) of where to display the text on the 20×4 LCD screen.

But, I was quickly reminded that the PIC architecture doesn’t support passing constant string data due to “reasons”. (Harvard architecture, for those who know.)

To make it work, you had to do something like:

const char *msg = "This is my message";
LCDWriteDataString (0, 0, msg);

…or…

chr buffer[19];
memcpy (buffer, "This is my message");
LCDWriteDataString (0, 0, msg);

…or, using the CCS compiler tools, add this to make the compiler take care of it for you:

#device PASS_STRINGS=IN_RAM

Initially I did that so I could get on with the task at had, but as I ran out of ROM space, I revisited this to see which approach was smaller.

From looking at the assembly generated by the CCS compiler, I could tell that “PASS_STRINGS=IN_RAM” generated quite a bit of extra code. Passing in a constant string pointer was much smaller.

So that’s what I did. And development continued…

Then I ran out of ROM yet again. Since I had some strings that needed formatted output, I was using sprintf(). I knew that sprintf() was large, so I thought I could create my own that only did what I needed:

char buffer[21];
sprintf (buffer, "CF:%02x C:%02x T:%02x V:%02x", faults, current, temp, volts);
LCDWriteDataString (0, 0, buffer);

char buffer[21];
sprintf (buffer, "Fwd: %u", watts);
LCDWriteDataString (0, 1, buffer);

In my particular example, all I was doing is printing out an 8-bit value as HEX, and printing out a 16-bit value as a decimal number. I did not need any of the other baggage sprintf() was bringing when I started using it.

I came out with these quick and dirty routines:

char GetHexDigit(uint8_t nibble)
{
  char hexChar;

  nibble = (nibble & 0x0f);

  if (nibble <= 9)
  {
    hexChar = '0';
  }
  else
  {
    hexChar = 'A'-10;
  }

  return (hexChar + nibble);
}

char *Get8BitHexStringPtr (uint8_t value)
{
    static char hexString[3];

    hexString[0] = GetHexDigit(value >> 4);
    hexString[1] = GetHexDigit(value & 0x0f);
    hexString[2] = '\0'; // NIL terminate

    return hexString;
}

The above routine maintains a static character buffer of 3 bytes. Two for the HEX digits, and the third for a NIL terminator (0). I chose to do it this way rather than having the user pass in a buffer pointer since the more parameters you pass, the larger the function call gets. The downside is those 3 bytes of variable storage are reserved forever, so if I was also out of RAM, I might rethink this approach.

I can now use it like this:

const char *msgC = " C:"; // used by strcat()
const char *msgT = " T:"; // used by strcat()
const char *msgV = " V:"; // used by strcat()

char buffer[20];

strcpy (buffer, "CF:"); // allows constants
strcat (buffer, Get8BitHexStringPtr(faults));
strcat (buffer, msgC);
strcat (buffer, Get8BitHexStringPtr(current));
strcat (buffer, msgT);
strcat (buffer, Get8BitHexStringPtr(temp));
strcat (buffer, msgV);
strcat (buffer, Get8BitHexStringPtr(volts));

LCDWriteDataString (0, 1, buffer);

If you are wondering why I do a strcpy() with a constant string, then use const pointers for strcat(), that is due to a limitation of the compiler I am using. Their implementation of strcpy() specifically supports string constants. Their implementation of strcat() does NOT, requiring me to jump through more hoops to make this work.

Even with all that extra code, it still ends up being smaller than linking in sprintf().

And, for printing out a 16-bit value in decimal, I am sure there is a clever way to do that, but this is what I did:

char *Get16BitDecStringPtr (uint16_t value)
{
    static char decString[6];
    uint16_t temp = 10000;
    int pos = 0;

    memset (decString, '0', sizeof(decString));

    while (value > 0)
    {
        while (value >= temp)
        {
            decString[pos]++;
            value = value - temp;
        }

        pos++;
        temp = temp / 10;
    }

    decString[5] = '\0'; // NIL terminate

    return decString;
}

Since I know the value is limited to what 16-bits can old, I know the max value possible is 65535.

I initialize my five-digit string with “00000”. I start with a temporary value of 10000. If the users value is larger than that, I decrement it by that amount and increase the first digit in the string (so “0” goes to “1”). I repeat until the user value has been decremented to be less than 10000.

Then I divide that temporary value by 10, so 10000 becomes 1000. I move my position to the next character in the output string and the process repeats.

Eventually I’ve subtracted all the 10000s, 1000s, 100s, 10s and 1s that I can, leaving me with a string of five digits (“00000” to “65535”).

I am sure there is a better way, and I am open to it if it generates SMALLER code. :)

And that’s my tale of today… I needed some extra ROM space, so I got rid of sprintf() and rolled my own routines for the two specific types of output I needed.

But this is barely scratching the surface of the things I’ve been doing this week to save a few bytes here or there. I’d like to revisit this subject in the future.

Until next time…

C and size_t and 64-bits

5 Replies

Do what I say and nobody gets hurt!

At my day job, I work on a Windows application that supervises high power solid state microwave generators. (The actual controlling part is done by multiple embedded PIC24-based boards, which is a good thing considering the issues Windows has given us over the years.)

At some point, we switched from building a 32-bit version of this application to a 64-bit version. The compiler started complaining about various things dealing with “ints” which were not 64-bits, so the engineer put in #ifdef code like this:

#ifdef _NI_mswin64_
    unsigned __int64 length = 0;
#else
    unsigned int length = 0;
#endif

That took care of the warnings since it would now use either a native “int” or a 64-bit int, depending on the target.

Today I ran across this and wondered why C wasn’t just taking care of things. A C library routine that returns an “int” should always expect an int, whether that int is 16-bits (like on Arduino), 32-bits or 64-bits on the system. I decided to look in to this, and saw the culprits were things like this:

length = strlen (pathName);

Surely if strlen() returned an int, it should not need to be changed to an “unsigned __int64” to work.

And indeed, C already does take care of this, if you do what it tells you to do. strlen does NOT return an int:

size_t strlen ( const char * str );

size_t is a special C data type that is “whatever type of number it needs to be.” And by simply changing all the #ifdef’d code to actually use the data type the C library call specifies, all the errors go away and the #ifdefs can be removed.

size_t length = 0;

A better, more stricter compiler might have complained about using an “int” to catch something coming back as “size_t.”

Oh wait. It did. We just chose to solve it a different way.

Until next time…

Redundant C variable initialization for redundant reasons.

2 Replies

The top article on this site for the past 5 or so years has been a simple C tidbit about splitting 16-bit values in to 8-bit values. Because of this, I continue to drop small C things here in case they might help when someone stumbles upon them.

Today, I’ll mention some redundant, useless code I always try to add.,,

I seem to recall that older specifications for the C programming language did not guarantee variables would be initialized to 0. I am not even sure if the current specification defines this, since one of the compilers I use at work has a specific proprietary override to enable this behavior.

You might find that this code prints non-zero on certain systems:

int i;

printf ("i = %d\n", i);

Likewise, trying to print a buffer that has not been initialized might produce non-empty data:

char message[32];

printf ("Message: '%s'\n", message);

Because of this, it’s a good habit to always initialize variables with at least something:

int i=0;

char message[42];
...
memset (message, 0x0, sizeof(message));

Likewise, when setting variables in code, it is also a good idea to always set an expected result and NOT rely on any previous initialization. For example:

int result = -1;

if (something == 1)
{
    result = 10;
}
else if (something == 2)
{
    result = 42;
}
else
{
    result = -1;
}

Above, you can clearly see that in the case none of the something values are met, it defaults to setting “result” to the same value it was just initialized to.

This is just redundant, wasteful code.

And you should always do it, unless you absolutely positively need those extra bytes of code space.

It is quite possible that at some point this code could be copy/pasted elsewhere, without the initialization. On first compile, the coder sees the undeclared “result” and just adds “int result;” at the top of the function. If the final else with “result = -1;” wasn’t there, the results could be unexpected.

The reverse of this is also true. If you know you are coding so you ALWAYS return a value and never rely on initialized defaults, it would be safe to just do “int result;” at the top of this code. But, many modern compilers will warn you of “possibly initialized variables.”

Because of this, I always try to initialize any variable (sometimes to a value I know it won’t ever use, to aid in debugging — “why did I suddenly get 42 back from this function? Oh, my code must not be running…”).

And I always try to have a redundant default “else” or whatever to set it, instead of relying on “always try.”

Maybe two “always tries” make a “did”?

Until next time…

while and if and braces … oh my.

Curly braces! Foiled again.

In C, it is common to see code formatted using whitespace like this:

if (a == 1)
    printf("One!\n");

That is fine, since it is really just doing this:

if (a == 1) printf("One!\n");

…but is considered poor coding style these days because many programmers seem to be used to languages where indention actually means something — as opposed to C, where whitespace is whitespace. Thus, you frequently find bugs where someone has added more code like this:

if (a == 1)
    printf("One!\n");
    DoSomething ();
    printf("Done.\n");

Above, it feels like it should execute three things any time a is 1, but to C, it really looks like this:

if (a == 1) printf("One!\n");
DoSomething ();
printf("Done.\n");

Thus, modern coding standards often say to always use curly braces even if there is just one thing after the if:

if (a == 1)
{
    printf("One!\n");
}

With the braces in place, adding more statements within the braces would work as expected:

if (a == 1)
{
    printf("One!\n");
    doSomething ();
    printf("Done.\n");
}

This is something that was drilled in to my brain at a position I had many years ago, and it makes great sense. And, the same thing should be said about using while. But while has it’s own quirks. Consider these two examples:

// This way:
while (1);

// That way:
while (1)
{
}

They do the same thing. One uses a semicolon to mark the end of the stuff to do, and other uses curly braces around the stuff to do. That’s the key to the code at the start of this post:

while (timeToDoSomething == true)
if (timeToDoSomething == true)
{
    timeToDoSomething = false;

    // Do something.
}

Just like you could do…

while (timeToDoSomething == true) printf("I am doing something");

…you could also write it as…

while (timeToDoSomething == true)
{
    printf("I am doing something");
}

So when the “if” got added after the “while”, it was legit code, as if the user was trying to do this:

while (timeToDoSomething == true)
{
    if (timeToDoSomething == true)
    {
        timeToDoSomething = false;

        // Do something.
    }
}

Since while can be followed by braces or a statement, it can also be followed by a statement using just braces.

The compiler can’t easily warn about needing a brace, since it is not required to have braces. But if it braces were required, that would catch the issues mentioned here with if and while blocks.

Code that looks like it should at least generate a warning is completely valid and legal C code, and that same code can be formatted in a way that makes it clear(er):

while (timeToDoSomething == true)
    if (timeToDoSomething == true)
    {
        timeToDoSomething = false;

        // Do something.
    }

Whitespace makes things look pretty, but lack of it can also make things look wrong. Or correct when they aren’t.

I suppose the soapbox message of today is just to use braces. That wouldn’t have caught this particular typo (forgetting to comment something out), but its probably still good practice…

Until next time…

Sub-Etha Software

"In Support of the CoCo and OS-9 since 1990!"

Category Archives: C Programming

IEEE-754 online tool thing

C and returning values quickly or safely. But not both.

Pointing to a variable’s memory

And your point is?

Try #1: Passing by Reference

Try #2: Passing a structure by reference

Try #3: Returning the address of a static

Try #4: Returning a copy of a structure

TIL: You can build C in Microsoft Visual Studio

My first C program for CoCo DISK BASIC.

Reversing bits in C

Checksums and zeros and XMODEM and randomness.

Checking the sum…

Double checking the sum.

Checking the sum: Extreme edition

Conclusion

One more thing…

When there’s not enough room for sprintf…

Formatting Output

C and size_t and 64-bits

Do what I say and nobody gets hurt!

Redundant C variable initialization for redundant reasons.

while and if and braces … oh my.

Curly braces! Foiled again.