Category Archives: C Programming

Code as if someone’s life depended on it.

Message for the day…

A useful technique I use to reduce the number of bugs in my buggy programs is to code as if someone’s life depended on them. When “it doesn’t matter,” it’s easy to avoid adding all the error checking.

But once you start looking at code as if it was mission critical, you start to see things differently and begin adding “useless” error checking. As long as you can afford the extra few moments to add the “useless” error checking, and as long as you have code space to hold the “useless” error checking, and as long as you have enough CPU cycles to run the “useless” error checking, you end up with a program that should be better,

Share and enjoy.

Braces! Foiled again!

Here’s another quick C programming thing.

Consider the following code, which nest two for/next loops together. Notice the variable it uses for each one.

for (int i=0 ; i < 10; i++) // using i
{
    printf("%d\n", i);

    for (int i=0 ; i < 3 ; i++) // using i
    {
        printf ("   %d\n", i);
    }
}

What do you think it will print? Will this even work? Running it produces…

0
   0
   1
   2
1
   0
   1
   2
2
   0
   1
   2
3
   0
   1
   2
4
   0
   1
   2
5
   0
   1
   2
6
   0
   1
   2
7
   0
   1
   2
8
   0
   1
   2
9
   0
   1
   2

In C, you create a new instance (scope) of a variable “i” within the braces. The second for/next declares its own local i, and any references to i within those braces will be that second instance of i.

This is confusing, of course, but it works.

A “bug” could exist if the second scope forgot the “int”:

for (int i=0 ; i < 10; i++) // using i
{
    printf ("%d\n", i);

    for (i=0 ; i < 3 ; i++) // where's the int?
    {
        printf ("   %d\n", i);
    }
}

By forgetting that “int”, it re-uses the first instance of i, and does not do what was expected.

0
   0
   1
   2
4
   0
   1
   2
4
   0
   1
   2
4
   0
   1
   2
4
   0
   1
   2
4
...repeat...

This was based on something recently fixed in some production code at my day job.

So, yeah. Use different variable names and avoid this from ever happening.

Next, revisiting a recent post, also dealing with what braces do, I stumbled upon a much better reason to always put braces around things, even if it’s just one statement.

Suppose you had a simple DEBUG macro (this is not a good one, but pretend it is):

#define DEBUG(x) printf(x);

if (x==1) DEBUG("hello");

Looks fine! But if you decide to make the macro contain multiple statements, using the backslash character like this:

#define DEBUG(x) printf("ERROR!\n"); \ printf(X);

It now breaks. That macro changes the resulting code into something like this:

if (x==1) printf("ERROR!\n");
printf("hello");

Not what was expected. To solve this, the macro creator should have put the statements in braces like:

#define DEBUG(x) { printf("ERROR!\n"); \ printf(X); }

Problem solved. Assuming the macro creator did it that way.

So, to prevent things like this that look fine and work fine from doing neither, just get in the habbit of using braces even for one statement:

if (x==1) { DEBUG("hello"); }

Or just always do them in multiple lines:

if (x==1)
{
   DEBUG("hello");
}

…and you don’t ever have to worry about that again.

Until next time…

Function names and “Clean Code”, part 2

See also: Part 1

Oh, the frustrations of obvious function names that don’t actually help.

LED1_On();
LED1_Off();
...
LED9_On();
LED9_Off();

First, it’s quite obvious that LED1_On() probably turns LED #1 on, and LED1_Off() turns it off. Simple.

But what is LED #1?

For most of my programming career, I have coded this way. You can always tell folks what that LED is in the commands… But then, of course, you don’t always do that everywhere it is used, and then if it is ever changed, all the old comments lie. (Thank you, Clean Code book, for making me fear comments.)

Today, I treat this kind of like how the OS-9 operating system treated devices… There is a device description (the thing that you are trying to access by name) and a device driver (the thing that controls the thing you are actually trying to access). Under OS-9, a hard disk might have been called “/h0” or “/h1”. A disk drive might have been “/d0” or “/d1”. The device drivers may have been some cryptic name with a chip number in it, such as “fd507” or “rb1003”. But that didn’t matter, because the used didn’t have to know and didn’t have to care. “/d0” was probably the main disk drive on any OS-9 system with a disk drive, regardless of what the actual underlying hardware was.

So for coding these days, I tend to write “low level” driver type functions like LED1_On() and LED2_Off(), and then wrap them in some clean code like:

void StatusLEDOn()
{
   LED1_On();
}

void StatusLEDOff()
{
   LED1_Off();
}

That adds a “useless” layer of code, but it keeps things nice and separate. If I see “StatusLEDOn()” in code, I can assume it probably turns on whatever the Status LED is. And if I see that function, I can determine that the Status LED must be LED #1.

And a good optimizer will likely just optimize these “useless” functions away, anyway.

Thanks to that stupid book, now when I see “ReadAI(2)” or “ToggleDO(5)” I just want to dive in and write wrapper functions to make that code look like “GetCurrentTemperature()” or “ToggleStatusLight()”.

I really need to go back and update all my code on GitHub some day…

Function names and “Clean Code”

At a previous job, I was introduced to the concept of “clean code.” As an embedded programmer, where we are often trying to squeeze extra bytes out of already optimized code, many of the principals of clean code do not apply.

But, in general, I like what I have read so far in this recommended book on the subject:

https://amzn.to/2KCtFy1 (affiliate link)

With this in mind, I wanted to ask a question about function names (whether they be in C, Perl, Java, or whatever).

In the past, I would write my functions using verb/noun, like this:

InitUserlog();
InitDisplay();
InitNetworking();

But after my exposure to object oriented programming, I learned about how classes and methods are used:

Userlog.init();
Display.init();
Networking.init();

This approach makes it very clean which class a function (method?) belongs to.

I started doing my C code similarly, letting the noun of the code lead:

UserlogInit();
DisplayInit();
NetworkingInit();

With all the Userlog functions starting with Userlog, I think it makes it clearer where things come from, and helps avoid collisions with outside code (or just code from another engineer on the same team).

With clean code, code should read more like English. Thus, using “InitializeUserlog()” is probably cleaner than “UserlogInit()”. But since you can’t do that with object oriented languages, perhaps writing backwards (Userlog.init(), UserlogInit(), etc.) is accepted due to the limitation of the language.

Indeed, “Userlog.lookupName()” seems less clean than “LookupNameInUserlog()”.

Perhaps I shouldn’t be doing this in C, since I *can* write the functions out more like proper English.

What do you think? Comments appreciated.

C and the dangers of memcpy()

Updates:

  • 2019-08-12 – Added note about using unions.

In the C programming language, memcpy (memory copy) is a function used to copy a range of bytes from one location in memory to another. I have used it often. Today, my boss mentioned something about not liking memcpy() because of all the dangers of using it. I understand that incorrect addresses, bad calculations, and simple omissions from a copy/paste can cause a memcpy() to have bad results. But, what’s the alternative?

He mentioned that he prefers to use C structures to copy data around, as opposed to using memcpy() to copy blocks of data.

Instead of doing this…

char Buffer1[64];
char Buffer2[64];

// Load Buffer1 with something.
strncpy (Buffer1, "Copy this string over", 64);

// Copy Buffer1 into Buffer2.
memcpy (Buffer2, Buffer1, 64);

That’s a simple, straight-forward example that seems hard to mess up. But doing something like this would be impossible to mess up:

typedef struct
{
    char Buffer[64];
} CopyStruct64;

CopyStruct64 Buffer1;
CopyStruct64 Buffer2;

// Load Buffer1 structure with something.
strncpy (Buffer1.Buffer, "Copy this string over", 64);

// Copy Buffer1 structure into Buffer2 structure.
Buffer2 = Buffer1;

Interesting. With properly defined structures, copying a block of data from one place to the other is done using code generated by the compiler.

(Update) And it seems I misspoke. He was actually talking about using unions, not structures. I’ve used unions as unions, but never for something like this. I did a similar test. (I realize there are several ways to declare a union, but I am using typedef so it looks similar to the above structure example.)

typedef union
{
    char Buffer[64];
} CopyUnion64;

CopyUnion64 Buffer1;
CopyUnion64 Buffer2;

// Load Buffer1 structure with something.
strncpy (Buffer1.Buffer, "Copy this string over", 64);

// Copy Buffer1 structure into Buffer2 structure.
Buffer2 = Buffer1;

Using union appears to work the same way, and since my example was not using structure elements, it seems to make more sense to do it that way. (End of Update)

I’m just now wrapping my brain around this, trying to figure out how this approach could be used to avoid using memcpy() and pointers.

What do you think? Am I on the right track to what he was talking about?

Until next time…

Braces! Foiled again!

Just a quick rant, based on how example code from a new compiler I am using is presented.

In C (and similar languages), it is very common to see simple logic presented in one line, such as:

if (AlertLevel == RED) TurnOnSiren();

That is simple to understand, and only takes up one line on the screen or a printout. However, at a previous job, our coding standard forbid that, and always required the use of braces around any actions of an “if”.

if (AlertLevel == RED)
{
   TurnOnSiren();
}

Now, ignoring style preferences of where the braces should go, the use of braces for something like this has some advantages when it comes to preventing potential bugs if this code is changed in the future.

In the case of the compiler examples I have been seeing, they show them without braces, such as:

if (AlertLevel == RED)
   TurnOnSiren();

For simple and short logic, this is no different than my first example — it just places the “what to do” stuff on an indented line after the “if”.

However, a common bug I see occurs when someone expands upon that like this:

if (AlertLevel == RED)
   TurnOnSiren();
   FlashRedLight();

In some languages, where program flow is determined by indentation, this would work. But in C, indentation does not matter. The above code would actually be processed as:

if (AlertLevel == RED) TurnOnSiren();

FlashRedLight();

The red lights would flash every time through, regardless of the status of AlertLevel.

Unintended consequences of not using braces, and using indents to format each statement on its own line.

By always using braces (even on simple one-statement things like this example), you avoid that:

if (AlertLevel == RED)
{
   TurnOnSiren();
   FlashRedLight();
}

Now we have the desired result, as adding new code or function calls inside the braces will work as we intended.

It’s 2019. The C Programming Language came out 47 years ago, and I still occasionally run into bugs caused by the lack of braces.

Thanks to a former job, I no longer make this mistake. I always use braces.

This has been a public service announcement of Sub-Etha Software.

C musing of the day: signed ints

I ran across some code today that puzzled me. It was an infinite loop that used a counter to determine if things took too long. Something like this:

int main()
{
  int count;
  int status;

  count = 0;

  do
  {
    status = GetStatus();

    count++;

  } while( status == 0 );

  if (count < 0)
  {
    printf("Time out! count = %d\n", count);

    return EXIT_FAILURE;
  }

  printf("Done. count = %d\n", count);

  return EXIT_SUCCESS;
}

The code would read some hardware status (“GetStatus() in this example)  and drop out of the do/while loop once it had a value. Inside that loop, it increments a count variable. After done, it would check for “count < 0” and exit with a timeout if that were true.

Count is only incremented, so the only way count could ever be less than zero is if it incremented so high it rolled over. With a signed 8-bit value (int8_t), you count from 0 to 127, then it rolls over to -128 and counts back up to 0.

So with an “int” being a 32-bit value (on the system I am using), the rollover happens at 2147483647.

And that seems like quite a weird value.

But I suppose it it took that long, it certainly timed out.

I think if was going to do it, I would have probably used an unsigned integer, and just specified an appropriate value:

unsigned int count;

...

if (count > 10000)
{
  printf("Time out! count = %u\n", count);
  return EXIT_FAILURE;
}

What says you?

C warnings, %d versus %u and more C fun.

Code cleanup on aisle five…

I recently spent two days at work going through projects to clean up compiler warnings. In GNU C, you can enable options such as “-Wall” (all warnings), “-Wextra” (extra warnings) and “-Werror” (warnings as errors). By doing steps like these, the compiler will scream at you and fail to build code that has warnings in it.

Many of these warnings don’t impact how your code runs. They just ask you “are you sure this is what you are meaning to do?”

For example, if you leave out a “break” in a switch/case block, the compiler can warn you about that:

  x = 1;

  switch( x )
  {
  case 1:
    printf("x is one\n");
    // did I mean to not have a break here?

  case 2:
    printf("x is two\n");
    break;

  default:
    printf("I don't know what X is\n");
    break;
  }

This code would print:

x is one
x is two

…because without the “break” in the “case 1”, the code drops down to the following case. I found several places in our embedded TCP/IP stack where this was being done intentionally, and the author had left comments like “/* falls through below */” to let future observers know their intent. But, with warnings cranked up, it would no longer build for me, even though it was perfectly fine code working as designed.

I found there was a GCC thing you could do where you put in “//no break” as a comment and it removes that warning. I expect that are many more “yes, I really mean to do this” comments GCC supports, but I have not looked in to it yet.

Size (of int) matters

Another issue I would see would be warnings when you used the wrong specifier in a printf. Code might compile fine without warning on a PC, but generate all kinds of warnings on a different architecture where an “int” might be a different size. For example:

int answer = 42;
printf("The answer is %d\n", answer);

On my PC, “%d” can print an “int” type just fine. But, if I had used a “long” data type, it would error out:

long answer = 42;
printf("The answer is %d\n", answer);

This produces this warning/error:

error: format '%d' expects argument of type 'int', but argument 2 has type 'long int' [-Werror=format=]|

You need to use the “l” (long) specifier (“%ld”) to be correct:

long answer = 42;
printf("The answer is %ld\n", answer);

I found that code that compiled without warnings on the PC would not do the same on one of my embedded target devices.

%u versus %d: Fight!

Another warning I had to deal with was printf() and using “%d” versus “%u”. Most code I see always uses %d, which is for a signed value which can be positive or negative. It seems works just fine is you print an unsigned integer type:

unsigned int z;

z = 100;
printf("z is %d\n", z);

Even though the data type for z is unsigned, the value is positive so it prints out a positive number. After all, a signed value can be positive.

But, it is more correct to use “%u” when printing unsigned values. And, here is an example of why it is important to use the proper specifier… Consider this:

#include <limits.h> // for UINT_MAX

unsigned int x;

x = UINT_MAX; // largest unsigned int

printf("x using %%d is %d\n", x);
printf("x using %%u is %u\n", x);

This prints:

x using %d is -1
x using %u is 4294967295

In this case, %d is not giving you what you expect. For a 32-bit int (in this example), ULONG_MAX of 4294967295 is all bits set:

11111111 11111111 11111111 11111111

That represents a -1 if the value was a signed integer, and that’s what %d is told it is. Thus, while %d works fine for smaller values, any value large enough to set that end bit (that represents a negative value for a signed int) will produce incorrect results.

So, yeah, it will work if you *know* you are never printing values that large, but %u would still be the proper one to use when printing unsigned integers… And you won’t get that warning :)

C warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

Trick C question time … what will this print?

#include <stdio.h>
#include <stdlib.h>

int main()
{
  int x;
  unsigned int y;

  x = -1;
  y = 2;

  printf("x = %d\n", x);
  printf("y = %u\n", y);

  if ( x > y )
  {
    printf("x > y\n");
  }
  else if (x < y)
  {
    printf("x < y\n");
  }
  else
  {
    printf("x == y\n");
  }

  return EXIT_SUCCESS;
}

I recently began looking in to various compiler warnings in some code I am using, and I found quite a few warnings about comparing signed and unsigned values:

warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

I thought I could safely ignore these, since it seems plausible to compare a signed value with an unsigned value. A signed value of -42 should be less than an unsigned value of 42, right?

In the above example, it will print the following:

x = -1
y = 2
x > y

Nope. I was wrong. According to C, -1 is greater than 2.

C does something that I either never knew, or knew and have long since forgotten. I guess I generally try to write code that has no warnings at all, so I’ve avoided doing this. And now I know (or re-know) the reason why.

When dealing with mis-matched comparisons, C makes them both unsigned. Thus, “-1” becomes whatever -1 would be for that data type.

char  achar  = -1;
short ashort = -1;
int   aint   = -1;
long  along  = -1;

printf("char  -1 as unsigned: %u\n", (unsigned char)achar);
printf("short -1 as unsigned: %u\n", (unsigned short)ashort);
printf("int   -1 as unsigned: %u\n", (unsigned int)aint);
printf("long  -1 as unsigned: %u\n", (unsigned long)along);

This outputs:

char  -1 as unsigned: 255
short -1 as unsigned: 65535
int   -1 as unsigned: 4294967295
long  -1 as unsigned: 4294967295

Thus, on a PC, an 8-bit signed value of -1 is treated as a 255 when comparing against an unsigned value, and a 16-bit as 65535. It seems an int and long as both 32-bits on my system, but these could all be different on other architectures (on Arduino, and int is 16-bits, I believe).

So, without this warning enabled, any comparison that looks correct might be doing something quite wrong.

Warnings are our friends. Even if we hate them and want them to go away.