C musing of the day: i++ or ++i?

Here’s another short side-musing about C…

At my previous job, I did embedded programming on TI MSP430, PowerPC and Renesas processors. We were in the process of moving to a new system based on ARM architecture. It was during this time that I went from being part of a tiny local team to being part a much larger group spread out around the world.

But I digress.

On this team were some great minds, and one of them produced a document listing a bunch of optimizations we could do in our code to reduce size and/or improve speed. I wish I had a copy of it as it would be fun to discuss the suggestions here and get some feedback. But for today, I’ll bring up one that was just brought up by a coworker at my current job.

var++ versus ++var

I am very used to writing (and seeing) variable increments written like this:

i++;

Is it called a post increment, I believe. There is another version called pre-increment that looks like this:

++i;

By themselves, the result appears the same. For example:

int i;

i = 0;
printf("i = %d\n", i);
i++;
printf("i = %d\n", i);

That would print 0, then i++ would increment it, then it would print 1.

int i;

i = 0;
printf("i = %d\n", i);
++i;
printf("i = %d\n", i);

That would print 0, then ++i would increment it, then it would print 1.

The reason there is pre and post is for situations where you want to check something at the same time you increment it. If you use “i++” in a check…

i = 0;
if (i++ == 1) printf("This will NOT be printed.\n");

…it checks the value of “i” , then increments it after the check (post increment, see?). But if you go the other way:

i = 0;
if (++i == 1) printf("This WILL be printed.\n");

…it increments “i” first, then checks the new value of i.

Behind the scenes, “i++” has to do more work since it has to retain the original value of i for the compare. I think a post-increment might look something like this:

i = 0;    // SET i TO 0
if (
    i++   // SAVE VALUE OF i AS TEMP_i
          // INCREMENT i
    == 1) // COMPARE TEMP_i TO 1

And a pre-increment might look like this:

i = 0;    // SET i TO 0
if (
    ++i   // INCREMENT i
    == 1) // COMPARE i TO 1

If all you are JUST wanting to increment (or decrement) a variable, it might make sense to always use the pre-increment version (“++i”) so the extra “save this value for later” code is not produced.

BUT, a good compiler optimizer should be able to see that nothing cares about that temporary value and just discard it rather than building it in to your binary.

With a good optimizer, it shouldn’t make any difference. But perhaps it’s better to write it like you mean it since “if (i++ …” means something different than “if (++i …”.

But geez, after twenty years of writing “i++” it sure would be hard to switch.

Does it matter? What if I am writing for an old pre-ANSI K&R compiler?

Comments appreciated.

C musing of the day: signed ints

I ran across some code today that puzzled me. It was an infinite loop that used a counter to determine if things took too long. Something like this:

int main()
{
  int count;
  int status;

  count = 0;

  do
  {
    status = GetStatus();

    count++;

  } while( status == 0 );

  if (count < 0)
  {
    printf("Time out! count = %d\n", count);

    return EXIT_FAILURE;
  }

  printf("Done. count = %d\n", count);

  return EXIT_SUCCESS;
}

The code would read some hardware status (“GetStatus() in this example)  and drop out of the do/while loop once it had a value. Inside that loop, it increments a count variable. After done, it would check for “count < 0” and exit with a timeout if that were true.

Count is only incremented, so the only way count could ever be less than zero is if it incremented so high it rolled over. With a signed 8-bit value (int8_t), you count from 0 to 127, then it rolls over to -128 and counts back up to 0.

So with an “int” being a 32-bit value (on the system I am using), the rollover happens at 2147483647.

And that seems like quite a weird value.

But I suppose it it took that long, it certainly timed out.

I think if was going to do it, I would have probably used an unsigned integer, and just specified an appropriate value:

unsigned int count;

...

if (count > 10000)
{
  printf("Time out! count = %u\n", count);
  return EXIT_FAILURE;
}

What says you?

C warnings, %d versus %u and more C fun.

Code cleanup on aisle five…

I recently spent two days at work going through projects to clean up compiler warnings. In GNU C, you can enable options such as “-Wall” (all warnings), “-Wextra” (extra warnings) and “-Werror” (warnings as errors). By doing steps like these, the compiler will scream at you and fail to build code that has warnings in it.

Many of these warnings don’t impact how your code runs. They just ask you “are you sure this is what you are meaning to do?”

For example, if you leave out a “break” in a switch/case block, the compiler can warn you about that:

  x = 1;

  switch( x )
  {
  case 1:
    printf("x is one\n");
    // did I mean to not have a break here?

  case 2:
    printf("x is two\n");
    break;

  default:
    printf("I don't know what X is\n");
    break;
  }

This code would print:

x is one
x is two

…because without the “break” in the “case 1”, the code drops down to the following case. I found several places in our embedded TCP/IP stack where this was being done intentionally, and the author had left comments like “/* falls through below */” to let future observers know their intent. But, with warnings cranked up, it would no longer build for me, even though it was perfectly fine code working as designed.

I found there was a GCC thing you could do where you put in “//no break” as a comment and it removes that warning. I expect that are many more “yes, I really mean to do this” comments GCC supports, but I have not looked in to it yet.

Size (of int) matters

Another issue I would see would be warnings when you used the wrong specifier in a printf. Code might compile fine without warning on a PC, but generate all kinds of warnings on a different architecture where an “int” might be a different size. For example:

int answer = 42;
printf("The answer is %d\n", answer);

On my PC, “%d” can print an “int” type just fine. But, if I had used a “long” data type, it would error out:

long answer = 42;
printf("The answer is %d\n", answer);

This produces this warning/error:

error: format '%d' expects argument of type 'int', but argument 2 has type 'long int' [-Werror=format=]|

You need to use the “l” (long) specifier (“%ld”) to be correct:

long answer = 42;
printf("The answer is %ld\n", answer);

I found that code that compiled without warnings on the PC would not do the same on one of my embedded target devices.

%u versus %d: Fight!

Another warning I had to deal with was printf() and using “%d” versus “%u”. Most code I see always uses %d, which is for a signed value which can be positive or negative. It seems works just fine is you print an unsigned integer type:

unsigned int z;

z = 100;
printf("z is %d\n", z);

Even though the data type for z is unsigned, the value is positive so it prints out a positive number. After all, a signed value can be positive.

But, it is more correct to use “%u” when printing unsigned values. And, here is an example of why it is important to use the proper specifier… Consider this:

#include <limits.h> // for UINT_MAX

unsigned int x;

x = UINT_MAX; // largest unsigned int

printf("x using %%d is %d\n", x);
printf("x using %%u is %u\n", x);

This prints:

x using %d is -1
x using %u is 4294967295

In this case, %d is not giving you what you expect. For a 32-bit int (in this example), ULONG_MAX of 4294967295 is all bits set:

11111111 11111111 11111111 11111111

That represents a -1 if the value was a signed integer, and that’s what %d is told it is. Thus, while %d works fine for smaller values, any value large enough to set that end bit (that represents a negative value for a signed int) will produce incorrect results.

So, yeah, it will work if you *know* you are never printing values that large, but %u would still be the proper one to use when printing unsigned integers… And you won’t get that warning :)

C warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

Trick C question time … what will this print?

#include <stdio.h>
#include <stdlib.h>

int main()
{
  int x;
  unsigned int y;

  x = -1;
  y = 2;

  printf("x = %d\n", x);
  printf("y = %u\n", y);

  if ( x > y )
  {
    printf("x > y\n");
  }
  else if (x < y)
  {
    printf("x < y\n");
  }
  else
  {
    printf("x == y\n");
  }

  return EXIT_SUCCESS;
}

I recently began looking in to various compiler warnings in some code I am using, and I found quite a few warnings about comparing signed and unsigned values:

warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

I thought I could safely ignore these, since it seems plausible to compare a signed value with an unsigned value. A signed value of -42 should be less than an unsigned value of 42, right?

In the above example, it will print the following:

x = -1
y = 2
x > y

Nope. I was wrong. According to C, -1 is greater than 2.

C does something that I either never knew, or knew and have long since forgotten. I guess I generally try to write code that has no warnings at all, so I’ve avoided doing this. And now I know (or re-know) the reason why.

When dealing with mis-matched comparisons, C makes them both unsigned. Thus, “-1” becomes whatever -1 would be for that data type.

char  achar  = -1;
short ashort = -1;
int   aint   = -1;
long  along  = -1;

printf("char  -1 as unsigned: %u\n", (unsigned char)achar);
printf("short -1 as unsigned: %u\n", (unsigned short)ashort);
printf("int   -1 as unsigned: %u\n", (unsigned int)aint);
printf("long  -1 as unsigned: %u\n", (unsigned long)along);

This outputs:

char  -1 as unsigned: 255
short -1 as unsigned: 65535
int   -1 as unsigned: 4294967295
long  -1 as unsigned: 4294967295

Thus, on a PC, an 8-bit signed value of -1 is treated as a 255 when comparing against an unsigned value, and a 16-bit as 65535. It seems an int and long as both 32-bits on my system, but these could all be different on other architectures (on Arduino, and int is 16-bits, I believe).

So, without this warning enabled, any comparison that looks correct might be doing something quite wrong.

Warnings are our friends. Even if we hate them and want them to go away.

 

const-ant confusion in C.

Updates:

  • 2017-11-30 – Fixing description of MyStructure example. Thanks, Lost Wiz!

Embedded Life

I currently make my living doing embedded C programming. I am not quite sure how to define what “embedded” programming is other than to say: you probably don’t have everything you expect.

You often program on systems without file systems, without gigabytes of RAM and without an operating system to do most of your work for you. For example, at my previous job one of my main platforms (a TI MSP430 processor) had only 10K of RAM and the program was limited to 40K of flash storage. At my current job, I work on several variations of ARM processors, with one configured to give me only 7K of RAM and 36K of program space.

These systems are much closer to an Arduino UNO, which has 32K of flash and 2K of RAM, than a desktop Windows or Linux machine.

Not all embedded systems are this tiny, of course. There are many embedded systems that run Linux, but once you have a full operating system and a file system, the definition of “embedded” seems to be used to just mean “smaller than Windows”.

But I digress…

Over the past six years, I’ve worked on code that was created and maintained by many different programmers before I worked on it. I have learned some cool tricks and also seen some very un-cool tricks (i.e., just wrong). I expect I will be leaving my own cool/un-cool bits of code for future programmers to find.

With that said, there is one item that keeps turning up repeatedly that many of us C programmers don’t seem to really understand because we keep misusing it. And by “we” I include myself.

“const”

There is a keyword in C called “const” which, according to Wikipedia, “indicates that the data is read only.” For example, suppose you wrote a function that accepts a string (actually, pointer to a bunch of characters) like this:

void PrintErrorMessage( char *message )
{
  fputs( "ERROR: ", stderr );
  fputs( message, stderr );
}

In this example, whatever string passed in will be displayed with “ERROR: ” prepended to it.

int main( int argc, char **argv )
{
  PrintErrorMessage("File Not Found");

  return EXIT_SUCCESS;
}

That would display a message to standard error output:

ERROR: File Not Found

But, there is nothing preventing the function from trying* to modify the string that was passed in.

* Key word is “trying”… If that string were embedded in the binary and it was running out of ROM or Flash, attempts to modify it would be rejected by the “hardware can’t do that” exception ;-)

void PrintErrorMessage( char *message )
{
  fputs( "ERROR: ", stderr );

  // Convert message to uppercase
  for (int i=0; i&lt;strlen(message); i++)
  {
    message[i] = toupper(message[i]);
  }
  fputs( message, stderr );
}

The intent here would be to display the error message in uppercase, such as “ERROR: FILE NOT FOUND”. This would work if the string being passed in was modifiable, such as:

char message[80];

strcpy(message, "File Not Found");
PrintErrorMessage( message );

…but after returning from that call, the string would have been modified by the function and would now be “FILE NOT FOUND” in memory. This is fine, if that is the intent of the function, but if you do not intend the string to be modified, you can take steps to prevent the function from being able to do it.

Only read this…

“const” will tell the compiler to not allow code to be built that modifies the variable. You see it used all the time by standard C library functions that take strings, such as puts(), strcpy(), etc.

int puts ( const char * str );

For puts() and similar functions, the use of const disallows modifying that string within the function. In the earlier example, we could make the passed-in string “read only” like this:

void PrintErrorMessage( const char *message )
{
  int i;
  fputs( "ERROR: ", stderr );

  // Convert message to uppercase
  for (int i=0; i&lt;strlen(message); i++)
  {
    message[i] = toupper(message[i]);
  }

  fputs( message, stderr );
}

With that change made, the compiler now should issue warnings (or errors) on attempts to modify the “message” string inside the function:

error: assignment of read-only location '*(message + (sizetype)((unsigned int)i * 1u))'|

The moment the function tries to modify “message[i]” causes a problem, because “const” has told the compiler whatever is passed in should not be modified.

Because of the usefulness of this extra compile-time error checking, const is a good thing to use.

And many of us do.

Incorrectly.

There is a bit of confusion in how const works. In the above example, we pass in the pointer to some memory that contains a string. We do not want the memory that is being passed in to be modified, so we use “const char *message”. According to the “C Gibberish” website, that means:

declare message as pointer to const char

We might also want to prevent the pointer itself from being modified by using “const char const * message” … and that would be incorrect. That is not the proper syntax for “const”:

declare message as pointer to const const char

The confusion comes from const allowing two ways of doing the same thing. Did you know that:

const char

…is the same as:

char const

In C, the true syntax seems to be using “const” after the thing you are declaring, like “char is a constant” or “* is a constant”. But, at the start of that line, const can be used at the beginning to mean the same thing, and since we see that all the time, many of us seem to think that const describes what comes after it. Which it doesn’t.

To properly declare that the pointer and the data it points to should be read-only, it should be:

char const * const message;

C Gibberish agrees:

declare message as const pointer to const char

We need to learn this double “before or after is the same thing” use, or only use the “after” use and be consistent.

// declare message as const pointer to const char
const char * const message;

is the same as

// declare message as const pointer to const char
char const * const message;

My most recent encounter with this was in code at work that did something like this:

void Initialize(const MyStructure *config);

They must have intended this to mean “I’m passing in a constant MyStructure pointer which cannot be modified” but, in reality, what they were getting was:

declare config as pointer to const struct MyStructure

They were telling the compiler that the structure being pointed to was read-only and should not be modified. But the function’s purpose was to modify elements of that structure:

config->type = 10;
config->foo = 'a';

Because of this misuse of const, many compiler warnings were generated.

||=== Build: Debug in Const (compiler: GNU GCC Compiler) ===|
main.c||In function 'Initialize':|
main.c|16|error: assignment of member 'type' in read-only object|
main.c|17|error: assignment of member 'foo' in read-only object|
||=== Build failed: 2 error(s), 0 warning(s) (0 minute(s), 0 second(s)) ===|

The fix was to correct the prototype and function to do what was actually intended:

// declare config as const pointer to struct MyStructure
void Initialize(MyStructure * const config);

This now disallows the “config” pointer from being changed, but not the structure it points to. Thus, this would not work:

config++; // Increment config pointer.

…but the intended structure modifications would:

config->type = 10;
config->foo = 'a';

I’m sad to say I’ve been using “const” incorrectly as long as I can remember using “const.”

For a future article, I may dive in to some deep const-ant confusion I recently found myself in, and see if someone out there can tell me if I am finally doing it correctly or not.

Until then…

Happy Halloween in November!

A few side projects keep me busy during the year. One is doing things for local festivals (show guides, websites, newspaper ads, TV commercials, etc.) and the other is maintaining my haunted house website: www.dmhauntedhouses.com

During October, I visit with all the local haunted attractions to get information from that website. I do video interviews, create custom audio/video effects for them, and other projects. Over the years I have done quite a bit in this area, from building BASIC-Stamp based prop controllers to doing complex DMX lighting/audio show control programs.

For 2018, I am going to start documenting my projects, and making plans available for those who want to recreate them. I also plan on making items available pre-built for those who just want to use and not build.

More to come…

Happy birthday, computer revolution.

Today marks the 40th anniversary of the Radio Shack TRS-80 Model I computer. TRS stood for Tandy/Radio Shack, and the 80 came from the Z-80 processor it uses. The Model I came from Tandy’s belief there would be more than one model ;-)

When the TRS-80 came out in 1977, there were already many kit computers, and a few you could order that were assembled. But, none were widely available at thousands of retail locations across America. The TRS-80 became the first widely available home computer.

And they were made in Texas :-)

Radio Shack was soon outselling Apple, and was truly the number one name in little computers (as the slogan said). Keep in mind, back in the 80s there were more Radio Shacks than McDonalds. It was huge.

Many other firsts would come from Ft. Worth, such as the first IBM PC clone that was under $1000 (the Tandy 1000). It was a historic era in home computer.

But soon the market was flooded with other cheap home computer offerings from companies like Atari, Commodore, and Texas Instruments.

Happy birthday, TRS-80.

All quiet on the Western front…

 

Things have been very quiet here. I started a new job a few months ago and have been having a blast doing embedded C firmware programming for power-over-ethernet LED light control systems. I am currently working on the CoAP protocol, as mentioned previously.

I have a few articles for this site waiting for me to get back to them:

  • Tiny BBS – A new take on my 1983 *ALLRAM* BBS for the Radio Shack Color Computer. A few years ago, I had ported my old MIcrosoft BASIC BBS program to Arduino C. I decided to do a new version of the system using things I have learned over the past 34 years. I had worked up a proof-of-concept version earlier this year which had a substantially larger message base in the same memory. I hope to find time to return to this. I think it would be fun to take a CoCo and a $3 WiFi-to-serial adapter and put a micro BBS online ;-)
  • const-ant confusion in C – I have another article in the works that will delve in to the const keyword in C, based on how I’ve been mis-using it most of my programming career. I learned quite a bit about it at a recent job, since we had it defined in our coding style guide. But, many of us there were still using it incorrectly.

But meanwhile, I’ll be chugging away at my day job, working on my Iowa Adventureland amusement park website, and doing various side projects to earn extra income so I can save up for something really cool for my child’s birthday.

To be continued…

More CoAP musings…

Last week I implemented a simple version of CoAP protocol at work, going by the main specification:

https://tools.ietf.org/html/rfc7252

CoAP seems to be similar to how a web browser works with a web server: GET some content, POST something back.

A CoAP server could report back the status of various sensors, and they would be given names similar to a web page. Instead of featching a web page using HTTP protocol like:

http://www.subethasoftware.com/bikelights

…you would use the COAP protocol to reference some resource:

coap://127.0.0.1/motionsensor

You can GET, PUT, POST or DELETE, which I am told mimics things web developers are familiar with. Instead of passing around huge HTML text packets using TCP, CoAP sends a very tiny and compact bit of binary data using the smaller and faster UDP protocol.

CoAP is only a few years old, so many items that I needed to implement were not part of the main specification. I had to consult a second RFC document to learn about the “Observe” option:

https://tools.ietf.org/html/rfc7641

Observe is a mechanism that allows being notified when a resource changes. For instance, if you were monitoring a motion sensor, you might GET the /motionsensor resource, and specify the Observe option. CoAP should respond with the status of the motion sensor, and any time the status changes, send a message with the update.

Fun.

I was able to quickly put together a simple version that could receive and respond to CoAP messages — at least the ones that we’d be needing. However, there are still many features of CoAP I have yet to tackle.

One such features is outgoing confirmable messages. The challenge with those is that all the important information has to be retained somewhere so it can be re-transmitted if the receiver doesn’t receive :)

When I first began researching CoAP, I found several CoAP implementations for memory constrained devices like Arduino. Now that I know more about the protocol, I thought I’d revisit them and see how they approached things like Confirmable messages and Observe.

It turns out, many of the features I have been kludging together are just not supported at all by the simple CoAP implementations.

Here is one called microcoap:

https://github.com/1248/microcoap

With a few small changes, I was able to compile it up for a Windows PC using the GCC compiler. It easily allows creating a new endpoint function that is called when the CoAP message is received, with all important information parsed out and presented as elements in a C structure. All one needs to do is deal with the payload (passed along as a pointer to it, and the length) and generate a response packet.

It does not handle Observe or Confirmable messages, but it does have enough framework to quickly parse incoming CoAP messages, run some code, and send back a response. Like many I have looked at, this version also seems to be geared just for listening and responding. If you have a need for CoAP on that level, it’s a good place to start.

Once I get my work project done, I expect to attempt a similar project, just for fun, that will run on an Arduino with 2K of RAM.  (microcoap has 8K of buffers set aside on startup!)

More to come, maybe.

CoAP – Constrained Application Protocol

Has anyone out there done any work with CoAP?

https://tools.ietf.org/html/rfc7252

And the Observe (subscription) extension proposal:

https://tools.ietf.org/html/rfc7641

I am implementing it for my day job for an embedded system. I am doing it from scratch using just the RFC for reference.

I am thinking of doing another implementation (not using any work code, of course) for the Arduino. I’ve found a few attempts to implement it, often with many missing features or missing many needed features.

Anyone out there a CoAP expert?