C and the case of the missing globals…

Even though I first started learning C using a K&R compiler on a 1980s 8-bit home computer, I still feel like I barely know the language. I learn something new regularly, usually by seeing code someone else wrote.

There are a few common “rules” in C programming that most C programmers I know agree with:

  • Do not use goto, even if it is an intentional supported part of the language.
  • Do not use globals.

I have seen many cases for both. In the case of goto, I have seen code that would otherwise be very convoluted with nested braces and comparisons solved simply by jumping out of the block of code with a goto. I still can’t bring myself to use goto in C, even though as I type this I feel like I actually did at some point. (Do I get a pass on using that, since it was a silly experiment where I was porting BASIC — which uses GOTO — to C, as literally as possible?)

But I digress…

A case for globals – laziness

Often, globals are used out of sheer laziness. Like, suppose you have a function that does something and you don’t want to have to update every use of it to deal with a parameter. I am guilty of this when I needed to make a function more flexible, and did not have time to go update every instance and use of the function to pass in a variable:

void InitializeCommunications ()
{
     InitI2C (g_Kbps);
}

In that case, there would be some global (I put a “g_” the variable name so it would be easy to spot as a global later) containing a baud rate, and any place that called that function could use it. Changing the global would make subsequent calls to the function use the new baud rate.

Bad example, but it is what it is.

A case for globals – speed

I have also resorted to using globals to speed things up. One project I worked on had dozens of windows (“panels”) and the original programmer had created a lookup function to return that handle based on a based in #define value:

int GetHandle (int panelID)
{
   int panelHandle = -1;

   switch (panelID)
   {
      case MAIN_PANEL:
         panelHandle = xxxx;
         break;

      case MAIN_OPTIONS:
         panelHandle = xxxx;
         break;
...etc...

Every function that used them would get the ID first by calling that routine:

handle = GetHandle (PANEL_MAIN);

SetPanelColor (handle, COLOR_BLUE); // or whatever

As the program grew, more and more panels were added, and it would take more and more time to look up panels at the bottom of the list. As an optimization I just decided to make al the panel handles global, so any use could just be:

SetPanelColor (g_MainPanel, COLOR_BLUE); // or whatever

This quick-and-dirty change ended up having about a 10% reduction in CPU usage — this thing uses a ton of panel accesses! And it was pretty quick and simple to do.

Desperate times.

An alternative to globals

The main replacement I see for globals are structures, declared during startup, then passed around by pointer. I’ve seen these called “context” or “runtime” structures. For example, some code I work on creates a big structure of “things” and then any place that needs one of those things accesses it:

InitI2C (runTime.baudRate);

But as you might guess, “runTime” is a global structure so any part of the code could access it (or manipulate it, or mess it up). The main benefit I see of making things a global structure is you have to know what you are doing. If you had globals like this:

// Globals
int index = 0;
int baudRate = 0;

…you might be surprised if you tried to use a local variable “index” or “baudRate” and got it confused with the global. (I actually ran in to a bug where there was a global named simply “index” and there was some code that had meant to have a local variable called “index” but forgot to declare it, thus it was always screwing with the global index which was used elsewhere in the code. This was a simple accident that caused alot of weird problems before it was identified and fixed.

Prepending something like “g_index” at least makes it clear you are using a global, so you could have a local “index” and not risk messing up the global “g_index”.

To me, using that global runtime structure is just a slower way to do that, since in embedded compilers I have tested, accessing a global something like “foo.x” is slower than just accessing a global “x”. I have also seen it to take more code space, and I had to remove all such references in one tightly restrained product to save just enough bytes to add some needed new code.

Yes, I have ran in to many situations where a tiny bit of extra memory space or a tiny bit of extra code space made the difference between getting something done, or not.

A cleaner approach?

Ideally, code could pass around a “context” structure, and then nothing could ever access it without specifically being handed it. Consider this:

int main ()
{
   int status = SUCCESS;

   // Allocate out context:
   RunTimeStruct runTime;

   ...

   status = StartProgram (&runTime);

   return status;
}

int BeginProgram (RunTimeStruct *runTime)
{
    InitializeCommunications (runTime->baudRate);

    status = DoSomething (runTime);

    return status;
}

The idea seems to be that once you had the runTime structure, you could pass in specific elements to a function (such as the baud rate), or pass along the entire context for routines that needed full access.

This feels like a nice approach to me since passing one pointer in is fast, and it still offers protection when you decide to pass in just one (or a few) specific items to a function. No code can legally touch those variables if it doesn’t have the context structure.

But what about globals that aren’t globals?

And now the point of this article. Something I learned from this project was an interesting use of “globals” that were not globals. There were functions that declared static structures, and would return the address of the structure:

RunTimeStruct *GetRunTimeData (void)
{
   static RunTimeStruct runTimeDate;

   return &runTimeData;
}

Now any place that needed it could just ask for it:

RunTimeStruct *runTime = GetRunTimeData (); 

runTime->baudRate = 300;

This seems like a hybrid approach. You can never accidentally use them, like you might with just a global “int index” or whatever, but if you did, you could get to them without needing a context passed in. It seems like a good compromise between safety and laziness.

It also means those functions could easily be adapted to return blocks of thread-safe variables, with a “Release” function at the end. (This is actually how the thread-safe variables work in the LabWindows/CVI environment I use at my day job.)

RunTimeStruct *runTime = GetRunTimeData (); 

runTime->baudRate = 300;

ReleaseRunTimeData ();

What do you do?

Since I like learning, I thought I’d write this up and ask you what YOU do. Show me your superior method, and why it is superior. I’ve seen so many different approaches to passing data around, so share yours in a comment.

Until next time…

13 thoughts on “C and the case of the missing globals…

  1. Sean Patrick Conner

    I have two stories about this.

    The first one, I have a project that had a lot of global variables, and I too, used the g_ prefix for globals. I got curious as to how they were being used, so I went through, marking each one as const and compiling. I found out that most (90+%?) where only set one, then pretty much read-only from then on. So I decided to move all the globals to a single globals.c file, and rename the ones that were set once to start with c_ (for “constant”).

    Later on, I wanted to better enforce the constant nature with the compiler, so I created a globals.h to declare all global variables; the ones that were named c_ were marked as const in the header file. I then put the code to set the “constants” into globals.c (which does not include globals.h) to set these variables. I found that worked quite well.

    Second story. Over the past two years, I’ve started to avoid globals entirely in my programs, using the strategies you’ve outlined. I started a personal project last week, using no globals, but as I worked on the project, and started using it, I found that making two globals (one an array of virtual disks; the other is the index of the default virtual disk) simplified the program overall. It wasn’t out of laziness as it was very deliberate on my part. And the two variables don’t even start with g_ (but they do follow the main API naming scheme).

    Reply
    1. Allen Huffman Post author

      I like the c_ prefix idea. Though, if it doesn’t change, maybe better as a #define or something? I may add that to my personal coding standard.

      I also moved a bunch of things to globals.c, which declared them, and globals.h that had the externs. During this task with a coworker, we began making Set and Get functions for every one of them, planning to move them to thread safe variables at some point. But my corwokrer was a C# object guy so he probably liked the Get Set thing better than globals.

      Reply
      1. Sean Patrick Conner

        The c_ was for values that could be configured at runtime, either through the config file or command line options, but once set, didn’t change.

        The only good thing I will say about Get*() and Set*() functions is that they can enforce invariants the program requires, but that’s about the only good thing about them.

        Reply
          1. Sean Patrick Conner

            Well, you can cast away the const, but that’s not a good answer. How I do it is in globals.c it’s declared as non-const:

            int const c_port;

            But in globals.h I define it as:

            extern int const c_port;

            so the rest of the code sees it as const but the initialization code (in globals.c) or the setter (also in globals.c) don’t see the const. And globals.c does not include globals.h. Does that make sense?

          2. Allen Huffman Post author

            I LIKE THIS! Using the online GDB compiler, I made a quick test. The “globals.h” lets code refer to a variable directly, like:

            printf (“%d\n”, c_number);

            …but not be able to set it directly. Adding in something like SetNumber() in globals.c lets me do:

            printf (“%d\n”, c_number);
            SetNumber(100);
            printf (“%d\n”, c_number);

            Cool approach. This allows the setter to do whatever range checking, etc.

            In the past, I’d used a file-static, and offered a Set and Get routine for it. This would work like that, but be faster since you would remove the overhead of the Get.

            Is that why you use this approach?

          3. Sean Patrick Conner

            Oh, and I should say that I always put the const to the right of the type. It makes it easier to reason about what it is that is const. Examples:

            int const c; /* c is constant */
            int const *pc; /* p points to a const char; p can change */
            int *const pc; /* p is a constant point to a char; the char can change, p cannot */
            int const *const pc; /* p is a constant pointer to a constant character */

            In each case, the const makes the thing to the right (the type, the pointer) constant.

          4. Allen Huffman Post author

            This is what I need to get used to. Being able to const int or int const and be the same is confusing to me. Why did they allow both? Funny to see you mention that here, as I just wrote up a post for next week about const inspired by your comments.

          5. Sean Patrick Conner

            Is that why you use this approach?

            I just checked (a few years ago I went ahead and removed all globals in the project) and yet, I did have a few set-type functions. A nice side effect, but not the primary reason I did it. It really was due to 25 out of the 34 globals being set once, then read multiple times; 4 having a set-type function, and 5 being true global variables.

  2. Sean Patrick Conner

    As for the GetHandle() issue you experience—was the search for a handle a linear search? Because there are any number of ways to keep the overhead down (balanced tree, hash table, sorted array). It sounds like the original programmer just did was was simpler and no one bothered to update it to a faster method.

    Reply
    1. Allen Huffman Post author

      I expect that is true… just a switch/case probably with a few things originally, then eventually dozens and dozens of things. When we were in a CPU usage crunch, it was a function that was called more than anything else, because every UI element first called it to get the handle before updating the data.

      Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.