Old C dog, new C tricks part 3: char *line vs char line[]

You, of course, already knew this. But I learn from your comments, so please leave some. Thanks!

This may be the next “old dog, new trick” I adapt too.

When I started learning C in the late 1980s, I had a compiler manual (not very useful for learning the language) and a Pocket C Reference book — both for pre-ANSI K&R C. I may have had another “big” C book, but I mostly remember using the Pocket C Book.

Looking back at some of my early code, I find I was declaring “fixed” strings like this:

And this shows us:

char version[5]="0.00"; /* Version number... */

Odd. Did I really count the bytes (plus 0 at the end) for every string like that? Not always. I found this one:

char filename[28]="cocofest3.map";

…but I think I remember why 28. In the OS-9/6809 operating system, directory entries were 32 bytes. The first 28 were the filename (yep, back in the 80s there were operating systems with filenames longer than FILENAME.EXT), and then three at the end were the LSN (logical sector number) where the File ID sector was. (More or less accurate.)

I also found arrays:

int *days[] = { "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" };

But why is that an int??? That must have been a typo/bug. Other places, I did it more correctly:

char *items[] = { /* employee info prompt thingies */
   "Employee :",
   "Min/Week :",
   "Max/Week :",
   "Min/Shift:",
   "Max/Shift:"
};

At some point in my C history, I just started using pointers to strings like this:

char *version = "1.0.42b-delta";

I guess I got tired of [brackets]. I mean, “they work the same way”, don’t they?

void function (char *line)
{
    if (NULL != line)
    {
        printf ("Line: '%s'\n", line);
    }
}

…and…

void function (char line[])
{
    if (NULL != line)
    {
        printf ("Line: '%s'\n", line);
    }
}

…both end up with line pointing to the start of wherever the bytes to that string are in memory. I’ve seen main() done the same ways:

int main (int arc, char *argv[] )

…and…

int main ( int argc, char **argv )

For years, I’ve been doing it the second way, but all my early code was *argv[] so I suspect that is how I learned it from my early K&R C books.

I have no idea why I changed, or when, but probably in the mid-to-late 1990s. I started working for Microware Systems Corporation in Des Moines, Iowa in 1995. This was the first place I used an ANSI-C compiler. In code samples from the training courses I taught, some used “*argv[]” but ones I wrote used “**argv”.

Does it matter?

Not really. But let’s talk about it anyway…

There was a comment left on one my articles last year that pointed out something different I had no considered: sizeof

If you have “char *line” you cannot use sizeof() to give you anything but the size of the pointer (“sizeof(line)”) or the size of a character (or whatever data type used) that it points to (“sizeof(*line)”).

If you have “char line[]”, you can get the size of the array (number of characters, in this case) or the size of one of the elements in it:

#include <stdio.h>
#include <stdlib.h> // for EXIT_SUCCESS
int main(void)
{
    char *line1 = "1234567890";
    
    char line2[] = "1234567890";
    
    printf ("sizeof(line1)  = %zu\n", sizeof(line1));
    printf ("sizeof(*line1) = %zu\n", sizeof(*line1));
    printf ("\n");
    printf ("sizeof(line2)  = %zu\n", sizeof(line2));
    printf ("sizeof(*line2) = %zu\n", sizeof(*line2));
    return EXIT_SUCCESS;
}

This produces:

sizeof(line1)  = 8  <- size of a 64-bit pointer
sizeof(*line1) = 1 <- size of a char

sizeof(line2) = 11 <- size of the character array
sizeof(*line2) = 1 <- size of a char

I cannot remember ever using sizeof() on a string constant. You may recall I was surprised it worked when I learned about it a few months ago.

But, now that I am aware, I think I may start moving myself back to where I started and using the [brackets] when I have constant strings. Using sizeof() in the program just embeds a constant value, while strlen() is a function that walks through each byte looking for the end zero, thus adding more code space and more execution time.

If I wanted to copy some constant string into a buffer, I could try these two approaches:

// Copy message into buffer.
char *line1 = "This is a message.";
strncpy (buffer, line1, strlen(line1)); // strlen
printf ("Buffer: '%s'\n", buffer);
char line2[] = "This is a message.";
strncpy (buffer, line2, sizeof(*line2)); // sizeof
printf ("Buffer: '%s'\n", buffer);

And the results are the same:

Buffer: 'This is a message.'
Buffer: 'This is a message.'

I would use the second version since using sizeof(*line2) avoids the overhead of strlen() scanning through each byte in the string looking for the end zero.

NOTE: As was pointed out in the comments, strlen() returns the number of characters up to the zero. “Hello” is a strlen() of 5. But sizeof() is the full array or characters including the 0 at the end so “Hello” would have a sizeof() of 6.

char line[] = "1234567890";
printf ("strlen(line) = %u\n", strlen(line));
printf ("sizeof(line) = %u\n", sizeof(line));
strlen(line) = 10
sizeof(line) = 11

If you wanted them to be the same, it would be “sizeof(line)-1”.

It’s all fun and games until you pass a parameter…

This “benefit” of sizeof() is not useful if you are passing the string in to a function. It just ends up like a pointer to wherever the string is stored:

#include <stdio.h>
#include <stdlib.h> // for EXIT_SUCCESS
#include <string.h>
void function1 (char *line)
{
    printf ("function1():\n");
    printf ("sizeof(line)  = %zu\n", sizeof(line));
    printf ("sizeof(*line) = %zu\n", sizeof(*line));
    printf ("strlen(line)  = %zu\n", strlen(line));
}
void function2 (char line[])
{
    printf ("function2():\n");
    printf ("sizeof(line)  = %zu\n", sizeof(line));
    printf ("sizeof(*line) = %zu\n", sizeof(*line));
    printf ("strlen(line)  = %zu\n", strlen(line));
}
int main(void)
{
    char *line1 = "1234567890";
    printf ("Line 1: '%s'\n", line1);
    function1 (line1);
    function2 (line1);
    
    printf ("\n");
    char line2[] = "1234567890";
    printf ("Line 2: '%s'\n", line2);
    function1 (line2);
    function2 (line2);
    return EXIT_SUCCESS;
}

Above, I create a “*line” pointer to a string then pass it in to two functions. The first expects a *line as the parameter, and the second expects a line[].

Then I do a “line[]” array and pass it to the same two functions.

The results are the same:

Line 1: '1234567890'
function1():
sizeof(line) = 8
sizeof(*line) = 1
strlen(line) = 10
function2():
sizeof(line) = 8
sizeof(*line) = 1
strlen(line) = 10

Line 2: '1234567890'
function1():
sizeof(line) = 8
sizeof(*line) = 1
strlen(line) = 10
function2():
sizeof(line) = 8
sizeof(*line) = 1
strlen(line) = 10

And, if you use a “good compiler,” you may get a warning about doing sizeof() like this:

main.c: In function ‘function2’:
main.c:16:44: warning: ‘sizeof’ on array function parameter ‘line’ will return size of ‘char *’ [-Wsizeof-array-argument]
   16 |     printf ("sizeof(line)  = %zu\n", sizeof(line));

Notice that warning was from function2(), and not from function1(). This is one difference in using “*line” versus “line[]” in the functions. For function1(), no warning is given:

void function1 (char *line)
{
    printf ("function1():\n");
    printf ("sizeof(line)  = %zu\n", sizeof(line));
    printf ("sizeof(*line) = %zu\n", sizeof(*line));
    printf ("strlen(line)  = %zu\n", strlen(line));
}

Since the function takes a “pointer to one or more chars”, doing sizeof() what that pointer points to makes sense. It is what you asked for. The “C gibberish” website says:

declare line as pointer to char

https://cdecl.org/?q=char+*line

But for the second one…

void function2 (char line[])
{
    printf ("function2():\n");
    printf ("sizeof(line)  = %zu\n", sizeof(line));
    printf ("sizeof(*line) = %zu\n", sizeof(*line));
    printf ("strlen(line)  = %zu\n", strlen(line));
}

…a warning is given about the “sizeof(line)” because it cannot tell us the size of a line[] array — it became a pointer to the character memory when it went into the function. But because the function parameter was “line[]”.

declare line as array of char

https://cdecl.org/?q=char+line%5B%5D

Doing sizeof() an “array of char” is valid. But it was passed into the function, even though the parameter was a line[] it is passed as a pointer to the data. I guess this is one of those “I’m sorry, Dave. I’m afraid I can’t do that” moments ;-)

Is this useful? It certainly will let you use sizeof() instead of strlen() on a string if you have direct access to the string variable. But passing strings into functions? Not so much. (Or am I mistaken?)

But I do think I am going to try to go back to using “line[]” for my string declarations. I like retro.

Until next time…

13 thoughts on “Old C dog, new C tricks part 3: char *line vs char line[]

  1. Blair Leduc

    No! Don’t do that with strncpy()! The length is for the maximum number of bytes to copy, which needs to be at most the size of the destination buffer, not the length of the source!

    Also, if you hit the size to copy limit, the destination will not be null terminated. I always found this resulted in awkward code forcing a terminating null at the end of the destination after calling strncpy(), because you never knew if the destination has a terminating null.

    Reply
        1. Allen Huffman Post author

          I have to research. I had never heard of it until I ran code through this compiler, and I have been #ifdef’ing code to take advantage of it. strncpy should use destination size, and pray the source has a NIL at the end, and then -1 is using sizeof, yes?

          Reply
  2. Sean Patrick Conner

    The strncpy() function isn’t for what you think it is. It was originally written to handle filenames for the original Unix file system. On it, directories were special files where each entry was 16 bytes in size, two bytes for the inode, and 14 bytes for the name, padded with NUL bytes. That is the exact use case for whch strncpy() was written for, and got stuck into the C89 standard.

    A much better version of both strncpy() and strcat() is the snprintf() function. For a copy: snprintf(buffer,sizeof(buffer),"%s",src). For concatenation, snprintf(buffer,sizeof(buffer),"%s%s%s",s1,s2,s3). If you can’t use snprintf() for any reason (say, a crappy embedded C compiler that doesn’t support C99), you are probably better off writing your own versions of strncpy() and strcat() that take a length parameter, and will always NUL terminate the result. Just don’t call your custom functions strncpy() or strcat().

    Reply
    1. Allen Huffman Post author

      Wow, that’s some background on strncpy I did not know. Was strcpy designed for something else as well?

      Using snprintf like that is interesting. On some of these systems I have, they provide strcpy but not strNcpy, and leave out a lot of the heavier functions. For instance, printf is missing many of the formatting features. “%x” works, but you may not have “%04x” to pad with leading zeros, for example.

      I think I could wean myself into using snprintf with a bit of practice. I think I’ll use this as an excuse to write a blog post about this, and maybe it can be a New Trick for me. Thank you so much. Your comments are greatly appreciated. I’ve been learning from them!

      Reply
      1. Sean Patrick Conner

        You can use strncpy() if you have a structure with a fixed length character field that doesn’t need to be NUL terminated, but that’s about all it’s good for. And I too, have come across limited versions of printf()`, I’ve even written a few myself.

        Reply
        1. Allen Huffman Post author

          The “will not put a 0 at the end if it reaches max size” is something I do not think I knew, or if I did, I had forgotten.

          char line[30];

          strncpy (line, something, 30);

          If no 0 is found by 30 in “something”, you get bad line. I must have known about this, because I found old code where I was doing two steps every time:

          strncpy(dst, src, dstSize);
          dst[dstSize] = ‘\0’;

          My sizeof(0) experiments lately are missing this.

          Reply
  3. Sean Patrick Conner

    As for using char * vs. char [], I use the former for strings (usually written as char const *), and for passing in a buffer to be filled with a string, char [] and size_t. It’s all about signalling intent.

    Reply
    1. Allen Huffman Post author

      Help me understand. For constant strings, this:

      const char *errMsg = “?SN ERROR”;

      But then you might use this to indicate “this holds an array of characters”:

      char lineBuffer[80];

      status = LineInput(lineBuffer[], sizeof(lineBuffer));

      bool lineInput(char line[], size_t lineSize)
      {
      }

      …or something?

      That said, what about non-string buffers:

      uint8_t txBuf[128];
      uint8_t rxBuf[128];

      Same type of style, or do you use the [] method specifically for string/character buffers?

      Reply
  4. Sean Patrick Conner

    Yes. As for non-string buffers or arrays in general, yes I also pass them using a [] instead of a pointer. Again, signalling intent.

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.