In C, you can sizeof() a string constant?

Updates:

  • 2024-08-27 – Adding a note about strlen()/sizeof() that was mentioned by Dave in the comments.

I am used to using sizeof() to know the size of a structure, or size of a variable…

typedef struct {
   char a;
   short b;
   int c;
   long d;
} MyStruct;

printf ("sizeof(MyStruct) is %d\n", sizeof(MyStruct));

MyStruct foo;
printf ("sizeof(foo) is %d\n", sizeof(foo));

…but every time I re-learn you can use it on strings, I am surprised:

#include <stdio.h>

#define VERSION_STRING __DATE__" "__TIME__

int main()
{
    printf ("Build: %s\n", VERSION_STRING);

    printf ("sizeof(): %ld\n", sizeof(VERSION_STRING));

    return 0;
}

Normally, I see strlen() used, and that works for a string that is in a buffer, or a constant string:

#define VERSION_STRING "1.0.42-beta"
const char versionString[] = "1.0.42-beta";

printf ("strlen(VERSION_STRING) = %d\n", strlen(VERSION_STRING));

printf ("strlen(versionString) = %d\n", strlen(versionString));

…but if you know it is a #define string constant, you can use sizeof() and that will be changed in to the hard-coded value that matches the length of that hard-coded string. This will be smaller code, and faster, since strlen() has to scan through the string memory looking for the ‘0’ at the end, counting along the way.

I wonder how many times I have posted about this over the years.

Additional Notes:

In the comments, Dave added:

sizeof a string literal includes the terminating nul character, so it will be strlen +1.

– Dave

Ah, yes – a very good thing to note. C strings have a 0 byte added to the end of them, so “hello” is really “hello\0”. The standard C string functions like strcpy(), strlen(), etc. look for that 0 to know when to stop.

#include <stdio.h>
#include <stdlib.h> // for EXIT_SUCCESS
#include <string.h> // for strlen()

#define STRING "hello"

int main()
{
    printf ("sizeof(STRING) = %ld\n", sizeof(STRING));
    
    printf ("strlen(STRING) = %ld\n", strlen(STRING));

    return EXIT_SUCCESS;
}

Output would show:

sizeof(STRING) = 6
strlen(STRING) = 5

So if using sizeof() to memcpy() bytes somewhere without the overhead of a strlen() counting first, you’d really want something like…

memcpy (buffer, STRING, sizeof(STRING)-1);

Until next time…

11 thoughts on “In C, you can sizeof() a string constant?

  1. Dave

    sizeof a string literal includes the terminating nul character, so it will be strlen +1.

    sizeof “hello” == 6, while strlen(“hello”) == 5

    Reply
  2. S. Enevoldsen

    To better remember this realize that arrays are not pointers, and string literals are arrays (that can decay to pointers).

    const char arrayVersion[] = “1.0.42-beta”;
    const char* pointerString = “1.0.42-beta”;
    printf (“sizeof(arrayVersion) = %d\n”, sizeof(arrayVersion));
    printf (“sizeof(pointerString) = %d\n”, sizeof(pointerString));

    Outputs

    sizeof(arrayVersion) = 12
    sizeof(pointerString) = 4

    Reply
    1. Allen Huffman Post author

      You know, I don’t think I realized that. I know at some point, I started using pointers for all my strings, declaring them as char *namePtr = “foo” or whatever. I would intentionally put the “Ptr” at the end to remind me it was a pointer. Seeing this is enough to make me switch back to using “char name[]” just for the purpose of the sizeof(name) being what I’d expect, and not a pointer.

      Reply
    2. antekone

      To better remember this realize that arrays are not pointers

      Except for situations where arrays are pointers:

      ~/dev/c/tests> cat arrays.c
      #include <stdio.h>

      int test(char arr[]) {
      printf("sizeof arr: %d\n", sizeof(arr));
      }

      int main() {
      char a[] = "x\n";
      test(a);
      }
      ~/dev/c/tests> gcc arrays.c -o arrays && ./arrays
      sizeof arr: 8

      Reply
      1. S. Enevoldsen

        Yes and no. The array object itself is of course not a pointer. The issue is that while the written syntax of the parameter looks like an array the type is not. In N4917, Section 9.3.4.6, paragraph 5 it explains the type is actually pointer:

        After determining the type of each parameter, any parameter of type “array of T” or of function type T is adjusted to be “pointer to T”.

        So the parameter is actually a pointer and already in the call has the array argument decayed. You can even give the “array” in the parameter a fixed size and it will always output 8.

        Reply
    1. Allen Huffman Post author

      Correct observation – compiler warning should scream at that, if enabled. A cast to int or similar is what I would do. I don’t think we know what a size_t is other than a number. I had to use %d on some machines I work with, and %ld is on others. Is there a printf parameter that is better than one of those and casting? I only recently learned about %p. I learned C on a K&R pre-ANSI compiler so a lot of this still feels new to me ;)

      Reply
    1. Allen Huffman Post author

      I have never worked with Unicode. I see there are C escape codes to add them in a C string. I wonder if my embedded C compiler even honors those. I will make a note to explore that.

      Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.