You, of course, already knew this. But I learn from your comments, so please leave some. Thanks!
This may be the next “old dog, new trick” I adapt too.
When I started learning C in the late 1980s, I had a compiler manual (not very useful for learning the language) and a Pocket C Reference book — both for pre-ANSI K&R C. I may have had another “big” C book, but I mostly remember using the Pocket C Book.
Looking back at some of my early code, I find I was declaring “fixed” strings like this:
And this shows us:
char version[5]="0.00"; /* Version number... */
Odd. Did I really count the bytes (plus 0 at the end) for every string like that? Not always. I found this one:
char filename[28]="cocofest3.map";
…but I think I remember why 28. In the OS-9/6809 operating system, directory entries were 32 bytes. The first 28 were the filename (yep, back in the 80s there were operating systems with filenames longer than FILENAME.EXT), and then three at the end were the LSN (logical sector number) where the File ID sector was. (More or less accurate.)
I also found arrays:
int *days[] = { "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" };
But why is that an int??? That must have been a typo/bug. Other places, I did it more correctly:
char *items[] = { /* employee info prompt thingies */
"Employee :",
"Min/Week :",
"Max/Week :",
"Min/Shift:",
"Max/Shift:"
};
At some point in my C history, I just started using pointers to strings like this:
char *version = "1.0.42b-delta";
I guess I got tired of [brackets]. I mean, “they work the same way”, don’t they?
void function (char *line)
{
if (NULL != line)
{
printf ("Line: '%s'\n", line);
}
}
…and…
void function (char line[])
{
if (NULL != line)
{
printf ("Line: '%s'\n", line);
}
}
…both end up with line pointing to the start of wherever the bytes to that string are in memory. I’ve seen main() done the same ways:
int main (int arc, char *argv[] )
…and…
int main ( int argc, char **argv )
For years, I’ve been doing it the second way, but all my early code was *argv[] so I suspect that is how I learned it from my early K&R C books.
I have no idea why I changed, or when, but probably in the mid-to-late 1990s. I started working for Microware Systems Corporation in Des Moines, Iowa in 1995. This was the first place I used an ANSI-C compiler. In code samples from the training courses I taught, some used “*argv[]” but ones I wrote used “**argv”.
Does it matter?
Not really. But let’s talk about it anyway…
There was a comment left on one my articles last year that pointed out something different I had no considered: sizeof
If you have “char *line” you cannot use sizeof() to give you anything but the size of the pointer (“sizeof(line)”) or the size of a character (or whatever data type used) that it points to (“sizeof(*line)”).
If you have “char line[]”, you can get the size of the array (number of characters, in this case) or the size of one of the elements in it:
#include <stdio.h>
#include <stdlib.h> // for EXIT_SUCCESS
int main(void)
{
char *line1 = "1234567890";
char line2[] = "1234567890";
printf ("sizeof(line1) = %zu\n", sizeof(line1));
printf ("sizeof(*line1) = %zu\n", sizeof(*line1));
printf ("\n");
printf ("sizeof(line2) = %zu\n", sizeof(line2));
printf ("sizeof(*line2) = %zu\n", sizeof(*line2));
return EXIT_SUCCESS;
}
This produces:
sizeof(line1) = 8 <- size of a 64-bit pointer
sizeof(*line1) = 1 <- size of a char
sizeof(line2) = 11 <- size of the character array
sizeof(*line2) = 1 <- size of a char
I cannot remember ever using sizeof() on a string constant. You may recall I was surprised it worked when I learned about it a few months ago.
But, now that I am aware, I think I may start moving myself back to where I started and using the [brackets] when I have constant strings. Using sizeof() in the program just embeds a constant value, while strlen() is a function that walks through each byte looking for the end zero, thus adding more code space and more execution time.
If I wanted to copy some constant string into a buffer, I could try these two approaches:
// Copy message into buffer.
char *line1 = "This is a message.";
strncpy (buffer, line1, strlen(line1)); // strlen
printf ("Buffer: '%s'\n", buffer);
char line2[] = "This is a message.";
strncpy (buffer, line2, sizeof(*line2)); // sizeof
printf ("Buffer: '%s'\n", buffer);
And the results are the same:
Buffer: 'This is a message.'
Buffer: 'This is a message.'
I would use the second version since using sizeof(*line2) avoids the overhead of strlen() scanning through each byte in the string looking for the end zero.
NOTE: As was pointed out in the comments, strlen() returns the number of characters up to the zero. “Hello” is a strlen() of 5. But sizeof() is the full array or characters including the 0 at the end so “Hello” would have a sizeof() of 6.
char line[] = "1234567890";
printf ("strlen(line) = %u\n", strlen(line));
printf ("sizeof(line) = %u\n", sizeof(line));
strlen(line) = 10
sizeof(line) = 11
If you wanted them to be the same, it would be “sizeof(line)-1”.
It’s all fun and games until you pass a parameter…
This “benefit” of sizeof() is not useful if you are passing the string in to a function. It just ends up like a pointer to wherever the string is stored:
#include <stdio.h>
#include <stdlib.h> // for EXIT_SUCCESS
#include <string.h>
void function1 (char *line)
{
printf ("function1():\n");
printf ("sizeof(line) = %zu\n", sizeof(line));
printf ("sizeof(*line) = %zu\n", sizeof(*line));
printf ("strlen(line) = %zu\n", strlen(line));
}
void function2 (char line[])
{
printf ("function2():\n");
printf ("sizeof(line) = %zu\n", sizeof(line));
printf ("sizeof(*line) = %zu\n", sizeof(*line));
printf ("strlen(line) = %zu\n", strlen(line));
}
int main(void)
{
char *line1 = "1234567890";
printf ("Line 1: '%s'\n", line1);
function1 (line1);
function2 (line1);
printf ("\n");
char line2[] = "1234567890";
printf ("Line 2: '%s'\n", line2);
function1 (line2);
function2 (line2);
return EXIT_SUCCESS;
}
Above, I create a “*line” pointer to a string then pass it in to two functions. The first expects a *line as the parameter, and the second expects a line[].
Then I do a “line[]” array and pass it to the same two functions.
The results are the same:
Line 1: '1234567890'
function1():
sizeof(line) = 8
sizeof(*line) = 1
strlen(line) = 10
function2():
sizeof(line) = 8
sizeof(*line) = 1
strlen(line) = 10
Line 2: '1234567890'
function1():
sizeof(line) = 8
sizeof(*line) = 1
strlen(line) = 10
function2():
sizeof(line) = 8
sizeof(*line) = 1
strlen(line) = 10
And, if you use a “good compiler,” you may get a warning about doing sizeof() like this:
main.c: In function ‘function2’:
main.c:16:44: warning: ‘sizeof’ on array function parameter ‘line’ will return size of ‘char *’ [-Wsizeof-array-argument]
16 | printf ("sizeof(line) = %zu\n", sizeof(line));
Notice that warning was from function2(), and not from function1(). This is one difference in using “*line” versus “line[]” in the functions. For function1(), no warning is given:
void function1 (char *line)
{
printf ("function1():\n");
printf ("sizeof(line) = %zu\n", sizeof(line));
printf ("sizeof(*line) = %zu\n", sizeof(*line));
printf ("strlen(line) = %zu\n", strlen(line));
}
Since the function takes a “pointer to one or more chars”, doing sizeof() what that pointer points to makes sense. It is what you asked for. The “C gibberish” website says:
declare line as pointer to char
– https://cdecl.org/?q=char+*line
But for the second one…
void function2 (char line[])
{
printf ("function2():\n");
printf ("sizeof(line) = %zu\n", sizeof(line));
printf ("sizeof(*line) = %zu\n", sizeof(*line));
printf ("strlen(line) = %zu\n", strlen(line));
}
…a warning is given about the “sizeof(line)” because it cannot tell us the size of a line[] array — it became a pointer to the character memory when it went into the function. But because the function parameter was “line[]”.
declare line as array of char
– https://cdecl.org/?q=char+line%5B%5D
Doing sizeof() an “array of char” is valid. But it was passed into the function, even though the parameter was a line[] it is passed as a pointer to the data. I guess this is one of those “I’m sorry, Dave. I’m afraid I can’t do that” moments ;-)
Is this useful? It certainly will let you use sizeof() instead of strlen() on a string if you have direct access to the string variable. But passing strings into functions? Not so much. (Or am I mistaken?)
But I do think I am going to try to go back to using “line[]” for my string declarations. I like retro.
Until next time…
No! Don’t do that with strncpy()! The length is for the maximum number of bytes to copy, which needs to be at most the size of the destination buffer, not the length of the source!
Also, if you hit the size to copy limit, the destination will not be null terminated. I always found this resulted in awkward code forcing a terminating null at the end of the destination after calling strncpy(), because you never knew if the destination has a terminating null.
Wait until you see my post on this “strncpy_s” thing I just learned about when I started using an MS compiler for the first time :-)
I am looking forward! Ah, yes, the _s functions taming the old Wild West style of development.
I have to research. I had never heard of it until I ran code through this compiler, and I have been #ifdef’ing code to take advantage of it. strncpy should use destination size, and pray the source has a NIL at the end, and then -1 is using sizeof, yes?
See also: https://subethasoftware.com/2016/01/13/c-strcat-strcpy-and-armageddon-part-1/
The
strncpy()
function isn’t for what you think it is. It was originally written to handle filenames for the original Unix file system. On it, directories were special files where each entry was 16 bytes in size, two bytes for the inode, and 14 bytes for the name, padded with NUL bytes. That is the exact use case for whchstrncpy()
was written for, and got stuck into the C89 standard.A much better version of both
strncpy()
andstrcat()
is thesnprintf()
function. For a copy:snprintf(buffer,sizeof(buffer),"%s",src)
. For concatenation,snprintf(buffer,sizeof(buffer),"%s%s%s",s1,s2,s3)
. If you can’t usesnprintf()
for any reason (say, a crappy embedded C compiler that doesn’t support C99), you are probably better off writing your own versions ofstrncpy()
andstrcat()
that take a length parameter, and will always NUL terminate the result. Just don’t call your custom functionsstrncpy()
orstrcat()
.Wow, that’s some background on strncpy I did not know. Was strcpy designed for something else as well?
Using snprintf like that is interesting. On some of these systems I have, they provide strcpy but not strNcpy, and leave out a lot of the heavier functions. For instance, printf is missing many of the formatting features. “%x” works, but you may not have “%04x” to pad with leading zeros, for example.
I think I could wean myself into using snprintf with a bit of practice. I think I’ll use this as an excuse to write a blog post about this, and maybe it can be a New Trick for me. Thank you so much. Your comments are greatly appreciated. I’ve been learning from them!
You can use
strncpy()
if you have a structure with a fixed length character field that doesn’t need to be NUL terminated, but that’s about all it’s good for. And I too, have come across limited versions ofprintf
()`, I’ve even written a few myself.The “will not put a 0 at the end if it reaches max size” is something I do not think I knew, or if I did, I had forgotten.
char line[30];
strncpy (line, something, 30);
If no 0 is found by 30 in “something”, you get bad line. I must have known about this, because I found old code where I was doing two steps every time:
strncpy(dst, src, dstSize);
dst[dstSize] = ‘\0’;
My sizeof(0) experiments lately are missing this.
As for using
char *
vs.char []
, I use the former for strings (usually written aschar const *
), and for passing in a buffer to be filled with a string,char []
andsize_t
. It’s all about signalling intent.Help me understand. For constant strings, this:
const char *errMsg = “?SN ERROR”;
But then you might use this to indicate “this holds an array of characters”:
char lineBuffer[80];
…
status = LineInput(lineBuffer[], sizeof(lineBuffer));
…
bool lineInput(char line[], size_t lineSize)
{
}
…or something?
That said, what about non-string buffers:
uint8_t txBuf[128];
uint8_t rxBuf[128];
Same type of style, or do you use the [] method specifically for string/character buffers?
Yes. As for non-string buffers or arrays in general, yes I also pass them using a [] instead of a pointer. Again, signalling intent.
uint8_t txBuffer[256];
clearBuffer(txBuffer[]);
I can see it. “I am passing in an array” versus “it points somewhere, to something”.