C strcat, strcpy and armageddon, part 5

See also: part 1part 2part 3 and part 4.

In our quest to find the safest (or at least a safer) way to copy or append strings, we have been discussing the C library functions strncpy() and strncat(). In the previous installment I mentioned there is a potential problem. That problem is with strlen(), which we are using for strncat() to determine how much data is already in the destination buffer:

// Create full name by starting with first name...
strncpy( fullName, firstName, FULLNAME_SIZE-1 );
fullName[FULLNAME_SIZE-1] = '\0';

// ...then appending a space...
strncat( fullName, " ", FULLNAME_SIZE-strlen(fullname)-1 );

// ...then appending the last name.
strncat( fullName, lastName, FULLNAME_SIZE-strlen(fullName)-1 );

We are assuming that the length of the data in the buffer is never greater than what the buffer can hold. But what if, somehow, the existing string was too large? If strlen(fullName) was larger than FULLNAME_SIZE, the math would break. Let’s say FULLNAME_SIZE is 20, but the string copied there was 25 (a buffer overrun):

FULLNAME_SIZE-strlen(fullName)-1

…is 20-25-1 which is -6 which means:

strncat( fullName, " ", -6 ); // error!

I would think that strncat() should fail because it is being passed a negative number for the count, but that third parameter is of type size_t, which, according to the cplusplus.com page, is:

Unsigned integral type
Alias of one of the fundamental unsigned integer types.It is a type able to represent the size of any object in bytes: size_t is the type returned by the sizeof operator and is widely used in the standard library to represent sizes and counts.

Unsigned numbers can only be positive (0 and up), while signed numbers can be negative or positive. If you pass in a -1 as an unsigned value, it will actually look like a huge positive number. (For signed numbers, one of the bits is used to indicate if it is positive or negative. If you use an 8-bit value, that can represent the unsigned values of 0-255, or the signed values of -128 to 127).

Thus, because of bad math, strncat() will operate as it was instructed, thinking the limit is some huge number.

Gotcha.

What we really need is a version or strlen() that stops counting when it his a maximuim size (like a strnlen() call). If we did, we could also give it the limit of our buffer:

FULLNAME_SIZE-strnlen(fullName, FULLNAME_SIZE)-1

That way, strnlen() would never return anything greater than FULLNAME_SIZE, so if it didn’t find a null (0) terminator before that character, it would just return the max. Thus, the above example would be 20-20-1 which is… -1.

Crap.

We would have to treat strnlen() the same way we do the other calls, and subtract one from it (a 40 character max buffer would return 39, always leaving room for an extra null terminator):

FULLNAME_SIZE-strnlen(fullName, FULLNAME_SIZE-1)-1

That would give us 20-19-1 which is 0, and passing in a 0 to strncat() should fail, not appending anything.

Problem solved.

Unfortunately, the ANSI-C standard has no such function (though some do offer it as a non-standard function). This means to be safe in these situations, you would have to create our own strnlen() function. A simple wrapper to the existing strnlen() should work:

// Return the length of the string, or num,
// whichever is smaller.
size_t strnlen( const char * str, size_t num )
{
  size_t len;

  len = strlen(str);

  // If string len is longer than we want...
  if (len > num)
  {
    // ...limit the len to be the max num.
    len = num;
  }

  // Return actual len, or max len.
  return len;
}

Problem actually solved.

This may seem to be a very unlikely error, but maybe you have a function written to deal with a buffer up up to 40 characters, but the caller was creating one that could hold up to 60. They might have a string there longer than you expect, so when you go to append (thinking 40 is the max) there might already be a longer string there.

This type of problem can be quite common when using code from other libraries or projects.

Or, perhaps someone properly used the buffer size, but used strncpy() and it filled it up without a null terminator to stop strlen(). Ah! This seems like a much more common issue.

Pity my proposed strnlen() is still potentially inefficient (and could even cause a crash). Do you see why? I guess we will have to fix that, too.

In the next part, I will summarize all of this on one simple “better practices” page for dealing with string copies or appends.

See you then…

4 thoughts on “C strcat, strcpy and armageddon, part 5

    1. allenhuffman

      I will look at this, since the next part is a final summary of suggestions. I was hoping to get your attention with what’s coming next. There are two parts showing replacement functions I wrote in a rather simple manner, and I would love you to see if you had some ways to make them more efficient. (Lots of this[old]=that[new] type stuff which can be done better with pointers and such, I think.)

      Reply
    2. allenhuffman

      Question, James… When I do the strncat(), I am doing the strlen() each time. Is that what you meant? Or am I missing something?

      In the next article, I roll my own version of strncat() that looks at the destination buffer size for me, and also makes sure it’s internal strlen() is limited so it won’t walk through memory endlessly looking for a zero if it does not find one. I kinda have an strncpynull() (same but always null terminates), strncatdst() (looks at destination buffer for limit) and an strnlen().

      Reply
  1. Pingback: C strcat, strcpy and armageddon, part 6 | Sub-Etha Software

Leave a Reply