A safer memcpy with very limited use cases

Here is a quick one… At my day job, I found lines of code like this:

memcpy(systemDataPtr->serialNumber, resp.serialNumber, 16);

A quick peek at systemDataPtr->serialNumber shows it defined as this:

unsigned char serialNumber[MAX_SERIAL_NUMBER_LENGTH];

…with that constant defined as:

#define MAX_SERIAL_NUMBER_LENGTH        16

So while 16 is correct, the use of hard-coded “magic numbers” (hat tip to a previous manager, Pete S., who introduced me to that term) is probably best to be avoided. Change that #define, and things could go horribly wrong with a memory overrun or massive nuclear explosion or something.

One simple fix is to use the #define in the memcpy:

memcpy(systemDataPtr->serialNumber, resp.serialNumber, MAX_SERIAL_NUMBER_LENGTH);

This, of course, assumes that resp.serialNumber is also 16. Let’s see:

char serialNumber[16];

Ah, magic number! In this case, it comes from a DLL header file that does not share that #define, and the header file for the DLL was made by someone who had never made a Windows DLL before (me) and did not make #defines for these various lengths.

What if the DLL value ever got out-of-sync? Worst case, not all data would be copied (only 16 bytes). That seems fine. But if the DLL value became smaller, like 10, then the memcpy would still copy 16 bytes, copying the 10 from the DLL buffer plus 6 bytes of data in memory after it — buffer overrun?

In this case, since the destination buffer can hold 16 bytes, and we only copy up 16 bytes, the worst case is we could get some unintended data in that buffer.

sizeof() exists for a reason.

One thing I tend to do is use sizeof() instead of hard-coded numbers or the #define, since it will continue to work if the source buffer ever got changed from using the #define:

memcpy(systemDataPtr->serialNumber, resp.serialNumber, sizeof(systemDataPtr->serialNumber));

But this still has the same issue if the source resp.serialNumber became larger.

A safer, and more ridiculous, memcpy

Naturally, I came up with a ridiculous “solution”: A safer memcpy() that is much more of a pain to use because you have to know the size of each buffer and tell it the size of each buffer so it can make sure not to copy something larger than will fit into the destination buffer.

Here is the prototype of memcpy():

void * memcpy ( void * destination, const void * source, size_t num );

It will blindly copy up to “num” bytes from “source” to “destination”. But a ridiculous safer memcpy might look like this:

void * memcpy_safer ( void * destination, size_t sizeOfDestination,
                      const void * source, size_t sizeOfSource,
                      size_t num );

Just think of the extra overhead to add two more parameters for every use! Plus, it is a longer function name so you get to type even more! Wonderful.

Here is a quick implementation:

void * memcpy_safer ( void * destination, size_t sizeOfDestination,
                      const void * source, size_t sizeOfSource,
                      size_t num )
{
    // Use whichever size is the smallest.
    if ((num > sizeOfDestination) || (num > sizeOfSource))
    {
        if (sizeOfDestination < sizeOfSource)
        {
            num = sizeOfDestination;
        }
        else
        {
            num = sizeOfSource;
        }
    }
    
    return memcpy ( destination, source, num);
}

Now that the function knows the size of source and destination, it can do a simple check to make sure to use whichever is smallest for the max number of bytes to copy — regardless (or irregardless, depending on which you prefer) of how many bytes you specified.

If both buffers are larger than the “num” passed in, num is used. But if “num” is larger than either buffer, it is reduced to match the largest of the two buffers.

Note that this does not do any NULL checks, so it is relying on the behavior of memcpy() which would likely be a crash if NULL is passed in. To improve, NULL checks could be added.

I decided to pass this function into Bing CoPilot and see what it “thought” about this:

Limitations & Risks

Silent truncation: If num is too large, the function silently reduces it. This can lead to subtle bugs if the caller expects all bytes to be copied.

No error signaling: There’s no return code or logging to indicate that truncation occurred.

Still not “safe” in all cases: If destination or source are NULL, or if the buffers overlap, memcpy may still cause undefined behavior.

No runtime buffer validation: It assumes the caller provides correct buffer sizes, which may not be guaranteed.

– Bing CoPilot

And, of course, the A.I. offered up these changes:

#include <string.h>
#include <stddef.h>
#include <stdio.h>

void * memcpy_safer ( void * destination, size_t sizeOfDestination,
                      const void * source, size_t sizeOfSource,
                      size_t num )
{
    if (!destination || !source) {
        fprintf(stderr, "Error: NULL pointer passed to memcpy_safer.\n");
        return NULL;
    }

    if (num > sizeOfDestination || num > sizeOfSource) {
        size_t adjusted = (sizeOfDestination < sizeOfSource) ? sizeOfDestination : sizeOfSource;
        fprintf(stderr, "Warning: Truncating copy from %zu to %zu bytes.\n", num, adjusted);
        num = adjusted;
    }

    return memcpy(destination, source, num);
}

That version adds NULL checks, returns a NULL if either buffer passed in was NULL, and adds prints to standard error if a NULL happens or if the value was truncated.

Not bad, predictive language model.

My ridiculous test program

Here is my test program, which I wrote using the Online GDB C compiler:

/******************************************************************************

Welcome to GDB Online.
  GDB online is an online compiler and debugger tool for C, C++, Python, PHP, Ruby, 
  C#, OCaml, VB, Perl, Swift, Prolog, Javascript, Pascal, COBOL, HTML, CSS, JS
  Code, Compile, Run and Debug online from anywhere in world.

*******************************************************************************/
#include <stdint.h> // for uint8_t
#include <stdio.h>  // for printf()
#include <stdlib.h> // for EXIT_SUCCESS
#include <string.h> // for memcpy()

/*---------------------------------------------------------------------------*/
// PROTOTYPES
/*---------------------------------------------------------------------------*/

void * memcpy_safer ( void * destination, size_t sizeOfDestination,
                      const void * source, size_t sizeOfSource,
                      size_t num );

void * memcpy_safer2 ( void * destination, size_t sizeOfDestination,
                       const void * source, size_t sizeOfSource,
                       size_t num );

void initializeBuffer (void *dataPtr, size_t dataSize, uint8_t value);

void dumpBuffer (const char* prefix, void *dataPtr, size_t dataSize);

/*---------------------------------------------------------------------------*/
// MAIN
/*---------------------------------------------------------------------------*/

int main()
{
    uint8_t smallerBuffer[10];
    uint8_t largerBuffer[15];
    
    // Test 1: copy longer buffer into smaller buffer.
    
    printf ("\nInitialized buffers:\n\n");    
    
    // Initialize buffers with something we can identify later.
    initializeBuffer (smallerBuffer, sizeof(smallerBuffer), 0x1);
    dumpBuffer ("smallerBuffer", smallerBuffer, sizeof(smallerBuffer));

    initializeBuffer (largerBuffer, sizeof(largerBuffer), 0x2);
    dumpBuffer ("largerBuffer ", largerBuffer, sizeof(largerBuffer));

    printf ("\nTest 1: Copying largerBuffer into smallerBuffer...\n\n");

    memcpy_safer (smallerBuffer, sizeof(smallerBuffer), largerBuffer, sizeof(largerBuffer), 42);

    dumpBuffer ("smallerBuffer", smallerBuffer, sizeof(smallerBuffer));

    // Test 2: copy smaller buffer into larger buffer.

    printf ("\nInitialized buffers:\n\n");

    // Initialize buffers with something we can identify later.
    initializeBuffer (smallerBuffer, sizeof(smallerBuffer), 0x1);
    dumpBuffer ("smallerBuffer", smallerBuffer, sizeof(smallerBuffer));

    initializeBuffer (largerBuffer, sizeof(largerBuffer), 0x2);
    dumpBuffer ("largerBuffer ", largerBuffer, sizeof(largerBuffer));

    printf ("\nTest 2: Copying smallerBuffer into largerBuffer...\n\n");

    memcpy_safer (largerBuffer, sizeof(largerBuffer), smallerBuffer, sizeof(smallerBuffer), 42);

    dumpBuffer ("largerBuffer ", largerBuffer, sizeof(largerBuffer));

    return EXIT_SUCCESS;
}


/*---------------------------------------------------------------------------*/
// FUNCTIONS
/*---------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*/
// My ridiculous "safer" memcpy.
/*---------------------------------------------------------------------------*/
void * memcpy_safer ( void * destination, size_t sizeOfDestination,
                      const void * source, size_t sizeOfSource,
                      size_t num )
{
    // Use whichever size is the smallest.
    if ((num > sizeOfDestination) || (num > sizeOfSource))
    {
        if (sizeOfDestination < sizeOfSource)
        {
            num = sizeOfDestination;
        }
        else
        {
            num = sizeOfSource;
        }
    }
    
    return memcpy ( destination, source, num);
}


/*---------------------------------------------------------------------------*/
// Bing CoPilot changes.
/*---------------------------------------------------------------------------*/
void * memcpy_safer2 ( void * destination, size_t sizeOfDestination,
                       const void * source, size_t sizeOfSource,
                       size_t num )
{
    if (!destination || !source) {
        fprintf(stderr, "Error: NULL pointer passed to memcpy_safer.\n");
        return NULL;
    }

    if (num > sizeOfDestination || num > sizeOfSource) {
        size_t adjusted = (sizeOfDestination < sizeOfSource) ? sizeOfDestination : sizeOfSource;
        fprintf(stderr, "Warning: Truncating copy from %zu to %zu bytes.\n", num, adjusted);
        num = adjusted;
    }

    return memcpy(destination, source, num);
}


/*---------------------------------------------------------------------------*/
// Utility function to initialize a buffer to a set value.
/*---------------------------------------------------------------------------*/
void initializeBuffer (void *dataPtr, size_t dataSize, uint8_t value)
{
    if (NULL != dataPtr)
    {
        memset (dataPtr, value, dataSize);
    }
}


/*---------------------------------------------------------------------------*/
// Utility function to dump bytes in a buffer, with an optional prefix.
/*---------------------------------------------------------------------------*/
void dumpBuffer (const char* prefix, void *dataPtr, size_t dataSize)
{
    if (NULL != dataPtr)
    {
        if (NULL != prefix)
        {
            printf ("%s: ", prefix);
        }

        for (size_t idx=0; idx<dataSize; idx++)
        {
            printf ("%02x ", ((uint8_t*)dataPtr)[idx]);
        }
        printf ("\n");
    }
}

// End of memcpy_safer.c

If you want to run it there, you can use this link:

https://onlinegdb.com/Eu7FToIcQ

But of course, I am not using this code. It is ridiculous and requires extra typing.

Besides, I know exactly what I am doing in C and never make any mistakes… Really.

Until next time…

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.