This came up at my day job when two programmers were trying to get a block of data to be the size both expected it to be. Consider this example:
typedef struct
{
uint8_t byte1; // 1
uint16_t word1; // 2
uint8_t byte2; // 1
uint16_t word2; // 2
uint8_t byte3; // 1
// 7 bytes
} MyStruct1;
The above structure represents three 8-bit byte values and two 16-bit word values for a total of 7 bytes.
However, if you were to run this code in GCC for Windows, and print the sizeof() that structure, you would see it returns 10:
sizeof(MyStruct1) = 10
This is due to the compiler padding variables so they all start on a 16-bit boundary.
The expected data storage in memory feels like it should be:
[..|..|..|..|..|..|..] = 7 bytes
| | | | | | |
| | | | \ / byte3
| | | | word2
| \ / byte2
| word1
byte1
But, using GCC on a Windows 10 machine shows that each value is stored on a 16-bit boundary, leaving unused padding bytes after the 8-bit values:
[..|xx|..|..|..|xx|..|..|..|xx] = 10 bytes
| | | | | | |
| | | | \ / byte3
| | | | word2
| \ / byte2
| word1
byte1
As you can see, three extra bytes were added to the “blob” of memory that contains this structure. This is being done so each element starts on an even-byte address (0, 2, 4, etc.). Some processors require this, but if you were using one that allowed odd-byte access, you would likely get a sizeof() 7.
Do not rely on processor architecture
To create portable C, you must not rely on the behavior of how things work on your environment. The same can/will could produce different results on a different environment.
See also: sizeof() matters, where I demonstrated a simple example of using “int” and how it was quite different on a 16-bit Arduino versus a 32/64-bit PC.
Make it smaller
One easy thing to do to reduce wasted memory in structures is to try to group the 8-bit values together. Using the earlier structure example, by simple changing the ordering of values, we can reduce the amount of memory it uses:
typedef struct
{
uint8_t byte1; // 1
uint8_t byte2; // 1
uint8_t byte3; // 1
uint16_t word1; // 2
uint16_t word2; // 2
// 7 bytes
} MyStruct2;
On a Windows 10 GCC compiler, this will produce:
sizeof(MyStruct1) = 8
It is still not the 7 bytes we might expect, but at least the waste is less. In memory, it looks like this:
[..|..|..|xx|..|..|..|..] = 8 bytes
| | | | | \ /
| | | \ / word2
| | | word1
| | byte3
| byte2
byte1
You can see an extra byte of padding being added after the third 8-bit value. Just out of curiosity, I moved the third byte to the end of the structure like this:
typedef struct
{
uint8_t byte1; // 1
uint8_t byte2; // 1
uint16_t word1; // 2
uint16_t word2; // 2
uint8_t byte3; // 1
// 7 bytes
} MyStruct3;
…but that also produced 8. I believe it is just adding an extra byte of padding at the end (which doesn’t seem necessary, but perhaps memory must be reserved on even byte boundaries and this just marks that byte as used so the next bit of memory would start after it).
[..|..|..|..|..|..|..|xx] = 8 bytes
| | | | | | |
| | | | \ / byte3
| | \ / word2
| | word1
| byte2
byte1
Because you cannot ensure how a structure ends up in memory without knowing how the compiler works, it is best to simply not rely or expect a structure to be “packed” with all the bytes aligned like the code. You also cannot expect the memory usage is just the values contained in the structure.
I do frequently see programmers attempt to massage the structure by adding in padding values, such as:
typedef struct
{
uint8_t byte1; // 1
uint8_t padding1; // 1
uint16_t word1; // 2
uint8_t byte2; // 1
uint8_t padding2; // 1
uint16_t word2; // 2
uint8_t byte3; // 1
uint8_t padding3; // 1
// 10 bytes
} MyPaddedStruct1;
At least on a system that aligns values to 16-bits, the structure now matches what we actually get. But what if you used a processor where everything was aligned to 32-bits?
It is always best to not assume. Code written for an Arduino one day (with 16-bit integers) may be ported to a 32-bit Raspberry Pi Pico at some point, and not work as intended.
Here’s some sample code to try. You would have to change the printfs to Serial.println() and change how it prints the sizeof() values, but then you could see what it does on a 16-bit Arduino UNO versus a 32-bit PC or other system.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
typedef struct
{
uint8_t byte1; // 1
uint16_t word1; // 2
uint8_t byte2; // 1
uint16_t word2; // 2
uint8_t byte3; // 1
// 7 bytes
} MyStruct1;
typedef struct
{
uint8_t byte1; // 1
uint8_t byte2; // 1
uint8_t byte3; // 1
uint16_t word1; // 2
uint16_t word2; // 2
// 7 bytes
} MyStruct2;
typedef struct
{
uint8_t byte1; // 1
uint8_t byte2; // 1
uint16_t word1; // 2
uint16_t word2; // 2
uint8_t byte3; // 1
// 7 bytes
} MyStruct3;
int main()
{
printf ("sizeof(MyStruct1) = %u\n", (unsigned int)sizeof(MyStruct1));
printf ("sizeof(MyStruct2) = %u\n", (unsigned int)sizeof(MyStruct2));
printf ("sizeof(MyStruct3) = %u\n", (unsigned int)sizeof(MyStruct3));
return EXIT_SUCCESS;
}
Until next time…