Category Archives: C Programming

char versus C versus C#

Updates:

2020-07-16 – Added a few additional notes.

I am mostly a simple C programmer, but I do touch a bit of C# at my day job.

If you don’t think about what is going on behind the scenes, languages like Java and C# are quite fun to work with. For instance, if I was pulling bytes out of a buffer in C, I’d have to write all the code manually. For example:

Buffer: [ A | B | C | D | D | E | E | E | E ]

Let’s say I wanted to pull three one-byte values out of the buffer (A, B and C), followed by a two-byte value (DD), and a four byte value (EEEE). There are many ways to do this, but one lazy way (which breaks if the data is written on a system with different endianness to how data is stored) might be:

#include <stdint.h>

uint8_t a, b, c;
uint16_t d;
uint32_t e;

a = Buffer[0];
b = Buffer[1];
c = Buffer[2];
memcpy (&d, &Buffer[3], 2);
memcpy (&e, &Buffer[5], 4);

There is much to critique about this example, but this post is not about being “safe” or “portable.” It is just an example.

In C#, I assume there are many ways to achieve this, but the one I was exposed to used a BitConverter class that can pull bytes from a buffer (array of bytes) and load them in to a variable. I think it would look something like this:

UInt8 a, b, c;
UInt16 d;
UInt32 e;

a = Buffer[0];
b = Buffer[1];
c = Buffer[2];
d = BitConverter.ToInt16(Buffer, 3);
e = BitConverter.ToInt32(Buffer, 5);

…or something like that. I found this code in something new I am working on. It makes sense, but I wondered why some bytes were being copied directly (a, b and c) and others went through BitConverter. Does BitConverter not have a byte copy?

I checked the BitConverter page and saw there was no ToUInt8 method, but there was a ToChar method. In C, “char” is signed, representing -127 to 128. If we wanted a byte, we’d really want an “unsigned char” (0-255), and I did not see a ToUChar method. Was that why the author did not use it?

Here’s where I learned something new…

The description of ToChar says it “Returns a Unicode character converted from two bytes“.

Two bytes? Unicode can represent more characters than normal 8-bit ASCII, so it looks like a C# char is not the same as a C char. Indeed, checking the Char page confirms it is a 16-bit value.

I’m glad I read the fine manual before trying to “fix” this code like this:

Char a, b, c;
UInt16 d;
UInt32 e;

// The ToChar will not work as intended!
a = BitConverter.ToChar(Buffer, 0); //Buffer[0];
b = BitConverter.ToChar(Buffer, 1); //Buffer[1];
c = BitConverter.ToChar(Buffer, 2); //Buffer[2];
d = BitConverter.ToInt16(Buffer, 3);
e = BitConverter.ToInt32(Buffer, 5);

For a C programmer experimenting in C# code, that might look fine, but had I just tried it without reading the manual first, I’d have been puzzled why it was not copying the values I expected from the buffer.

Making assumptions about languages, even simple things like a data type “char”, can cause problems.

Thank you, manual.

Researching 8-bit floating point.

6 Replies

Recently, I ran in to a situation where a floating point value (represent current) was being converted to a byte value before being sent off in a status message. Thus, any calculations on the other side were being done with whole values (1, 2, 42, etc.). I was asked if the precision could be increased (adding a decimal place) and realized “you can’t get there from here.”

This made me wonder how much could be done with an 8-bit floating point representation. A quick web search led me to this article:

http://www.toves.org/books/float/

I also found a few others discussion other methods of representing a floating point value with only 8-bits.

Now I kinda want to code up such a routine in C and do some tests to see if it would be better than our round-to-whole-number approach.

Has anyone reading this already done this? I think it would be a fun way to learn more about how floating point representation (sign, mantissa, exponent) works.

But it doesn’t seem very useful.

C, printf, and pointers.

3 Replies

I learned a new C thing today.

But first, an old C thing.

When you write “standard C” and want to print a value using printf, you are at the mercy of standard C data types. For example, “%d” is a “Signed decimal integer” and “%u” is an “Unsigned decimal integer.”

There apparently mean the data types “int” and “unsigned int”.

int x;
printf ("x is %d\n", x);

unsinged int y;
printf ("y is %u\n", y);

At my day job, I am using a PIC24 processor, which is a 16-bit chip and is Harvard Architecture, similar to an Arduino UNO processor. When I try to write generic code on a PC using GCC (64-bit compiler), I run in to things like this:

char *ptr = 0x1234;
printf ("ptr is 0x%x\n", ptr);

%x expects an “Unsigned decimal value”. ptr is NOT a “Unsigned decimal value” that %x expects, so it will give a warning like:

warning: initialization of 'char *' from 'int' makes pointer from integer without a cast [-Wint-conversion]

warning: format '%x' expects argument of type 'unsigned int', but argument 2 has type 'char *' [-Wformat=]

This seems like a reasonable warning. In the past, I’ve cast the pointer to an unsigned int like this:

char *ptr = 0x1234;
printf ("ptr is 0x%x\n", (unsigned int)ptr);

A 16-bit compiler may be just fine with this, but it would generate warnings on a 32 or 64-bit compiler because the “int” is a “long” or “long long” to them. Thus, different casting is needed.

That makes code non-portable, because printing a long int (“%lu”) expects different things on different architectures. In other words, I can cast my code to compile cleanly on the 16-bit system, but then I get warnings on the 64-bit system. Or vise versa.

…but that’s not what this post is about. Instead, it’s about why I was casting in the first place — to print pointers.

I was unaware that a “%p” had been added specifically for printing pointers. Unfortunately, it was not supported in the 16-bit PIC compiler I am using, so that won’t work.

inttypes.h

Instead, I found a stack overflow post that introduced me to “inttypes.h” and uintptr_t. I was unaware of this and see the casting lets me do something like this:

printf ("Dump (0x%x, %u)\r\n", (uintptr_t)ptr, size);

This will cast the pointer to an unsigned integer value which can then be printed!

On some systems.

There’s still the problem with a %u expecting an “int” but the value might be a “long int” (if it’s a 32-bit or 64-bit pointer).

PRIxWHAT?

It looks like someone thought of this, and my original “non portable printf” casting issue.

I see this file also contains #defines for various printf outputting types such as PRIXPTR, PRIX, PRIU8, and lowercase versions where it makes sense like PRIxPTR. These turn in to quoted strings like “d” or “X” or whatever. Meaning, you could use them to get the output type that matches what you want on the compiler you are using, rather than hard-coding %d.

uint32_t x = 42;
printf ("This is my value: %" PRIu32 "\n", x);

Above, PRIu32 turns in to the appropriate %u for printing a 32-bit value. This might be %u on one system, or %lu on another (depending on the size of “int” on the architecture).

Neat!

That lets me printfs be portable, IF the compiler supports these defines.

…as long as your compiler supports it, that is.

Fortunately, my PIC compiler does, so from now on I’ll stop hard-coding %x when printing pointers, and using those defines when printing stdint types like uint32_t:

void Dump(void *ptr, unsigned int size)
 {
     printf ("Dump (0x%"PRIXPTR", %u)\r\n", (uintptr_t)ptr, size);

At least, I think I will.

Until next time…

warning: comparing floating point with == or != is unsafe [-Wfloat-equal] – Part 2

Leave a reply

Previously, I started discussing C compiler warnings, with specific intent to discuss this one:

warning: comparing floating point with == or != is unsafe [-Wfloat-equal]

I presented a simple question — what would this print?

int main()
{
    float var1;

    var1 = 902.1;

    if (var1 == 902.1)
    {
        printf ("it is!\n");
    }
    else
    {
        printf ("it is NOT.\n");
    }

    return EXIT_SUCCESS;
}

On my Windows 10 computer running GCC (via the free Code:Blocks editor), I get see “it is NOT.” This goes back to an earlier article I wrote about how 902.1 cannot be represented with a 32-bit floating point. In that article, I tested setting a “float” and a “double” to 902.1 and printed the results:

float 902.1 = 902.099976
double 902.1 = 902.100000

As you can see, a 64-bit double precision floating point value can accurately display 902.1, while a 32-bit floating point cannot. The key here is that in C a floating point number defaults to being double precision, so when you say:

var = 123.456;

…you are assigning 123.456 (a double) to var (a float).

float var1;

var1 = 902.1;

var1 (a float) was being assigned 902.1 (a double) and being truncated. There is a potential loss of precision that a compiler warning could tell you about.

If, with warnings enabled, comparing a floating point using == (equal) or != (not equal) is considered “unsafe,” how do we make it safe?

To be continued…

warning: comparing floating point with == or != is unsafe [-Wfloat-equal]

7 Replies

Ah, compiler warnings. We hate you, yet we need you.

Recently at my day job, I noticed we were running with a very low C compiler warning level. I decided to crank it up a bit then rebuild.

It was a disasters. Thousands of compiler warnings scrolled by as it churned away for about five minutes. Many were coming from header files provided by the tool company! Things like windows.h and even things they included from standard ANSI C headers like stdint.h. What a mess!

I was able to figure out how to disable certain warnings to clean up the ones I could not fix, leaving me with a ton of legitimate warnings in our own code.

Some were harmless (if you knew what you were doing), like comparing signed to unsigned values, which I’ve previously discussed. Some complained about missing or incorrect prototypes.

A favorite useful one is when it complains about initialized variables that could be accessed, such as:

int variable;

switch (something)
{
   case 1:
      variable = 10;
      break;

   default:
      break;
}

printd ("variable: %d\n", variable);

Above, if “something” did not match one of the case statements (1 in this example), it would never be set, and the print would be using whatever random crap was in the space occupied by “variable” later. This is a nice warning to have.

Others were about unused variables or unused parameters. For example:

int function (int notUsed)
{
     return 42;
}

Compiler warnings can notify you that you have a variable declared that isn’t getting used. You can tell the compiler “Yeah, I know” by doing this:

int function (int notUsed)
{
    (void)notUsed; // Not used!
    return 42;
}

Each time you clean up a warning like that, you tell the compiler (and others who inspect the code) “I meant to do that!” And, sometimes you find actual bugs that were hidden before.

I’ve written a bit about floating point issues in the past, but the warning today is this one:

warning: comparing floating point with == or != is unsafe [-Wfloat-equal]

Basically, while you can do…

int a;
int b;

if (a == b) { ...something... };

…you cannot…

float a;
float b;

if (a == b) { ...something...}

…at least, you can’t without getting a warning. Due to how floating point works, simple code may fail in interesting ways:

int main()
{
    float var1;

    var1 = 902.1;

    if (var1 == 902.1)
    {
        printf ("it is!\n");
    }
    else
    {
        printf ("it is NOT.\n");
    }

    return EXIT_SUCCESS;
}

What do you think that will print?

To be continued…

Can you initialize a static linked list in C?

5 Replies

Updates:

2020-04-30 – Added missing semicolon in code and updated example.

NOTE: This article was originally written a few years ago, so some references may be out of date.

I have been enjoying working on the SirSound project the past month, as well as some fun challenges at my day job. Every now and then I run into something I’d like to do that is not doable in C, or doable but not proper to do in C, or … maybe doable. It’s sometimes difficult for me to find answers when I do not know how to ask the question.

With that said, I have a C question for anyone who might be able to answer it.

Using my multi-track music sequencer as an example, consider representing data like this:

[sequence]
  [track]
    [note /]
    [note /]
    [note /]
  [/track]
  [track]
    [note /]
    [note /]
    [note /]
  [/track]
[/sequence]

It’s easy to create a sequence like this in C. Here’s some pseudo code:

track1 = { C, C, C, C };
track2 = { E, E, E, E };
sequence1 = { track1, track 2};

I thought there might be a clever way to do all of this with one initializer. If I treat the data like nested XML (like the first example), I thought it might be possible to do something like this:

typedef struct SentenceStruct SentenceStruct;

struct SentenceStruct
{
  char           *Word;
  SentenceStruct *Next;
};

Something like this allows me to represent that tree of data very easily, and I find many examples of building things like this in C code:

int main()
{
   SentenceStruct Word1;
   SentenceStruct Word2;
   SentenceStruct Word3;

   Word1.Word = "sequence";
   Word1.Next = &Word2;

   Word2.Word = "track";
   Word2.Next = &Word3;

   Word3.Word = "note1";
   Word3.Next = NULL;

   SentenceStruct *Word;

   Word = &Word1;

   while( Word != NULL )
   {
      printf("%s ", Word->Word);
      if (Word->Next == NULL) break;
      Word = Word->Next;
   }
   printf("\n");

   return EXIT_SUCCESS;
}

This creates a single linked list of words, then prints them out.

I thought it might be possible to initialize the whole thing, like this:

 SentenceStruct sentence =
 {
   { "kitchen", { "light", { "ceiling", { "on", NULL }}}};
 }

…though the address of “light” is quite different than the address of a structure which contains a pointer that should point to “light”.

I was going to make each one a pointer to an array of words, so I could have a tree of words like the earlier example (with kitchen/light and kitchen/fan):

typedef struct SentenceStruct SentenceStruct;

struct SentenceStruct
{
  char *Word;
  SentenceStruct *Next[];
}

Does anyone know how (or if) this can be done in C?

Thoughts?

Floating point and 902.1 follow-up

1 Reply

Yesterday, I wrote a short bit about how I wasted a work day trying to figure out why we would tell our hardware to go to “902.1 Mhz” but it would not.

The root cause was a limitation in the floating point used in the C program I was working with. A float cannot represent every value, and it turns out values we were using were some of them. I showed a sample like this:

#include <stdio.h>
#include <stdlib.h>
 
int main()
{
   float f = 902.1;
   printf ("float 902.1  = %f\r\n", f);
 
   double d = 902.1;
   printf ("double 902.1 = %f\r\n", d);
 
   return EXIT_SUCCESS;
}

The float representation of 902.1 would display as 902.099976, and this caused a problem when we multiplied it by one million to turn it into hertz and send it to the hardware.

I played WHAC-A-MOLE trying to find all that places in the code that needed to change floats to doubles, and then played more WHAC-A-MOLE to update the user interface to change fields that use floats to use doubles…

Then, I realized we don’t actually care about precision past one decimal point. Here is the workaround I decided to go with, which means I can stop whacking moles.

float MHz;
uint32_t Hertz;
uint32_t Hertz2;
for (MHz=902.0; MHz < 902.9; MHz+=.1)
{
    Hertz = (MHz * 1000000);

    // Round to nearest 10000th of Hertz.
    Hertz2 = (((Hertz + 50000) / 100000)) * 100000;

    printf ("%f * 1000000 = %u -> %u\n", MHz, Hertz, Hertz2);
}

I kept the existing conversion (which would be off by quite a bit after being multiplied by one million) and then did a quick-and-dirty hack to round that Hertz value to the closest decimal point.

Here are the results, showing the MHz float, the converted Hertz, and the new rounded Hertz we actually use:

902.000000 * 1000000 = 902000000 -> 902000000
902.099976 * 1000000 = 902099975 -> 902100000
902.199951 * 1000000 = 902199951 -> 902200000
902.299927 * 1000000 = 902299926 -> 902300000
902.399902 * 1000000 = 902399902 -> 902400000
902.499878 * 1000000 = 902499877 -> 902500000
902.599854 * 1000000 = 902599853 -> 902600000
902.699829 * 1000000 = 902699829 -> 902700000
902.799805 * 1000000 = 902799804 -> 902800000
902.899780 * 1000000 = 902899780 -> 902900000

There. Much nicer. It’s not perfect, but it’s better.

Now I can get back to my real project.

Until next time…

Floating point and 902.1

5 Replies

I wasted most of my work day trying to figure out why some hardware was not going to the proper frequency. In my case, it looked fine going to 902.0 mHz, but failed on various odd values such as 902.1 mHz, 902.3 mHz, etc. But, I was told, there was an internal “frequency sweep” function that went through all those frequencies and it worked fine.

I finally ended up looking at the difference between what our host system sent (“Go to frequency X”) and what the internal function was doing (“Scan from frequency X to frequency Y”).

Then I saw it.

The frequency was being represented in hertz as a large 32-bit value, such as 902000000 for 902 mHz, or 902100000 for 902.1 mHz. The host program was taking its 902.1 floating point value and converting it to a 32-bit integer by multiplying that by 1,000,000… which resulted it it sending 902099975… which was then fed into some formula and resulted in enough drift due to being slightly off that the end results was also off.

902099975 is not what I expected from “multiply 902.1 by 1,000,000”.

I keep forgetting how bad floating point is. Try this:

#include <stdio.h>
#include <stdlib.h>
int main()
{
   float f = 902.1;
   printf ("float 902.1  = %f\r\n", f);
   double d = 902.1;
   printf ("double 902.1 = %f\r\n", d);
   return EXIT_SUCCESS;
}

It prints:

float 902.1 = 902.099976
double 902.1 = 902.100000

A double precision floating point can correctly represent 902.1, but a single precision float cannot.

The Windows GUI was correctly showing “902.1”, though, probably because it was taking the actual value and rounding it to one decimal place. Thank you, GUI, for hiding the problem.

I guess now I have to go through and change all those floats to doubles so the user gets what the user wants.

Until next time…

C compiler bugs that make you go hmmm…

Leave a reply

There is a PIC compiler from CCS that I use at work. Every now and then we run across a bug (most of which get quickly fixed by CCS). The latest is a rather bizarre issue where strings are getting mixed up.

Attempting to do something like this:

printf("Value %d", value);

…created output like:

XXXXXX42

…because somewhere else in the code was this:

printf("XXXXXXXXXXXXX");

The compiler was using the correct format string (to know where to put the %d variable on output) but was displaying the string from a different bit of code.

And that other bit of code was unused (not being called at all).

This may be a compiler optimizer bug, but I found it amusing. I haven’t seen a mixed up printf() issue before.

Until next time…

Converting two 8-bit values to one 16-bit value in C

Leave a reply

This is a follow-up to an earlier article I wrote about converting 8-bit values to 16-bit values.

There are several common ways to convert two byte *8-bit) values into one word (16-bit) value in C. Here are the ones I see often:

Math

This is how we learned to do it in BASIC. “C = A * 256 + B”:

uint8_t byte1;
uint8_t byte2;
uint16_t word;

byte1 = 0x12;
byte2 = 0x32;

word = byte1 * 256 + byte2;

Bit Shifting

This way generally makes less code, making use of the processor’s ability to bit shift and perform an OR.

word = (byte1 << 8) | byte2;

Unions

union {
   // First union element contains two bytes.
   struct {
      uint8_t byte1;
      uint8_t byte2;
   } Bytes;
   // Second union element is a 16-bit value.
   uint16_t word;
} MyUnion;

MyUnion.Bytes.byte1 = byte1;
MyUnion.Bytes.byte2 = byte2;

That one looks like much more source code but it may generate the smallest amount of executable instructions.

Array Access

And here was one I found at a previous job, which I hadn’t seen before, and haven’t seen since:

uint8_t  bytes[2];
uint16_t value;

value = 0x1234;

bytes[0] = *((uint8_t*)&(value)+1); //high byte (0x12)
bytes[1] = *((uint8_t*)&(value)+0); //low byte  (0x34)

That example is backwards from the topic of this post, but hopefully you can see the approach.

The question today is … which would you choose?

Clean Code would say writing it out in the simplest form is the best, since it would take less time for someone to understand what it was doing. But if you needed every last byte of code space, or every last clock cycle, you might want to do something else.

Which do you think is smallest?

Which do you think is fastest?

Which do you think is easiest to understand?

Comments appreciated.

Until next time…

Sub-Etha Software

"In Support of the CoCo and OS-9 since 1990!"

Category Archives: C Programming

char versus C versus C#

Researching 8-bit floating point.

C, printf, and pointers.

inttypes.h

PRIxWHAT?

warning: comparing floating point with == or != is unsafe [-Wfloat-equal] – Part 2

warning: comparing floating point with == or != is unsafe [-Wfloat-equal]

Can you initialize a static linked list in C?

Floating point and 902.1 follow-up

Floating point and 902.1

C compiler bugs that make you go hmmm…

Converting two 8-bit values to one 16-bit value in C

Math

Bit Shifting

Unions

Array Access