ProgrammingC System Programmer

Explain the specifics of typecasting between signed and unsigned numbers in C, what pitfalls arise, how to avoid unforeseen consequences in arithmetic and comparison of signed/unsigned types, and how this affects program portability?

Pass interviews with Hintsage AI assistant

Answer

In C, automatic type conversion works on the principle of "usual arithmetic conversions". When participating in an expression with numbers of different signs (signed/unsigned), the following rules apply:

  • If one of the operands is unsigned and the other is signed, the signed operand is automatically converted to unsigned.
  • This can lead to unexpected overflow, especially during comparisons or arithmetic operations.
  • The size of the types also matters: if the unsigned type is larger in size, the signed type is converted to unsigned.

Example of dangerous arithmetic:

int a = -1; // signed unsigned int b = 1; printf("%d\n", a < b); // always false, as a is converted to a very large unsigned

The result: -1, when converted to unsigned, becomes a very large positive number.

What to remember:

  • Always explicitly cast types if there is potential confusion with signs.
  • Pay attention to the sizes of types (int, long, uint32_t, etc.) to ensure predictable conversions.
  • Separate the logic of handling signed and unsigned variables, especially in boundary checks and arithmetic.

Trick question

Question: What result will the expression (int)(unsigned)-1 return?

Expected incorrect answer: "-1, since -1 is cast to int."

Correct answer: In the expression (unsigned)-1, -1 is first converted to unsigned (on a 32-bit platform this is 0xFFFFFFFF), then back to signed int, which also depends on implementation, but often results in -1 again (if two's complement is used). However, it's more accurate to say: The result depends on the standards for representing signed numbers, but in most implementations it will be -1.

Example:

int x = (int)(unsigned)-1; // x == -1 on most platforms

Examples of real errors due to ignorance of the nuances of the topic


Story

In a string handler, a size comparison function was used: if the string length could be negative, the program reported an error. However, the length was of type size_t (unsigned), and the comparison code if(length < 0) always returned false, leading to an infinite loop and memory overflow.


Story

When parsing protocols, network packets contained fields as unsigned, while local variables were signed. Due to unsigned overflow when processing certain values, incorrect length calculations for the packet arose, resulting in a buffer overflow vulnerability.


Story

A date comparison module in logs stored dates as unsigned int but searched for date range in int. Some boundary values, instead of the expected exception, led to improper record filtering and loss of important logs.