Undefined Behaviour and For-Loops
David Jackson 2024-06-22
Recently, my boss was interviewing for an intern and shared some of the example-code written by the candidates. One of these was interesting in its wrong-ness. Roughly-speaking, the candidate had written a for-loop that looked like this:
for (int i = 0; i++) {
// ...
}
This is, quite obviously, not a valid for loop in Java (the language that they
wanted to write their code in) because i++
does not evaluate to a boolean and
Java is picky about that here. It is, however, kind of a valid for-loop in
C if you add a semicolon after the i++
. I wondered what would happen when you
compiled that.
The syntax of the for loop in C is:
for (initializer; condition; update) { ... }
C has no dedicated boolean type, so 1
is true and 0
is false. This means
that i++
, since it evaluates to a number, can be considered to be a boolean
value in C. Our test program looks like this:
#include <stdio.h>
int main(void)
{
int n = 0;
int i;
for (i = 0; i++;) {
n++;
}
printf("%d\n", n);
}
I compiled this (on Linux, on x8664, using GCC) and ran it and it immediately
output `0`. Because the `++` is evaluated after_ the value of i
is taken,
since i
starts at 0
it's immediately false so the loop is never entered.
For the sake of continuing this experiment, I switched the code to ++i
so
that the loop would actually be run.
Running the ++i
version of the program took a while longer, but it eventually
output -1
. This happened because the integer "wrapped around" from a large
positive number to a negative number and eventually it gets back to being zero.
This is C, though, and C has
undefined behaviour. In C,
the act of adding to an integer, when that addition would cause integer
overflow, is undefined behaviour. When the compiler encounters an operation
that would result in undefined behaviour, it's allowed to do anything that it
wants to do. The reason that we got the "boring" behaviour in our test program
is that we were compiling without optimization. If we turn on optimization, via
the -O2
compiler flag, things start to get interesting.
$ cc -O2 -o ubfor ubfor.c
ubfor.c: In function ‘main’:
ubfor.c:7:21: warning: iteration 2147483647 invokes undefined behavior [-Waggressive-loop-optimizations]
7 | for (i = 0; ++i;) {
|
This time when we attempt to compile the program, we get a warning about the fact that this for-loop will result in undefined behaviour. When we run the program this time, no matter how long we wait, it will never end—i.e. it has been compiled to an infinite loop. We can see this if we disassemble the executable (using intel assembly syntax) as follows:
$ objdump -M intel -d ubfor
The main
function, in its entirety, is:
0000000000001020 <main>:
1020: eb fe jmp 1020 <main>
1022: 66 2e 0f 1f 84 00 00 cs nop WORD PTR [rax+rax*1+0x0]
1029: 00 00 00
102c: 0f 1f 40 00 nop DWORD PTR [rax+0x0]
As soon as we get into main
, we jump to the beginning of main
, forever.
This program does nothing other than loop eternally. The nop
operation was
more confusing to me because it seemed awfully complicated for a "do nothing"
operation. Looking at
the x86\_64 developer manual
(on page 1378) we find the multi-bye NOP opcode. Weirdly, none of the
instructions match exactly, but it's close to the 9-byte one which is
66 NOP DWORD ptr [EAX + EAX*1 + 00000000H]
having the byte sequence
66 0F 1F 84 00 00 00 00 00H
In our instruction, we have an extra 2e
, presumably just to pad the
instruction. The actual meaning of the bytes in our 10-bytes nop are
as follows:
66
- Operand-size override prefix: sets the operand-size to the non-default
value (default can be 16 or 32 bits)
* 2e
- CS segment override prefix: sets the default segment register used
for memory accesses to CS
* 0F
- NOP (instruction)
* 1F
- DWORD (double word): 4-byte argument
* 84 00 00 00 00H
is a memory access
Since the instruction is a NOP, the memory instruction is irrelevant apart from
its use to fill space, but I wanted to know what it meant anyway. On page 522
of the x86_64 manual, we find a table explaining the bits in the 84
byte.
The first two bits (10
in our case from 8 = 0b1000
) are the "mode" bit. The
next 3 bits are the "reg/opcode" bits and they specify if a register or an
opcode extension will be used. The final 3 bits are the R/M bits and these
specify either a particular register or memory addressing mode.
10
- mode is "register + displacement"000
- register EAX100
-[--][--]+disp32
a scale/index/base byte follows the ModR/M byte
After this (84
) byte is the SIB which is 00
. Its bits are:
00
- Scale is "multiplied by 1"000
- Index = EAX000
- Base = EAX
Therefore this byte means EAX + EAX*1
. Finally we have the 32-bit (4-byte)
displacement value, which is 0. Ultimately, the memory access is, as the
disassembler said: [EAX + EAX*1 + 00000000H]
(by the way, the "H" means
"hexadecimal"). We now know exactly what all of this code means. It's very
complex for code that's meant to do nothing!
This brings us back to how we got here in the first place: undefined behaviour.
Since addition-with-overflow is undefined behaviour, during optimization the
compiler assumes that it is impossible, meaning that the condition of the for
loop will never be true, so rather than do the work of running any code after
the infinite loop, it gets deleted. Then the code inside the loop does nothing
of interest so it is deleted as well. That's how we get from a badly-coded
for-loop to a main
function that does nothing, forever.