These articles are written by Codalogic empowerees as a way of sharing knowledge with the programming community. They do not necessarily reflect the opinions of Codalogic.

Beware C++ Undefined Behaviour

By: Pete, November 2022

In C++, overflow of signed integers results in undefined behaviour (UB), whereas overflow of unsigned integers is defined.

However, on pretty much every microprocessr since the 1980s the behaviour of signed overflow has been the same. They just wrap in the same way that unsigned integers wrap. That's because a CPU ALU does not take into account whether an integer is signed or unsigned when it operates. It's merely the way the bits are interpreted post-computation that differentiates between signed and unsigned operations.

So, if you're like I was, you might be inclined to think that signed integer overflow is no big deal. After all, how bad can an overflow be?

But that's not the case by a long shot.

Take the program below (https://godbolt.org/z/d4v6bbrK6) and compile it with GCC with -O3 optimisations and what would you expect the result to be?

#include <iostream>
int main() {
    for (int i = 0; i < 4; ++i){
        int q = i * 1000000000;
        std::cout << q << std::endl;
    }
    return 0;
}

First you might think it's:

0
1000000000
2000000000
3000000000

But then quickly realise that 3000000000 is too big to fit into a signed 32-bit integer, so you'll get wrapping and end up with:

0
1000000000
2000000000
-1294967296

Alas, that's also wrong. What you end up with is:

0
1000000000
2000000000
-1294967296
-294967296
705032704
1705032704
-1589934592
-589934592
410065408
1410065408
-1884901888
-884901888
...

In fact it is an infinite loop. To all intents and purposes your program has crashed - simply due to a signed overflow.

Wow. You would at least hope arithmetic UB would stay in its own lane!!!

It turns out that GCC is doing aggressive loop optimisation.

Post-optimisation the code it effectively executes is as follows:

int main() {
    for (int q = 0; i < 4000000000; q += 1000000000){
        std::cout << q << std::endl;
    }
    return 0;
}

The signed integer i can never get greater than 4000000000 so the loop carries on forever.

Indeed, GCC realises that i can never get greater than 4000000000 and so removes that test. The code actually generated by GCC is effectively (see assembly diff below):

int main() {
    for (int q = 0; true; q += 1000000000){
        std::cout << q << std::endl;
    }
    return 0;
}

Conversely, Clang does loop unrolling and constant folding to arrive at the following effective code:

int main() {
    std::cout << 0 << std::endl;
    std::cout << 100000000 << std::endl;
    std::cout << 200000000 << std::endl;
    std::cout << -1294967296 << std::endl;
    return 0;
} 

Nice!

Interestingly, back on GCC, if you cast i to unsigned int before doing the multiplication, so you have:

int main() {
    for (int i = 0; i < 4; ++i){
        int q = static_cast<unsigned int>(i) * 1000000000;
        std::cout << q << std::endl;
    }
    return 0;
}

You get the result you'd expect:

0
1000000000
2000000000
-1294967296

This is because the behaviour of unsigned integer overflow is defined and the compiler implementers are not allowed to ignore that.

The code generated code using the unsigned cast is effectively:

int main() {
    for (int q = 0; true; q += 1000000000){
        std::cout << q << std::endl;
        if( q == -1294967296 )
            break;
    }
    return 0;
}

You can see the difference between the generated signed and unsigned in the assembly code.

Recalling that the effective signed code is:

int main() {
    for (int q = 0; true; q += 1000000000){
        std::cout << q << std::endl;
    }
    return 0;
}

And the effective unsigned code is:

int main() {
    for (int q = 0; true; q += 1000000000){
        std::cout << q << std::endl;
        if( q == -1294967296 )
            break;
    }
    return 0;
}

The assembly diff is as follows:

    main:
            push    r12
            xor     r12d, r12d
            push    rbp
            push    rbx
    .L9:
            mov     esi, r12d
            mov     edi, OFFSET FLAT:_ZSt4cout
            call    std::basic_ostream<...>::operator<<(int)
            mov     rbx, rax
            mov     rax, QWORD PTR [rax]
            mov     rax, QWORD PTR [rax-24]
            mov     rbp, QWORD PTR [rbx+240+rax]
            test    rbp, rbp
            je      .L10
            cmp     BYTE PTR [rbp+56], 0
            je      .L5
            movsx   esi, BYTE PTR [rbp+67]
    .L6:
            mov     rdi, rbx
            add     r12d, 1000000000
            call    std::basic_ostream<...>::put(char)
            mov     rdi, rax
            call    std::basic_ostream<...>::flush()
 <!         jmp     .L9
 !>         cmp     r12d, -294967296
 !>         jne     .L9
 !>         pop     rbx
 !>         xor     eax, eax
 !>         pop     rbp
 !>         pop     r12
 !>         ret

The jmp .L9 corresponds to the true in the signed for-loop and the cmp r12d, -294967296 is the start of the if() clause in the unsigned case.

(I'm not sure why the program output shows -1294967296 whereas the disassembly shows -294967296. The actual op-codes show a value of 0xee6b2800 which is 4000000000 in hex and -294967296 when a 32-bit value is converted to decimal. So I'm not sure why the program displays -1294967296.)

As a small post-note, if you try to actually write:

int main() {
    for (int q = 0; true; q += 1000000000){
        std::cout << q << std::endl;
        if( q == -1294967296 )
            break;
    }
    return 0;
}

the if() clause is dead code elided (because if you start at 0 and only add to it you can never get a negative number) and you'll get the forever output again!

In summary, don't take C++ undefined behaviour (UB) lightly. Don't assume you know what will happen and decide it is unimportant. The compiler writers might be making other assumptions and they might want to teach you a lesson :)

Keywords