Raymond K. Zhao

CTO & Co-founder@ExeQuantum | Ex CSIRO, MonashUni

../ Academia/ Awards/ News/
21 March 2026

`long` Is Not a Fixed-Width Type: A Silent Data Corruption Bug in GMP

by Raymond K. Zhao

I recently spent days debugging an infinite loop in my cryptographic code on Windows. The function worked perfectly on Linux. All unit tests passed on both platforms. The root cause turned out to be a single call to GMP’s mpz_set_si:

// GMP's API signature:
//   void mpz_set_si(mpz_t rop, signed long int op);
//
// Our code passes a value that may exceed 32-bit range:
mpz_set_si(result, large_value);

The function mpz_set_si takes a signed long int. On Linux (LP64 data model), long is 64 bits, so a value like 3,784,023,924-3,784,023,924 is passed without issue. On Windows (LLP64 data model), long is 32 bits. The value exceeds the 32-bit signed range of approximately ±2.1\pm 2.1 billion. The implicit conversion from the wider type to signed long is undefined behavior in C — the compiler may saturate, truncate, or produce any result. On our toolchain, the stored value was 2,147,483,648-2,147,483,648 (INT32_MIN), bearing no mathematical relationship to the correct input.

There was no compiler warning. No runtime error. No assertion failure. The wrong value simply propagated through subsequent computations, producing results that always failed a validity check, causing an infinite retry loop. A function that completes in milliseconds on Linux hangs forever on Windows.

The Problem

GMP was first released in 1991. Its public API uses signed long int and unsigned long int throughout:

void mpz_set_si(mpz_t rop, signed long int op);
signed long int mpz_get_si(const mpz_t op);
void mpz_set_ui(mpz_t rop, unsigned long int op);
unsigned long int mpz_get_ui(const mpz_t op);
void mpz_mul_si(mpz_t rop, const mpz_t op1, signed long int op2);
unsigned long int mpz_cdiv_r_ui(mpz_t r, const mpz_t n, unsigned long int d);
// ... dozens more

In 1991, long was the widest standard integer type in C. When 64-bit Unix systems adopted LP64 (long = 64 bits), GMP’s API became 64-bit for free. But Microsoft chose LLP64 for 64-bit Windows, keeping long at 32 bits for backward compatibility. GMP never adapted. The need for an mp_size_t rewrite was discussed on the GMP mailing list as early as 2008. It has never been completed.

This is not an obscure edge case. GMP is used as the backend for Python’s gmpy2, GCC’s compile-time arithmetic, Mathematica, Maple, and countless mathematical / cryptographic implementations. Every one of these projects on Windows is exposed to silent data corruption if any value passed through mpz_set_si or retrieved via mpz_get_si exceeds ±231\pm 2^{31}.

The two directions have different failure modes. For mpz_set_si, the implicit conversion from a wider integer to signed long is undefined behavior — depending on the compiler, platform, and optimization level, the result may be saturated to INT32_MIN / INT32_MAX, truncated to the lower 32 bits, or something else entirely. For mpz_get_si, GMP’s documentation specifies that if the stored value is too large for signed long, only the low bits that fit are returned — a well-defined but equally dangerous silent truncation.

How We Found It

The computation involved converting values from a high-precision intermediate representation into GMP arbitrary-precision integers, then using those integers in further computation. The intermediate values routinely reach magnitudes of 3-4 billion — well beyond the 32-bit signed range.

We isolated the bug by forcing the computation to run exactly once (bypassing its internal retry loop) and comparing the full output byte-by-byte across platforms. The first portion matched perfectly. The divergence started at exactly the point where values were converted to GMP integers via mpz_set_si:

Linux:  g[0] = -3784023924   (correct)
Windows: g[0] = -2147483648   (wrong - INT32_MIN)

The Security Implications

In our case, the corrupted value caused an infinite loop — a denial of service. But this was actually the lucky outcome. If the validity check had happened to pass with the wrong data, the function would have silently returned a cryptographic key computed from corrupted intermediate values. The user would receive a key that appears valid but was derived from incorrect mathematics. Depending on the scheme, this could mean a key with reduced security, a key that fails interoperability with other implementations, or a key that leaks information about the secret material.

Silent data corruption in cryptographic code is strictly worse than a crash or a hang. A crash gets noticed. A hang gets noticed. A quietly wrong key does not.

The Official Workaround

GMP’s recommended solution is mpz_import and mpz_export, which operate on raw byte arrays with explicit size parameters:

static inline void mpz_set_int64(mpz_t rop, int64_t op) {
    if (op >= 0) {
        uint64_t u = (uint64_t)op;
        mpz_import(rop, 1, -1, sizeof(uint64_t), 0, 0, &u);
    } else {
        uint64_t u = (uint64_t)(-op);
        mpz_import(rop, 1, -1, sizeof(uint64_t), 0, 0, &u);
        mpz_neg(rop, rop);
    }
}

static inline int64_t mpz_get_int64(const mpz_t op) {
    uint64_t u = 0;
    mpz_export(&u, NULL, -1, sizeof(uint64_t), 0, 0, op);
    if (mpz_sgn(op) < 0)
        return -(int64_t)u;
    return (int64_t)u;
}

This pattern has been independently rediscovered by multiple projects that hit the same bug on Windows:

The GMP developers have been aware of this since at least 2009, when mpz_import was recommended on the mailing list as the portable solution for int64_t conversion. Adding mpz_set_si64 / mpz_get_si64 functions would have been trivial. It has never been done.

The Real Lesson

The long type in C has no guaranteed width. The C standard only requires it to be at least 32 bits. Its actual width depends on the platform’s data model:

Data model int long long long void * Platforms

ILP32

32

32

64

32

32-bit Linux, 32-bit Windows

LP64

32

64

64

64

64-bit Linux, macOS, BSDs

LLP64

32

32

64

64

64-bit Windows

On LP64 systems (most of the Unix world), long happens to be 64 bits. This creates a dangerous illusion: code that uses long as a 64-bit type works fine everywhere the developer tests it, then silently breaks on Windows.

The C99 standard introduced <stdint.h> with fixed-width types: int8_t, int16_t, int32_t, int64_t, and their unsigned counterparts. These types have exactly the width their name says, on every platform, with no exceptions. They have been universally available for over 25 years.

There is no reason to use long in any cross-platform C or C++ code written after 1999. If you mean 32 bits, write int32_t. If you mean 64 bits, write int64_t. If you mean "pointer-sized integer", write intptr_t or uintptr_t. If you mean "the size of an object", write size_t.

The only legitimate use of long is when calling an API that requires it — such as GMP’s. And when you do, you must account for the possibility that it is narrower than the value you are passing.

Recommendations

For Library Authors

  • Never use long or unsigned long in a public API. Use int32_t, int64_t, size_t, or intptr_t. Your users should not have to guess what width your function accepts.

  • If you must accept long for backward compatibility, provide _i64 / _u64 variants. This is what the Windows API itself does (SetFilePointer vs SetFilePointerEx).

  • Document the width assumptions. If your function silently corrupts data when the input exceeds a platform-dependent range, that is a bug, not a design choice.

For Library Users

  • Never pass a 64-bit value to a function that takes long without checking. On Windows, this is a narrowing conversion that your compiler will not warn you about.

  • Audit every long-taking call site in your cross-platform code. If you use GMP, grep for _si(, _ui(, _get_si, _get_ui and check the maximum possible value at each site.

  • Consider NTL for cryptographic prototyping. Victor Shoup’s NTL library provides arbitrary-precision integers (ZZ), modular arithmetic (ZZ_p), and polynomial rings (ZZX, zz_pX) with a clean C++ API that uses proper fixed-width conversions. NTL is already widely used in cryptographic research implementations, including HElib. It uses GMP as a backend for performance, but its public API does not expose GMP’s long-width pitfalls.

Final Thoughts

Silent data corruption in a cryptographic code is not a minor inconvenience. In the best case, you get a hang or a crash. In the worst case, you get a key that looks valid but is not — and you will never know. The fix was 20 lines of code. The diagnosis took days.

Use fixed-width integer types. Always.

tags: Tech