Skip to content

draft: Store integers in ob_size field of PyLongObjects #31595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
55 changes: 44 additions & 11 deletions Include/cpython/longintrepr.h
Original file line number Diff line number Diff line change
Expand Up @@ -61,19 +61,52 @@ typedef long stwodigits; /* signed variant of twodigits */
#define PyLong_BASE ((digit)1 << PyLong_SHIFT)
#define PyLong_MASK ((digit)(PyLong_BASE - 1))

/* Long integer representation.
The absolute value of a number is equal to
SUM(for i=0 through abs(ob_size)-1) ob_digit[i] * 2**(SHIFT*i)
Negative numbers are represented with ob_size < 0;
zero is represented by ob_size == 0.
In a normalized number, ob_digit[abs(ob_size)-1] (the most significant
digit) is never zero. Also, in all cases, for all valid i,
0 <= ob_digit[i] <= MASK.
The allocation function takes care of allocating extra memory
so that ob_digit[0] ... ob_digit[abs(ob_size)-1] are actually available.
/* Long Integer Representation
---------------------------

There are two representations of long objects: the inlined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the optimized representation isn't "inlined" -- I would think that that term might be reserved for a version using tagged pointer values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change this to "small num" and "big num", or something like that, to make this clearer.

representation, where the sign and value is stored within the ob_size
field, and the bignum implementation, where the ob_size stores both the sign
and number of digits in the ob_digits field.

To distinguish between either representation, one looks at the least significant
bit of the ob_size field; if it's set, the value is inlined in that field; if it's
unset, then it should be treated as the number of digits in ob_digit.

For inlined longs, their value can be obtained with this expression:

ob_size >> 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also document the macro one is suppsed to use. :-)


For inlined longs:
* These integers have a capacity of 62bits on 64-bit architectures:
one bit for the "is inlined" flag, and one sign bit. This is 30 bits on
32-bit architectures (for the same reasons).
Comment on lines +81 to +83
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that be called a capacity of 63 (or 31) bits? When we use the full 64-bit word we don't say that the capacity is 63 bits plus sign, we usually just say "64-bit signed integers".

* Inlined longs are always normalized, as they use the machine
representation for integers.
* Allocation functions won't allocate the space for the ob_digit buffer,
because these are never used with this representation.
* As a consequence of the previous point, the width of a digit for
long longs can be either 15 or 30, and this doesn't affect the
representation of inlined longs.

For bignum longs, their absolute value can be obtained with this expression:

SUM(for i=0 through abs(ob_size >> 1)-1) ob_digit[i] * 2**(SHIFT*i)

In this representation:
* These numbers can also be normalized. In a normalized number,
ob_digit[abs(ob_size)-1] (the most significant digit) is never zero.
Comment on lines +97 to +98
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there also a stronger guarantee that when the object leaves longobject.c it is always normalized? (I.e. unnormalized longs only exist as intermediary results.)

* Also, in all cases, for all valid i, 0 <= ob_digit[i] <= MASK.
* The allocation function takes care of allocating extra memory
so that ob_digit[0] ... ob_digit[abs(ob_size)-1] are actually available.

In either case:
* Negative numbers are represented by (ob_size >> 1) < 0
* Zero is represented by (ob_size >> 1) == 0

CAUTION: Generic code manipulating subtypes of PyVarObject has to
aware that ints abuse ob_size's sign bit.
aware that ints abuse ob_size's sign bit and its least significant
bit.
Comment on lines +108 to +109
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather say that they abuse the ob_size value. :-)

*/

struct _longobject {
Expand Down
149 changes: 131 additions & 18 deletions Include/internal/pycore_bitutils.h
Original file line number Diff line number Diff line change
Expand Up @@ -139,33 +139,46 @@ _Py_popcount32(uint32_t x)
#endif
}

static inline int
_Py_popcount64(uint64_t x)
{
#if (defined(__clang__) || defined(__GNUC__))
if (sizeof(long long) == sizeof(uint64_t)) {
return __builtin_popcountll(x);
}
if (sizeof(long) == sizeof(uint64_t)) {
return __builtin_popcountl(x);
}
#endif
return _Py_popcount32(x >> 32) + _Py_popcount32((uint32_t)x);
}

static inline int
_Py_popcount(long x)
{
if (sizeof(long) == sizeof(uint32_t)) {
return _Py_popcount32(x);
}
if (sizeof(long) == sizeof(uint64_t)) {
return _Py_popcount64(x);
}
_Py_UNREACHABLE();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual macro has no leading _.

}

// Return the index of the most significant 1 bit in 'x'. This is the smallest
// integer k such that x < 2**k. Equivalent to floor(log2(x)) + 1 for x != 0.
static inline int
_Py_bit_length(unsigned long x)
_Py_bit_length32(uint32_t x)
{
#if (defined(__clang__) || defined(__GNUC__))
if (x != 0) {
// __builtin_clzl() is available since GCC 3.4.
// Undefined behavior for x == 0.
return (int)sizeof(unsigned long) * 8 - __builtin_clzl(x);
}
else {
return 0;
}
// __builtin_clzl() is undefined for x = 0.
Py_BUILT_ASSERT(sizeof(long) <= sizeof(uint32_t));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean >=?

return x == 0 ? 0 : 32 - __builtin_clzl(x);
#elif defined(_MSC_VER)
// _BitScanReverse() is documented to search 32 bits.
Py_BUILD_ASSERT(sizeof(unsigned long) <= 4);
unsigned long msb;
if (_BitScanReverse(&msb, x)) {
return (int)msb + 1;
}
else {
return 0;
}
return _BitScanReverse(&msb, x) ? msb + 1 : 0;
#else
const int BIT_LENGTH_TABLE[32] = {
static const int BIT_LENGTH_TABLE[32] = {
0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5
};
Expand All @@ -180,6 +193,106 @@ _Py_bit_length(unsigned long x)
}


// Return the index of the most significant 1 bit in 'x'. This is the smallest
// integer k such that x < 2**k. Equivalent to floor(log2(x)) + 1 for x != 0.
// (Same as _Py_bit_length(), but works for 64-bit integers.)
static inline int
_Py_bit_length64(uint64_t x)
{
#if (defined(__clang__) || defined(__GNUC__))
/* __builtin_clzll() is undefined for x = 0 */
return x == 0 ? 0 : 64 - __builtin_clzll(x);
#elif defined(_MSC_VER) && defined(_WIN64)
// FIXME(lpereira): Is _WIN64 sufficient to test for Aarch64 and x86-64?
// _BitScanReverse64() is only defined for 64-bit Windows, either on x86,
// or on ARM:
// https://docs.microsoft.com/en-us/cpp/intrinsics/bitscanreverse-bitscanreverse64
unsigned long msb;
return _BitScanReverse64(&msb, x) ? msb + 1 : 0;
#else
int upper_bits = _Py_bit_length32((uint32_t)(x >> 32));
int lower_bits = _Py_bit_length32((uint32_t)x);
return upper_bits + lower_bits;
Comment on lines +213 to +215
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't look right? It should use the bit length of the upper bits, plus 32, if the upper bits are nonzero, else the bit length of the lower bits, or something like that.

(I wonder if there should be a mode where you don't use the GCC/clang/MSVC versions for any of these functions, to test that the fallback versions are correct.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Good catch. I was really tired when I wrote this code and ended up not testing/proving this is correct.

#endif
}

static inline int _Py_bit_length(unsigned long x)
{
_Py_BUILD_ASSERT(sizeof(x) == sizeof(uint32_t) || sizeof(x) == sizeof(uint64_t));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No leading _.


if (sizeof(x) == sizeof(uint32_t)) {
return _Py_bit_length32(x);
}
if (sizeof(x) == sizeof(uint64_t)) {
return _Py_bit_length64(x);
}

_Py_UNREACHABLE();
}

static inline bool _Py_add_overflow32(int32_t a, int32_t b, int32_t *result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file needs #include <stdbool.h> somewhere near the top (say at line 19).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also our C style guide requires a line break between the return type and the function name.)

{
#if (defined(__clang__) || defined(__GNUC__))
return __builtin_add_overflow(a, b, result);
#else
*result = (int64_t)((uint32_t)a + (uint32_t)b);
/* When adding, signed overflow only happens if the result has different sign when
* both inputs have the same sign. */
return ((*result ^ a) & ~(a ^ b)) >> 31;
#endif
}

static inline bool _Py_add_overflow64(int64_t a, int64_t b, int64_t *result)
{
#if (defined(__clang__) || defined(__GNUC__))
return __builtin_add_overflow(a, b, result);
#else
*result = (int64_t)((uint64_t)a + (uint64_t)b);
return ((*result ^ a) & ~(a ^ b)) >> 63;
#endif
}

static inline bool _Py_sub_overflow32(int32_t a, int32_t b, int32_t *result)
{
#if (defined(__clang__) || defined(__GNUC__))
return __builtin_sub_overflow(a, b, result);
#else
*result = (int32_t)((uint32_t)a - (uint32_t)b);
return ((*result ^ a) & (a ^ b)) >> 31;
#endif
}

static inline bool _Py_sub_overflow64(int64_t a, int64_t b, int64_t *result)
{
#if (defined(__clang__) || defined(__GNUC__))
return __builtin_sub_overflow(a, b, result);
#else
*result = (int64_t)((uint64_t)a - (uint64_t)b);
return ((*result ^ a) & (a ^ b)) >> 63;
#endif
}

static inline bool _Py_mul_overflow32(int32_t a, int32_t b, int32_t *result)
{
#if (defined(__clang__) || defined(__GNUC__))
return __builtin_mul_overflow(a, b, result);
#else
uint64_t result64 = (uint64_t)((uint64_t)a * (uint64_t)b);
*result = (int32_t)result64;
return result64 <= INT32_MAX;
#endif
}

static inline bool _Py_mul_overflow64(int64_t a, int64_t b, int64_t *result)
{
#if (defined(__clang__) || defined(__GNUC__))
return __builtin_mul_overflow(a, b, result);
#else
*result = (uint64_t)a + (uint64_t)b;
return (a >= INT32_MAX || b >= INT32_MAX) && a > 0 && INT64_MAX / a < b;
#endif
}

#ifdef __cplusplus
}
#endif
Expand Down
2 changes: 1 addition & 1 deletion Include/internal/pycore_global_objects.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ struct _Py_global_objects {
* The integers that are preallocated are those in the range
* -_PY_NSMALLNEGINTS (inclusive) to _PY_NSMALLPOSINTS (exclusive).
*/
PyLongObject small_ints[_PY_NSMALLNEGINTS + _PY_NSMALLPOSINTS];
PyVarObject small_ints[_PY_NSMALLNEGINTS + _PY_NSMALLPOSINTS];

PyBytesObject bytes_empty;
struct {
Expand Down
2 changes: 1 addition & 1 deletion Include/internal/pycore_long.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ PyObject *_PyLong_Subtract(PyLongObject *left, PyLongObject *right);

/* Used by Python/mystrtoul.c, _PyBytes_FromHex(),
_PyBytes_DecodeEscape(), etc. */
PyAPI_DATA(unsigned char) _PyLong_DigitValue[256];
PyAPI_DATA(const unsigned char) _PyLong_DigitValue[256];

/* Format the object based on the format_spec, as defined in PEP 3101
(Advanced String Formatting). */
Expand Down
Loading