Skip to content

bpo-33234 Improve list() pre-sizing for inputs with known lengths (no __length_hint__) #9846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 28, 2018
Merged
2 changes: 1 addition & 1 deletion Lib/test/test_descr.py
Original file line number Diff line number Diff line change
Expand Up @@ -2028,7 +2028,7 @@ class X(Checker):
setattr(X, attr, obj)
setattr(X, name, SpecialDescr(meth_impl))
runner(X())
self.assertEqual(record, [1], name)
self.assertGreaterEqual(record.count(1), 1, name)

class X(Checker):
pass
Expand Down
9 changes: 9 additions & 0 deletions Lib/test/test_list.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import sys
from test import list_tests
from test.support import cpython_only
import pickle
import unittest

Expand Down Expand Up @@ -157,5 +158,13 @@ class L(list): pass
with self.assertRaises(TypeError):
(3,) + L([1,2])

@cpython_only
def test_preallocation(self):
iterable = [0] * 10
iter_size = sys.getsizeof(iterable)

self.assertEqual(iter_size, sys.getsizeof(list([0] * 10)))
self.assertEqual(iter_size, sys.getsizeof(list(range(10))))

if __name__ == "__main__":
unittest.main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
The list constructor will pre-size and not over-allocate when
the input lenght is known.
38 changes: 38 additions & 0 deletions Objects/listobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,31 @@ list_resize(PyListObject *self, Py_ssize_t newsize)
return 0;
}

static int
list_preallocate_exact(PyListObject *self, Py_ssize_t size)
{
PyObject **items;
size_t allocated;

allocated = (size_t)size;
if (allocated > (size_t)PY_SSIZE_T_MAX / sizeof(PyObject *)) {
PyErr_NoMemory();
return -1;
}

if (size == 0) {
allocated = 0;
}
items = (PyObject **)PyMem_New(PyObject*, allocated);
if (items == NULL) {
PyErr_NoMemory();
return -1;
}
self->ob_item = items;
self->allocated = allocated;
return 0;
}

/* Debug statistic to compare allocations with reuse through the free list */
#undef SHOW_ALLOC_COUNT
#ifdef SHOW_ALLOC_COUNT
Expand Down Expand Up @@ -2649,6 +2674,19 @@ list___init___impl(PyListObject *self, PyObject *iterable)
(void)_list_clear(self);
}
if (iterable != NULL) {
if (_PyObject_HasLen(iterable) && self->ob_item == NULL) {
Py_ssize_t iter_len = PyObject_Size(iterable);
if (iter_len == -1) {
if (PyErr_ExceptionMatches(PyExc_Exception)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What tests will fail if not silence any exceptions at all?

Copy link
Member Author

@pablogsal pablogsal Oct 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least the following:

test_generators failed
**********************************************************************
File "/home/pablogsal/github/cpython/Lib/test/test_generators.py", line ?, in test.test_generators.__test__.weakref
Failed example:
    list(p)
Exception raised:
    Traceback (most recent call last):
      File "/home/pablogsal/github/cpython/Lib/doctest.py", line 1329, in __run
        compileflags, 1), test.globs)
      File "<doctest test.test_generators.__test__.weakref[9]>", line 1, in <module>
        list(p)
    TypeError: object of type 'generator' has no len()
**********************************************************************
1 items had failures:
   1 of  10 in test.test_generators.__test__.weakref
***Test Failed*** 1 failures.


test_genexps failed
**********************************************************************
File "/home/pablogsal/github/cpython/Lib/test/test_genexps.py", line ?, in test.test_genexps.__test__.doctests
Failed example:
    list(p)
Exception raised:
    Traceback (most recent call last):
      File "/home/pablogsal/github/cpython/Lib/doctest.py", line 1329, in __run
        compileflags, 1), test.globs)
      File "<doctest test.test_genexps.__test__.doctests[75]>", line 1, in <module>
        list(p)
    TypeError: object of type 'generator' has no len()
**********************************************************************
1 items had failures:
   1 of  76 in test.test_genexps.__test__.doctests
***Test Failed*** 1 failures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can _PyObject_HasLen() return true for generators?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it is not true generator, it is a weakref. What if silence just TypeError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit afraid if someone has similar tests/usage like these two cases and then they will receive new errors as raising all exceptions would be technically a backwards incompatible change. But if you think we should raise (maybe ignoring TypeError) I am happy to change the implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See PyObject_LengthHint().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 0a6c8ba

PyErr_Clear();
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: PEP 7 requires

}
else {

return -1;
}
}
if (iter_len > 0 && list_preallocate_exact(self, iter_len)) {
return -1;
}
}
PyObject *rv = list_extend(self, iterable);
if (rv == NULL)
return -1;
Expand Down