-
-
Notifications
You must be signed in to change notification settings - Fork 34k
Description
Bug report
Bug description:
In commit 59f247e, the PUT and LONG_PUT opcodes with large arguments leading to OOM errors was fixed by adding a sparse dictionary to the memo (in the C _pickle module). This has enabled large memo indices to be used in the C _pickle module, but has led to discovering another discrepancy in the PUT opcode.
In order to handle numbers > MAX_LONG (such as 999999999999999999999999), C _pickle must treat the numbers as PyLongs instead of a built-in type (like ssize_t or long). In the C _pickle source code for loading the PUT opcode during pickle deserialization, the memo index is first parsed as a PyLong, and then converted to a Py_ssize_t, which is really just a ssize_t under the hood.
Lines 6544 to 6547 in 29acc08
| key = PyLong_FromString(s, NULL, 10); | |
| if (key == NULL) | |
| return -1; | |
| idx = PyLong_AsSsize_t(key); |
Line 1625 in 29acc08
| _Unpickler_MemoPut(UnpicklerObject *self, size_t idx, PyObject *value) |
This is done because idx is eventually is passed to _Unpickler_MemoPut(), which takes size_t idx as one of the arguments. However, this leads to the unintended side effect of preventing PUT indices > MAX_LONG from being valid in C _pickle.
payload: b'K\x01p999999999999999999999999\n.'
pickle: 1
_pickle.c: FAILURE Python int too large to convert to C ssize_t
pickletools:
0: K BININT1 1
2: p PUT 999999999999999999999999
28: . STOP
highest protocol among opcodes = 1
To ensure these indices are valid, it would likely require overloading _Unpickler_MemoPut() to accept a PyLong and somehow being able to use that as an index in the array.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Metadata
Metadata
Assignees
Labels
Projects
Status