This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年02月01日 13:57 by pkt, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| poc_unidata_normalize.py | pkt, 2015年02月01日 13:57 | |||
| Messages (8) | |||
|---|---|---|---|
| msg235175 - (view) | Author: paul (pkt) | Date: 2015年02月01日 13:57 | |
# Bug
# ---
#
# static PyObject*
# unicodedata_normalize(PyObject *self, PyObject *args)
# {
# ...
# if (strcmp(form, "NFKC") == 0) {
# if (is_normalized(self, input, 1, 1)) {
# Py_INCREF(input);
# return input;
# }
# return nfc_nfkc(self, input, 1);
#
# We need to pass the is_normalized() check (repeated \xa0 char takes care of
# that). nfc_nfkc calls:
#
# static PyObject*
# nfd_nfkd(PyObject *self, PyObject *input, int k)
# {
# ...
# Py_ssize_t space, isize;
# ...
# isize = PyUnicode_GET_LENGTH(input);
# /* Overallocate at most 10 characters. */
# space = (isize > 10 ? 10 : isize) + isize;
# osize = space;
# 1 output = PyMem_Malloc(space * sizeof(Py_UCS4));
#
# 1. if isize=2^30, then space=2^30+10, so space*sizeof(Py_UCS4)=(2^30+10)*4 ==
# 40 (modulo 2^32), so PyMem_Malloc allocates buffer too small to hold the
# result.
#
# Crash
# -----
#
# nfd_nfkd (self=<module at remote 0x4056e574>, input='...', k=1) at /home/p/Python-3.4.1/Modules/unicodedata.c:552
# 552 stackptr = 0;
# (gdb) n
# 553 isize = PyUnicode_GET_LENGTH(input);
# (gdb) n
# 555 space = (isize > 10 ? 10 : isize) + isize;
# (gdb) n
# 556 osize = space;
# (gdb) n
# 557 output = PyMem_Malloc(space * sizeof(Py_UCS4));
# (gdb) print space
# 9ドル = 1073741834
# (gdb) print space*4
# 10ドル = 40
# (gdb) c
# Continuing.
#
# Program received signal SIGSEGV, Segmentation fault.
# 0x40579cbb in nfd_nfkd (self=<module at remote 0x4056e574>, input='', k=1) at /home/p/Python-3.4.1/Modules/unicodedata.c:614
# 614 output[o++] = code;
#
# OS info
# -------
#
# % ./python -V
# Python 3.4.1
#
# % uname -a
# Linux ubuntu 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 15:31:16 UTC 2013 i686 i686 i386 GNU/Linux
import unicodedata as ud
s="\xa0"*(2**30)
ud.normalize("NFKC", s)
|
|||
| msg237058 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年03月02日 16:21 | |
New changeset 84025a32fa2b by Benjamin Peterson in branch '3.3': fix possible overflow bugs in unicodedata (closes #23367) https://hg.python.org/cpython/rev/84025a32fa2b New changeset 90f960e79c9e by Benjamin Peterson in branch '3.4': merge 3.3 (#23367) https://hg.python.org/cpython/rev/90f960e79c9e New changeset 93244000efea by Benjamin Peterson in branch 'default': merge 3.4 (#23367) https://hg.python.org/cpython/rev/93244000efea New changeset 3019effc44f2 by Benjamin Peterson in branch '2.7': fix possible overflow bugs in unicodedata (closes #23367) https://hg.python.org/cpython/rev/3019effc44f2 |
|||
| msg237062 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年03月02日 16:58 | |
Actually integer overflow in the line space = (isize > 10 ? 10 : isize) + isize; is not possible. Integer overflows in PyMem_Malloc were fixed in issue23446. |
|||
| msg237068 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2015年03月02日 17:58 | |
Why can't (isize > 10 ? 10 : isize) + isize overflow? |
|||
| msg237077 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年03月02日 19:24 | |
Because isize is the size of real PyUnicode object. It's maximal value is PY_SSIZE_T_MAX - sizeof(PyASCIIObject) - 1. |
|||
| msg237080 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年03月02日 20:17 | |
Well, the test doesn't hurt. |
|||
| msg237084 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2015年03月02日 20:54 | |
True, but that could change and is not true in Python 2. I suppose we could revert the change and add a static assertion. On Mon, Mar 2, 2015, at 14:24, Serhiy Storchaka wrote: > > Serhiy Storchaka added the comment: > > Because isize is the size of real PyUnicode object. It's maximal value is > PY_SSIZE_T_MAX - sizeof(PyASCIIObject) - 1. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue23367> > _______________________________________ |
|||
| msg237158 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年03月03日 19:45 | |
The test doesn't hurt. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:12 | admin | set | github: 67556 |
| 2015年03月03日 19:45:10 | serhiy.storchaka | set | messages: + msg237158 |
| 2015年03月03日 05:13:01 | Arfrever | set | versions: + Python 2.7, Python 3.3, Python 3.5 |
| 2015年03月02日 20:54:26 | benjamin.peterson | set | messages: + msg237084 |
| 2015年03月02日 20:17:09 | vstinner | set | messages: + msg237080 |
| 2015年03月02日 19:24:18 | serhiy.storchaka | set | messages: + msg237077 |
| 2015年03月02日 17:58:09 | benjamin.peterson | set | nosy:
+ benjamin.peterson messages: + msg237068 |
| 2015年03月02日 16:58:28 | serhiy.storchaka | set | messages: + msg237062 |
| 2015年03月02日 16:21:38 | python-dev | set | status: open -> closed nosy: + python-dev messages: + msg237058 resolution: fixed stage: resolved |
| 2015年03月02日 08:21:26 | ezio.melotti | set | nosy:
+ ezio.melotti, vstinner, serhiy.storchaka components: + Unicode |
| 2015年02月01日 21:17:48 | Arfrever | set | nosy:
+ Arfrever |
| 2015年02月01日 13:57:15 | pkt | create | |