homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unicodeobject.c: use of uninitialized values
Type: behavior Stage: resolved
Components: Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, lemburg, serhiy.storchaka, skrah, vstinner
Priority: normal Keywords: patch

Created on 2010年07月13日 09:24 by skrah, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
uninitialized.py skrah, 2010年07月16日 11:10
issue9242.patch serhiy.storchaka, 2013年01月06日 20:27 review
Messages (8)
msg110165 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010年07月13日 09:24
Not sure if this is valid or if there is some internal trickery that Valgrind isn't aware of. If it's the latter, perhaps an entry to
Misc/valgrind.supp could be added.
test_bug1175396 (__main__.UTF32Test) ... ==26861== Conditional jump or move depends on uninitialised value(s)
==26861== at 0x48A2DD: PyUnicodeUCS2_DecodeUTF32Stateful (unicodeobject.c:2282)
==26861== by 0x50E25C: utf_32_le_decode (_codecsmodule.c:420)
==26861== by 0x52E727: PyCFunction_Call (methodobject.c:81)
==26861== by 0x4B4EB7: call_function (ceval.c:4012)
==26861== by 0x4B1402: PyEval_EvalFrameEx (ceval.c:2665)
==26861== by 0x4B31DA: PyEval_EvalCodeEx (ceval.c:3252)
==26861== by 0x4B52C0: fast_function (ceval.c:4108)
==26861== by 0x4B4FE1: call_function (ceval.c:4033)
==26861== by 0x4B1402: PyEval_EvalFrameEx (ceval.c:2665)
==26861== by 0x4B31DA: PyEval_EvalCodeEx (ceval.c:3252)
==26861== by 0x4B52C0: fast_function (ceval.c:4108)
==26861== by 0x4B4FE1: call_function (ceval.c:4033)
==26861== 
==26861== 
==26861== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- y
==26861== starting debugger with cmd: /usr/bin/gdb -nw /proc/26862/fd/1014 26862
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...
Attaching to program: /proc/26862/fd/1014, process 26862
0x000000000048a2dd in PyUnicodeUCS2_DecodeUTF32Stateful (s=0x5bd6a54 "\n", size=1, errors=0x5add144 "strict", byteorder=0x7feffb18c, 
 consumed=0x7feffb170) at Objects/unicodeobject.c:2282
2282 if (qq[iorder[2]] != 0 || qq[iorder[3]] != 0)
(gdb) p s
1ドル = 0x5bd6a54 "\n"
(gdb) p qq
2ドル = (const unsigned char *) 0x5bd6a54 "\n"
(gdb) p iorder[2]
3ドル = 2
(gdb) p iorder[3]
4ドル = 3
(gdb) p bo
5ドル = -1
(gdb) p *byteorder
6ドル = -1
msg110168 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010年07月13日 10:45
Stefan Krah wrote:
> 
> New submission from Stefan Krah <stefan-usenet@bytereef.org>:
> 
> Not sure if this is valid or if there is some internal trickery that Valgrind isn't aware of. If it's the latter, perhaps an entry to
> Misc/valgrind.supp could be added.
> 
> 
> test_bug1175396 (__main__.UTF32Test) ... ==26861== Conditional jump or move depends on uninitialised value(s)
> ==26861== at 0x48A2DD: PyUnicodeUCS2_DecodeUTF32Stateful (unicodeobject.c:2282)
> ==26861== by 0x50E25C: utf_32_le_decode (_codecsmodule.c:420)
> ==26861== by 0x52E727: PyCFunction_Call (methodobject.c:81)
> ==26861== by 0x4B4EB7: call_function (ceval.c:4012)
> ==26861== by 0x4B1402: PyEval_EvalFrameEx (ceval.c:2665)
> ==26861== by 0x4B31DA: PyEval_EvalCodeEx (ceval.c:3252)
> ==26861== by 0x4B52C0: fast_function (ceval.c:4108)
> ==26861== by 0x4B4FE1: call_function (ceval.c:4033)
> ==26861== by 0x4B1402: PyEval_EvalFrameEx (ceval.c:2665)
> ==26861== by 0x4B31DA: PyEval_EvalCodeEx (ceval.c:3252)
> ==26861== by 0x4B52C0: fast_function (ceval.c:4108)
> ==26861== by 0x4B4FE1: call_function (ceval.c:4033)
> ==26861== 
> ==26861== 
> ==26861== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- y
> ==26861== starting debugger with cmd: /usr/bin/gdb -nw /proc/26862/fd/1014 26862
> GNU gdb 6.8-debian
> Copyright (C) 2008 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu"...
> Attaching to program: /proc/26862/fd/1014, process 26862
> 0x000000000048a2dd in PyUnicodeUCS2_DecodeUTF32Stateful (s=0x5bd6a54 "\n", size=1, errors=0x5add144 "strict", byteorder=0x7feffb18c, 
> consumed=0x7feffb170) at Objects/unicodeobject.c:2282
> 2282 if (qq[iorder[2]] != 0 || qq[iorder[3]] != 0)
> (gdb) p s
> 1ドル = 0x5bd6a54 "\n"
> (gdb) p qq
> 2ドル = (const unsigned char *) 0x5bd6a54 "\n"
> (gdb) p iorder[2]
> 3ドル = 2
> (gdb) p iorder[3]
> 4ドル = 3
> (gdb) p bo
> 5ドル = -1
> (gdb) p *byteorder
> 6ドル = -1
Could you check whether the report goes away when using the following
definition of iorder in that function:
#ifdef BYTEORDER_IS_LITTLE_ENDIAN
 const int iorder[4] = {0, 1, 2, 3};
#else
 const int iorder[4] = {3, 2, 1, 0};
#endif
I guess that the missing dimension is causing the valgrind report.
From a code perspective, everything is in order, in fact I'd expect
the compiler to optimize the array away. It may even be a good idea
to replace the array with symbols:
#ifdef BYTEORDER_IS_LITTLE_ENDIAN
# define BYTEORDER_0 0
# define BYTEORDER_1 1
# define BYTEORDER_2 2
# define BYTEORDER_3 3
#else
# define BYTEORDER_0 3
# define BYTEORDER_1 2
# define BYTEORDER_2 1
# define BYTEORDER_3 0
#endif
msg110171 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010年07月13日 11:40
> const int iorder[4] = {0, 1, 2, 3};
const isn't possible, iorder is modified later on. Adding the array
dimension did not change anything.
Making everything const (see below) did not change anything either.
I presume that Valgrind regards qq[2] or qq[3] as uninitialized.
Index: Objects/unicodeobject.c
===================================================================
--- Objects/unicodeobject.c (revision 82816)
+++ Objects/unicodeobject.c (working copy)
@@ -2216,10 +2216,12 @@
 int bo = 0; /* assume native ordering by default */
 const char *errmsg = "";
 /* Offsets from q for retrieving bytes in the right order. */
+ const int iorder_le[] = {0, 1, 2, 3};
+ const int iorder_be[] = {3, 2, 1, 0};
 #ifdef BYTEORDER_IS_LITTLE_ENDIAN
- int iorder[] = {0, 1, 2, 3};
+ const int *iorder = iorder_le;
 #else
- int iorder[] = {3, 2, 1, 0};
+ const int *iorder = iorder_be;
 #endif
 PyObject *errorHandler = NULL;
 PyObject *exc = NULL;
@@ -2262,17 +2264,11 @@
 
 if (bo == -1) {
 /* force LE */
- iorder[0] = 0;
- iorder[1] = 1;
- iorder[2] = 2;
- iorder[3] = 3;
+ iorder = iorder_le;
 }
 else if (bo == 1) {
 /* force BE */
- iorder[0] = 3;
- iorder[1] = 2;
- iorder[2] = 1;
- iorder[3] = 0;
+ iorder = iorder_be;
 }
msg110427 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010年07月16日 11:10
Here is a minimal example how to reproduce this issue, extracted from
UTF32LETest.
valgrind --db-attach=yes --suppressions=Misc/valgrind-python.supp ./python uninitialized.py
It seems that in Lib/codecs.py the equivalent of "\x00".decode('utf-32-le')
is called (line 477).
This leads to:
PyUnicodeUCS2_DecodeUTF32Stateful (s=0x5b0fc0c "", size=1, errors=0x5add144 "strict", byteorder=0x7fefff39c, 
 consumed=0x0)
So we have a string of size 1, but s[2] and s[3] will be accessed
in the function.
msg179219 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年01月06日 20:27
3.2 is affected too. The PEP-393 changes fixes this bug in 3.3+.
Here is a simple patch.
msg179371 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年01月08日 20:51
I forgot mention the issue number in commit messages. See changeset3570e04f4ea9 and changesetbf347198fbaf.
Is the issue fixed now?
msg179372 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年01月08日 20:51
changeset 3570e04f4ea9 and changeset bf347198fbaf 
msg179376 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013年01月08日 22:01
The utf_32_le_decode invalid access is gone; for Valgrind the issue
is fixed (I didn't look at the patch, no time ATM, sorry).
History
Date User Action Args
2022年04月11日 14:57:03adminsetgithub: 53488
2013年01月08日 22:22:10serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2013年01月08日 22:01:25skrahsetmessages: + msg179376
2013年01月08日 20:51:57serhiy.storchakasetmessages: + msg179372
2013年01月08日 20:51:21serhiy.storchakasetmessages: + msg179371
2013年01月06日 20:27:04serhiy.storchakasetfiles: + issue9242.patch

assignee: serhiy.storchaka
versions: + Python 3.2
keywords: + patch
nosy: + serhiy.storchaka

messages: + msg179219
stage: patch review
2013年01月04日 23:40:25Arfreversetnosy: + Arfrever
2011年07月07日 10:44:13vstinnersetnosy: + vstinner
2010年07月16日 11:10:59skrahsetfiles: + uninitialized.py

messages: + msg110427
2010年07月13日 11:40:08skrahsetmessages: + msg110171
2010年07月13日 10:45:45lemburgsetmessages: + msg110168
2010年07月13日 09:24:47skrahcreate

AltStyle によって変換されたページ (->オリジナル) /