This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年07月19日 10:09 by aalien, last changed 2022年04月11日 14:57 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| json-raw-decoder-fix-whitespace.diff | aalien, 2012年07月19日 11:40 | Patch for Python 3.3.0 | review | |
| json-raw-decoder-fix-whitespace.diff | aalien, 2012年07月20日 12:41 | Patch for Python 3.3.0, take two | review | |
| issue15393_3.6.patch | Bayard Randel, 2016年09月13日 00:28 | review | ||
| Messages (14) | |||
|---|---|---|---|
| msg165829 - (view) | Author: Antti Laine (aalien) | Date: 2012年07月19日 10:09 | |
raw_decode on json.JSONDecoder does not handle leading whitespace. According to RFC 4627, section 2, whitespace can precede an object. With json.loads leading whitespace is handled just fine.
d = json.JSONDecoder()
d.raw_decode(' {}')
ValueError: No JSON object could be decoded
Tested with versions 2.6.7, 2.7.3, 3.1.3 and 3.2.3
|
|||
| msg165830 - (view) | Author: Antti Laine (aalien) | Date: 2012年07月19日 10:16 | |
My coworker just submitted a pull request for a possible fix to simplejson on github. https://github.com/simplejson/simplejson/pull/38 |
|||
| msg165831 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年07月19日 10:37 | |
Please, post a patch for 2.7 and 3.2/3.3, with a test. Seems quite easy. If you hurry, this could go in 3.3.0. |
|||
| msg165832 - (view) | Author: Antti Laine (aalien) | Date: 2012年07月19日 11:40 | |
Suggestion for a patch for 3.3.0. I wasn't quite sure how the testcases were supposed to be loaded. Sorry if I made a mess ;) |
|||
| msg165834 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年07月19日 11:54 | |
Antti, you are changing the signature of "decode()" and that would be a compatibility problem. Can you rewrite the patch to be more "compatible"?
In your test, please, use a bit more complicated object than "{}" :-)
|
|||
| msg165835 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2012年07月19日 11:56 | |
IMO this is not a bug, according to the current documentation it is working as designed. raw_decode says it decodes a string that *starts with* a json document. To my understanding, raw_decode is designed to be used when parsing a stream containing more than just one json document, including possibly non-json text, in which case it makes sense that the calling application would be responsible for (and want to) handle the whitespace around any such document. I recommend closing this as invalid. If the consensus is otherwise and a change is made, I think it would be an enhancement, not a bug fix. |
|||
| msg165836 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2012年07月19日 12:03 | |
Ah, I see, you are thinking that "json document" includes the possibility of leading whitespace. That is a reasonable interpretation...just make sure that we don't break backward compatibility. |
|||
| msg165918 - (view) | Author: Antti Laine (aalien) | Date: 2012年07月20日 12:41 | |
> you are changing the signature of "decode()" and that would be a compatibility problem I was changing the signature of raw_decode(), by adding a(n almost) private keyword. I really don't see how that would affect compatibility. > you are thinking that "json document" includes the possibility of leading whitespace Yes. While JSON does not have a real standard to follow, I believe that going by the RFC is the best thing to do, and the RFC very clearly states, that objects can be surrounded with whitespace. decode() depends on the functionality of raw_decode(), and its own parameter _w. I have no idea why _w is a parameter. It is not used anywhere in CPython, and, beginning with an underscore, it shouldn't be used from anywhere else. Still it offers the users a possibility to break the behaviour of decode(). If it's a performance hack, making the variable being initialized only once, then I think it's a very poor one. Now the patch leaves decode() as it is, and checks for whitespace in raw_decode(), which leads to checking for whitespace twice when using decode(). I think that a more extensive cleanup would be in order. |
|||
| msg168936 - (view) | Author: Petri Lehtinen (petri.lehtinen) * (Python committer) | Date: 2012年08月23日 12:03 | |
I think skipping preceding whitespace in raw_decode() wouldn't hurt, but skipping following whitespace would be wrong. David: Any examples of how this could break backwards compatibility? |
|||
| msg168939 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2012年08月23日 12:56 | |
I didn't have anything specific in mind, just making a general comment about the care that needs to be taken in crafting the fix. |
|||
| msg276143 - (view) | Author: Bayard Randel (Bayard Randel) * | Date: 2016年09月13日 00:28 | |
I've provided an updated patch for 3.6. Additionally I've updated the docstring for decoder.raw_decode to explain that whitespace at the beginning of the document will be ignored. I believe the concerns around the backwards incompatible change to the function signature were addressed by aalien's second patch. |
|||
| msg276161 - (view) | Author: Robert Collins (rbcollins) * (Python committer) | Date: 2016年09月13日 04:16 | |
I think the patch should either be rejected, or also handle trailing spaces: if we're taking the RFC definition of whitespace not being structural then we should also eat trailing space, which will change the check for extra data in decode to just checking end == s.len(). Obviously a test for the trailing case needs to be added as well. |
|||
| msg276291 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2016年09月13日 15:24 | |
Robert: as noted, skipping trailing whitespace would be breaking raw_decode's contract. It is designed to parse a json document and return the position of the last character of that document. If there is then another json document in the stream, parsing the remainder of the stream would skip that whitespace. If it is some other data, then the whitespace should be preserved for whatever is parsing the remainder of the stream (if anything). In any case that behavior cannot be changed because it would break backward compatibility. I would not object to rejecting the change, but it does make raw_decode slightly more awkward to use if it does not skip the leading whitespace. So I guess I'm +0.5 on accepting it as is. |
|||
| msg415523 - (view) | Author: Daniël van Noord (danielnoord) * | Date: 2022年03月18日 20:05 | |
@Bayard Randel, do you want to make this patch into a GitHub PR? If not, I could help by doing so and try to get this landed. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:33 | admin | set | github: 59598 |
| 2022年03月18日 20:05:49 | danielnoord | set | nosy:
+ danielnoord messages: + msg415523 |
| 2016年10月04日 10:56:05 | Mariatta | set | nosy:
+ Mariatta |
| 2016年09月13日 15:24:40 | r.david.murray | set | versions: + Python 3.5, Python 3.6, Python 3.7, - Python 2.7, Python 3.2, Python 3.3, Python 3.4 |
| 2016年09月13日 15:24:08 | r.david.murray | set | messages: + msg276291 |
| 2016年09月13日 07:12:07 | petri.lehtinen | set | nosy:
- petri.lehtinen |
| 2016年09月13日 04:16:10 | rbcollins | set | nosy:
+ rbcollins messages: + msg276161 |
| 2016年09月13日 00:28:28 | Bayard Randel | set | files:
+ issue15393_3.6.patch nosy: + Bayard Randel messages: + msg276143 |
| 2012年08月23日 12:56:01 | r.david.murray | set | messages: + msg168939 |
| 2012年08月23日 12:03:38 | petri.lehtinen | set | nosy:
+ petri.lehtinen messages: + msg168936 |
| 2012年07月20日 12:41:59 | aalien | set | files:
+ json-raw-decoder-fix-whitespace.diff messages: + msg165918 |
| 2012年07月19日 12:03:06 | r.david.murray | set | messages: + msg165836 |
| 2012年07月19日 11:56:17 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg165835 |
| 2012年07月19日 11:54:19 | jcea | set | messages: + msg165834 |
| 2012年07月19日 11:40:55 | aalien | set | files:
+ json-raw-decoder-fix-whitespace.diff keywords: + patch messages: + msg165832 |
| 2012年07月19日 10:37:45 | jcea | set | nosy:
+ jcea messages: + msg165831 keywords: + easy stage: needs patch |
| 2012年07月19日 10:25:21 | ezio.melotti | set | nosy:
+ ezio.melotti versions: + Python 3.3, Python 3.4, - Python 2.6, Python 3.1 |
| 2012年07月19日 10:16:14 | aalien | set | messages: + msg165830 |
| 2012年07月19日 10:09:55 | aalien | create | |