homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Make dict.setdefault() atomic
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Ramchandra Apte, eric.araujo, gruszczy, jcea, jcon, meador.inge, pitrou, python-dev, rhettinger
Priority: normal Keywords: patch

Created on 2011年12月03日 00:30 by rhettinger, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
13521.patch gruszczy, 2011年12月11日 19:53 review
13521_2.patch gruszczy, 2011年12月11日 20:03 review
13521_27_1.patch gruszczy, 2011年12月18日 18:00 review
13521_27_3.patch gruszczy, 2012年01月15日 14:01
13521_27_4.patch gruszczy, 2012年02月26日 20:40 review
Messages (33)
msg148782 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011年12月03日 00:30
A dialog on python-dev suggests that dict.setdefault() was intended to be atomic and that a number of experienced developers have deployed code relying on its atomicity: http://mail.python.org/pipermail/python-3000/2007-July/008693.html
The actual implementation shows a second call to PyDict_Setitem() which can call PyObject_Hash() which can call arbitrary Python code. http://hg.python.org/cpython/file/2.7/Objects/dictobject.c#l1937
This fix is straight-forward, use the results of the initial lookup to insert the default object. This will make the operation atomic and it will make it faster.
msg148784 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年12月03日 01:27
+1.
msg148815 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011年12月03日 18:27
Atomic and faster... +1.
msg149142 - (view) Author: Ramchandra Apte (Ramchandra Apte) * Date: 2011年12月10日 04:28
+1 for atomic and more robust
msg149164 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011年12月10日 16:08
maniram: We try to keep bug reports focused on one thing, and we don’t generally collect votes (we prefer that people vote with patches, tests and messages with content—for example, saying exactly what "more robust" should be). Here I think Antoine and Jesús said +1 to agree that this should be changed even in stable versions.
Raymond: I think I could make the patch, but I’m not sure about testing the new behavior.
msg149195 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011年12月10日 22:31
Eric, overload "__hash__()" and check that is only called once, while now would be called twice.
msg149245 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2011年12月11日 19:53
I have written a patch and a test, but since it's changing C code, I am far from being sure if it's achieve the expected behavior in the right way. There are also tests and running whole test suite didn't bring any errors.
msg149246 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2011年12月11日 19:53
Also: I'll be happy to work further on this patch, if I get some comments and advice.
msg149247 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年12月11日 19:59
If _PyDict_SetItemUsingHash is module-private, it should be declared "static". Also, better if it follows the usual naming of static functions inside that C file (i.e. "dict_some_lowercase_name").
msg149249 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2011年12月11日 20:03
Done.
msg149255 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年12月11日 22:43
The patch looks ok to me. I'll let Raymond make the final call.
msg149324 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011年12月12日 16:09
Thanks for the idea Jesús, even though I didn’t get the change to use it :)
Filip: Is there a reasons for using a nonlocal count rather than e.g. self.count? Otherwise the test looks good.
msg149328 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年12月12日 16:48
> Filip: Is there a reasons for using a nonlocal count rather than e.g.
> self.count? Otherwise the test looks good.
Eric, please, could we stop such pointless nitpicking about one's
stylistic preferences? "nonlocal" is as good as any other solution here,
and reviewing is not supposed to be a pedantry contest. There are tons
of serious issues to be solved where your energy would be better spent,
IM(NSH)O.
msg149329 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2011年12月12日 16:53
I haven't given it much thought, when I was making the choice of using nonlocal rather than self.count. I was rather excited to see, if the change will work as I wanted it to. If you believe it would be better to use an attribute, I'll be happy to change it.
msg149330 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011年12月12日 16:55
Well, if this is going to be applied to 2.7, "nonlocal" is invalid syntax there. I guess that being able to use the same test in 2.7 and 3.2/3.3 would be nice. Style, but justified.
msg149332 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年12月12日 17:09
> Well, if this is going to be applied to 2.7, "nonlocal" is invalid
> syntax there.
Ah, good point.
msg149333 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2011年12月12日 17:17
I'll send a patch, when I get home from work.
msg149335 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011年12月12日 17:20
Let's just start with a working 2.7 patch and go from there.
msg149350 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2011年12月12日 19:43
I have tried to port patch to python2.7, but apparently I must be doing something wrong, because while most tests pass, several fail (including tests for unittest itself).
I would like to continue working on this test, but incoming week is pretty intensive for me. Is it ok if I returned to working on this during the weekend?
msg149367 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011年12月13日 00:37
> I would like to continue working on this test, but incoming week is
> pretty intensive for me. Is it ok if I returned to working on this
> during the weekend?
Of course. There's no rush.
(I'm really not sure it should go in 2.7 and 3.2; this looks more like a feature than a bug, since atomicity has never been documented)
msg149375 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011年12月13日 02:26
Thanks, I'll look at it over the next few days.
msg149785 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2011年12月18日 18:00
Patch for 2.7.
msg150878 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2012年01月08日 16:50
Bump! There was no activity here for two weeks. Is my patch for 2.7 ok or should I do something more about it?
msg150886 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月08日 17:58
Looking at the patch again, I think this isn't enough.
setdefault() will still call the lookup routine twice which, in the general case (i.e. lookdict() not lookdict_unicode()), can call arbitrary Python code through e.g. __eq__ methods.
msg150908 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2012年01月08日 21:30
I understand you are talking about this call:
mp->ma_lookup(mp, key, hash);
I haven't noticed that earlier. I'll try to provide a better fix (for 2.7 first, after we agree it's good enough, I will provide one for 3.3).
Do you have any idea, how I might this part? I mean how to check, that this is called only once?
msg150912 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月08日 21:42
> Do you have any idea, how I might this part? I mean how to check, that
> this is called only once?
Checking that __eq__ is called only once should be a good proxy.
msg150914 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2012年01月08日 21:46
Thanks, I'll try that. Like before I will probably find time next weekend.
msg151288 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2012年01月15日 14:01
No more double lookup.
msg151296 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月15日 17:20
> No more double lookup.
Your patch doesn't check hashed2.eq_count.
Since the dict specification doesn't say on which instance __eq__ will
be called when doing a lookup, the patch should either check
``hashed1.eq_count + hashed2.eq_count``, or make eq_count a class
attribute.
A nit: be careful not to use tabs in C files.
msg154384 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2012年02月26日 20:40
Fixed both issues.
msg154421 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年02月26日 23:48
New changeset 5c52e7c6d868 by Antoine Pitrou in branch '2.7':
Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
http://hg.python.org/cpython/rev/5c52e7c6d868 
msg154422 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年02月27日 00:05
New changeset 90572ccda12c by Antoine Pitrou in branch '3.2':
Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
http://hg.python.org/cpython/rev/90572ccda12c
New changeset 3dfa98cf2e26 by Antoine Pitrou in branch 'default':
Issue #13521: dict.setdefault() now does only one lookup for the given key, making it "atomic" for many purposes.
http://hg.python.org/cpython/rev/3dfa98cf2e26 
msg154423 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年02月27日 00:06
Thank you Filip! I've now committed the patch.
History
Date User Action Args
2022年04月11日 14:57:24adminsetgithub: 57730
2012年05月16日 16:49:58pitroulinkissue5730 superseder
2012年02月27日 00:06:32pitrousetstatus: open -> closed
resolution: fixed
messages: + msg154423

stage: patch review -> resolved
2012年02月27日 00:05:45python-devsetmessages: + msg154422
2012年02月26日 23:48:58python-devsetnosy: + python-dev
messages: + msg154421
2012年02月26日 20:40:41gruszczysetfiles: + 13521_27_4.patch

messages: + msg154384
2012年01月15日 17:20:03pitrousetmessages: + msg151296
2012年01月15日 14:01:29gruszczysetfiles: + 13521_27_3.patch

messages: + msg151288
2012年01月08日 21:46:34gruszczysetmessages: + msg150914
2012年01月08日 21:42:19pitrousetmessages: + msg150912
2012年01月08日 21:30:00gruszczysetmessages: + msg150908
2012年01月08日 17:58:43pitrousetmessages: + msg150886
2012年01月08日 16:50:39gruszczysetmessages: + msg150878
2011年12月18日 18:00:55gruszczysetfiles: + 13521_27_1.patch

messages: + msg149785
2011年12月13日 02:26:57rhettingersetmessages: + msg149375
2011年12月13日 00:37:30pitrousetmessages: + msg149367
2011年12月12日 19:43:48gruszczysetmessages: + msg149350
2011年12月12日 17:20:10rhettingersetmessages: + msg149335
2011年12月12日 17:17:52gruszczysetmessages: + msg149333
2011年12月12日 17:09:34pitrousetmessages: + msg149332
2011年12月12日 16:55:40jceasetmessages: + msg149330
2011年12月12日 16:53:35gruszczysetmessages: + msg149329
2011年12月12日 16:48:16pitrousetmessages: + msg149328
2011年12月12日 16:09:19eric.araujosetmessages: + msg149324
2011年12月11日 22:43:20pitrousetmessages: + msg149255
2011年12月11日 20:03:27gruszczysetfiles: + 13521_2.patch

messages: + msg149249
2011年12月11日 19:59:12pitrousetmessages: + msg149247
stage: needs patch -> patch review
2011年12月11日 19:53:59gruszczysetmessages: + msg149246
2011年12月11日 19:53:36gruszczysetfiles: + 13521.patch

nosy: + gruszczy
messages: + msg149245

keywords: + patch
2011年12月10日 22:31:30jceasetmessages: + msg149195
2011年12月10日 16:08:15eric.araujosetnosy: + eric.araujo

messages: + msg149164
versions: - Python 3.1
2011年12月10日 04:28:36Ramchandra Aptesetnosy: + Ramchandra Apte
messages: + msg149142
2011年12月10日 04:01:31jconsetnosy: + jcon
2011年12月03日 18:27:44jceasetnosy: + jcea
messages: + msg148815
2011年12月03日 18:02:26meador.ingesetnosy: + meador.inge

stage: needs patch
2011年12月03日 02:15:57rhettingersetassignee: rhettinger
2011年12月03日 01:27:27pitrousetnosy: + pitrou
messages: + msg148784
2011年12月03日 00:30:28rhettingercreate

AltStyle によって変換されたページ (->オリジナル) /