homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: _PyDict_GetItem_KnownHash ignores DKIX_ERROR return
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: methane, python-dev, rhettinger, serhiy.storchaka, vstinner, xiang.zhang
Priority: high Keywords: patch

Created on 2016年09月13日 09:51 by xiang.zhang, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue28123.patch xiang.zhang, 2016年09月13日 09:54 review
issue28123_v2.patch xiang.zhang, 2016年09月14日 08:58 review
issue28123_v3.patch xiang.zhang, 2016年09月17日 14:24 review
issue28123_v4.patch xiang.zhang, 2016年10月31日 11:28 review
issue28123_v5.patch xiang.zhang, 2016年11月01日 01:12 review
issue28123_v6.patch xiang.zhang, 2016年11月03日 11:38 review
Messages (22)
msg276230 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年09月13日 09:51
_PyDict_GetItem_KnownHash should handle dk_lookup return value the same as PyDict_GetItem.
BTW, it seems PyDict_GetItem can call _PyDict_GetItem_KnownHash to remove duplicate code, if you like, maybe another issue?
diff -r 6acd2b575a3c Objects/dictobject.c
--- a/Objects/dictobject.c	Tue Sep 13 07:56:45 2016 +0300
+++ b/Objects/dictobject.c	Tue Sep 13 17:46:08 2016 +0800
@@ -1370,12 +1370,12 @@
 ix = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr, NULL);
 /* ignore errors */
 PyErr_Restore(err_type, err_value, err_tb);
- if (ix == DKIX_EMPTY)
+ if (ix < 0)
 return NULL;
 }
 else {
 ix = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr, NULL);
- if (ix == DKIX_EMPTY) {
+ if (ix < 0) {
 PyErr_Clear();
 return NULL;
msg276232 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年09月13日 09:52
Please create a patch file and attach it to the issue, so we can review it more easily.
msg276233 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年09月13日 09:54
Hmm, I thought this is trivial so I didn't. Now upload the file patch ;).
msg276237 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年09月13日 10:33
I understand the the code doesn't handle correctly lookup failures. Such failure is easy to trigger in pure Python using a custom __eq__() method for example. Can you please write an unit test for it?
msg276239 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年09月13日 10:41
_PyDict_GetItem_KnownHash is not invoked by any other dict methods. So to achieve it in pure Python level, we have to rely on others modules and objects such as OrderedDict, lru_cache. Is it a good idea to rely on those?
msg276242 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年09月13日 10:45
> _PyDict_GetItem_KnownHash is not invoked by any other dict methods.
Oh, I missed that. In this case, I suggest you to expose the function at Python level using the _testcapi module. And then use _testcapi._PyDict_GetItem_KnownHash() in test_dict.
It would be nice to have unit tests on _PyDict_GetItem_KnownHash(), this function starts to become important in Python.
msg276243 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年09月13日 10:50
How about let PyDict_GetItem call it? Just like the relationship of delitem and delitem_knownhash. You can see they share most codes. If we do that, it seems we can easily write a test(or there has already been a test) for it.
msg276246 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年09月13日 10:57
Xiang Zhang added the comment:
> How about let PyDict_GetItem call it?
PyDict_GetItem() is like the most important function in term of
performance for Python. If you want to touch it, you must benchmark
it.
I would prefer to keep it as it is.
> Just like the relationship of delitem and delitem_knownhash.
delitem is less important in term of performance, so I decided to
accept Naoki's change to merge duplicated code.
msg276403 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年09月14日 08:58
Update the patch with unittest.
msg276533 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年09月15日 08:34
If __eq__() raise an exception, _PyDict_GetItem_KnownHash() currently returns NULL and pass through the exception. To me, it looks like the correct behaviour.
With your patch, it looks like the _PyDict_GetItem_KnownHash() clears the __eq__() exception: return NULL with no exception set, as if the key is simply missing.
PyDict_GetItem() ignores *almost* all exceptions, but for bad reasons:
/* Note that, for historical reasons, PyDict_GetItem() suppresses all errors
 * that may occur (originally dicts supported only string keys, and exceptions
 * weren't possible). (...) */
I would prefer to not ignore __eq__ exceptions, but pass them through.
To be clear: this is a behaviour change compared to Python 3.5 which works as PyDict_GetItem(), ignore *all* exceptions:
 ep = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr);
 if (ep == NULL) {
 PyErr_Clear();
 return NULL;
 }
I consider that it's ok to change _PyDict_GetItem_KnownHash() behaviour because this function is private and only used 4 times in 250k lines of C code.
Would you be interested to write a different patch to pass through the exception?
Note: It seems like _count_elements() (Modules/_collectionsmodule.c) doesn't handle correctly such error.
msg276681 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年09月16日 07:24
Yes, ignoring exceptions is due to historical reasons. Although it's used rarely but I am still afraid changing it may break knowledge of devs that are already familiar with dict APIs. And what's more is there already exists two versions of getitem, one version with no exceptions and one version propagates exceptions (witherror), maybe we can also introduce a _PyDict_GetItem_KnownHashWithError?
msg276706 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年09月16日 13:00
Xiang Zhang: "I am still afraid changing it may break knowledge of
devs that are already familiar with dict APIs"
Come on, the function is only called 4 times in the whole code base,
and the function is private. Private means that we are free to break
its behaviour anytime, it's part of the contract.
msg276708 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年09月16日 13:07
Hah, Okay. I'll make the corresponding change then.
msg276796 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年09月17日 14:24
Victor, v3 now applies your suggestion.
msg279768 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年10月31日 07:44
Actually ignoring exceptions in _PyDict_GetItem_KnownHash causes a subtle difference between Python and C implementations. Making _PyDict_GetItem_KnownHash not ignoring exceptions looks right thing.
But dict_merge expects that _PyDict_GetItem_KnownHash don't raise an exception.
msg279780 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年10月31日 11:28
dict_merge was altered after the patch. I make it ignore explicitly the error now, to not affect former behaviour.
Serhiy, I apply your suggestion to use _PyLong_AsByteArray for Py_hash_t, but I am not familiar with the API. It needs a review.
msg279985 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年11月03日 11:11
If _PyDict_GetItem_KnownHash() returns an error, it is very likely that following insertdict() with same key will return an error. I would prefer to return an error right after _PyDict_GetItem_KnownHash() is failed. This would look more straightforward.
msg279987 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016年11月03日 11:38
> If _PyDict_GetItem_KnownHash() returns an error, it is very likely that following insertdict() with same key will return an error.
Make sense.
msg279991 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年11月03日 12:46
LGTM. I'll commit the patch soon if there are no comments from other core developers.
msg279996 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2016年11月03日 14:24
LGTM.
msg280135 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016年11月06日 11:19
New changeset d06a6b0fd992 by Serhiy Storchaka in branch '3.6':
Issue #28123: _PyDict_GetItem_KnownHash() now can raise an exception as
https://hg.python.org/cpython/rev/d06a6b0fd992
New changeset 805467de22fc by Serhiy Storchaka in branch 'default':
Issue #28123: _PyDict_GetItem_KnownHash() now can raise an exception as
https://hg.python.org/cpython/rev/805467de22fc 
msg280138 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年11月06日 13:15
Sorry, I didn't have time to review this issue :-(
Thanks Xiang and Serhiy for fixing the issue! I prefer the new API (don't
ignore the error).
History
Date User Action Args
2022年04月11日 14:58:36adminsetgithub: 72310
2016年11月06日 13:15:14vstinnersetmessages: + msg280138
2016年11月06日 11:20:24serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: commit review -> resolved
2016年11月06日 11:19:59python-devsetnosy: + python-dev
messages: + msg280135
2016年11月03日 14:24:03methanesetmessages: + msg279996
2016年11月03日 12:46:20serhiy.storchakasetmessages: + msg279991
stage: patch review -> commit review
2016年11月03日 11:38:11xiang.zhangsetfiles: + issue28123_v6.patch
assignee: vstinner -> serhiy.storchaka
messages: + msg279987
2016年11月03日 11:11:13serhiy.storchakasetmessages: + msg279985
2016年11月01日 01:12:25xiang.zhangsetfiles: + issue28123_v5.patch
2016年10月31日 11:28:42xiang.zhangsetfiles: + issue28123_v4.patch

messages: + msg279780
2016年10月31日 07:53:22serhiy.storchakasetpriority: normal -> high
stage: patch review
2016年10月31日 07:44:05serhiy.storchakasetmessages: + msg279768
2016年10月23日 20:25:07serhiy.storchakalinkissue28457 dependencies
2016年10月18日 15:19:40matrixisesetassignee: vstinner
2016年09月17日 14:24:33xiang.zhangsetfiles: + issue28123_v3.patch

messages: + msg276796
2016年09月16日 13:13:32serhiy.storchakasetnosy: + rhettinger, serhiy.storchaka
2016年09月16日 13:07:49xiang.zhangsetmessages: + msg276708
2016年09月16日 13:00:36vstinnersetmessages: + msg276706
2016年09月16日 07:24:14xiang.zhangsetmessages: + msg276681
2016年09月15日 08:34:53vstinnersetmessages: + msg276533
2016年09月14日 08:58:22xiang.zhangsetfiles: + issue28123_v2.patch

messages: + msg276403
2016年09月13日 10:57:00vstinnersetmessages: + msg276246
2016年09月13日 10:50:09xiang.zhangsetmessages: + msg276243
2016年09月13日 10:45:21vstinnersetmessages: + msg276242
2016年09月13日 10:41:39xiang.zhangsetmessages: + msg276239
2016年09月13日 10:33:56vstinnersetmessages: + msg276237
2016年09月13日 09:54:18xiang.zhangsetfiles: + issue28123.patch
keywords: + patch
messages: + msg276233
2016年09月13日 09:52:45vstinnersetmessages: + msg276232
2016年09月13日 09:51:18xiang.zhangcreate

AltStyle によって変換されたページ (->オリジナル) /