This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年03月04日 19:49 by jek, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| inherit_hash.patch | amaury.forgeotdarc, 2008年03月05日 00:00 | restore previous behavior of __hash__ inheritance | ||
| inherit_hash2.patch | amaury.forgeotdarc, 2008年03月05日 00:48 | |||
| inherit_hash3.patch | amaury.forgeotdarc, 2008年04月10日 23:44 | |||
| issue2235_fix_hash_inheritance.diff | ncoghlan, 2008年06月29日 12:25 | Patch to allow classes to explicitly block inheritance of tp_hash | ||
| issue2235_fix_hash_inheritance_with_function_ptr.diff | ncoghlan, 2008年07月06日 07:52 | Block inheritance with PyObject_HashNotImplemented | ||
| Messages (37) | |||
|---|---|---|---|
| msg63258 - (view) | Author: jason kirtland (jek) | Date: 2008年03月04日 19:49 | |
In 2.6a, seems like the __hash__ implementation and __eq__ must be defined together, in the same class. See also #1549. Ensuring that a __hash__ implementation isn't being pulled from a builtin type is probably a sufficient check...? >>> class Base(object): ... def __init__(self, name): ... self.name = name ... def __eq__(self, other): ... return self.name == other.name ... def __hash__(self): ... return hash(self.name) ... >>> class Extended(Base): ... def __eq__(self, other): ... print "trace: __eq__ called" ... return Base.__eq__(self, other) ... >>> hash(Base('b1')) 603887253 >>> hash(Extended('e1')) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'Extended' |
|||
| msg63263 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年03月04日 21:56 | |
This issue is similar to issue1549. Note that only new-style classes are concerned. I think the change was intentional. Guido, do you confirm? However there should be some documentation about it, a NEWS entry or an item in the "porting to 2.6" section. The initial modification appeared in the py3k branch (r51454): """ Change the way __hash__ is inherited; when __eq__ or __cmp__ is overridden but __hash__ is not, set __hash__ explicitly to None (and tp_hash to NULL). """ |
|||
| msg63264 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2008年03月04日 22:15 | |
Wouldn't you expect this sort of thing to break code? Does it meet the criteria for backporting to 2.6? |
|||
| msg63265 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2008年03月04日 22:17 | |
The py3k feature was intentional. I'm not sure who did the backport (could've been me, long ago). I think the backport should be replaced with a warning that is only issued when -3 is given. |
|||
| msg63269 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年03月05日 00:00 | |
The attached patch reverts r59576 and the part of r59106 about the tp_hash slot. It also adds the py3k warning:: type defines __eq__ but not __hash__, and will not be hashable in 3.x printed when calling hash() on such an object. |
|||
| msg63270 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年03月05日 00:05 | |
I noticed that my patch uses Py_TYPE(self)->tp_hash, whereas normal processing of slot_tp_hash() uses lookup_method(self,"__hash__",hash_str). I am almost sure that both expressions return the same value. Is this correct? Py_TYPE(self)->tp_hash seems much faster. |
|||
| msg63271 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2008年03月05日 00:12 | |
> I noticed that my patch uses Py_TYPE(self)->tp_hash, whereas normal > processing of slot_tp_hash() uses lookup_method(self,"__hash__",hash_str). > > I am almost sure that both expressions return the same value. > Is this correct? Py_TYPE(self)->tp_hash seems much faster. HERE BE DRAGONS There are cases where Py_TYPE(self)->tp_hash is the function currently executing (some wrapper(s) in typeobject.c). OTOH, lookup_method(...) finds the Python code in the class __dict__s along the MRO. |
|||
| msg63272 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年03月05日 00:48 | |
Ah, I remember that it took me some time to understand the boundaries between a slot and the corresponding special method. Here is another version of the patch, which does not test tp_hash while we are currently running the tp_hash function... I also added the warning for old-style classes. |
|||
| msg65297 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2008年04月10日 16:46 | |
Hi Amaury, thanks for your work on this. I applied your patch and looked at the code but there seem to be some issues left still. (1) I don't think you need to add a warning to classic classes that define __eq__ without __hash__ -- these aren't hashable and never were. The problem is purely with new classes AFAICT. (2) I can't seen to trigger the warning for new classes (and yes, I specified -3 on the command line :-): >>> class C(object): ... def __eq__(self, other): return False ... >>> hash(C()) -134976284 (3) test_Hashable in test_collections.py fails. This is because the Hashable class believes that list, set, and dict are hashable. Something odd is going on: >>> list.__hash__([]) -135100116 but >>> hash([]) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list' Note that in 2.5, both raise TypeError. |
|||
| msg65298 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2008年04月10日 16:47 | |
Let's try to have this resolved before the next release. Also let's try not to accidentally merge the 2.6 changes related to this issue into 3.0. :-) |
|||
| msg65299 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2008年04月10日 16:51 | |
Note: some more tests also fail: test_descr, test_quopri, test_set |
|||
| msg65329 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年04月10日 23:44 | |
Here is a new patch, all tests pass. - no warning for classic classes - added tests for -3 warnings I had to add two TPSLOT entries to the slodefs list; can you please check that this is correct (I wanted to reset tp_hash to NULL when __eq__ or __cmp__ are defined). |
|||
| msg66399 - (view) | Author: Barry A. Warsaw (barry) * (Python committer) | Date: 2008年05月08日 03:43 | |
Guido, can you comment on Amaury's latest patch? I'm going to bump this back to critical so as not to hold up 2.6 alpha, but it should be marked as a release blocker for the first beta. |
|||
| msg68792 - (view) | Author: Glyph Lefkowitz (glyph) (Python triager) | Date: 2008年06月26日 16:50 | |
As barry said, this should have been a release blocker for the first beta... ;-) |
|||
| msg68795 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2008年06月26日 17:06 | |
I'm going to have to ask someone else to look at this code. I am too busy with too many things to be able to take on a detailed code review. |
|||
| msg68828 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年06月27日 13:23 | |
Amaury's patch doesn't currently remove the new-in-2.6 code that turns "tp_hash = NULL" at the C level into "__hash__ = None" at the Python level. I suspect that will prove to be a problem (and may be the cause of Django's woes, if I remember Glyph's python-dev posts correctly). I also don't believe the new TPSLOT entries should be necessary - we should be able to just revert all of this tp_hash related code to match the 2.5 implementation, and then see if we can find a way to emit the Py3k warning when __cmp__ or __eq__ are overridden without overriding __hash__ without breaking any existing code (if we can't, we may have to figure out a way to get 2to3 to check for it). P.S. you'd think I would have learnt by now not to try to make sense of typeobject.c when I'm tired ;) |
|||
| msg68839 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2008年06月27日 18:05 | |
Thanks for giving this some time. I think that backwards compatibility should be a higher priority than being able to issue -3 warnings -- if the warnings can't be generated, too bad, we'll just have to document it. (I don't think that checking things at class scope is an easy task in 2to3, though I may be underestimating it.) So, concluding, insofar as the proposal is to revert 2.6 to the 2.5 semantics where it comes to __eq__ and __hash__, I'm okay with that if there's no other way to maintain backwards compatibility, even though my preference would be to still flag this when -3 is specified. (There may be similar backwards compatibility issues caused by stricted arg checking in __new__ and __init__ -- the same remarks apply there.) |
|||
| msg68866 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年06月28日 00:56 | |
The _new__/__init__ stuff showed up on my diff as well (obviously) - that seemed to be doing a pretty good job of maintaining backwards compatibility. I'll dig into this further today and tomorrow - I'll probably start by reverting the relevant code back to exactly what is in 2.5, and then see how far I can get with adding the warning code back in while still keeping jek's test case working correctly. |
|||
| msg68937 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年06月29日 08:34 | |
Well, I think I figured out why the __hash__ changes were backported from Py3k: without them, the existence of object.__hash__ makes the Hashable ABC completely useless (every newstyle class meets it). I see two options here. Option 1 is to revert the changes so __hash__ behave entirely as it did in 2.5 and also delete collections.Hashable from 2.6 (which may break other backported Py3k items). Doesn't seem like a particularly good idea. Option 2 is what I have implemented locally (and will be uploading soon): changing typeobject.c to allow Py_None to be stored in type method slots, and have it convert correctly to None at the Python level. Classes that don't want to inherit the default object.tp_hash implementation can then write "__hash__ = None" or "(hashfunc)Py_None" to block the inheritance. |
|||
| msg68948 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年06月29日 12:25 | |
Attached patch allows classes to explicitly block inheritance of object.__hash__ by setting the tp_hash slot to Py_None or the __hash__ attribute to None. This is similar to the approach used in Py3k, but allows __hash__ to be inherited by default, with classes explicitly disabling it as appropriate. I'd also suggest forward porting this approach to Py3k as well, and then possibly look at replacing a copied tp_hash slot with Py_None when the inherited tp_hash is object.tp_hash, but tp_cmp or tp_richcompare are different. |
|||
| msg68949 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年06月29日 13:30 | |
Unassigning for the moment - I'll take a look at adding the Py3k warning back in at some point, but the main thing I would like now is a sanity check from Guido or Barry (or anyone else happy to dig into the guts of typeobject.c) that the basic idea behind my fix is sound. It would be nice to avoid the explicit check for Py_None in PyObject_Compare by getting update_one_slot() to replace the Py_None it encounters in __hash__ with the default slot_tp_hash implementation, but my attempts at doing that have all led to segfaults. (Also, glyph, I'd love to know if my patch helps to clear up Django's issues with 2.6b1) |
|||
| msg69096 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年07月02日 14:42 | |
Suggestion from GvR (one I like): instead of re-using Py_None, add a new C function that is stored in the tp_hash slot as a sentinel instead of the Py_None value used in the posted version of the patch. This will avoid breaking code that just checks for NULL before calling the tp_hash slot directly, while still allowing typeobject.c to detect that the object isn't actually hashable despite the presence of a non-NULL value in the tp_hash slot. (I'll keep the None at the Python level though, since that matches the Py3k behaviour and plays nicely with collections.Hashable) |
|||
| msg69101 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年07月02日 15:15 | |
I don't like the changes in listobject.c and dictobject.c: they seem to imply that all unhashable types must be modified when upgrading to 2.6. |
|||
| msg69129 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年07月02日 21:43 | |
Unhashable types that implement an "I'm not hashable" __hash__ method will indeed require modification in 2.6 in order to avoid incorrectly passing an "isinstance(obj, collections.Hashable)" check (note that several of the mutable standard library types such as collections.deque are incorrectly detected as hashable in the current SVN trunk). |
|||
| msg69137 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年07月02日 22:32 | |
Isn't that a serious compatibility issue? 2.5 code should work on 2.6 with minimal effort. For deque objects, I don't get your point: Python 2.6b1+ (trunk, Jul 1 2008, 22:35:48) [MSC v.1500 32 bit Intel)] on win32 >>> from collections import deque >>> hash(deque()) TypeError: deque objects are unhashable |
|||
| msg69190 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年07月03日 10:50 | |
As far as deque goes, the following behaviour on the trunk is the problem I am trying to fix: Python 2.6b1+ (trunk:64655, Jul 2 2008, 22:48:24) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from collections import deque, Hashable >>> isinstance(deque(), Hashable) True All of the container types that my patch touches were already unhashable in 2.5 - my patch just ensures that they all correctly return false for isinstance(obj, collections.Hashable) even after we make object.__hash__ inherited by default again. This is also the reason why simply reverting the trunk hash implementation to the 2.5 behaviour (which is what I first tried) is inadequate. Since collections.Hashable is new in 2.6, I can live with it returning the wrong answer for types which define __hash__ to always raise an error (that's a known limitation of the ABC system, even in Py3k). However, we should at least make sure it returns the correct answer for all of the types in the standard library. |
|||
| msg69324 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年07月06日 07:52 | |
Attached updated patch which implements Guido's suggestion from above (inheritance of __hash__ is blocked at the C level by setting the slot to PyObject_HashNotImplemented and at the Python level by setting __hash__ = None). All tests which don't depend on a -u resource flag still pass (currently running a -uall set of tests as well). |
|||
| msg69325 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年07月06日 08:09 | |
Two failures in the -uall run (test_codecmaps_hk, test_linuxaudiodev). However, I see those test failures even after reverting my changes, so they aren't related to this. |
|||
| msg69691 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年07月15日 15:49 | |
Fixed for 2.6 in r64962 and the underlying PyObject_HashNotImplemented mechanic forward ported to 3.0 in r64967. Still need to update the C API documentation and the library reference, as well as implement the Py3k warning in 2.6, but I won't get to those for this beta - dropping priority to deferred blocker. |
|||
| msg70748 - (view) | Author: Guido van Rossum (gvanrossum) * (Python committer) | Date: 2008年08月05日 16:06 | |
How's this going? |
|||
| msg70777 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年08月06日 09:44 | |
The documentation and Py3k warning will be ready for the third beta next week (the remaining part is a lot easier than the initial fix). |
|||
| msg71021 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年08月11日 15:48 | |
Py3k warnings for this have been checked in on the trunk as r65642. There's an unfortunate problem with having to define a fixed stack level for the warnings - for classes with a metaclass implemented in Python, the warning message will point to the __new__ method of the metaclass instead of to the first line of the offending class definition. I don't see a way around this problem, but it definitely caused me grief in tracking down some of the misbehaving classes in the standard library that use ABCMeta as their metaclass in 2.6 (I actually cheated and recompiled with the stack level on the warnings set to 2 instead of 1). The update also eliminates the spurious and not-so-spurious Py3k warnings that this update triggered in the test suite under the -3 flag. Most were just test suite classes that happen to become unhashable in Py3k, but a few were classes in the standard lib itself that defined __eq__ or __cmp__ methods that were incompatible with the default __hash__ implementation (collections.Mapping, collections.Set, unittest.TestSuite, xml.dom.minidom.NamedNodeMap, numbers.Number, UserList.UserList). Instances of all of the affected classes are now explicitly unhashable in 2.x as well as 3.x (note that any code which relied on any of them being hashable was already broken due to the fact that these classes didn't provide the correct hash and equality invariants). The numbers.Number change deserves special mention - the actual warning occurs further down in the number stack (when numbers.Complex defines a new __eq__ method), but it seemed appropriate to block the inheritance of object.__hash__ in Number since an id() based hash isn't appropriate for any numeric type. This particular aspect of the change should probably be forward ported to Py3k. In implementing the warnings, I'm also pretty happy with the current Py3k approach that overriding __cmp__ or __eq__ means that __hash__ isn't inherited *at all*, rather than restricting the block solely to object.__hash__. The current behaviour is simple to explain, and more importantly, I think it is more correct - any time you change the definition of equality for the class, you really do need to think carefully about what it means for mutability and hashability. P.S. I should never have said this part of the change was going to be easy, because that was just begging old Murphy to come slap me upside the head... |
|||
| msg71022 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年08月11日 15:53 | |
I still intend to do the necessary documentation updates for this, but they're going to be delayed a bit since getting the warnings right took much longer than I expected. |
|||
| msg71223 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2008年08月16日 16:23 | |
Let's lower the priority a little then. |
|||
| msg71325 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年08月18日 13:21 | |
numbers.Number change forward ported to Py3k in r65808 Docs updated for 2.6 and 3.0 in r65810 and r65811 respectively. Which means I can finally close this one :) |
|||
| msg72114 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年08月28日 21:48 | |
Reopening - still need to fix the Python level docs for hash() and __hash__(). |
|||
| msg72207 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2008年08月31日 13:22 | |
__hash__() documentation fixed for 2.6 in r66085 and for 3.0 in r66086. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:31 | admin | set | github: 46488 |
| 2008年08月31日 13:22:05 | ncoghlan | set | status: open -> closed resolution: fixed messages: + msg72207 |
| 2008年08月28日 21:48:38 | ncoghlan | set | status: closed -> open resolution: fixed -> (no value) messages: + msg72114 |
| 2008年08月18日 13:21:15 | ncoghlan | set | status: open -> closed resolution: fixed messages: + msg71325 |
| 2008年08月16日 16:23:58 | benjamin.peterson | set | priority: release blocker -> critical nosy: + benjamin.peterson messages: + msg71223 |
| 2008年08月11日 15:53:44 | ncoghlan | set | messages: + msg71022 |
| 2008年08月11日 15:48:31 | ncoghlan | set | messages: + msg71021 |
| 2008年08月06日 09:44:26 | ncoghlan | set | messages: + msg70777 |
| 2008年08月05日 16:06:41 | gvanrossum | set | messages: + msg70748 |
| 2008年07月18日 03:45:40 | barry | set | priority: deferred blocker -> release blocker |
| 2008年07月15日 15:49:22 | ncoghlan | set | priority: release blocker -> deferred blocker messages: + msg69691 |
| 2008年07月06日 08:09:53 | ncoghlan | set | messages: + msg69325 |
| 2008年07月06日 07:52:36 | ncoghlan | set | files:
+ issue2235_fix_hash_inheritance_with_function_ptr.diff messages: + msg69324 |
| 2008年07月03日 10:50:33 | ncoghlan | set | assignee: ncoghlan messages: + msg69190 |
| 2008年07月02日 22:32:05 | amaury.forgeotdarc | set | messages: + msg69137 |
| 2008年07月02日 21:43:23 | ncoghlan | set | messages: + msg69129 |
| 2008年07月02日 15:15:24 | amaury.forgeotdarc | set | messages: + msg69101 |
| 2008年07月02日 14:42:18 | ncoghlan | set | messages: + msg69096 |
| 2008年06月29日 13:30:52 | ncoghlan | set | assignee: ncoghlan -> (no value) messages: + msg68949 |
| 2008年06月29日 12:25:39 | ncoghlan | set | files:
+ issue2235_fix_hash_inheritance.diff messages: + msg68948 |
| 2008年06月29日 08:34:33 | ncoghlan | set | messages: + msg68937 |
| 2008年06月28日 01:02:45 | ncoghlan | set | assignee: ncoghlan |
| 2008年06月28日 00:56:16 | ncoghlan | set | messages: + msg68866 |
| 2008年06月27日 18:05:09 | gvanrossum | set | messages: + msg68839 |
| 2008年06月27日 13:23:19 | ncoghlan | set | nosy:
+ ncoghlan messages: + msg68828 |
| 2008年06月26日 17:06:12 | gvanrossum | set | assignee: gvanrossum -> (no value) messages: + msg68795 |
| 2008年06月26日 16:58:44 | schmir | set | nosy: + schmir |
| 2008年06月26日 16:50:23 | glyph | set | priority: critical -> release blocker nosy: + glyph messages: + msg68792 |
| 2008年05月08日 03:43:12 | barry | set | priority: release blocker -> critical nosy: + barry messages: + msg66399 |
| 2008年04月15日 05:56:19 | nnorwitz | set | assignee: amaury.forgeotdarc -> gvanrossum |
| 2008年04月10日 23:44:06 | amaury.forgeotdarc | set | files:
+ inherit_hash3.patch messages: + msg65329 |
| 2008年04月10日 16:51:50 | gvanrossum | set | messages: + msg65299 |
| 2008年04月10日 16:47:39 | gvanrossum | set | priority: high -> release blocker messages: + msg65298 |
| 2008年04月10日 16:46:57 | gvanrossum | set | assignee: amaury.forgeotdarc messages: + msg65297 |
| 2008年03月16日 21:00:11 | eikeon | set | nosy: + eikeon |
| 2008年03月05日 00:48:31 | amaury.forgeotdarc | set | files:
+ inherit_hash2.patch messages: + msg63272 |
| 2008年03月05日 00:12:29 | gvanrossum | set | messages: + msg63271 |
| 2008年03月05日 00:05:33 | amaury.forgeotdarc | set | messages: + msg63270 |
| 2008年03月05日 00:00:51 | amaury.forgeotdarc | set | files:
+ inherit_hash.patch keywords: + patch messages: + msg63269 |
| 2008年03月04日 22:20:13 | rhettinger | set | priority: high |
| 2008年03月04日 22:17:35 | gvanrossum | set | messages: + msg63265 |
| 2008年03月04日 22:15:27 | rhettinger | set | nosy:
+ rhettinger messages: + msg63264 |
| 2008年03月04日 21:56:09 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc, gvanrossum type: crash -> behavior messages: + msg63263 |
| 2008年03月04日 19:49:23 | jek | create | |