homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: sys.getsizeof() gives an AttributeError for _sre objects.
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: schuppenies Nosy List: amaury.forgeotdarc, benjamin.peterson, facundobatista, schuppenies
Priority: normal Keywords: patch

Created on 2008年06月16日 10:51 by schuppenies, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
_sre_sizeof.patch schuppenies, 2008年06月16日 10:51 Patch against 2.6 trunk, revision 64299
sizeof.patch amaury.forgeotdarc, 2008年06月26日 22:08
sizeof2.patch schuppenies, 2008年07月03日 13:31
Messages (15)
msg68266 - (view) Author: Robert Schuppenies (schuppenies) * (Python committer) Date: 2008年06月16日 10:51
>>> import re
>>> import sys
>>> r = re.compile('')
>>> sys.getsizeof(r)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
AttributeError: __sizeof__
This applies to objects of the types _sre.SRE_Pattern,
_sre.SRE_Scanner, and _sre.SRE_Match.
The attached patch addresses this issue.
msg68788 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008年06月26日 15:47
Robert, do you have a test suite for the sizeof functionality? If not,
you should start one ASAP, ;)
This test should be included in that suit...
msg68789 - (view) Author: Robert Schuppenies (schuppenies) * (Python committer) Date: 2008年06月26日 16:01
What would be a good way to identify *all* possible types? 
When I started, I included all objects in /Objects, but obviously this
is not sufficient.
msg68801 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008年06月26日 21:10
I think it would be better to give a TypeError rather than an
AttributeError for objects that don't support __sizeof__ as per other
special methods.
msg68802 - (view) Author: Robert Schuppenies (schuppenies) * (Python committer) Date: 2008年06月26日 21:40
The attribute error is caused by pattern_getattr, which tries to find
__sizeof__, fails and then sets the error message. I don't know if
casting the error is the right thing to do. Actually, sys.getsizeof()
should work on any type.
Another suggestion was a that sys.getsizeof allows one optional
argument for a default size. If the default argument is provided,
sys.getsizeof will not throw an exception (if the __sizeof__ method is
missing or for any other error), but instead return the given default
size.
Still, an agreement on the right error message is needed.
msg68804 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008年06月26日 22:08
I wondered why getsizeof fails for _sre.SRE_Pattern objects when it
succeeds for _socket.socket or struct.Struct.
It turns out that _sre.SRE_Pattern defines the tp_getattr slot, and this
prevents attribute lookup from searching the base class (object).
This is a pity, because the base implementation (which use tp_basicsize)
does the right thing for many objects.
So I borrowed some code from the __format__ method, and here is a patch.
Now classes are not handled specially any more; the distinction is
between old-style instances, and all other objects.
All tests pass, but I may have missed something important...
msg68805 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008年06月26日 22:24
Robert,
looking at your patch for the _sre module, I noticed that
MatchObject.__sizeof__ includes the sizeof of one of its member (regs: a
tuple of (begin, end) pairs).
Why this one and not the others? I thought the rule (if there is a rule)
was to include the used memory for members that are *not* python objects.
msg68818 - (view) Author: Robert Schuppenies (schuppenies) * (Python committer) Date: 2008年06月27日 07:12
You are right, the rule is to not include referenced objects.
But, and this has been rather informal up to now, I handled transparently
cached information as something that is added to the memory used by an
object (see unicode.defenc). The same I applied to
MatchObject.regs. The rational being that the user cannot know wether
the match positions are cached or not.
What do you think?
msg68819 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008年06月27日 07:52
I don't understand the relation between "the member is cached" and "it
counts in the object's sizeof". 
What does "cached" mean? 
Does 'self.x = 3' create a cached member?
msg68820 - (view) Author: Robert Schuppenies (schuppenies) * (Python committer) Date: 2008年06月27日 09:05
I was refering to the fact that something on the C level is cached
without the Python user ever noticing that it wasn't there before.
Caching itself is no criteria, but allocating memory without giving the
user a chance to find out should be (in this context).
Maybe I am missing something here, but calling match.regs creates a
tuple which is not there before, but cannot be removed
afterwards. This is why I handled it separately.
msg68821 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008年06月27日 09:49
> Caching itself is no criteria, but allocating memory without giving
> the user a chance to find out should be (in this context).
> ... calling match.regs creates a
> tuple which is not there before, but cannot be removed
> afterwards. This is why I handled it separately.
Well, so why only include the tuple, and not objects inside the tuple?
They may also count in allocated memory (not often: small numbers are
shared)
Does the same criteria apply to function.func_defaults and function.doc
members? Both can be None, sizeof(None) would be added twice.
Would you say the same for property members?
class C(object):
 def setx(self): self.__x = 42
 x = property(lambda self: self.__x)
the value is not there before you call o.setx(), and cannot be removed
afterwards.
IMO, the criteria (to decide whether a container should include a
particular PyObject member in its sizeof) should not include the way the
member is created, or who created it, but only the current layout in
memory. For example: can other objects hold references to this member,
does it appear in gc.objects...
And I propose this simple algorithm: do not include any referenced
PyObject :-)
msg68826 - (view) Author: Robert Schuppenies (schuppenies) * (Python committer) Date: 2008年06月27日 12:22
Okay, I get the point. With including unicode.defenc I already
included a referenced object which was ruled out in the first
place. And this for a good reason.
What bugs me, though, is that this leaves out a potentially
significant amount of memory. I know that this is already the case
for shared objects (e.g. the above mentioned numbers) or malloc
overhead, but adding yet another exception bothers me.
On the other hand, since it's hidden behind the C API, I don't know
how to address this problem. Maybe just give it some text in the
documentation is sufficient.
msg69148 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008年07月02日 23:47
It now works for SRE objects in the py3k branch since r64672, but my
patch is still needed for types that define tp_getattr.
msg69199 - (view) Author: Robert Schuppenies (schuppenies) * (Python committer) Date: 2008年07月03日 13:31
Amaury,
I was testing your patch and it turns out, that it will ignore
any __sizeof__ attribute which may be available through getattr. I
adapted it a bit, so now getsizeof will try to call the method on the
passed object first, and if it fails or the object is a type, the code
proposed by you will be executed. This also deals with old-style class
instances. The match_sizeof function in the patch is just to showcase
the example. What do you think?
msg69767 - (view) Author: Robert Schuppenies (schuppenies) * (Python committer) Date: 2008年07月16日 07:16
Fixed in r64842.
History
Date User Action Args
2022年04月11日 14:56:35adminsetgithub: 47372
2008年07月16日 07:16:12schuppeniessetstatus: open -> closed
resolution: fixed
messages: + msg69767
keywords: patch, patch
2008年07月03日 13:31:48schuppeniessetkeywords: patch, patch
files: + sizeof2.patch
messages: + msg69199
2008年07月02日 23:47:41amaury.forgeotdarcsetkeywords: patch, patch
messages: + msg69148
2008年06月27日 12:22:09schuppeniessetkeywords: patch, patch
messages: + msg68826
2008年06月27日 09:49:01amaury.forgeotdarcsetkeywords: patch, patch
messages: + msg68821
2008年06月27日 09:05:38schuppeniessetkeywords: patch, patch
messages: + msg68820
2008年06月27日 07:52:24amaury.forgeotdarcsetkeywords: patch, patch
messages: + msg68819
2008年06月27日 07:12:14schuppeniessetkeywords: patch, patch
messages: + msg68818
2008年06月26日 22:24:29amaury.forgeotdarcsetkeywords: patch, patch
messages: + msg68805
2008年06月26日 22:08:16amaury.forgeotdarcsetkeywords: patch, patch
files: + sizeof.patch
messages: + msg68804
nosy: + amaury.forgeotdarc
2008年06月26日 21:40:52schuppeniessetkeywords: patch, patch
messages: + msg68802
2008年06月26日 21:10:33benjamin.petersonsetkeywords: patch, patch
nosy: + benjamin.peterson
messages: + msg68801
2008年06月26日 16:01:55schuppeniessetkeywords: patch, patch
messages: + msg68789
2008年06月26日 15:47:08facundobatistasetkeywords: patch, patch
nosy: + facundobatista
messages: + msg68788
2008年06月16日 10:51:25schuppeniescreate

AltStyle によって変換されたページ (->オリジナル) /