This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年06月16日 10:51 by schuppenies, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| _sre_sizeof.patch | schuppenies, 2008年06月16日 10:51 | Patch against 2.6 trunk, revision 64299 | ||
| sizeof.patch | amaury.forgeotdarc, 2008年06月26日 22:08 | |||
| sizeof2.patch | schuppenies, 2008年07月03日 13:31 | |||
| Messages (15) | |||
|---|---|---|---|
| msg68266 - (view) | Author: Robert Schuppenies (schuppenies) * (Python committer) | Date: 2008年06月16日 10:51 | |
>>> import re
>>> import sys
>>> r = re.compile('')
>>> sys.getsizeof(r)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: __sizeof__
This applies to objects of the types _sre.SRE_Pattern,
_sre.SRE_Scanner, and _sre.SRE_Match.
The attached patch addresses this issue.
|
|||
| msg68788 - (view) | Author: Facundo Batista (facundobatista) * (Python committer) | Date: 2008年06月26日 15:47 | |
Robert, do you have a test suite for the sizeof functionality? If not, you should start one ASAP, ;) This test should be included in that suit... |
|||
| msg68789 - (view) | Author: Robert Schuppenies (schuppenies) * (Python committer) | Date: 2008年06月26日 16:01 | |
What would be a good way to identify *all* possible types? When I started, I included all objects in /Objects, but obviously this is not sufficient. |
|||
| msg68801 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2008年06月26日 21:10 | |
I think it would be better to give a TypeError rather than an AttributeError for objects that don't support __sizeof__ as per other special methods. |
|||
| msg68802 - (view) | Author: Robert Schuppenies (schuppenies) * (Python committer) | Date: 2008年06月26日 21:40 | |
The attribute error is caused by pattern_getattr, which tries to find __sizeof__, fails and then sets the error message. I don't know if casting the error is the right thing to do. Actually, sys.getsizeof() should work on any type. Another suggestion was a that sys.getsizeof allows one optional argument for a default size. If the default argument is provided, sys.getsizeof will not throw an exception (if the __sizeof__ method is missing or for any other error), but instead return the given default size. Still, an agreement on the right error message is needed. |
|||
| msg68804 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年06月26日 22:08 | |
I wondered why getsizeof fails for _sre.SRE_Pattern objects when it succeeds for _socket.socket or struct.Struct. It turns out that _sre.SRE_Pattern defines the tp_getattr slot, and this prevents attribute lookup from searching the base class (object). This is a pity, because the base implementation (which use tp_basicsize) does the right thing for many objects. So I borrowed some code from the __format__ method, and here is a patch. Now classes are not handled specially any more; the distinction is between old-style instances, and all other objects. All tests pass, but I may have missed something important... |
|||
| msg68805 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年06月26日 22:24 | |
Robert, looking at your patch for the _sre module, I noticed that MatchObject.__sizeof__ includes the sizeof of one of its member (regs: a tuple of (begin, end) pairs). Why this one and not the others? I thought the rule (if there is a rule) was to include the used memory for members that are *not* python objects. |
|||
| msg68818 - (view) | Author: Robert Schuppenies (schuppenies) * (Python committer) | Date: 2008年06月27日 07:12 | |
You are right, the rule is to not include referenced objects. But, and this has been rather informal up to now, I handled transparently cached information as something that is added to the memory used by an object (see unicode.defenc). The same I applied to MatchObject.regs. The rational being that the user cannot know wether the match positions are cached or not. What do you think? |
|||
| msg68819 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年06月27日 07:52 | |
I don't understand the relation between "the member is cached" and "it counts in the object's sizeof". What does "cached" mean? Does 'self.x = 3' create a cached member? |
|||
| msg68820 - (view) | Author: Robert Schuppenies (schuppenies) * (Python committer) | Date: 2008年06月27日 09:05 | |
I was refering to the fact that something on the C level is cached without the Python user ever noticing that it wasn't there before. Caching itself is no criteria, but allocating memory without giving the user a chance to find out should be (in this context). Maybe I am missing something here, but calling match.regs creates a tuple which is not there before, but cannot be removed afterwards. This is why I handled it separately. |
|||
| msg68821 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年06月27日 09:49 | |
> Caching itself is no criteria, but allocating memory without giving > the user a chance to find out should be (in this context). > ... calling match.regs creates a > tuple which is not there before, but cannot be removed > afterwards. This is why I handled it separately. Well, so why only include the tuple, and not objects inside the tuple? They may also count in allocated memory (not often: small numbers are shared) Does the same criteria apply to function.func_defaults and function.doc members? Both can be None, sizeof(None) would be added twice. Would you say the same for property members? class C(object): def setx(self): self.__x = 42 x = property(lambda self: self.__x) the value is not there before you call o.setx(), and cannot be removed afterwards. IMO, the criteria (to decide whether a container should include a particular PyObject member in its sizeof) should not include the way the member is created, or who created it, but only the current layout in memory. For example: can other objects hold references to this member, does it appear in gc.objects... And I propose this simple algorithm: do not include any referenced PyObject :-) |
|||
| msg68826 - (view) | Author: Robert Schuppenies (schuppenies) * (Python committer) | Date: 2008年06月27日 12:22 | |
Okay, I get the point. With including unicode.defenc I already included a referenced object which was ruled out in the first place. And this for a good reason. What bugs me, though, is that this leaves out a potentially significant amount of memory. I know that this is already the case for shared objects (e.g. the above mentioned numbers) or malloc overhead, but adding yet another exception bothers me. On the other hand, since it's hidden behind the C API, I don't know how to address this problem. Maybe just give it some text in the documentation is sufficient. |
|||
| msg69148 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2008年07月02日 23:47 | |
It now works for SRE objects in the py3k branch since r64672, but my patch is still needed for types that define tp_getattr. |
|||
| msg69199 - (view) | Author: Robert Schuppenies (schuppenies) * (Python committer) | Date: 2008年07月03日 13:31 | |
Amaury, I was testing your patch and it turns out, that it will ignore any __sizeof__ attribute which may be available through getattr. I adapted it a bit, so now getsizeof will try to call the method on the passed object first, and if it fails or the object is a type, the code proposed by you will be executed. This also deals with old-style class instances. The match_sizeof function in the patch is just to showcase the example. What do you think? |
|||
| msg69767 - (view) | Author: Robert Schuppenies (schuppenies) * (Python committer) | Date: 2008年07月16日 07:16 | |
Fixed in r64842. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:35 | admin | set | github: 47372 |
| 2008年07月16日 07:16:12 | schuppenies | set | status: open -> closed resolution: fixed messages: + msg69767 keywords: patch, patch |
| 2008年07月03日 13:31:48 | schuppenies | set | keywords:
patch, patch files: + sizeof2.patch messages: + msg69199 |
| 2008年07月02日 23:47:41 | amaury.forgeotdarc | set | keywords:
patch, patch messages: + msg69148 |
| 2008年06月27日 12:22:09 | schuppenies | set | keywords:
patch, patch messages: + msg68826 |
| 2008年06月27日 09:49:01 | amaury.forgeotdarc | set | keywords:
patch, patch messages: + msg68821 |
| 2008年06月27日 09:05:38 | schuppenies | set | keywords:
patch, patch messages: + msg68820 |
| 2008年06月27日 07:52:24 | amaury.forgeotdarc | set | keywords:
patch, patch messages: + msg68819 |
| 2008年06月27日 07:12:14 | schuppenies | set | keywords:
patch, patch messages: + msg68818 |
| 2008年06月26日 22:24:29 | amaury.forgeotdarc | set | keywords:
patch, patch messages: + msg68805 |
| 2008年06月26日 22:08:16 | amaury.forgeotdarc | set | keywords:
patch, patch files: + sizeof.patch messages: + msg68804 nosy: + amaury.forgeotdarc |
| 2008年06月26日 21:40:52 | schuppenies | set | keywords:
patch, patch messages: + msg68802 |
| 2008年06月26日 21:10:33 | benjamin.peterson | set | keywords:
patch, patch nosy: + benjamin.peterson messages: + msg68801 |
| 2008年06月26日 16:01:55 | schuppenies | set | keywords:
patch, patch messages: + msg68789 |
| 2008年06月26日 15:47:08 | facundobatista | set | keywords:
patch, patch nosy: + facundobatista messages: + msg68788 |
| 2008年06月16日 10:51:25 | schuppenies | create | |