homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Improve pickle format for timezone aware datetime instances
Type: enhancement Stage: resolved
Components: Versions: Python 3.6
process
Status: closed Resolution: rejected
Dependencies: 9183 Superseder:
Assigned To: belopolsky Nosy List: alexandre.vassalotti, belopolsky, fdrake, mark.dickinson, ned.deily, pitrou, python-dev, serhiy.storchaka, tim.peters, vstinner
Priority: normal Keywords: patch

Created on 2010年06月21日 20:15 by belopolsky, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue9051.diff belopolsky, 2010年06月22日 03:26 review
issue9051-utc-pickle-proto.diff belopolsky, 2014年06月30日 01:26 review
timezone_test_pickle.patch serhiy.storchaka, 2015年10月24日 09:58 review
Messages (19)
msg108313 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010年06月21日 20:15
>>> s = pickle.dumps(timezone.utc)
>>> pickle.loads(s)
Traceback (most recent call last):
 ..
TypeError: ("Required argument 'offset' (pos 1) not found", <class 'datetime.timezone'>, ())
msg108408 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010年06月22日 18:39
Python version is fixed in sandbox and committed in r82154.
msg108492 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010年06月23日 22:20
Committed in r82184. Leaving the issue open pending a more thorough review of pickling in datetime module.
msg108608 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010年06月25日 15:55
The datetime module provides compact pickled representation for date, datetime, time and timedelta instances:
type: size
date: 34
datetime: 44
time: 36
timedelta: 37
On the other hand, current pickle size for timezone is 64 and the size of an aware datetime instance is 105 bytes.
Since stability is important for pickle format, the best format should be developed before the first release to include timezone class.
msg108611 - (view) Author: Fred Drake (fdrake) (Python committer) Date: 2010年06月25日 16:56
As part of this, we should ensure references to common timezones, like
UTC, only create references to a single instance rather than filling
memory with multiple instances.
One consequence of this is that shared instances should probably be immutable.
msg108619 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010年06月25日 19:29
I am attaching a python prototype implementing interned UTC instance pickling. The patch is against sandbox revision r82218 of datetime.py.
Note that the pickling protocol requires that an instance or factory function is defined at the module level.
The pickle size saving is substantial:
>>> len(dumps(datetime.now(timezone.utc)))
61
>>> len(dumps(datetime.now(timezone.min)))
163
but there is still room for improvement:
>>> len(dumps(datetime.now()))
44
I do feel, however, that further improvements will see diminishing returns.
msg221920 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014年06月29日 23:53
Your latest patch doesn't have a review link. Would you like to regenerate it against the latest default?
msg221933 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014年06月30日 01:26
I updated the patch. Please note that it only includes python code, so you need to suppress _datetime acceleration to test:
>>> import sys
>>> sys.modules['_datetime'] = None
>>> from pickle import *
>>> from datetime import *
>>> dumps(timezone.utc)
b'\x80\x03cdatetime\n_utc\nq\x00.'
msg221934 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014年06月30日 01:29
Hmm. Still no "review" link. Is there something special that I need to do to get it? Use different name, maybe?
msg221935 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2014年06月30日 01:33
FWIW, I see review links for both of your files: all the way over to the right in the Files section.
msg221936 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2014年06月30日 01:37
> I see review links for both of your files
Now I do too. I guess they just take time to appear. Note that I unlinked the file that Antoine complained about.
msg251342 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年09月22日 19:16
For stability it is better to use public name 'timezone.utc' instead of '_utc'. Dotted names now are supported with all protocols.
> but there is still room for improvement:
Don't worry about this. When you pickle a number of datetime objects with the same timezone, the timezone is pickled only once and for all other datetime objects only a reference is used.
msg251668 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2015年09月26日 22:54
> For stability it is better to use public name 'timezone.utc' instead of '_utc'.
Can you elaborate on the "stability" consideration?
I would like to revisit this issue since we will have some changes made to datetime pickles in the context of PEP 495.
msg251677 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年09月27日 03:12
For now "_utc" is not a part of API. When use it in pickling, you implicitly make it a part of API, and should support it forever (and force other Python implementations to support it). As far as it looks as implementation detail, there is a risk that the "_utc" variable can be renamed and this will break pickle compatibility.
msg251812 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2015年09月29日 02:23
> there is a risk that the "_utc" variable can be renamed
> and this will break pickle compatibility.
I think we will still need to maintain _utc global indefinitely to keep the old pickles readable.
On the other hand, it looks like support for qualnames is only available with pickle protocol 4, so I don't see any downside from from storing the full qualname other than an 8-byte increase in the size of the pickle.
I guess my position on this will be +0: I'll accept a patch if someone will submit it, but I am unlikely to do it myself.
msg251814 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年09月29日 02:58
> I think we will still need to maintain _utc global indefinitely to keep the old pickles readable.
We have no need to maintain _utc global indefinitely if don't add it.
> On the other hand, it looks like support for qualnames is only available with pickle protocol 4,
No, support for qualnames now is available with all pickle protocols.
msg253402 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年10月24日 09:58
Following patch adds tests for timezone pickling and copying. Note, that timezone.utc identity is preserved.
msg254721 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年11月16日 09:21
New changeset e48da8f01316 by Serhiy Storchaka in branch '3.4':
Issue #9051: Added tests for pickling and copying the timezone objects.
https://hg.python.org/cpython/rev/e48da8f01316
New changeset 634905e9628d by Serhiy Storchaka in branch '3.5':
Issue #9051: Added tests for pickling and copying the timezone objects.
https://hg.python.org/cpython/rev/634905e9628d
New changeset 9c31544d6a66 by Serhiy Storchaka in branch 'default':
Issue #9051: Added tests for pickling and copying the timezone objects.
https://hg.python.org/cpython/rev/9c31544d6a66 
msg254724 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年11月16日 09:53
>>> import pickle, pickletools, datetime
>>> len(pickle.dumps([datetime.datetime.utcfromtimestamp(i) for i in range(10000)], 3)) / 10000
30.9283
>>> len(pickle.dumps([datetime.datetime.fromtimestamp(i, datetime.timezone.utc) for i in range(10000)], 3)) / 10000
32.936
>>> len(pickle.dumps([datetime.datetime.utcfromtimestamp(i) for i in range(10000)], 4)) / 10000
19.0074
>>> len(pickle.dumps([datetime.datetime.fromtimestamp(i, datetime.timezone.utc) for i in range(10000)], 4)) / 10000
21.0144
In best case the UTC timezone costs only 2 bytes per datetime instance (BINGET + 1-byte index). In worst case it can cost 5 bytes (LONG_BINGET + 4-bytes index). Saving few bytes for single timezone instance has significant effect only for small pickled data. But in this case it is more advantageous to save a datetime just as a timestamp. I suggest to close this issue.
History
Date User Action Args
2022年04月11日 14:57:02adminsetgithub: 53297
2017年03月06日 17:30:07belopolskysetstatus: pending -> closed
resolution: rejected
stage: patch review -> resolved
2017年03月06日 13:17:56serhiy.storchakasetstatus: open -> pending
2015年11月16日 09:54:00serhiy.storchakasetmessages: + msg254724
2015年11月16日 09:21:18python-devsetnosy: + python-dev
messages: + msg254721
2015年10月24日 09:58:03serhiy.storchakasetfiles: + timezone_test_pickle.patch

messages: + msg253402
stage: needs patch -> patch review
2015年09月29日 02:58:17serhiy.storchakasetmessages: + msg251814
2015年09月29日 02:24:12belopolskysetnosy: + tim.peters
2015年09月29日 02:23:56belopolskysetmessages: + msg251812
2015年09月27日 03:12:50serhiy.storchakasetmessages: + msg251677
2015年09月26日 22:54:31belopolskysetmessages: + msg251668
2015年09月22日 19:16:14serhiy.storchakasetversions: + Python 3.6, - Python 3.5
nosy: + serhiy.storchaka

messages: + msg251342

type: behavior -> enhancement
2014年07月01日 05:35:24rhettingersettitle: Improve pickle format for aware datetime instances -> Improve pickle format for timezone aware datetime instances
2014年06月30日 01:37:16belopolskysetmessages: + msg221936
2014年06月30日 01:33:55ned.deilysetnosy: + ned.deily
messages: + msg221935
2014年06月30日 01:29:39belopolskysetversions: + Python 3.5, - Python 3.3
2014年06月30日 01:29:20belopolskysetmessages: + msg221934
2014年06月30日 01:27:52belopolskysetfiles: - issue9051-utc-pickle-proto.diff
2014年06月30日 01:26:54belopolskysetfiles: + issue9051-utc-pickle-proto.diff

messages: + msg221933
2014年06月29日 23:53:43pitrousetmessages: + msg221920
2014年06月29日 20:08:53belopolskysetnosy: + pitrou, vstinner, alexandre.vassalotti
2011年01月11日 15:16:45belopolskysetnosy: fdrake, mark.dickinson, belopolsky
versions: + Python 3.3, - Python 3.2
2010年07月28日 18:01:38belopolskysetdependencies: + Intern UTC timezone
2010年06月25日 19:29:17belopolskysetfiles: + issue9051-utc-pickle-proto.diff
keywords: + patch
messages: + msg108619
2010年06月25日 16:56:49fdrakesetmessages: + msg108611
2010年06月25日 15:55:07belopolskysetpriority: low -> normal

nosy: + fdrake
title: Cannot pickle timezone instances -> Improve pickle format for aware datetime instances
messages: + msg108608

stage: patch review -> needs patch
2010年06月23日 22:20:32belopolskysetpriority: normal -> low
keywords: - patch, easy
messages: + msg108492
2010年06月22日 18:39:06belopolskysetmessages: + msg108408
2010年06月22日 03:26:51belopolskysetfiles: + issue9051.diff
2010年06月22日 03:26:31belopolskysetfiles: - issue8455.diff
2010年06月22日 03:26:11belopolskysetkeywords: + easy, patch
nosy: + mark.dickinson

files: + issue8455.diff
stage: test needed -> patch review
2010年06月21日 20:15:07belopolskycreate

AltStyle によって変換されたページ (->オリジナル) /