I am puzzled by an inconsistency when calling .strftime() for dates which are pre-1000 AD, using Python's datetime module.
Take the following example:
import datetime
old_date = datetime.date(year=33, month=3, day=28) # 28th March 33AD
old_date.isoformat()
>>> "0033年03月28日" # Fine!
old_date.strftime("%Y-%m-%d")
>>> "33-03-28" # Woah - where did my leading zeros go?
# And even worse
datetime.datetime.strptime(old_date.strftime("%Y-%m-%d"), "%Y-%m-%d")
>>>
...
File "<input>", line 1, in <module>
File "/usr/lib/python3.12/_strptime.py", line 554, in _strptime_datetime
tt, fraction, gmtoff_fraction = _strptime(data_string, format)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/_strptime.py", line 333, in _strptime
raise ValueError("time data %r does not match format %r" %
ValueError: time data '33-03-28' does not match format '%Y-%m-%d'
The documentation shows examples of %Y yielding zero-padded years. Even using %G, which is documented to be an ISO-8601 4-digit year, is showing only two digits.
This caused a problem in an application where a user can enter a date, and if they type in an old date the exception above would arise when trying to convert a date-string back into a date.
Presumably there is something in my local configuration which is causing this, as this seems too obvious to be a bug in Python. I'm using Python 3.12 on Ubuntu 24.04.
2 Answers 2
This is caused by the implementation of .strftime() in the C library in Linux omitting any leading zeros from %Y and %G. The related issue in CPython's issue tracker is here.
Thanks to jonrsharpe's comment for the answer, and highlighting this section of the documentation:
Comments
This is an issue that stems from Python (and Linux in general) using GNU libc (glibc) on Linix-based OSes. In glibc, if you need a 4 digit year then you can use %4Y. This is true on a Python that uses glibc too. However, this is a non-portable extension to libc, and so will cause an error on systems that do not use glibc. This includes, but is not limited to, Windows.
Examples
On Linux
>>> old_date.strftime("%4Y-%m-%d")
'0033-03-28'
On Windows
>>> old_date.strftime("%4Y-%m-%d")
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
old_date.strftime("%4Y-%m-%d")
ValueError: Invalid format string
Portability
If you need portability, either because you need to run on multiple platforms, or do not know which platform your code will be running on, then you can do the following:
import datetime
import platform
# using libc_ver() means you are not using the OS as a proxy for which libc
# library Python is using. And so your code will work on any OS, regardless
# of which libc library has been used.
if platform.libc_ver()[0] == 'glibc':
# use a glibc-only format option to get a 4 digit year
date_format = '%4Y-%m-%d'
else:
date_format = '%Y-%m-%d'
old_date = datetime.date(year=33, month=3, day=28)
assert '0033-03-28' == old_date.strftime(date_format)