This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年12月08日 15:01 by ocean-city, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| py3k_workaround_for_wcsftime.patch | ocean-city, 2010年12月08日 17:57 | review | ||
| main.c | ocean-city, 2010年12月09日 10:54 | test code | ||
| Messages (22) | |||
|---|---|---|---|
| msg123612 - (view) | Author: Hirokazu Yamamoto (ocean-city) * (Python committer) | Date: 2010年12月08日 15:01 | |
Following tests fails on official Python3.2 Windows binary. I cannot reproduce this on VC6. ///////////////////////////////////////////////////// C:\Python32>.\python -m test.regrtest -v test_time test_strptime [1/2] test_time test_asctime (test.test_time.TimeTestCase) ... ok test_asctime_bounding_check (test.test_time.TimeTestCase) ... ok test_clock (test.test_time.TimeTestCase) ... ok test_conversions (test.test_time.TimeTestCase) ... ok test_ctime_without_arg (test.test_time.TimeTestCase) ... ok test_data_attributes (test.test_time.TimeTestCase) ... ok test_default_values_for_zero (test.test_time.TimeTestCase) ... ok test_gmtime_without_arg (test.test_time.TimeTestCase) ... ok test_insane_timestamps (test.test_time.TimeTestCase) ... ok test_localtime_without_arg (test.test_time.TimeTestCase) ... ok test_sleep (test.test_time.TimeTestCase) ... ok test_strftime (test.test_time.TimeTestCase) ... ok test_strftime_bounding_check (test.test_time.TimeTestCase) ... ok test_strptime (test.test_time.TimeTestCase) ... FAIL test_strptime_bytes (test.test_time.TimeTestCase) ... ok test_tzset (test.test_time.TimeTestCase) ... ok test_bug_3061 (test.test_time.TestLocale) ... ok ====================================================================== FAIL: test_strptime (test.test_time.TimeTestCase) ---------------------------------------------------------------------- test test_time crashed -- <class 'UnicodeEncodeError'>: 'cp932' codec can't enco de character '\x93' in position 495: illegal multibyte sequence Traceback (most recent call last): File "C:\Python32\lib\test\regrtest.py", line 960, in runtest_inner indirect_test() File "C:\Python32\lib\test\test_time.py", line 244, in test_main support.run_unittest(TimeTestCase, TestLocale) File "C:\Python32\lib\test\support.py", line 1146, in run_unittest _run_suite(suite) File "C:\Python32\lib\test\support.py", line 1120, in _run_suite result = runner.run(suite) File "C:\Python32\lib\unittest\runner.py", line 173, in run result.printErrors() File "C:\Python32\lib\unittest\runner.py", line 110, in printErrors self.printErrorList('FAIL', self.failures) File "C:\Python32\lib\unittest\runner.py", line 117, in printErrorList self.stream.writeln("%s" % err) File "C:\Python32\lib\unittest\runner.py", line 25, in writeln self.write(arg) UnicodeEncodeError: 'cp932' codec can't encode character '\x93' in position 495: illegal multibyte sequence [2/2] test_strptime test_basic (test.test_strptime.getlang_Tests) ... ok test_am_pm (test.test_strptime.LocaleTime_Tests) ... ok test_date_time (test.test_strptime.LocaleTime_Tests) ... ok test_lang (test.test_strptime.LocaleTime_Tests) ... ok test_month (test.test_strptime.LocaleTime_Tests) ... ok test_timezone (test.test_strptime.LocaleTime_Tests) ... FAIL test_weekday (test.test_strptime.LocaleTime_Tests) ... ok test_blankpattern (test.test_strptime.TimeRETests) ... ok test_compile (test.test_strptime.TimeRETests) ... FAIL test_locale_data_w_regex_metacharacters (test.test_strptime.TimeRETests) ... ok test_matching_with_escapes (test.test_strptime.TimeRETests) ... ok test_pattern (test.test_strptime.TimeRETests) ... ok test_pattern_escaping (test.test_strptime.TimeRETests) ... ok test_whitespace_substitution (test.test_strptime.TimeRETests) ... ok test_ValueError (test.test_strptime.StrptimeTests) ... ok test_bad_timezone (test.test_strptime.StrptimeTests) ... ok test_caseinsensitive (test.test_strptime.StrptimeTests) ... ok test_date (test.test_strptime.StrptimeTests) ... ok test_date_time (test.test_strptime.StrptimeTests) ... ok test_day (test.test_strptime.StrptimeTests) ... ok test_defaults (test.test_strptime.StrptimeTests) ... ok test_escaping (test.test_strptime.StrptimeTests) ... ok test_fraction (test.test_strptime.StrptimeTests) ... ok test_hour (test.test_strptime.StrptimeTests) ... ok test_julian (test.test_strptime.StrptimeTests) ... ok test_minute (test.test_strptime.StrptimeTests) ... ok test_month (test.test_strptime.StrptimeTests) ... ok test_percent (test.test_strptime.StrptimeTests) ... ok test_second (test.test_strptime.StrptimeTests) ... ok test_time (test.test_strptime.StrptimeTests) ... ok test_timezone (test.test_strptime.StrptimeTests) ... ERROR test_unconverteddata (test.test_strptime.StrptimeTests) ... ok test_weekday (test.test_strptime.StrptimeTests) ... ok test_year (test.test_strptime.StrptimeTests) ... ok test_twelve_noon_midnight (test.test_strptime.Strptime12AMPMTests) ... ok test_all_julian_days (test.test_strptime.JulianTests) ... ok test_day_of_week_calculation (test.test_strptime.CalculationTests) ... ERROR test_gregorian_calculation (test.test_strptime.CalculationTests) ... ERROR test_julian_calculation (test.test_strptime.CalculationTests) ... ERROR test_week_of_year_and_day_of_week_calculation (test.test_strptime.CalculationTes ts) ... ok test_TimeRE_recreation (test.test_strptime.CacheTests) ... ok test_new_localetime (test.test_strptime.CacheTests) ... ok test_regex_cleanup (test.test_strptime.CacheTests) ... ok test_time_re_recreation (test.test_strptime.CacheTests) ... ok ====================================================================== ERROR: test_timezone (test.test_strptime.StrptimeTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python32\lib\test\test_strptime.py", line 303, in test_timezone strp_output = _strptime._strptime_time(strf_output, "%Z") File "C:\Python32\lib\_strptime.py", line 482, in _strptime_time tt = _strptime(data_string, format)[0] File "C:\Python32\lib\_strptime.py", line 337, in _strptime (data_string, format)) ValueError: time data '\x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)' does not match format '%Z' ====================================================================== ERROR: test_day_of_week_calculation (test.test_strptime.CalculationTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python32\lib\test\test_strptime.py", line 437, in test_day_of_week_ca lculation format_string) File "C:\Python32\lib\_strptime.py", line 482, in _strptime_time tt = _strptime(data_string, format)[0] File "C:\Python32\lib\_strptime.py", line 337, in _strptime (data_string, format)) ValueError: time data '2010 12 08 14 00 342 \x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\ x9e)' does not match format '%Y %m %d %H %S %j %Z' ====================================================================== ERROR: test_gregorian_calculation (test.test_strptime.CalculationTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python32\lib\test\test_strptime.py", line 423, in test_gregorian_calc ulation format_string) File "C:\Python32\lib\_strptime.py", line 482, in _strptime_time tt = _strptime(data_string, format)[0] File "C:\Python32\lib\_strptime.py", line 337, in _strptime (data_string, format)) ValueError: time data '2010 14 58 01 3 342 \x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x 9e)' does not match format '%Y %H %M %S %w %j %Z' ====================================================================== ERROR: test_julian_calculation (test.test_strptime.CalculationTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python32\lib\test\test_strptime.py", line 414, in test_julian_calcula tion format_string) File "C:\Python32\lib\_strptime.py", line 482, in _strptime_time tt = _strptime(data_string, format)[0] File "C:\Python32\lib\_strptime.py", line 337, in _strptime (data_string, format)) ValueError: time data '2010 12 08 14 58 01 3 \x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e \x9e)' does not match format '%Y %m %d %H %M %S %w %Z' ====================================================================== FAIL: test_timezone (test.test_strptime.LocaleTime_Tests) ---------------------------------------------------------------------- test test_strptime crashed -- <class 'UnicodeEncodeError'>: 'cp932' codec can't encode character '\x93' in position 192: illegal multibyte sequence Traceback (most recent call last): File "C:\Python32\lib\test\regrtest.py", line 960, in runtest_inner indirect_test() File "C:\Python32\lib\test\test_strptime.py", line 557, in test_main CacheTests File "C:\Python32\lib\test\support.py", line 1146, in run_unittest _run_suite(suite) File "C:\Python32\lib\test\support.py", line 1120, in _run_suite result = runner.run(suite) File "C:\Python32\lib\unittest\runner.py", line 173, in run result.printErrors() File "C:\Python32\lib\unittest\runner.py", line 110, in printErrors self.printErrorList('FAIL', self.failures) File "C:\Python32\lib\unittest\runner.py", line 117, in printErrorList self.stream.writeln("%s" % err) File "C:\Python32\lib\unittest\runner.py", line 25, in writeln self.write(arg) UnicodeEncodeError: 'cp932' codec can't encode character '\x93' in position 192: illegal multibyte sequence 2 tests failed: test_strptime test_time |
|||
| msg123618 - (view) | Author: Brian Curtin (brian.curtin) * (Python committer) | Date: 2010年12月08日 15:40 | |
I don't see this on a US/English version of Windows 7 with 3.2b1 installed. cp932 is the default on a Japanese version, correct? (I'm not very good with all of this encoding stuff so I don't know how much help I can be) |
|||
| msg123623 - (view) | Author: Hirokazu Yamamoto (ocean-city) * (Python committer) | Date: 2010年12月08日 17:46 | |
I think this is locale problem. With "C" locale on windows, wcsftime doesn't return UTF16. (when non ascii characters are contained) It is just like .... char cbuf[] = "...."; /* contains non ascii chars in MBCS */ wchar_t wbuf[sizeof(cbuf)]; for (size_t i = 0; i < sizeof(cbuf); ++i) wbuf[i] = cbuf[i]; /* just copy it. non ascii chars in MBCS uses two bytes, but should use 1 char space in UTF16. But this case, it uses 2 chars space! (something strange encoding) */ In japanese, wcsftime returns non ascii characters for timezone in this strange encoding. Python converts this with #ifdef HAVE_WCSFTIME ret = PyUnicode_FromWideChar(outbuf, buflen); #else so Unicode object will contain data in this strange encoding. This is cause of problem. I investigated a little about locale, and I learned C standard does not guarantee wchar_t is always UTF16. |
|||
| msg123624 - (view) | Author: Hirokazu Yamamoto (ocean-city) * (Python committer) | Date: 2010年12月08日 17:57 | |
I'll attach workaround. I used to confirm this works on VS8, but I don't have VS8 now. I hope this still works. |
|||
| msg123625 - (view) | Author: Alexander Belopolsky (belopolsky) * (Python committer) | Date: 2010年12月08日 17:58 | |
> ValueError: time data '2010 14 58 01 3 342 \x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)' does not match format '%Y %H %M %S %w %j %Z'
This looks like valid cp932 data to me
>>> b'2010 14 58 01 3 342 \x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)'.decode('cp932')
'2010 14 58 01 3 342 東京 (標準時)'
Please help me with Japanese, but I think the above means Tokyo timezone. However, strftime should have produced decoded unicode strings, not raw cp932 in a str. What does time.strftime('%Z') return on your system?
|
|||
| msg123626 - (view) | Author: Hirokazu Yamamoto (ocean-city) * (Python committer) | Date: 2010年12月08日 18:12 | |
Here you are.
>>> import time
>>> time.strftime('%Z')
'\x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)'
|
|||
| msg123628 - (view) | Author: Alexander Belopolsky (belopolsky) * (Python committer) | Date: 2010年12月08日 18:18 | |
On Wed, Dec 8, 2010 at 1:12 PM, Hirokazu Yamamoto <report@bugs.python.org> wrote: .. >>>> import time >>>> time.strftime('%Z') > '\x93\x8c\x8b\x9e (\x95W\x8f\x80\x8e\x9e)' Thanks. Please bear with me for one more question: what is >>> time.tzname ? |
|||
| msg123631 - (view) | Author: Hirokazu Yamamoto (ocean-city) * (Python committer) | Date: 2010年12月08日 18:50 | |
I got readable result. ;-)
>>> import time
>>> time.tzname
('東京 (標準時)', '東京 (標準時)')
|
|||
| msg123639 - (view) | Author: Alexander Belopolsky (belopolsky) * (Python committer) | Date: 2010年12月08日 19:45 | |
On Wed, Dec 8, 2010 at 1:50 PM, Hirokazu Yamamoto <report@bugs.python.org> wrote: .. > I got readable result. ;-) > You mean readable to *you*. :-) >>>> import time >>>> time.tzname > ('東京 (標準時)', '東京 (標準時)') This makes sense now. There are two issues here: 1. Decoding the output of wcsftime(). Python expects mbcs (which I believe is an UTF16-like wide char encoding) while Windows apparently puts cp932 there in your locale. I don't have expertise to address this issue. 2. strptime() cannot parse strftime() output when strftime('%Z') is different from time.tzname[dst]. This issue we can address. Note that for most of the locale information such as day of the week or month names, strptime() relies on strftime() output, so the round-tripping should work even when strftime() results are nonsensical. On the other hand, tz spellings are taken from time.tzname. I think we can make strptime() more robust by adding [time.strftime('%Z', (2000,1,1,0,0,0,0,0,dst) for dst in (0,1)] to the list of recognized tz names if they differ from time.tzname. |
|||
| msg123676 - (view) | Author: Hirokazu Yamamoto (ocean-city) * (Python committer) | Date: 2010年12月09日 10:54 | |
> 1. Decoding the output of wcsftime(). Python expects mbcs (which
> I believe is an UTF16-like wide char encoding) while Windows
> apparently puts cp932 there in your locale. I don't have expertise
> to address this issue.
No, mbcs is not wide character sets (wchar_t*) but ANSI character sets
(char*). In my environment, mbcs == cp932. And python expects UTF-16.
> 2. strptime() cannot parse strftime() output when strftime('%Z') is
> different from time.tzname[dst]. (snip)
I attached test program to test behavior of strftime and wcsftime
on locale. On VC6, strftime doesn't depend on locale, wheres
wcsftime changed the value depends on locale. (I tested only "C"
locale and "System" locale because I could not find other
locales working on my environment, so )
If strftime doesn't depend on locale and equals to tzname
for every locale, maybe strftime is preferred on windows.
# Can somebody test this on VS9? And other locales?
|
|||
| msg144419 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年09月22日 21:09 | |
See also issue #13029. |
|||
| msg145490 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2011年10月14日 00:38 | |
New changeset e3d9c5e690fc by Victor Stinner in branch '3.2': Issue #10653: On Windows, use strftime() instead of wcsftime() because http://hg.python.org/cpython/rev/e3d9c5e690fc New changeset 79e60977fc04 by Victor Stinner in branch 'default': (Merge 3.2) Issue #10653: On Windows, use strftime() instead of wcsftime() http://hg.python.org/cpython/rev/79e60977fc04 |
|||
| msg145492 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年10月14日 00:41 | |
It's a bug in the Windows API: I used the workaround suggested by Hirokazu Yamamoto. Thanks Hirokazu! Python 2.7 doesn't use wcsftime() and so it is not affected by this issue. |
|||
| msg145596 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年10月15日 15:42 | |
Crashes on the Windows buildbots: f:\dd\vctools\crt_bld\self_x86\crt\src\strftime.c(832) : Assertion failed: ( "Invalid format directive" , 0 ) f:\dd\vctools\crt_bld\self_x86\crt\src\strftime.c(484) : Assertion failed: FALSE |
|||
| msg145628 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2011年10月16日 17:07 | |
New changeset e3c13a1d2595 by Victor Stinner in branch 'default': Issue #10653: Fix time.strftime() on Windows, check for invalid format strings http://hg.python.org/cpython/rev/e3c13a1d2595 |
|||
| msg145629 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年10月16日 17:09 | |
> Crashes on the Windows buildbots: Oops, it should be fixed by my last commits. |
|||
| msg145647 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2011年10月16日 21:45 | |
New changeset 977c5753ca32 by Victor Stinner in branch '3.2': Issue #10653: Fix time.strftime() on Windows, check for invalid format strings http://hg.python.org/cpython/rev/977c5753ca32 |
|||
| msg243660 - (view) | Author: Eryk Sun (eryksun) * (Python triager) | Date: 2015年05月20日 13:30 | |
This solution no longer works. If the system is configured to use the Japanese system locale and language pack, then 3.4.3 returns codepage 932 mojibake for the "%Z" time zone name. Originally [this approach worked][1] because it called PyUnicode_Decode using the 'mbcs' encoding.
Currently it calls PyUnicode_DecodeLocaleAndSize, which just ends up calling mbstowcs. That's pretty much what wcsftime does. In the default C locale, mbstowcs casts the byte values to wchar_t:
>>> time.strftime('%Z')
'\x91\xbe\x95\xbd\x97m\x89\xc4\x8e\x9e\x8a\xd4'
>>> time.strftime('%Z').encode('latin-1').decode('932')
'太平洋夏時間'
The problem is worse for 3.5 built with VC++ 14. In the new CRT strftime decodes the format string via MultiByteToWideChar, calls _Wcsftime_l, and encodes the result back via WideCharToMultiByte. The outer conversions use the default LC_TIME codepage, which is ANSI (ACP), so they're not the problem. The problem is the internal _mbstowcs_s_l conversion of the ANSI time zone name, which creates the above-shown mojibake 'unicode' string. This is then compounded by calling WideCharToMultiByte on the result:
>>> time.strftime('%Z')
'?????m?A???O'
There's no way to fix this by transcoding. The result is just garbage.
[1]: https://hg.python.org/cpython/file/79e60977fc04/Modules/timemodule.c#l501
|
|||
| msg388237 - (view) | Author: Eryk Sun (eryksun) * (Python triager) | Date: 2021年03月07日 12:36 | |
Update since msg243660: Python 3.8+ now calls setlocale(LC_CTYPE, "") at startup in Windows, as it has always done in POSIX, so decoding the output of strftime("%Z") with PyUnicode_DecodeLocaleAndSize() works again since both agree on using the process active code page. In 3.7+, per bpo-36779, time.tzname is set when the module is first loaded by directly querying GetTimeZoneInformation(). time.tzset() is still not supported, despite the fact that it was always supported by ucrt, so this value can become stale relative to strftime("%Z"). Starting with Windows 10 v2004 (build 19041), ucrt uses an internal wide-character version of the time-zone name that gets returned by an internal __wide_tzname() call and used for "%Z" in wcsftime(). The wide-character value gets updated by _tzset() and kept in sync with _tzname. |
|||
| msg388238 - (view) | Author: Eryk Sun (eryksun) * (Python triager) | Date: 2021年03月07日 13:06 | |
> decoding the output of strftime("%Z") with PyUnicode_DecodeLocaleAndSize()
> works again since both agree on using the process active code page
At least it works as much as it ever did. It depends on the process active code page being compatible with the preferred UI language of the current process or thread. For example if the UI language is Japanese ('ja-JP') for the current user, but the process active code page is Latin 1252 (based on the system locale), then the result will be garbage. In that case, given the time-zone name is in Japanese, both LC_TIME and LC_CTYPE have to be changed to "ja-JP" in order to correctly encode (as tzname in ucrt), decode-encode (for strftime in ucrt) and finally decode again via PyUnicode_DecodeLocaleAndSize(). If Python switched back to using wcsftime() in Windows 10 2004+, then the current locale encoding would no longer be a problem for any UI language.
|
|||
| msg388279 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2021年03月08日 18:25 | |
Eryk Sun: This issue is now closed. If you want to enhance the time module, please open a new issue. |
|||
| msg388292 - (view) | Author: Eryk Sun (eryksun) * (Python triager) | Date: 2021年03月08日 19:52 | |
> Eryk Sun: This issue is now closed. If you want to enhance > the time module, please open a new issue. I was aware of that at the time, Victor. The problem can be worked on in a new issue, or in the older issue bpo-8304, which remains open. The two messages that I added are purely informative, to update my original comment in msg243660. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:09 | admin | set | github: 54862 |
| 2021年03月08日 19:52:06 | eryksun | set | messages: + msg388292 |
| 2021年03月08日 18:25:07 | vstinner | set | messages: + msg388279 |
| 2021年03月07日 13:06:17 | eryksun | set | messages: + msg388238 |
| 2021年03月07日 12:36:19 | eryksun | set | messages: + msg388237 |
| 2019年05月07日 00:38:38 | jkloth | set | nosy:
+ jkloth |
| 2015年05月20日 13:30:40 | eryksun | set | nosy:
+ eryksun messages: + msg243660 versions: + Python 3.4, Python 3.5 |
| 2011年10月16日 21:45:14 | python-dev | set | messages: + msg145647 |
| 2011年10月16日 20:07:05 | vstinner | set | status: open -> closed |
| 2011年10月16日 17:09:11 | vstinner | set | messages: + msg145629 |
| 2011年10月16日 17:07:37 | python-dev | set | messages: + msg145628 |
| 2011年10月15日 15:42:24 | pitrou | set | status: closed -> open nosy: + pitrou messages: + msg145596 assignee: vstinner |
| 2011年10月14日 00:41:19 | vstinner | set | status: open -> closed resolution: fixed messages: + msg145492 versions: + Python 3.3 |
| 2011年10月14日 00:38:22 | python-dev | set | nosy:
+ python-dev messages: + msg145490 |
| 2011年09月22日 21:09:42 | vstinner | set | nosy:
+ vstinner messages: + msg144419 |
| 2010年12月09日 10:54:38 | ocean-city | set | files:
+ main.c messages: + msg123676 |
| 2010年12月08日 19:45:36 | belopolsky | set | messages: + msg123639 |
| 2010年12月08日 18:50:07 | ocean-city | set | messages: + msg123631 |
| 2010年12月08日 18:18:54 | belopolsky | set | messages: + msg123628 |
| 2010年12月08日 18:12:54 | ocean-city | set | messages: + msg123626 |
| 2010年12月08日 17:58:10 | belopolsky | set | messages: + msg123625 |
| 2010年12月08日 17:57:08 | ocean-city | set | files:
+ py3k_workaround_for_wcsftime.patch keywords: + patch messages: + msg123624 |
| 2010年12月08日 17:46:24 | ocean-city | set | messages: + msg123623 |
| 2010年12月08日 17:33:40 | r.david.murray | set | nosy:
+ belopolsky |
| 2010年12月08日 15:40:53 | brian.curtin | set | nosy:
+ brian.curtin messages: + msg123618 |
| 2010年12月08日 15:01:27 | ocean-city | create | |