homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Roman Akopov
Recipients Roman Akopov, SilentGhost, ezio.melotti, vstinner
Date 2020年06月02日.20:38:26
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1591130306.47.0.245958346607.issue40845@roundup.psfhosted.org>
In-reply-to
Content
This is how I extract data from Common Locale Data Repository v37
script assumes common\main working directory
from os import walk
from xml.etree import ElementTree
en_root = ElementTree.parse('en.xml')
for (dirpath, dirnames, filenames) in walk('.'):
 for filename in filenames:
 if filename.endswith('.xml'):
 code = filename[:-4]
 xx_root = ElementTree.parse(filename)
 xx_lang = xx_root.find('localeDisplayNames/languages/language[@type=\'' + code + '\']')
 en_lang = en_root.find('localeDisplayNames/languages/language[@type=\'' + code + '\']')
 if en_lang.text == 'Cherokee':
 print(en_lang.text)
 print(xx_lang.text)
 print(xx_lang.text.encode("unicode_escape"))
 print(xx_lang.text.encode('idna'))
 print(ord(xx_lang.text[0]))
 print(ord(xx_lang.text[1]))
 print(ord(xx_lang.text[2]))
script outputs
Cherokee
ᏣᎳᎩ
b'\\u13e3\\u13b3\\u13a9'
b'xn--tz9ata7l'
5091
5043
5033
If I change text to lower case
 print(en_lang.text.lower())
 print(xx_lang.text.lower())
 print(xx_lang.text.lower().encode("unicode_escape"))
 print(xx_lang.text.lower().encode('idna'))
 print(ord(xx_lang.text.lower()[0]))
 print(ord(xx_lang.text.lower()[1]))
 print(ord(xx_lang.text.lower()[2]))
then script outputs
cherokee
ꮳꮃꭹ
b'\\uabb3\\uab83\\uab79'
b'xn--tz9ata7l'
43955
43907
43897
I am not sure where do you get '\u13e3\u13b3\u13a9' string. '\u13e3\u13b3\u13a9'.lower().encode('unicode_escape') gives b'\\uabb3\\uab83\\uab79'
History
Date User Action Args
2020年06月02日 20:38:26Roman Akopovsetrecipients: + Roman Akopov, vstinner, ezio.melotti, SilentGhost
2020年06月02日 20:38:26Roman Akopovsetmessageid: <1591130306.47.0.245958346607.issue40845@roundup.psfhosted.org>
2020年06月02日 20:38:26Roman Akopovlinkissue40845 messages
2020年06月02日 20:38:26Roman Akopovcreate

AltStyle によって変換されたページ (->オリジナル) /