I have a mysql db. I set charset to utf8;
...
PRIMARY KEY (`username`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 |
...
I connect to db in python with MySQLdb;
conn = MySQLdb.connect(host = "localhost",
passwd = "12345",
db = "db",
charset = 'utf8',
use_unicode=True)
When I execute a query, response is decoding with "windows-1254". Example response;
curr = conn.cursor(MySQLdb.cursors.DictCursor)
select_query = 'SELECT * FROM users'
curr.execute(select_query)
for ret in curr.fetchall():
username = ret["username"]
print "repr-username; ", repr(username)
print "username; "username.encode("utf-8")
...
output is;
repr-username; u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
username; ÅŸÃ1⁄4krÃ1⁄4çaÄŸlÃ1⁄4li
When I print username with "windows-1254" it works fine;
...
print "repr-username; ", repr(username)
print "username; ", username.encode("windows-1254")
...
Output is;
repl-username; u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
username; şükrüçağlüli
When I try it with some other characters like cyrillic alphabet, decodeding is changed dinamicly. How can I prevent it?
asked Aug 28, 2014 at 12:50
umut
1,0261 gold badge12 silver badges25 bronze badges
1 Answer 1
I think the items where encoded wrong while INSERT to the database.
I recommend python-ftfy(from https://github.com/LuminosoInsight/python-ftfy) (helped me out in a simillar problem):
import ftfy
username = u'\xc5\u0178\xc3\xbckr\xc3\xbc\xc3\xa7a\xc4\u0178l\xc3\xbcli'
print ftfy.fix_text(username) # outputs şükrüçağlüli
Sign up to request clarification or add additional context in comments.
Comments
default
INSERTandSELECTfrom Python. Does the problem persist ?u"şükrüçağlüli" == u'\u015f\xfckr\xfc\xe7a\u011fl\xfcli'. This is not what you have. Are you certain the data have been properly encoded atINSERTtime ?