0

I am trying to work with sqlite on python:

from pysqlite2 import dbapi2 as sqlite
con = sqlite.connect('/home/argon/super.db')
cur = con.cursor()
cur.execute('select * from notes')
for i in cur.fetchall():
 print i[2]

And I sometimes get something like this (I am from Russia):

Ответ etc...

And if I pass this string to this function(it helped me in other projects):

def unescape(text):
 def fixup(m):
 text = m.group(0)
 if text[:2] == "&#":
 # character reference
 try:
 if text[:3] == "&#x":
 return unichr(int(text[3:-1], 16))
 else:
 return unichr(int(text[2:-1]))
 except ValueError:
 pass
 else:
 # named entity
 try:
 text = unichr(htmlentitydefs.name2codepoint[text[1:-1]])
 except KeyError:
 pass
 return text # leave as is
 return re.sub("&#?\w+;", fixup, text)

I get even more weird result:

ÐÑÐ2ÐμÑÐ ̧ÑÑ Ñ ÑÐ ̧ÑÐ ̧ÑÐ3⁄4Ð2аÐ1⁄2Ð ̧ÐμÐ1⁄4 etc

What should I do to get normal Cyrillic symbols?

asked Oct 13, 2012 at 20:52

1 Answer 1

1

О looks like a UTF-8 byte pair for \xD0\x9E, or \u1054. Better known as the cyrillic character О (Capital O).

In other words, you have strangely encoded UTF-8 data on your hand. Turn the { digits into bytes (chr(208) would do) then decode from UTF-8:

>>> (chr(208) + chr(158)).decode('utf-8')
u'\u1054'
>>> print (chr(208) + chr(158)).decode('utf-8')
О
>>> print (chr(208) + chr(158) + chr(209) + chr(130) + chr(208) + chr(178)).decode('utf-8')
Отв
answered Oct 13, 2012 at 20:58
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.