UTF-8 Encoding Error

subhabangalore at gmail.com subhabangalore at gmail.com
Fri Dec 23 01:38:15 EST 2016


I am getting the error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 15: invalid start byte
as I try to read some files through TaggedCorpusReader. TaggedCorpusReader is a module
of NLTK.
My files are saved in ANSI format in MS-Windows default. 
I am using Python2.7 on MS-Windows 7. 
I have tried the following options till now, 
string.encode('utf-8').strip()
unicode(string)
unicode(str, errors='replace')
unicode(str, errors='ignore')
string.decode('cp1252')
But nothing is of much help.
If any one may kindly suggest.
I am trying if you may see.


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /