Skip to main content
Stack Overflow
  1. About
  2. For Teams

Return to Revisions

2 of 6
added 844 characters in body
codeape
  • 101.6k
  • 26
  • 180
  • 202

This is what's happening:

  • sampleString is a byte string (cp1255 encoded)
  • sampleString.decode("cp1255") decodes (decode==bytes -> unicode string) the byte string to a unicode string
  • print sampleString.decode("cp1255") attempts to print the unicode string to stdout. Print has to encode the unicode string to do that (encode==unicode string -> bytes). The error that you're seeing means that the python print statement cannot write the given unicode string to the console's encoding. sys.stdout.encoding is the terminal's encoding.

So the problem is that your console does not support these characters. You should be able to tweak the console to use another encoding. The details on how to do that depends on your OS and terminal program.

Another approach would be to manually specify the encoding to use:

print sampleString.decode("cp1255").encode("utf-8")

See also:

A simple test program you can experiment with:

import sys
print sys.stdout.encoding
samplestring = '\xe0\xe1\xe2\xe3\xe4'
print samplestring.decode("cp1255").encode(sys.argv[1])

On my utf-8 terminal:

$ python2.6 test.py utf-8
UTF-8
אבגדה
$ python2.6 test.py latin1
UTF-8
Traceback (most recent call last):
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-4: ordinal not in range(256)
$ python2.6 test.py ascii
UTF-8
Traceback (most recent call last):
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)
codeape
  • 101.6k
  • 26
  • 180
  • 202

AltStyle によって変換されたページ (->オリジナル) /