You're seeing the UTF-8-encoded version of your string (which you shouldn't name str, by the way). By adding the # -*- coding: utf-8 -*- line at the start of your script, you're telling Python that that's the encoding your script is using. Are you sure that it is in fact using that encoding?
If that's not the case (check your editor!) or if your terminal window (where you're printing the string) happens to be using a different encoding, you'll get gibberish (or errors if the encoded string can't be interpreted in that encoding).
Only if you decode your (byte)string, you'll get a Unicode object.
So first you need to know your terminal's character encoding. Then you should be converting all strings to Unicode as soon as possible and manipulate only Unicode objects in your program until it's time to output them - at which point you need to encode them to the correct encoding.
For example
# -*- coding: utf-8 -*-
s = u"测试"
s = s + u"娴嬭瘯"
print s.encode("somecodepage")
You're seeing the UTF-8-encoded version of your string (which you shouldn't name str, by the way). By adding the # -*- coding: utf-8 -*- line at the start of your script, you're telling Python that that's the encoding your script is using.
Only if you decode your (byte)string, you'll get a Unicode object.
You're seeing the UTF-8-encoded version of your string (which you shouldn't name str, by the way). By adding the # -*- coding: utf-8 -*- line at the start of your script, you're telling Python that that's the encoding your script is using. Are you sure that it is in fact using that encoding?
If that's not the case (check your editor!) or if your terminal window (where you're printing the string) happens to be using a different encoding, you'll get gibberish (or errors if the encoded string can't be interpreted in that encoding).
Only if you decode your (byte)string, you'll get a Unicode object.
So first you need to know your terminal's character encoding. Then you should be converting all strings to Unicode as soon as possible and manipulate only Unicode objects in your program until it's time to output them - at which point you need to encode them to the correct encoding.
For example
# -*- coding: utf-8 -*-
s = u"测试"
s = s + u"娴嬭瘯"
print s.encode("somecodepage")
You're seeing the UTF-8-encoded version of your string (which you shouldn't name str, by the way). By adding the # -*- coding: utf-8 -*- line at the start of your script, you're telling Python that that's the encoding your script is using.
Only if you decode your (byte)string, you'll get a Unicode object.