i input "company\data2012円\name" to a variable.
i get "company\dataü2\name" in that variable.
i want "company\data2012円\name" in that variable.
i am using arcpy as part of esri's arcmap python scripting with a geoprocessing toolbox that i think handles the string literal part of my inputs if that makes sense to anyone.
Help!
2 Answers 2
It looks like you want to include a literal backslash in your string. Backslash is used as an escape character in Python strings so to include a literal backslash you need to do one of the following:
- Use two backslashes, e.g.
"data\2012円" - Use a raw string literal, e.g.
r"data2012円"
With "data2012円", the "201円" is actually interpreted as an octal escape, so that escape sequence is translated into a single character. The value 201 in base 8 is 129 in base 10 or 0x81 in hex. If you are seeing 'ü' when this is displayed you must be using a Windows console that uses CP437 or some similar codec.
3 Comments
The number is still there, it's just in the string. This may not get you 100% of the way there, but it should be close. Basically, you need to determine the set of valid characters you don't want 'decoded', and then translate the rest like this:
# Original escaped the \n correctly?? but not the 201円....
testdata = "company\data2012円\\name"
print testdata
company\dataü2\name
corrected = ''.join([x if (x.isalnum() or x in '/.\\') else '\\%s'%(oct(ord(x))[1:]) for x in testdata])
print corrected
You may need to add to the list of recognized punctuation, and/or limit the range of the numbers that it recognizes.
However, you really do need to fix it at the source... this won't help with something like this:
testdata = 'company\data015円\\name'
print testdata
\nameny\data
or worse
testdata = 'company\data102円\\name'
print testdata
company\dataB\name
I have to know that I should translate a character back to be able for this to work. 201円 works, because it's not an otherwise expected character. The first one may be ok - we don't really expect carriage returns either. But how would I know to convert the B? it's a valid alphabetic character, and I can't tell it apart from the rest of the real text.
So, this really needs to be resolved upstream.
inputto get the strings? If so, just replace it forraw_input(and stop typing in the quotes when you enter the values)