I know that we have to use setattr method when we are outside of an object. However, I have troubles calling setattr with unicode key leading me to use __setattr__ directly.
class MyObject(object):
def __init__(self):
self.__dict__["properties"] = dict()
def __setattr__(self, k, v):
self.properties[k] = v
obj = MyObject()
And I get the following content of obj.properties:
setattr(obj, u"é", u"à"): raise UnicodeEncodeErrorsetattr(obj, "é", u"à"):{'\xc3\xa9': u'\xe0'}obj.__setattr__(u"é", u"à"):{u'\xe9': u'\xe0'}
I don't understand why Python is behaving with these differences
-
I am using Python 2.7.10 (default, Oct 14 2015, 16:09:02)jbaptiste.trb– jbaptiste.trb2016年04月21日 14:34:10 +00:00Commented Apr 21, 2016 at 14:34
2 Answers 2
Python 2.7? Ascii identifiers only. That includes your code in 2) - ascii accent but not .1) - unicode accent.
Unicode identifiers in Python?
3) involves you setting an unicode key within a dictionary. Legal.
Note that __setattr__ is almost never meant to be used as you are doing. It's meant to set attributes on an object. Not intercept that and stuff them in a internal dict attribute. I'd Avoid properties too as a name, confusing with properties in the get/Set sense.
Generally you want to use setattr, not the double underscore variant. Unlike your opening sentence.
You typically also don't call double underscore methods, you define them and Python's underlying data protocol calls them on your behalf. Bit like JavaBeans get/set implicit calls (I think).
__setattr__ can be tricky. If you are not careful, it blocks "setting activities" in unexpected ways.
Here's a silly example,
class Foo(object):
def __setattr__(self, attrname, value):
""" let's uppercase variables starting with k"""
if attrname.lower().startswith("k"):
self.__dict__[attrname.upper()] = value
foo = Foo()
foo.kilometer = 1000
foo.meter = 1
print "foo.KILOMETER:%s" % getattr(foo, "KILOMETER", "unknown")
print "foo.meter:%s" % getattr(foo, "meter", "unknown")
print "foo.METER:%s" % getattr(foo, "METER", "unknown")
output:
foo.KILOMETER:1000
foo.meter:unknown
foo.METER:unknown
You needed to have an else after the if:
else:
self.__dict__[attrname] = value
output:
foo.KILOMETER:1000
foo.meter:1
foo.METER:unknown
Last, if you are just starting out and unicode is a big deal, I'd evaluate Python 2 vs 3 - 3 has much better, unified, unicode support. There are tons of reasons you might or might not need to use 2.7, rather than 3, but unicode "pushes towards" 3.
2 Comments
encode('utf-8')before calling setattr. Otherwise, concerning the last point, I have the following requirement: be able to access object property "toto" via obj.properties["toto"] and also directly obj.toto. Thus, intercept setattr and getattr seems to be the only solution.toto vs foobar gives that away ;-) If you only need to access via obj.toto for reads then you can leave setattr alone and instead write a _getattr_ that returns obj.properties[attrname]. Overriding _getattr_ is common, _setattr_ is more special case and needs careful consideration. I'd have something like my silly example with the k variable names and test for leading _ in attribute names to allow for normal internal variables.Python 2 doesn't allow unicode identifiers:
>>> é = 3
File "<stdin>", line 1
é = 3
^
SyntaxError: invalid syntax
Presumably it's so insistent on this point that you can't work around it as you're trying because setattr goes through some processing before calling __setattr__. You can show this by inserting a print at the very start of __setattr__: nothing gets printed, so the issue is not in your code.