I've got a use case for namedtuple that requires some of the fields to be floats. I could require the code that's constructing the namedtuple to supply floats, but I'd like for the namedtuple class to be smart enough to convert strings it gets into floats for me (but only for a specified subset of fields). I have code to do this, but it's ugly:
from collections import namedtuple, OrderedDict
# This list is built at runtime in my application
_FIELDS_AND_TYPES = [("num1", True), ("num2", True), ("label", False)]
FIELDS_FORCE_FLOAT = OrderedDict(_FIELDS_AND_TYPES)
_MyNamedTuple = namedtuple("_MyNamedTuple", FIELDS_FORCE_FLOAT.keys())
class TypedNamedTuple(_MyNamedTuple):
fields_force_float = FIELDS_FORCE_FLOAT
def __new__(cls, *values_in):
super_obj = super(_MyNamedTuple, cls)
superclass = super_obj.__thisclass__
values = [float(value) if cls.fields_force_float[fieldname] else value
for fieldname, value in zip(superclass._fields, values_in)]
self = super_obj.__new__(cls, values)
return self
print TypedNamedTuple("1.0", "2.0", "3.0")
# _MyNamedTuple(num1=1.0, num2=2.0, label='3.0')
Things I don't like about this code:
- The output of
print TypedNamedTuple("1.0", "2.0", "3.0")
starts with_MyNamedTuple
rather thanTypedNamedTuple
. I think this is a general problem with subclassing namedtuples, so fair enough. I could give both classes the same name to solve this. - My code has to pull in
FIELDS_FORCE_FLOAT
from a global variable. TypedNamedTuple
is inefficient (it runszip
and does a bunch of dictionary lookups). This is not so bad in my context but I'd like to know how to handle this "right."- If I wanted to create another namedtuple subclass that forces some of its args to be float, I'd basically be starting from scratch. My
TypedNamedTuple
is not reusable.
Is there a cleaner way to do this?
As for why I'm attempting to use namedtuples here at all (H/T @jonrsharpe), it's not set in stone but here are some things that pushed me in this direction:
- The list of fields I'm interested in isn't known until runtime.
- Immutability: I do not, in fact, have a reason to change any of these objects once they're constructed.
1 Answer 1
In order to reuse code (that is, namedtuple
) it should be okay to use
this approach. It might also make things harder in the long run, so it
could be easier to use a regular meta-/class or decorator to enforce this kind of
logic. I also see a problem with automatic conversion of values, as
that needs to be communicated/debugged later on, but if it fits the use
case, why not. Since the tuple is immutable there's no normal way to
circumvent this, so I guess it is less of an issue than with a mutable
class.
Also, have you looked at the collections.py
source? I mean it's good
that it works, but honestly that code generation is a bit scary. It is, however, a way to generate all this stuff at runtime. If you don't have to have separate Python classes you could also just use a single class and pass in the valid fields as a separate parameter, then use functions to create the separate tuple "types".
For the presented code / your questions I have some other suggestions:
- You can override
__repr__
to fix the "wrong" printed representation. I'm not quite sure exactly why it is kind of hardcoded incollections.py
, but generally I'd prefer things to print their own class name instead, as written below. The formatting is copied from the standardnamedtuple
. - If it's happening once, that is not nice, but certainly not the
biggest issue. Again, you could follow the code generation model
and pass it in as a parameter,
e.g.
typednamedtuple('TypedNamedTuple', FIELDS_AND_TYPES)
. - Use
izip
to prevent allocating a newlist
. Otherwise looks good? I can't see how you would otherwise do the conversion. - Subclass and override the field? Note the
AnotherTypedNamedTuple
below. Otherwise see above.
Also:
- Don't assign to
self
. That looks super wrong.
from collections import namedtuple, OrderedDict
from itertools import izip
# This list is built at runtime in my application
_FIELDS_AND_TYPES = [("num1", True), ("num2", True), ("label", False)]
FIELDS_FORCE_FLOAT = OrderedDict(_FIELDS_AND_TYPES)
_MyNamedTuple = namedtuple("_MyNamedTuple", FIELDS_FORCE_FLOAT.keys())
class TypedNamedTuple(_MyNamedTuple):
fields_force_float = FIELDS_FORCE_FLOAT
def __new__(cls, *values_in):
super_obj = super(_MyNamedTuple, cls)
superclass = super_obj.__thisclass__
values = [float(value) if cls.fields_force_float[fieldname] else value
for fieldname, value in izip(superclass._fields, values_in)]
return super_obj.__new__(cls, values)
def __repr__(self):
return "{}({})".format(
self.__class__.__name__,
', '.join("{}=%r".format(name) for name in self._fields) % self)
print TypedNamedTuple("1.0", "2.0", "3.0")
# TypedNamedTuple(num1=1.0, num2=2.0, label='3.0')
class AnotherTypedNamedTuple(TypedNamedTuple):
fields_force_float = OrderedDict([("num1", False), ("num2", False), ("label", False)])
print AnotherTypedNamedTuple("1.0", "2.0", "3.0")
# AnotherTypedNamedTuple(num1='1.0', num2='2.0', label='3.0')
_MyNamedTuple = namedtuple("TypedNamedTuple", FIELDS_FORCE_FLOAT.keys())
. Does_MyNamedTuple
get used elsewhere? \$\endgroup\$