Creating a Python namedtuple class that forces some of its fields to be floats

Question 1

I've got a use case for namedtuple that requires some of the fields to be floats. I could require the code that's constructing the namedtuple to supply floats, but I'd like for the namedtuple class to be smart enough to convert strings it gets into floats for me (but only for a specified subset of fields). I have code to do this, but it's ugly:

from collections import namedtuple, OrderedDict
# This list is built at runtime in my application
_FIELDS_AND_TYPES = [("num1", True), ("num2", True), ("label", False)]
FIELDS_FORCE_FLOAT = OrderedDict(_FIELDS_AND_TYPES)
_MyNamedTuple = namedtuple("_MyNamedTuple", FIELDS_FORCE_FLOAT.keys())
class TypedNamedTuple(_MyNamedTuple):
 fields_force_float = FIELDS_FORCE_FLOAT
 def __new__(cls, *values_in):
 super_obj = super(_MyNamedTuple, cls)
 superclass = super_obj.__thisclass__
 values = [float(value) if cls.fields_force_float[fieldname] else value
 for fieldname, value in zip(superclass._fields, values_in)]
 self = super_obj.__new__(cls, values)
 return self
print TypedNamedTuple("1.0", "2.0", "3.0")
# _MyNamedTuple(num1=1.0, num2=2.0, label='3.0')

Things I don't like about this code:

The output of print TypedNamedTuple("1.0", "2.0", "3.0") starts with _MyNamedTuple rather than TypedNamedTuple. I think this is a general problem with subclassing namedtuples, so fair enough. I could give both classes the same name to solve this.
My code has to pull in FIELDS_FORCE_FLOAT from a global variable.
TypedNamedTuple is inefficient (it runs zip and does a bunch of dictionary lookups). This is not so bad in my context but I'd like to know how to handle this "right."
If I wanted to create another namedtuple subclass that forces some of its args to be float, I'd basically be starting from scratch. My TypedNamedTuple is not reusable.

Is there a cleaner way to do this?

As for why I'm attempting to use namedtuples here at all (H/T @jonrsharpe), it's not set in stone but here are some things that pushed me in this direction:

The list of fields I'm interested in isn't known until runtime.
Immutability: I do not, in fact, have a reason to change any of these objects once they're constructed.

Question 2

I'm sure I'm just missing something, but why can't you just do this: _MyNamedTuple = namedtuple("TypedNamedTuple", FIELDS_FORCE_FLOAT.keys()). Does _MyNamedTuple get used elsewhere?

Question 3

@SuperBiasedMan, yes, you could do this. I think that's what I meant by "I could give both classes the same name to solve this" in #1.

Question 4

In order to reuse code (that is, namedtuple) it should be okay to use this approach. It might also make things harder in the long run, so it could be easier to use a regular meta-/class or decorator to enforce this kind of logic. I also see a problem with automatic conversion of values, as that needs to be communicated/debugged later on, but if it fits the use case, why not. Since the tuple is immutable there's no normal way to circumvent this, so I guess it is less of an issue than with a mutable class.

Also, have you looked at the collections.py source? I mean it's good that it works, but honestly that code generation is a bit scary. It is, however, a way to generate all this stuff at runtime. If you don't have to have separate Python classes you could also just use a single class and pass in the valid fields as a separate parameter, then use functions to create the separate tuple "types".

For the presented code / your questions I have some other suggestions:

You can override __repr__ to fix the "wrong" printed representation. I'm not quite sure exactly why it is kind of hardcoded in collections.py, but generally I'd prefer things to print their own class name instead, as written below. The formatting is copied from the standard namedtuple.
If it's happening once, that is not nice, but certainly not the biggest issue. Again, you could follow the code generation model and pass it in as a parameter, e.g. typednamedtuple('TypedNamedTuple', FIELDS_AND_TYPES).
Use izip to prevent allocating a new list. Otherwise looks good? I can't see how you would otherwise do the conversion.
Subclass and override the field? Note the AnotherTypedNamedTuple below. Otherwise see above.

Also:

Don't assign to self. That looks super wrong.

from collections import namedtuple, OrderedDict
from itertools import izip
# This list is built at runtime in my application
_FIELDS_AND_TYPES = [("num1", True), ("num2", True), ("label", False)]
FIELDS_FORCE_FLOAT = OrderedDict(_FIELDS_AND_TYPES)
_MyNamedTuple = namedtuple("_MyNamedTuple", FIELDS_FORCE_FLOAT.keys())
class TypedNamedTuple(_MyNamedTuple):
 fields_force_float = FIELDS_FORCE_FLOAT
 def __new__(cls, *values_in):
 super_obj = super(_MyNamedTuple, cls)
 superclass = super_obj.__thisclass__
 values = [float(value) if cls.fields_force_float[fieldname] else value
 for fieldname, value in izip(superclass._fields, values_in)]
 return super_obj.__new__(cls, values)
 def __repr__(self):
 return "{}({})".format(
 self.__class__.__name__,
 ', '.join("{}=%r".format(name) for name in self._fields) % self)
print TypedNamedTuple("1.0", "2.0", "3.0")
# TypedNamedTuple(num1=1.0, num2=2.0, label='3.0')
class AnotherTypedNamedTuple(TypedNamedTuple):
 fields_force_float = OrderedDict([("num1", False), ("num2", False), ("label", False)])
print AnotherTypedNamedTuple("1.0", "2.0", "3.0")
# AnotherTypedNamedTuple(num1='1.0', num2='2.0', label='3.0')

ferada ferada 11.4k25 silver badges65 bronze badges · Accepted Answer · 2015-10-01 22:17:42Z

In order to reuse code (that is, namedtuple) it should be okay to use this approach. It might also make things harder in the long run, so it could be easier to use a regular meta-/class or decorator to enforce this kind of logic. I also see a problem with automatic conversion of values, as that needs to be communicated/debugged later on, but if it fits the use case, why not. Since the tuple is immutable there's no normal way to circumvent this, so I guess it is less of an issue than with a mutable class.

Also, have you looked at the collections.py source? I mean it's good that it works, but honestly that code generation is a bit scary. It is, however, a way to generate all this stuff at runtime. If you don't have to have separate Python classes you could also just use a single class and pass in the valid fields as a separate parameter, then use functions to create the separate tuple "types".

For the presented code / your questions I have some other suggestions:

You can override __repr__ to fix the "wrong" printed representation. I'm not quite sure exactly why it is kind of hardcoded in collections.py, but generally I'd prefer things to print their own class name instead, as written below. The formatting is copied from the standard namedtuple.
If it's happening once, that is not nice, but certainly not the biggest issue. Again, you could follow the code generation model and pass it in as a parameter, e.g. typednamedtuple('TypedNamedTuple', FIELDS_AND_TYPES).
Use izip to prevent allocating a new list. Otherwise looks good? I can't see how you would otherwise do the conversion.
Subclass and override the field? Note the AnotherTypedNamedTuple below. Otherwise see above.

Also:

Don't assign to self. That looks super wrong.

from collections import namedtuple, OrderedDict
from itertools import izip
# This list is built at runtime in my application
_FIELDS_AND_TYPES = [("num1", True), ("num2", True), ("label", False)]
FIELDS_FORCE_FLOAT = OrderedDict(_FIELDS_AND_TYPES)
_MyNamedTuple = namedtuple("_MyNamedTuple", FIELDS_FORCE_FLOAT.keys())
class TypedNamedTuple(_MyNamedTuple):
 fields_force_float = FIELDS_FORCE_FLOAT
 def __new__(cls, *values_in):
 super_obj = super(_MyNamedTuple, cls)
 superclass = super_obj.__thisclass__
 values = [float(value) if cls.fields_force_float[fieldname] else value
 for fieldname, value in izip(superclass._fields, values_in)]
 return super_obj.__new__(cls, values)
 def __repr__(self):
 return "{}({})".format(
 self.__class__.__name__,
 ', '.join("{}=%r".format(name) for name in self._fields) % self)
print TypedNamedTuple("1.0", "2.0", "3.0")
# TypedNamedTuple(num1=1.0, num2=2.0, label='3.0')
class AnotherTypedNamedTuple(TypedNamedTuple):
 fields_force_float = OrderedDict([("num1", False), ("num2", False), ("label", False)])
print AnotherTypedNamedTuple("1.0", "2.0", "3.0")
# AnotherTypedNamedTuple(num1='1.0', num2='2.0', label='3.0')

Stack Exchange Network

Creating a Python namedtuple class that forces some of its fields to be floats

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Creating a Python namedtuple class that forces some of its fields to be floats

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions