Extending Python list functionalities

Question 1

I am extending the functionalities of a Python list and I would like to include a method to normalize a vector to the [0, 1] range, by using element-wise operations. I came out with this solution, but find that using two classes does not seem clean. The main motivation for using two classes is that the output of data - min(data) from normalize() returns a Python list (due to how __sub__() was implemented), and that native list does not seem to have __truediv__() implemented.

How can I achieve the normalize() method and avoid the creation of the intermediate _BaseList class? The project I am working on has very constrained memory and I cannot use Numpy.

class _BaseList(list):
 def __init__(self, data):
 super().__init__(data)
 def __sub__(self, value):
 if type(value) in (int, float):
 return [elem - value for elem in self]
 elif type(value) is list and len(value) == len(self):
 return [a - b for a, b in zip(value, self)]
 def __truediv__(self, value):
 if type(value) in (int, float):
 return [elem / value for elem in self]
 elif type(value) is list and len(value) == len(self):
 return [a / b for a, b in zip(value, self)]
class Array(_BaseList):
 def __init__(self, data=None):
 super().__init__(data)
 def normalize(self):
 print(type(self))
 return _BaseList((self - min(self))) / float(max(self) - min(self))

Question 2

Why do you need to extend list? Are there existing methods that you'll make extensive use of?

Question 3

Also, is it only normalize that you need to add or do you plan on adding more methods?

Question 4

Are you re implementing numpy?

Question 5

You say you need the extra class, "due to how __sub__() was implemented"; because it returns a vanilla list, with a list comprehension. However, note:

If you replace e.g.

return [elem - value for elem in self]

with:

return _BaseList(elem - value for elem in self)

__sub__ will return a _BaseList instead; and

If you put those four methods in one class calling normalize would still work anyway, because you do convert to _BaseList after the subtraction.

In fact, three methods, because the __init__ is redundant - if all your subclass method does is call the superclass version, you can just let Python handle that for you.

The Zen of Python states that

Errors should never pass silently.

However, in both __truediv__ and __sub__, if value is not one of the three specified types (or it's a list of the wrong length), they quietly return None. Instead, you should raise TypeError if the user passes a value that can't be handled.

Speaking of specified types, don't write things like:

if type(value) in (int, float)

If you need to check types, use isinstance - this supports inheritance better:

if isinstance(value, (int, float)):

This is particularly important when you're using inheritance yourself - what happens if you try to subtract a _BaseList from another _BaseList? Python supplies abstract base classes that can help make this usage more generic.

Also note that there is another numeric type: complex - as Mathias Ettinger pointed out in the comments you can use Number when you want to cover all three cases.

There's a lot of duplication between the __sub__ and __truediv__ implementations; you could factor this out by extracting to a method that also takes the operator to apply, then use the operators defined in operator.

from collections.abc import Sequence
from operators import sub, trudiv
class Array(list):
 def normalize(self):
 min_ = min(self) # calculate this once
 return (self - min_) / (max(self) - min_)
 def __sub__(self, other):
 return self._process(other, sub)
 def __truediv__(self, other):
 return self._process(other, truediv)
 def _process(self, other, op):
 if isinstance(other, Sequence):
 if len(other) == len(self):
 return Array(op(a, b) for a, b in zip(self, other))
 raise ValueError('cannot operate on a sequence of unequal length')
 return Array(op(a, other) for a in self)

In use:

>>> a = Array((7, 8, 9))
>>> a
[7, 8, 9]
>>> a - 2
[5, 6, 7]
>>> a - [1, 2, 3]
[6, 6, 6]
>>> a.normalize()
[0.0, 0.5, 1.0]
>>> a - 'abc'
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 6, in __sub__
 File "<stdin>", line 12, in _process
 File "<stdin>", line 12, in <genexpr>
TypeError: unsupported operand type(s) for -: 'int' and 'str'
>>> a - [1, 2, 3, 4]
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 6, in __sub__
 File "<stdin>", line 13, in _process
ValueError: cannot operate on a sequence of unequal length

It might also be worth implementing __repr__, so you can tell more easily when you have a vanilla list and when you have an Array.

Question 6

Thank you very much for all the suggestions! Just adapted them for this specific constrained context, and they work great.

jonrsharpe jonrsharpe 14k2 gold badges36 silver badges62 bronze badges · Accepted Answer · 2017-12-24 09:07:48Z

You say you need the extra class, "due to how __sub__() was implemented"; because it returns a vanilla list, with a list comprehension. However, note:

If you replace e.g.

return [elem - value for elem in self]

with:

return _BaseList(elem - value for elem in self)

__sub__ will return a _BaseList instead; and

If you put those four methods in one class calling normalize would still work anyway, because you do convert to _BaseList after the subtraction.

In fact, three methods, because the __init__ is redundant - if all your subclass method does is call the superclass version, you can just let Python handle that for you.

The Zen of Python states that

Errors should never pass silently.

However, in both __truediv__ and __sub__, if value is not one of the three specified types (or it's a list of the wrong length), they quietly return None. Instead, you should raise TypeError if the user passes a value that can't be handled.

Speaking of specified types, don't write things like:

if type(value) in (int, float)

If you need to check types, use isinstance - this supports inheritance better:

if isinstance(value, (int, float)):

This is particularly important when you're using inheritance yourself - what happens if you try to subtract a _BaseList from another _BaseList? Python supplies abstract base classes that can help make this usage more generic.

Also note that there is another numeric type: complex - as Mathias Ettinger pointed out in the comments you can use Number when you want to cover all three cases.

There's a lot of duplication between the __sub__ and __truediv__ implementations; you could factor this out by extracting to a method that also takes the operator to apply, then use the operators defined in operator.

from collections.abc import Sequence
from operators import sub, trudiv
class Array(list):
 def normalize(self):
 min_ = min(self) # calculate this once
 return (self - min_) / (max(self) - min_)
 def __sub__(self, other):
 return self._process(other, sub)
 def __truediv__(self, other):
 return self._process(other, truediv)
 def _process(self, other, op):
 if isinstance(other, Sequence):
 if len(other) == len(self):
 return Array(op(a, b) for a, b in zip(self, other))
 raise ValueError('cannot operate on a sequence of unequal length')
 return Array(op(a, other) for a in self)

In use:

>>> a = Array((7, 8, 9))
>>> a
[7, 8, 9]
>>> a - 2
[5, 6, 7]
>>> a - [1, 2, 3]
[6, 6, 6]
>>> a.normalize()
[0.0, 0.5, 1.0]
>>> a - 'abc'
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 6, in __sub__
 File "<stdin>", line 12, in _process
 File "<stdin>", line 12, in <genexpr>
TypeError: unsupported operand type(s) for -: 'int' and 'str'
>>> a - [1, 2, 3, 4]
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 6, in __sub__
 File "<stdin>", line 13, in _process
ValueError: cannot operate on a sequence of unequal length

It might also be worth implementing __repr__, so you can tell more easily when you have a vanilla list and when you have an Array.

Thank you very much for all the suggestions! Just adapted them for this specific constrained context, and they work great.

Stack Exchange Network

Extending Python list functionalities

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Extending Python list functionalities

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions