I want to gain experience creating data structures that look and feel like Python builtin types. As a first exercise, I've written a WraparoundList
class meant to be identical to the builtin list
, except that accessing out-of-bounds elements "wraps around".
Goals:
- The only behavior that differs from that of a
list
is when explicitly indexed with[]
. - Should look and feel like the Python builtin, i.e., wouldn't look too out of place in the
collections
module. - Should be compatible with both Python 2.7.x and 3.x (though I only have tested on 2.7.13).
The complete source code with doctests follows:
#!/usr/bin/env python
from sys import maxint as MAXINT
class WraparoundList(list):
"""A list whose index wraps around when out of bounds.
A `WraparoundList` is the same as an ordinary `list`,
except that out-of-bounds indexing causes the index
value to wrap around. The wrapping behavior is as if
after reaching the last element, one returned to the
other end of the list and continued counting.
>>> x = WraparoundList('abcd')
>>> x
['a', 'b', 'c', 'd']
>>> x[3]
'd'
>>> x[4] # wraps to x[0]
'a'
>>> x[-6] = 'Q' # wraps to x[-2]
>>> x
['a', 'b', 'Q', 'd']
>>> del x[7] # wraps to x[3]
>>> x
['a', 'b', 'Q']
Indices used in out-of-range slices also wrap around.
If the slice's `start` or `stop` is out-of-bounds, it
gets wrapped around.
>>> x = WraparoundList('abcd')
>>> x
['a', 'b', 'c', 'd']
>>> x[:10] # wraps to x[:2]
['a', 'b']
>>> x[-7:3] # wraps to x[-3:3]
['b', 'c']
The one way in which slicing a `WraparoundList` differs
from slicing an ordinary `list` is the case of using the
list length as the upper limit.
>>> x = WraparoundList('abcd')
>>> x
['a', 'b', 'c', 'd']
>>> x[2:]
['c', 'd']
>>> x[2:4] # wraps to x[2:0]
[]
Initializing a `WraparoundList` with a nested iterable
does not cause inner indices to wrap. To have a multi-
dimensional `WraparoundList`, all the elements of the
outer `WraparoundList` must also be `WraparoundList`s.
>>> x = WraparoundList([list('abc'), list('def')])
>>> x
[['a', 'b', 'c'], ['d', 'e', 'f']]
>>> x[3]
['d', 'e', 'f']
>>> x[3][5]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> y = WraparoundList([WraparoundList(i) for i in x])
>>> y[3][5]
'f'
"""
def __getitem__(self, i):
"""x.__getitem__(i) <=> x[i]"""
if isinstance(i, slice):
return list.__getitem__(self, self._wrap_slice(i))
else:
return list.__getitem__(self, self._wrap_index(i))
def __getslice__(self, i, j):
"""x.__getslice__(i, j) <=> x[i:j]"""
return self.__getitem__(slice(i, j, None))
def __setitem__(self, i, y):
"""x.__setitem__(i, y) <=> x[i] = y"""
if isinstance(i, slice):
list.__setitem__(self, self._wrap_slice(i), y)
else:
list.__setitem__(self, self._wrap_index(i), y)
def __setslice__(self, i, j):
"""x.__setslice__(i, j) <=> x[i:j] = y"""
self.__setitem__(slice(i, j, None))
def __delitem__(self, i):
"""x.__delitem__(i, y) <=> del x[i]"""
if isinstance(i, slice):
list.__delitem__(self, self._wrap_slice(i))
else:
list.__delitem__(self, self._wrap_index(i))
def __delslice__(self, i, j):
"""x.__delslice__(i, j) <=> del x[i:j]"""
self.__delitem__(slice(i, j, None))
def _wrap_index(self, i):
_len = len(self)
if i >= _len:
return i % _len
elif i < -_len:
return i % (-_len)
else:
return i
def _wrap_slice(self, slc):
if slc.start is None:
start = None
else:
start = self._wrap_index(slc.start)
if slc.stop is None:
stop = None
elif slc.stop == MAXINT:
# __*slice__ methods treat absent upper bounds as sys.maxint, which would
# wrap around to a system-dependent (and probably unexpected) value. Setting
# to `None` in this case forces the slice to run to the end of the list.
stop = None
else:
stop = self._wrap_index(slc.stop)
step = slc.step
return slice(start, stop, step)
def main():
pass
if __name__ == '__main__':
main()
2 Answers 2
This is well documented, well commented code.
The docstring says:
The one way in which slicing a
WraparoundList
differs from slicing an ordinarylist
is the case of using the list length as the upper limit.but this isn't quite the whole story — an ordinary
list
can also be sliced using a value greater than the list length, and in that caseWraparoundList
also has a different behaviour:>>> x = [1, 2, 3] >>> x[:10] [1, 2, 3] >>> x = WraparoundList(x) >>> x[:10] [1]
The code is not portable to Python 3, because there's no
sys.maxint
(all integers in Python 3 are "long"). I suggest something like this:try: # In Python 2.7, when __*slice__ methods are called with no "stop" # value, sys.maxint is passed instead. from sys import maxint as NO_STOP except ImportError: # Python 3 does not have sys.maxint or use the __*slice__ methods. NO_STOP = object()
I prefer a name like
NO_STOP
because it communicates the intention rather than the implementation._wrap_index
raisesZeroDivisionError
if the list is empty:>>> w = WraparoundList([]) >>> w[0] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "cr153920.py", line 79, in __getitem__ return list.__getitem__(self, self._wrap_index(i)) File "cr153920.py", line 110, in _wrap_index return i % _len ZeroDivisionError: integer division or modulo by zero
Raising an exception is the right thing to do in this case, but I would expect to get an
IndexError
instead.The code calls
list.__getitem__
directly, rather than via thesuper
function. But this has the unsatisfactory consequence that if someone has another classC
also inheriting fromlist
and overriding the__getitem__
method, and combinesWraparoundList
andC
via inheritance, like this:class D(WraparoundList, C): pass
Then
D()[0]
callsWraparoundList.__getitem__
, which callslist.__getitem__
, butC.__getitem__
is never called, contrary to what one would expect. If you want to support subclassing ofWraparoundList
, then you need to write:return super(WraparoundList, self).__getitem__(self._wrap_slice(i))
and so on.
With a little refactoring, you could avoid some of the repetition. In particular, if you had a method like this:
def _wrap_arg(self, i): if isinstance(i, slice): return self._wrap_slice(i) else: return self._wrap_index(i)
Then you'd be able to write:
def __getitem__(self, i): """x.__getitem__(i) <=> x[i]""" return super(WraparoundList, self).__getitem__(self._wrap_arg(i))
and so on.
Once you've done the refactoring above, you'll see that
_wrap_slice
is only called from one place, so it could be inlined at its point of use.There is no need to include an empty
main
function or anif __name__ == '__main__':
section — if there's nothing to do, then there's no need to write code to do it.
-
\$\begingroup\$ Wonderful! One clarification: Do I understand correctly that the
__*slice__
methods are redundant if I'm handling slices in__*item__
, even in 2.7.X? \$\endgroup\$Endulum– Endulum2017年01月29日 21:08:35 +00:00Commented Jan 29, 2017 at 21:08 -
\$\begingroup\$ See the documentation: "built-in types in CPython currently still implement
__getslice__()
. Therefore, you have to override it in derived classes when implementing slicing." \$\endgroup\$Gareth Rees– Gareth Rees2017年01月29日 21:16:49 +00:00Commented Jan 29, 2017 at 21:16
On top of Gareth Ree's excellent answer: because the WraparoundList
and the usual list
behave differently (for instance when sliced using a value greater than the list length), your class does not respect Liskov substitution principle (see also less formal explanation).
In a nutshell: because some code using a list
would behave differently if it was to use a WraparoundList
, then WraparoundList
should not inherit from list
even if the "WraparoundList is a list" relationship is respected.
A way to change this could be to stop inheriting from list
but instead to use a list
internally to store data (Composition over inheritance).
Explore related questions
See similar questions with these tags.
itertools.cycle
. \$\endgroup\$itertools.cycle
does not appear to have real Python source code, apart from being plugged into the Python object API. The source is here in CPython: github.com/python/cpython/blob/master/Modules/… \$\endgroup\$x=range(4); x[-10]
raisesIndexError
, at least in 2.7.13. Do you mean "wrap" in a different sense? \$\endgroup\$