Programmer's Python Data - A Custom Data Class
Written by Mike James
Monday, 15 January 2024
Article Index
Programmer's Python Data - A Custom Data Class
Node Numbering
Equality and Hash
The Complete Tree
Page 1 of 4

Python has a lot of data structures, but what if you want to create your own? Find out how it all works in this extract from Programmer's Python: Everything is Data.

Programmer's Python
Everything is Data

Is now available as a print book: Amazon

pythondata360Contents

  1. Python – A Lightning Tour
  2. The Basic Data Type – Numbers
    Extract: Bignum
  3. Truthy & Falsey
  4. Dates & Times
    Extract Naive Dates
  5. Sequences, Lists & Tuples
    Extract Sequences
  6. Strings
    Extract Unicode Strings
  7. Regular Expressions
    Extract Simple Regular Expressions
  8. The Dictionary
    Extract The Dictionary
  9. Iterables, Sets & Generators
    Extract Iterables
  10. Comprehensions
    Extract Comprehensions
  11. Data Structures & Collections
    Extract Stacks, Queues and Deques
    Extract Named Tuples and Counters
  12. Bits & Bit Manipulation
    Extract Bits and BigNum
    Extract Bit Masks ***NEW!!!
  13. Bytes
    Extract Bytes And Strings
    Extract Byte Manipulation
  14. Binary Files
    Extract Files and Paths
  15. Text Files
    Extract Text Files & CSV
  16. Creating Custom Data Classes
    Extract A Custom Data Class
  17. Python and Native Code
    Extract Native Code
    Appendix I Python in Visual Studio Code
    Appendix II C Programming Using Visual Studio Code

<ASIN:1871962765>

<ASIN:1871962749>

<ASIN:1871962595>

<ASIN:B0CK71TQ17>

<ASIN:187196265X>

Most of the time you can find a built-in class, or something from a predefined module, to satisfy your need for a data structure. If you need something custom then how much effort is worth expending on such a creation depends on many factors, but in most cases some level of integration with the general way Python handles data objects is desirable. In this chapter we look at ways of achieving this using magic methods and the Abstract Base Classes.

Indexing

How do you support indexing and slicing on a custom object? In many languages the answer would be that you can’t or that it was very difficult. In Python it is very easy and very natural. When you use [] with an object one of three methods is called:

  • __getitem__(key) when an indexed value is needed a=b[key]

  • __setitem__(key,value) when an indexed value is changed b[key]=a

  • __delitem(key)__ when an indexed element is to be deleted del b[key]- should really only be used for dictionary-like objects.


As an example we can add index access to the binary tree class described in Chapter 10. To do this we need to indicate how an index might map onto nodes of a binary tree. The simplest way of doing this is to use a breadth-first enumeration:

[画像:treeicon]

The problem of converting the index to the corresponding node is tricky, but given a function getNode(key) which returns the node at the specified key location then writing the index access functions is easy:

def __getitem__(self,key):
 if isinstance(key,int): 
 return self.getNode(key)
 raise TypeError("invalid index")
 
def __setitem__(self,key,value):
 if isinstance(key,int):
 self.getNode(key).value=value
 else:
 raise TypeError("invalid index")

You can see that the __setitem__ method simply sets the value attribute on the node. An alternative is to replace the entire node with a new node. With these and the getNode method defined you can write things like:

for i in range(0,16):
 print(tree[i].value)
print(tree[16].value)
tree[10]=42
print(tree[10].value)

Notice that both __getitem__ and __setitem__ should raise an IndexError exception if the key is out of range, but in this case the error is delegated to getNode, see later.

Slicing

What if you want to extend the indexing to slicing? There is very little new in adding slicing to indexing – the same methods are used, the only difference is that a slice object is passed in place of an index. The real problem is how to interpret a slice for a non-linear data structure. In the case of the Tree example we can interpret tree[s:e] as being a request for the node objects with index tree[s] through to tree[e] returned not as a tree but as a list. This isn’t conceptually "pure" as a list isn’t really a slice of a tree, but it could be useful.

The only implementation problem is in handling all of the missing values that are possible in a slice. The slice object that the __getitem__ and __setitem__ methods use has three attributes start, stop and step. The problem is that not all slices define each one and when you don’t specify one of them they default to None. So, you can’t simply use:

range(slice.start,slice.stop,slice.step) 

to generate index values in the specified range because range doesn’t accept None for any of its parameters.

A reasonable solution is to set up a maximum range for the object and then use the slice object to reduce it to the desired slice. That is with:

range(0,len(obj))[slice]

the [slice] notation is quite happy to accept None as any of the slice parameters.

Assuming that we have defined __len__ for the Tree class and it simply returns that total number of nodes in the Tree, we can write the __getitem__ method to accept a slice and return a list of nodes in the specified slice:

def __getitem__(self,key):
 if isinstance(key,slice):
 temp=[]
 for i in range(0,len(self))[key]:
 temp.append(self.getNode(i))
 return temp
 if isinstance(key,int): 
 return self.getNode(key)
 raise TypeError("invalid index")

Notice that it still handles a simple index request. This probably isn’t the most efficient way to construct a slice from a tree, but it works. After defining this method you can write things like:

print(list(node.value for node in tree[3:9]))

As an alternative you could define the return slice as a list of node values rather than node objects.

To accept a slice the best interpretation is to allow a list of values to replace the values in the nodes in the specified slice. To do this we simply use the same technique, but this time we set the value of each node in the slice:

def __setitem__(self,key,value):
 if isinstance(key,slice) and isinstance(value,list):
 for i in range(0,len(self))[key]:
 self.getNode(i).value=value.pop(0)
 elif isinstance(key,int):
 self.getNode(key).value=value
 else:
 raise TypeError("invalid index or value") 

Notice that no check is made to ensure that value has enough elements to satisfy the number of elements in the slice. Once you have this modification you can write things like:

tree[3:6]=[1,2,3]


Prev - Next >>

Last Updated ( Monday, 15 January 2024 )