I created a mutable String class in Python, based on the builtin str
class.
I can change the first character, but when I call capitalize()
, it uses the old value instead
class String(str):
def __init__(self, string):
self.string = list(string)
def __repr__(self):
return "".join(self.string)
def __str__(self):
return "".join(self.string)
def __setitem__(self, index, value):
self.string[index] = value
def __getitem__(self, index):
if type(index) == slice:
return "".join(self.string[index])
return self.string[index]
def __delitem__(self, index):
del self.string[index]
def __add__(self, other_string):
return String("".join(self.string) + other_string)
def __len__(self):
return len(self.string)
text = String("cello world")
text[0] = "h"
print(text)
print(text.capitalize())
Expected Output :
hello world Hello world
Actual Output :
hello world Cello world
3 Answers 3
Your implementation inherits from str
, so it brings along all the methods that str
implements. However, the implementation of the str.capitalize()
method is not designed to take that into account. Methods like str.capitalize()
return a new str
object with the required change applied.
Moreover, the Python built-in types do not store their state in a __dict__
mapping of attributes, but use internal struct
data structures) only accessible on the C level; your self.string
attribute is not where the (C equivalent of) str.__new__()
stores the string data. The str.capitalize()
method bases its return value on the value stored in the internal data structure when the instance was created, which can't be altered from Python code.
You'll have to shadow all the str
methods that return a new value, including str.capitalize()
to behave differently. If you want those methods from returning a new instance to changing the value in-place, you have to do so yourself:
class String(str):
# ...
def capitalize(self):
"""Capitalize the string, in place"""
self.string[:] ''.join(self.string).capitalize()
return self # or return None, like other mutable types would do
That can be a lot of work, writing methods like these for every possible str
method that returns an updated value. Instead, you could use a __getattribute__
hook to redirect methods:
_MUTATORS = {'capitalize', 'lower', 'upper', 'replace'} # add as needed
class String(str):
# ...
def __getattribute__(self, name):
if name in _MUTATORS:
def mutator(*args, **kwargs):
orig = getattr(''.join(self.string), name)
self.string[:] = orig(*args, **kwargs)
return self # or return None for Python type consistency
mutator.__name__ = name
return mutator
return super().__getattribute__(name)
Demo with the __getattribute__
method above added to your class:
>>> text = String("cello world")
>>> text[0] = "h"
>>> print(text)
hello world
>>> print(text.capitalize())
Hello world
>>> print(text)
Hello world
One side note: the __repr__
method should use repr()
to return a proper representation, not just the value:
def __repr__(self):
return repr(''.join(self.string))
Also, take into account that most Python APIs that are coded in C and take a str
value as input, are likely to use the C API for Unicode strings and so not only completely ignore your custom implementations but like the original str.capitalize()
method will also ignore the self.string
attribute. Instead, they too will interact with the internal str
data.
-
1@khelwood: it'll test as a string anywhere
isinstance(object, str)
is used. Given that there is nocollections.abc
type you can use instead, that's a decent reason to continue doing this.Martijn Pieters– Martijn Pieters2019年05月29日 11:28:20 +00:00Commented May 29, 2019 at 11:28 -
1@khelwood: that depends on what the OP hopes to use this class for.Martijn Pieters– Martijn Pieters2019年05月29日 11:29:16 +00:00Commented May 29, 2019 at 11:29
-
1@khelwood: and in programming, there is no such thing as 'misleading', that's what inheritance is all about! To extend and alter behaviour but still be compatible with the interface of the parent class.Martijn Pieters– Martijn Pieters2019年05月29日 11:30:23 +00:00Commented May 29, 2019 at 11:30
-
1@khelwood: yes, so? That's how a lot of objects work.
list
,tuple
,bytes
andstr
are all sequences, so have a lot of methods in common. Yet their implementations are very different for all that the method names and signatures are the same.Martijn Pieters– Martijn Pieters2019年05月29日 11:41:45 +00:00Commented May 29, 2019 at 11:41 -
1@MichaelFish: Use
vars(str)
to get a dictionary of all attributes that the type defines. Methods are just attributes that happen to be callable (and usually, implement the descriptor protocol). Take into account there is at least one static method on the type,str.maketrans()
.Martijn Pieters– Martijn Pieters2019年05月29日 11:44:51 +00:00Commented May 29, 2019 at 11:44
This approach is inferior to the already suggested answers. There is more overhead because you don't get to just track things as a list, and isinstance(s, str)
won't work, for example.
Another way to accomplish this is to subclass collections.UserString
. It's a wrapper around the built-in string type that stores it as a member named data
. So you could do something like
from collections import UserString
class String(UserString):
def __init__(self, string):
super().__init__(string)
def __setitem__(self, index, value):
data_list = list(self.data)
data_list[index] = value
self.data = "".join(data_list)
# etc.
And then you will get capitalize
and the other string methods for free.
You inherited str
's definition of capitalize
, which ignores your class's behaviors and just uses the underlying data of the "real" str
.
Inheriting from a built-in type like this effectively requires you to reimplement every method, or do some metaprogramming with __getattribute__
; otherwise, the underlying type's behaviors will be inherited unmodified.