3

I am trying to get my head around on mutable and immutable objects. I have read that string is immutable and that for each string, a separate object is created with a different object ID. I am trying to verify this using below simple code, however, I see same object ID for multiple strings which are not same. Can someone please clarify this. Thanks in advance.

mystring = ""
mylist = ["This ", "That ", "This ", "That ", "This ", "That ", "This ", "That "]
for item in mylist:
 mystring = mystring + item
 print("mystring: ", mystring, "ID of mystring: ", id(mystring))

which results in below output:

mystring: This ID of mystring: 6407264
mystring: This That ID of mystring: 42523448
mystring: This That This ID of mystring: 42523448
mystring: This That This That ID of mystring: 6417200
mystring: This That This That This ID of mystring: 42785608
mystring: This That This That This That ID of mystring: 42785608
mystring: This That This That This That This ID of mystring: 42837536
mystring: This That This That This That This That ID of mystring: 42775856
eyllanesc
245k19 gold badges201 silver badges281 bronze badges
asked May 29, 2018 at 4:30
8
  • ids are reclaimed when not used, so not surprising you are seeing the same id because you are discarding the old strings.
    AChampion
    Commented May 29, 2018 at 4:32
  • @AChampion: Except that the lifetimes really should be overlapping, so ID reuse should be invalid. There's an optimization going on here that doesn't quite preserve the language's guarantees about id return values and string immutability. Commented May 29, 2018 at 4:34
  • @user2357112 the lifetimes are not overlapping.
    wim
    Commented May 29, 2018 at 4:35
  • @wim: Between the computation of mystring + item and the assignment to mystring, the lifetimes of successive mystring values should overlap. Lifetime overlap isn't transitive, but that doesn't matter, because we're seeing ID reuse for successive mystring values. If it weren't for the in-place optimization of mystring = mystring + item, this ID reuse wouldn't happen. Commented May 29, 2018 at 4:37
  • 1
    @wim: Without the optimization, the new mystring value would come into existence before the name binding operation, and then the name binding would end the lifetime of the old mystring value. There would be a llifetime overlap between the + and the =. Commented May 29, 2018 at 5:02

2 Answers 2

3

Python is allowed to reuse object IDs for objects with non-overlapping lifetimes, but you're seeing ID reuse in cases where there should be a lifetime overlap. Specifically, during execution of this statement:

mystring = mystring + item

between the evaluation of mystring + item and the assignment to mystring, there should be a lifetime overlap between any two successive values of mystring. You're seeing ID reuse for successive values of mystring, which shouldn't happen.

The effect you're seeing happens because of an optimization in the CPython bytecode evaluation loop, where statements of the form

string1 = string1 + string2

or

string1 += string2

are detected, and if the interpreter can confirm that string1 has no other references, it attempts to perform the concatenation by mutating string1 in-place. You can see the code in Python/ceval.c under unicode_concatenate. This optimization is mostly invisible, due to the refcount check, but the effect on id values is one way it's visible.

answered May 29, 2018 at 4:48
2

String are immutable. There exist no str method that allows to mutate them.

That being said, the reason you see the same id multiple times is because when an object is no longer in use, Python will reuse its position in memory. And what id does is precisely to provide a unique identifier by returning the position of the object in memory.

One way to convince yourself that this is indeed the reason for your observation would be to make sure to always have a reference to each of the string you create by adding them to a list.

Code

mystring = ""
mylist = ["This ", "That ", "This ", "That ", "This ", "That ", "This ", "That "]
# A list to keep a reference to each string
created_strings = []
for item in mylist:
 mystring = mystring + item
 # Prevent mystring from being garbage collected by adding it to the list
 created_strings.append(mystring)
 print("mystring: ", mystring, "ID of mystring: ", id(mystring))

Output

mystring: This ID of mystring: 2522900655888
mystring: This That ID of mystring: 2522903930416
mystring: This That This ID of mystring: 2522903930544
mystring: This That This That ID of mystring: 2522902118880
mystring: This That This That This ID of mystring: 2522900546624
mystring: This That This That This That ID of mystring: 2522900546864
mystring: This That This That This That This ID of mystring: 2522902428376
mystring: This That This That This That This That ID of mystring: 2522900907952

Notice that now that memory is not reclaimed, each object has a different id.

answered May 29, 2018 at 4:50
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.