7

I was optimizing a piece of code and figured out that list copying (shallow) was the bottleneck.

Now I am curious: why is list copying so much slower than copying an array of 8-bytes? In my opinion there should be no difference. In both cases it should be just memcpy(dst, src, sizeof(int64_t)*len(src)) since pointers are 8 bytes long. But apparently Python does more work than I expected. Is it somehow related to GC? Or is it possible that lists are implemented as linked lists?

import array
import numpy as np
import timeit
n = 100*1000
lst = [i for i in range(n)]
arr = array.array('q', lst)
nmp = np.array(arr, dtype=np.int64)
assert(arr.itemsize == 8)
n_iter = 100000
print('=== copy() ===')
print('List of int:', timeit.timeit(stmt='lst.copy()', setup='from __main__ import lst', number=n_iter))
print('Array of 8-bytes:', timeit.timeit(stmt='arr.__copy__()', setup='from __main__ import arr', number=n_iter))
print('Numpy array of int64:', timeit.timeit(stmt='nmp.copy()', setup='from __main__ import nmp', number=n_iter))

The results:

=== copy() ===
List of int: 27.434935861998383
Array of 8-bytes: 2.6839109230022586
Numpy array of int64: 2.69919407800262
kuzand
9,8764 gold badges47 silver badges49 bronze badges
asked Feb 23, 2020 at 15:27

1 Answer 1

3

With the list, it's not just about copying references to the objects. It's also increasing the reference counters inside the objects. Plus those aren't cache-friendly, since they're not nicely adjacent to each other because they're inside objects (along with other data of the objects) and those objects are somewhere on the heap. Also see Why is copying a shuffled list much slower? (you have the faster "unshuffled" case here, but the explanations there might still be useful).

answered Feb 23, 2020 at 15:40
2
  • What about numpy array? Commented Feb 23, 2020 at 22:27
  • @AndreasK. Usually NumPy arrays don't contain Python objects but for example raw 64-bit ints, as in the question. So they don't have this extra cost and get copied much faster, as shown in the question. Does that answer your question? Your profile looks like you know NumPy much better than me, so I'm not sure why/what you're asking :-) Commented Feb 23, 2020 at 23:18

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.