Difference between list(numpy_array) and numpy_array.tolist()

Question 1

What is the difference between applying list() on a numpy array vs. calling tolist()?

I was checking the types of both outputs and they both show that what I'm getting as a result is a list, however, the outputs don't look exactly the same. Is it because that list() is not a numpy-specific method (i.e. could be applied on any sequence) and tolist() is numpy-specific, and just in this case they are returning the same thing?

Input:

points = numpy.random.random((5,2))
print "Points type: " + str(type(points))

Output:

Points type: <type 'numpy.ndarray'>

Input:

points_list = list(points)
print points_list
print "Points_list type: " + str(type(points_list))

Output:

[array([ 0.15920058, 0.60861985]), array([ 0.77414769, 0.15181626]), array([ 0.99826806, 0.96183059]), array([ 0.61830768, 0.20023207]), array([ 0.28422605, 0.94669097])]
Points_list type: 'type 'list''

Input:

points_list_alt = points.tolist()
print points_list_alt
print "Points_list_alt type: " + str(type(points_list_alt))

Output:

[[0.15920057939342847, 0.6086198537462152], [0.7741476852713319, 0.15181626186774055], [0.9982680580550761, 0.9618305944859845], [0.6183076760274226, 0.20023206937408744], [0.28422604852159594, 0.9466909685812506]]
Points_list_alt type: 'type 'list''

Question 2

One is a list of lists, the other a list of numpy arrays -- because .tolist "walks down the tree" and correctly makes every level into lists, while list(...) just "walks" the top level. So, what is the question...?

Question 3

There is no need for this statement at all: print "Points type: " + str(type(points)), you can check the type of the variable by using type(varName_here) i.e., type(points)

Question 4

Your example already shows the difference; consider the following 2D array:

>>> import numpy as np
>>> a = np.arange(4).reshape(2, 2)
>>> a
array([[0, 1],
 [2, 3]])
>>> a.tolist()
[[0, 1], [2, 3]] # nested vanilla lists
>>> list(a)
[array([0, 1]), array([2, 3])] # list of arrays

tolist handles the full conversion to nested vanilla lists (i.e. list of list of int), whereas list just iterates over the first dimension of the array, creating a list of arrays (list of np.array of np.int64). Although both are lists:

>>> type(list(a))
<type 'list'>
>>> type(a.tolist())
<type 'list'>

the elements of each list have a different type:

>>> type(list(a)[0])
<type 'numpy.ndarray'>
>>> type(a.tolist()[0])
<type 'list'>

The other difference, as you note, is that list will work on any iterable, whereas tolist can only be called on objects that specifically implement that method.

Question 5

.tolist() appears to convert all of the values recursively to python primitives (list), whereas list creates a python list from an iterable. Since the numpy array is an array of arrays, list(...) creates a list of arrays

You can think of list as a function that looks like this:

# Not the actually implementation, just for demo purposes
def list(iterable):
 newlist = []
 for obj in iter(iterable):
 newlist.append(obj)
 return newlist

Question 6

The major difference is that tolist recursively converts all data to python standard library types.

For instance:

>>> arr = numpy.arange(2)
>>> [type(item) for item in list(arr)]
[numpy.int64, numpy.int64]
>>> [type(item) for item in arr.tolist()]
[builtins.int, builtins.int]

Aside from the functional differences tolist will generally be quicker as it knows it has a numpy array and access to the backing array. Whereas, list will fall back to using an iterator to add all the elements.

In [2]: arr = numpy.arange(1000)
In [3]: %timeit arr.tolist()
10000 loops, best of 3: 33 μs per loop
In [4]: %timeit list(arr)
10000 loops, best of 3: 80.7 μs per loop

I would expect the tolist to be

Question 7

Additional differences:

If you have a 1D numpy array and convert it to a list by using tolist() that will change numpy scalar to the nearest compatible built-in python type. On the contrary, list() does not, it keeps the type of numpy scalar as it is.

# list(...)
a = np.uint32([1, 2])
a_list = list(a) # [1, 2]
type(a_list[0]) # <class 'numpy.uint32'> 
# .tolist()
a_tolist = a.tolist() # [1, 2]
 
type(a_tolist[0]) # <class 'int'>

If the numpy array with a scalar value meaning 0D array, using list() will throw an error but tolist() will only convert it as python scalar without any errors.

a = np.array(5)
list(a) # TypeError: iteration over a 0-d array
a.tolist() # 5

score 27 · Accepted Answer · 2015-01-11 18:52:16Z

Your example already shows the difference; consider the following 2D array:

>>> import numpy as np
>>> a = np.arange(4).reshape(2, 2)
>>> a
array([[0, 1],
 [2, 3]])
>>> a.tolist()
[[0, 1], [2, 3]] # nested vanilla lists
>>> list(a)
[array([0, 1]), array([2, 3])] # list of arrays

tolist handles the full conversion to nested vanilla lists (i.e. list of list of int), whereas list just iterates over the first dimension of the array, creating a list of arrays (list of np.array of np.int64). Although both are lists:

>>> type(list(a))
<type 'list'>
>>> type(a.tolist())
<type 'list'>

the elements of each list have a different type:

>>> type(list(a)[0])
<type 'numpy.ndarray'>
>>> type(a.tolist()[0])
<type 'list'>

The other difference, as you note, is that list will work on any iterable, whereas tolist can only be called on objects that specifically implement that method.

CollectivesTM on Stack Overflow

Difference between list(numpy_array) and numpy_array.tolist()

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related