I am a bit confused about the behaviour of multi-dimensional Python "arrays" (actually lists of lists, not numpy arrays). Suppose I have a 4 x 4 array whose entries are 1 x 2 arrays (that is, entries of the 4 x 4 array are lists containing 2 lists). I can initialize such an empty array with the command:
b = [[[[], ]*2, ]*4, ]*4
This creates the following multidimensional empty array:
[[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]]]
Now I want to modify a single entry of this 4 x 4 array; for instance I want to make the component b[0][0] to be equal to [[1],[2]]:
[[[[1], [2]], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]]]
My expectation was that the following command would do that job:
b[0][0] = [[1],[2]]
However, this leads instead to the following matrix b:
[[[[1], [2]], [[], []], [[], []], [[], []]],
[[[1], [2]], [[], []], [[], []], [[], []]],
[[[1], [2]], [[], []], [[], []], [[], []]],
[[[1], [2]], [[], []], [[], []], [[], []]]]
What is the proper way to achieve this?
2 Answers 2
The problem is really that when you do
b = [[[[], ]*2, ]*4, ]*4
You are copying references to the same arrays, so your new large multi-dimensional array contains many references to the same arrays.
Here's an example with iPython:
In [22]: a = [[],]*10
In [23]: a
Out[23]: [[], [], [], [], [], [], [], [], [], []]
In [24]: a[1].append(12)
In [25]: a
Out[25]: [[12], [12], [12], [12], [12], [12], [12], [12], [12], [12]]
For a better way to create your array, head over here
Possible solution to what you want:
m = [[[[] for x in range(2)] for x in range(4)] for x in range(4)]
5 Comments
numpy and then converting back to a list of lists?+ to concatenate lists, append, etc. So it should take me some important effort to modify the code so that it works with numpy arrays instead of lists of lists (and performance of the module is not an important aspect here).[[]]*2 will create an array of two references to the same empty array. When you try to change one of them, the other reference reflects the change since it points to the same one.
In your case each row points to the same array. When you change the first one, you're changing the array all four rows point to.
Feel free to get acquainted with NumPy library
Comments
Explore related questions
See similar questions with these tags.
b = numpy.empty((4,4,2), dtype=int). Even the simple step of creating the array is 20--30 times faster. And if you do anything interesting with the data, numpy is frequently hundreds or thousands of times faster than MightyPork's (entirely valid) solution, and basically never slower (as far as I've seen).