I am trying to make an array of lists with numpy

Question 1

I need an array that holds either an array of records or a list of records. I need something like this:

people = [['David','Blue','Dog','Car'],['Sally','Yellow','Cat','Boat']]

or:

people = (['David','Blue','Dog','Car'],['Sally','Yellow','Cat','Boat'])

But I keep getting:

people = ['David','Blue','Dog','Car','Sally','Yellow','Cat','Boat']

I've tried append vs concatenate, different axis, different np initialization, but the results is always the same. Here is my latest version. What am I doing wrong?

import numpy as np
# Tried
# people = np.empty((0,0), dtype='S')
# people = np.array([[]])
people = np.array([])
records = GetRecordsFromDB()
for record in records:
 # Do some stuff 
 
 # Tried
 # person = [name, color, animal, vehicle]
 person = np.array([name, color, animal, vehicle])
 # Tried this with different axis
 # people = np.append(people, person, axis=0)
 people = np.concatenate((people, person))

Thank you.

EDIT: This will be the input for a Pandas DataFrame if that helps.

Question 2

What is the format/structure of record? Also, please provide a sample of the data in record

Question 3

Four strings: name, color, animal, vehicle. I can put them in any format, but I need to be able to add those, as a list, to an array. So I end up with an array that holds n number of records in a list.

Question 4

Use np.c_

people = np.c_[people, person]

Question 5

That is the opposite of what I want. Please reread what I need.

Question 6

Check now @user390480

Question 7

I figured it out. The problem was the adding the first record. Your code would only work for me in the context of my if statement.

Question 8

Here is how I solved this:

import numpy as np
people = np.array([])
records = GetRecordsFromDB()
for record in records:
 # Do some stuff
 person = np.array([name, color, animal, vehicle])
 
 if len(people) == 0:
 people = [person]
 else:
 people = np.append(people, [person], axis=0)

Question 9

np.array([person]) is a (1,4) shaped array. You could have just as as well initialed people=None, and done a if people is None:... test. None is a common "uninitialized" value. Keep in mind though that this array append will be slower than a list append.

Question 10

So you really aren't trying to make an array of lists. You want to iteratively make a 2d array. Details matter when programming as you found out, and they also help when asking questions.

Question 11

@hpaulj, sorry, yes, I don't know the difference between an array of lists and a 2d array, but thank you for your help.

Question 12

Then you haven't read enough basic numpy documentation.

Question 13

edit

Looking at your answer I realized I took your subject line too literally. [360] and following makes an array of lists. But [356] is a 2d array, where each "row" looks like a list - but it isn't.

earlier

In [354]: alist = [['David','Blue','Dog','Car'],['Sally','Yellow','Cat','Boat']]
 ...: 
In [355]: alist
Out[355]: [['David', 'Blue', 'Dog', 'Car'], ['Sally', 'Yellow', 'Cat', 'Boat']]
In [356]: np.array(alist)
Out[356]: 
array([['David', 'Blue', 'Dog', 'Car'],
 ['Sally', 'Yellow', 'Cat', 'Boat']], dtype='<U6')

That makes a 2d array of strings. The construction is no different from the textbook example of making a 2d numeric array:

In [358]: np.array([[1, 2], [3, 4]])
Out[358]: 
array([[1, 2],
 [3, 4]])

With hstack or concatenate:

In [359]: np.hstack(alist)
Out[359]: 
array(['David', 'Blue', 'Dog', 'Car', 'Sally', 'Yellow', 'Cat', 'Boat'],
 dtype='<U6')

To make an array with just 2 lists, you have to initial one:

In [360]: arr = np.empty(2, object)
In [361]: arr
Out[361]: array([None, None], dtype=object)
In [362]: arr[:] = alist
In [363]: arr
Out[363]: 
array([list(['David', 'Blue', 'Dog', 'Car']),
 list(['Sally', 'Yellow', 'Cat', 'Boat'])], dtype=object)

If the lists differ in length,

In [364]: np.array([["David", "Blue"], ["Sally", "Yellow", "Cat"]])
<ipython-input-364-be12d6dec312>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
 np.array([["David", "Blue"], ["Sally", "Yellow", "Cat"]])
Out[364]: 
array([list(['David', 'Blue']), list(['Sally', 'Yellow', 'Cat'])],
 dtype=object)

Read that warning - in full.

By default np.array tries to make a multidimensional array of base classes like int or string. It's only when it can't that it falls back on making an object dtype array. That kind of arrays should be viewed as second-class array, and used only when it really is needed. Often the list of lists is just as good.

Your iterative creation is one such case

people = []
records = GetRecordsFromDB()
for record in records:
 # Do some stuff 
 
 # Tried
 # person = [name, color, animal, vehicle]
 person = np.array([name, color, animal, vehicle])
 people = append(person)

Items, whether lists or array (or anything else) can be added to a list in-place with just the addition of a reference. Trying use concatenate to add items of an array is, not only harder to get right, but slower, since it is making a whole new array each time. That means a lot of copying!

np.append is a badly named way of calling concatenate. It is not a list.append clone.

Using np.concatenate requires a careful handling of dimensions. Don't be sloppy, thinking it will figure out what you want.

Similarly this is not a close of list []:

In [365]: np.array([])
Out[365]: array([], dtype=float64)
In [366]: np.array([]).shape
Out[366]: (0,)

It is a 1d array with a specific shape. You can only concatenate it with another 1d array - one the only axis, 0).

ansev ansev 31k5 gold badges21 silver badges33 bronze badges · Accepted Answer · 2022-03-13 20:54:03Z

1

Use np.c_

people = np.c_[people, person]

Share

Improve this answer

edited Mar 13, 2022 at 21:59

answered Mar 13, 2022 at 20:54

ansev's user avatar

ansev ansev

31k5 gold badges21 silver badges33 bronze badges

3 Comments

user390480

user390480 Over a year ago

That is the opposite of what I want. Please reread what I need.

2022年03月13日T21:31:14.367Z+00:00

ansev

ansev Over a year ago

Check now @user390480

2022年03月13日T22:00:41.877Z+00:00

user390480

user390480 Over a year ago

I figured it out. The problem was the adding the first record. Your code would only work for me in the context of my if statement.

2022年03月13日T22:39:23.403Z+00:00

CollectivesTM on Stack Overflow

I am trying to make an array of lists with numpy

3 Answers 3

3 Comments

4 Comments

edit

earlier

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

3 Comments

4 Comments

edit

earlier

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related