numpy arrays dimension mismatch

Question 1

I am using numpy and pandas to attempt to concatenate a number of heterogenous values into a single array.

np.concatenate((tmp, id, freqs))

Here are the exact values:

tmp = np.array([u'DNMT3A', u'p.M880V', u'chr2', 25457249], dtype=object)
freqs = np.array([0.022831050228310501], dtype=object)
id = "id_23728"

The dimensions of tmp, 17232, and freqs are as follows:

[in] tmp.shape
[out] (4,)
[in] np.array(17232).shape
[out] ()
[in] freqs.shape
[out] (1,)

I have also tried casting them all as numpy arrays to no avail.

Although the variable freqs will frequently have more than one value.

However, with both the np.concatenate and np.append functions I get the following error:

*** ValueError: all the input arrays must have same number of dimensions

These all have the same number of columns (0), why can't I concatenate them with either of the above described numpy methods?

All I'm looking to obtain is[(tmp), 17232, (freqs)] in one single dimensional array, which is to be appended onto the end of a pandas dataframe.

Thanks.

Update

It appears I can concatenate the two existing arrays:

np.concatenate([tmp, freqs],axis=0)
array([u'DNMT3A', u'p.M880V', u'chr2', 25457249, 0.022831050228310501], dtype=object)

However, the integer, even when casted cannot be used in concatenate.

np.concatenate([tmp, np.array(17571)],axis=0)
*** ValueError: all the input arrays must have same number of dimensions

What does work, however is nesting append and concatenate

np.concatenate((np.append(tmp, 17571), freqs),)
array([u'DNMT3A', u'p.M880V', u'chr2', 25457249, 17571,
 0.022831050228310501], dtype=object)

Although this is kind of messy. Does anyone have a better solution for concatenating a number of heterogeneous arrays?

Question 2

Not great, but at least it doesn't care which things are 0d: np.concatenate(map(np.atleast_1d, [tmp, id, freqs]))

Question 3

@askewchan that is nice, the atleast_1d function hadn't crossed my radar. Thanks

Question 4

The problem is that id, and later the integer np.array(17571), are not an array_like object. See here how numpy decides whether an object can be converted automatically to a numpy array or not.

The solution is to make id array_like, i.e. to be an element of a list or tuple, so that numpy understands that id belongs to a 1D array_like structure

It all boils down to

concatenate((tmp, (id,), freqs))

or

concatenate((tmp, [id], freqs))

To avoid this sort of problems when dealing with input variables in functions using numpy, you can use atleast_1d, as pointed out by @askewchan. See about it this question/answer.

Basically, if you are unsure if in different scenarios your variable id will be a single str or a list of str, you are better off using

concatenate((tmp, atleast_1d(id), freqs))

because the two options above will fail if id is already a list/tuple of strings.

EDIT: It may not be obvious why np.array(17571) is not an array_like object. This happens because np.array(17571).shape==(), so it is not iterable as it has no dimensions.

gg349 22.8k5 gold badges58 silver badges65 bronze badges · Accepted Answer · 2014-03-24 21:33:51Z

The problem is that id, and later the integer np.array(17571), are not an array_like object. See here how numpy decides whether an object can be converted automatically to a numpy array or not.

The solution is to make id array_like, i.e. to be an element of a list or tuple, so that numpy understands that id belongs to a 1D array_like structure

It all boils down to

concatenate((tmp, (id,), freqs))

or

concatenate((tmp, [id], freqs))

To avoid this sort of problems when dealing with input variables in functions using numpy, you can use atleast_1d, as pointed out by @askewchan. See about it this question/answer.

Basically, if you are unsure if in different scenarios your variable id will be a single str or a list of str, you are better off using

concatenate((tmp, atleast_1d(id), freqs))

because the two options above will fail if id is already a list/tuple of strings.

EDIT: It may not be obvious why np.array(17571) is not an array_like object. This happens because np.array(17571).shape==(), so it is not iterable as it has no dimensions.

CollectivesTM on Stack Overflow

numpy arrays dimension mismatch

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related