3

I have an large array of more than 40000 elements

a = ['15', '12', '', 18909, ...., '8989', '', '90789', '8']

I'm looking for a simply way to replace the empty '' values to '0' so that I can manipulate the data in the array using Numpy.

I would then convert the elements in my array into integers using

a = map(int, a)

so that I could find the mean of the array in numpy

a_mean = np.mean(a)

My issue is that I cannot convert to integers in an array with missing numbers to get a mean.

asked Sep 15, 2018 at 14:43
2
  • 4
    Can you do: new_a = [int(v or 0) for v in a] and then use new_a? Commented Sep 15, 2018 at 14:47
  • I believe you can use numpy.nan_to_num ? Commented Sep 15, 2018 at 14:54

4 Answers 4

5

You could make a small function that converts a single value exactly how you want it, e.g.:

def to_int(x):
 try:
 return int(x)
 except ValueError:
 return 0

which can be used with map:

In [22]: a = ['15', '12', '', 18909, '8989', '90789', '8']
map(to_int, a)
Out[23]: [15, 12, 0, 18909, 8989, 90789, 8]

in a list comprehension:

In [25]: np.array([to_int(x) for x in a])
Out[25]: array([ 15, 12, 0, 18909, 8989, 90789, 8])

or in a generator expression to directly create a numpy array:

In [27]: np.fromiter((to_int(x) for x in a), dtype=int)
Out[27]: array([ 15, 12, 0, 18909, 8989, 90789, 8])
answered Sep 15, 2018 at 14:57
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, this was the simple and clean solution to my problem.
2

If I understood you right so it should look like that:

for index in range(len(a)):
 if a[i] is '':
 a[i] = '0'

You can also use:

a = list(map(lambda x: '0' if x == '' else x, a))
answered Sep 15, 2018 at 14:49

Comments

1

From the previous learning with SO, i see you can impy the below solution to convert the NaN to zeros..

from numpy import *
a = array([[0, 1, 2], [3, 4, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0

secondly, nan_to_num() as i said earlier in my comment.

>>> import numpy as np
>>> a = array([[0, 1, 2], [3, 4, np.NaN]])
>>> a
array([[ 0., 1., 2.],
 [ 3., 4., nan]])
>>> a = np.nan_to_num(a)
>>> a
array([[ 0., 1., 2.],
 [ 3., 4., 0.]])
answered Sep 15, 2018 at 14:57

Comments

0

A more verbose answer is:

acc = 0
for v in a:
 acc+=int(v or 0)
a_mean = acc/len(a)
answered Sep 15, 2018 at 14:57

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.