6

I have to numpy string array which look like this:

[['0', '', '12.12', '140.65', '', ''],
['3', '', '10.45', '154.45', '', ''],
['5', '', '15.65', '184.74', '', '']]

What I need to do is to replace the empty cells with a number in order to convert it into a float array. I can't just delete the columns because in some cases the empty cells are filled. I tried this:

data = np.char.replace(data, '','0').astype(np.float64)

But this will just put a 0 everywhere between all characters which ends up in this:

[[0, 0, 1020.0102, 104000.0605, 0, 0],
[30, 0, 1000.0405, 105040.0405, 0, 0],
[50, 0, 1050.0605, 108040.0704, 0, 0]]

I can't figure out why python does that? I searched via google but couldn't find a good explanation for numpy.char.replace. Can anyone explain to me how it works?

asked Nov 16, 2017 at 9:10
2
  • 3
    Possible duplicate of Numpy array, fill empty values for a single column Commented Nov 16, 2017 at 9:15
  • Your 'empty' cells contain commas. Char.replace applies the regular string replace method to each element. Commented Nov 16, 2017 at 14:28

3 Answers 3

5
>>> a = np.array([['0', '', '12.12', '140.65', '', ''],
... ['3', '', '10.45', '154.45', '', ''],
... ['5', '', '15.65', '184.74', '', '']])
>>> a[a == ''] = 0
>>> a.astype(np.float64)
array([[ 0. , 0. , 12.12, 140.65, 0. , 0. ],
 [ 3. , 0. , 10.45, 154.45, 0. , 0. ],
 [ 5. , 0. , 15.65, 184.74, 0. , 0. ]])
answered Nov 16, 2017 at 9:13
Sign up to request clarification or add additional context in comments.

6 Comments

You might want to have empty strings in the initial array like OP
@ixop that's a copy-paste fail. Give me a second.
Perfect, this works great! Thank you for your fast answer!
@Toggo replace operates on substrings, so it more or less checks each position of each of your strings for the to replace string. Since '' matches everywhere, you will have '0' inserted everywhere.
@Paul Panzer Ok, I guessed it would be this way. Is there a possibility to make replace operate with the whole string rather than a substring?
|
0

data = np.char.replace(data, '','0')

It seems to replace all empty places, like '' has one place , and '0' has two places, '12.12' has 6 places. The result is

[['000' '0' '01020.01020' '0104000.06050' '0' '0']
 ['030' '0' '01000.04050' '0105040.04050' '0' '0']
 ['050' '0' '01050.06050' '0108040.07040' '0' '0']]

Try this :

import numpy as np
a = np.array([['0', '', '12.12', '140.65', '', ''],
 ['3', '', '10.45', '154.45', '', ''],
 ['5', '', '15.65', '184.74', '', '']])
#a[np.where(a == '')] = '0'
a[a == ''] = '0'
a = a.astype(np.float64)
print(a)
answered Nov 16, 2017 at 10:10

Comments

0

I know that this is an old question, but unfortunately, the accepted answer does not work properly today. If you do the [a == ''] comparison you will get a FutureWarning:

FutureWarning: elementwise comparison failed; returning scalar
instead, but in the future will perform elementwise comparison

one method that will do the trick with no waring is to use the numpy.where()

 import numpy as np
 a = np.array([['0', '', '12.12', '140.65', '', ''],
 ['3', '', '10.45', '154.45', '', ''],
 ['5', '', '15.65', '184.74', '', '']])
 result = np.where(a=='', '0', a)
 print(result)

The result is

[['0' '0' '12.12' '140.65' '0' '0'] 
 ['3' '0' '10.45' '154.45' '0' '0'] 
 ['5' '0' '15.65' '184.74' '0' '0']]
answered Apr 29, 2021 at 0:05

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.