How to convert from boolean array to int array in python

Question 1

I have a Numpy 2-D array in which one column has Boolean values i.e. True/False. I want to convert it to integer 1 and 0 respectively, how can I do it?

E.g. my data[0::,2] is boolean, I tried

data[0::,2]=int(data[0::,2])

, but it is giving me error:

TypeError: only length-1 arrays can be converted to Python scalars

My first 5 rows of array are:

[['0', '3', 'True', '22', '1', '0', '7.25', '0'],
 ['1', '1', 'False', '38', '1', '0', '71.2833', '1'],
 ['1', '3', 'False', '26', '0', '0', '7.925', '0'],
 ['1', '1', 'False', '35', '1', '0', '53.1', '0'],
 ['0', '3', 'True', '35', '0', '0', '8.05', '0']]

Question 2

This can't be a 2D-array, since in 2D array all elements have the same type. Probably, you have a structured array. Could you, please, show a few full rows from it and its dtype?

Question 3

OK, those quotes should hint you that you've got an array of strings. So, again, in numpy all elements of a 2D-array must have the same type. You either need structured arrays or just get rid of numpy and use ordinary Python lists. Why do you need numpy and what is your final goal?

Question 4

Actually I am following a tutorial on machine learning project which uses python, and as I am new to python I am facing these difficulty, it asks for numpy array. So it'd be great if you can tell me how to convert this whole array of strings to float as it is clear that it can be converted to float(treating true as 1 and fase as 0).

Question 5

How do you produce the data in the first place? From a text file?

Question 6

Ok, the easiest way to change a type of any array to float is doing:

data.astype(float)

The issue with your array is that float('True') is an error, because 'True' can't be parsed as a float number. So, the best thing to do is fixing your array generation code to produce floats (or, at least, strings with valid float literals) instead of bools.

In the meantime you can use this function to fix your array:

def boolstr_to_floatstr(v):
 if v == 'True':
 return '1'
 elif v == 'False':
 return '0'
 else:
 return v

And finally you convert your array like this:

new_data = np.vectorize(boolstr_to_floatstr)(data).astype(float)

Question 7

@AkashdeepSaluja I've double-checked the code and it is working for me. Could you please update your question with the exact output of data[:5].

Question 8

The output in the question is the exact output, do you want something else?

Question 9

@AkashdeepSaluja No, this can't be true. First of all, what I can see in the question is not a numpy array, but a Python list. And it was missing commas before I edited it—Python could not output this. Second, my code works for Python lists too, so everything should be fine. Add print(data[:5]) to your code and post the exact output.

Question 10

Or even better from pprint import pprint and then use pprint(data[:5]).

Question 11

boolarrayvariable.astype(int) works:

data = np.random.normal(0,1,(1,5))
threshold = 0
test1 = (data>threshold)
test2 = test1.astype(int)

Output:

data = array([[ 1.766, -1.765, 2.576, -1.469, 1.69]])
test1 = array([[ True, False, True, False, True]], dtype=bool)
test2 = array([[1, 0, 1, 0, 1]])

Question 12

If I do this on your raw data source, which is strings:

data = [['0', '3', 'True', '22', '1', '0', '7.25', '0'],
 ['1', '1', 'False', '38', '1', '0', '71.2833', '1'],
 ['1', '3', 'False', '26', '0', '0', '7.925', '0'],
 ['1', '1', 'False', '35', '1', '0', '53.1', '0'],
 ['0', '3', 'True', '35', '0', '0', '8.05', '0']]
data = [[eval(x) for x in y] for y in data]

..and then follow that with:

data = [[float(x) for x in y] for y in data]
# or this if you prefer:
arr = numpy.array(data)

..then the problem is solved. ..you can even do it as a one-liner (I think this makes ints, though, and floats are probably needed): numpy.array([[eval(x) for x in y] for y in data])

..I think the problem is that numpy is keeping your numeric strings as strings, and since not all of your strings are numeric, you can't do a type conversion on the whole array. Also, if you try to do a type conversion just on the parts of the array with "True" and "False", you're not really working with booleans, but with strings. ..and the only ways I know of to change that are to do the eval statement. ..well, you could do this, too:

booltext_int = {'True': 1, 'False': 2}
clean = [[float(x) if x[-1].isdigit() else booltext_int[x]
 for x in y] for y in data]

..this way you avoid evals, which are inherently insecure. ..but that may not matter, since you may be using a trusted data source.

Question 13

Using @kirelagin's idea with ast.literal_eval

>>> import ast
>>> import numpy as np
>>> arr = np.array(
 [['0', '3', 'True', '22', '1', '0', '7.25', '0'],
 ['1', '1', 'False', '38', '1', '0', '71.2833', '1'],
 ['1', '3', 'False', '26', '0', '0', '7.925', '0'],
 ['1', '1', 'False', '35', '1', '0', '53.1', '0'],
 ['0', '3', 'True', '35', '0', '0', '8.05', '0']])
>>> np.vectorize(ast.literal_eval, otypes=[np.float])(arr)
array([[ 0. , 3. , 1. , 22. , 1. , 0. ,
 7.25 , 0. ],
 [ 1. , 1. , 0. , 38. , 1. , 0. ,
 71.2833, 1. ],
 [ 1. , 3. , 0. , 26. , 0. , 0. ,
 7.925 , 0. ],
 [ 1. , 1. , 0. , 35. , 1. , 0. ,
 53.1 , 0. ],
 [ 0. , 3. , 1. , 35. , 0. , 0. ,
 8.05 , 0. ]])

Question 14

Old Q but, for reference - a bool can be converted to an int and an int to a float

data[0::,2]=data[0::,2].astype(int).astype(float)

kirelagin 13.7k2 gold badges45 silver badges59 bronze badges · Accepted Answer · 2013-06-01 07:17:23Z

26

Ok, the easiest way to change a type of any array to float is doing:

data.astype(float)

The issue with your array is that float('True') is an error, because 'True' can't be parsed as a float number. So, the best thing to do is fixing your array generation code to produce floats (or, at least, strings with valid float literals) instead of bools.

In the meantime you can use this function to fix your array:

def boolstr_to_floatstr(v):
 if v == 'True':
 return '1'
 elif v == 'False':
 return '0'
 else:
 return v

And finally you convert your array like this:

new_data = np.vectorize(boolstr_to_floatstr)(data).astype(float)

Share

Improve this answer

answered Jun 1, 2013 at 7:17

kirelagin's user avatar

kirelagin

13.7k2 gold badges45 silver badges59 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

kirelagin

kirelagin Over a year ago

@AkashdeepSaluja I've double-checked the code and it is working for me. Could you please update your question with the exact output of data[:5].

2013年06月01日T07:26:05.383Z+00:00

Akashdeep Saluja

Akashdeep Saluja Over a year ago

The output in the question is the exact output, do you want something else?

2013年06月01日T07:35:24.687Z+00:00

kirelagin

kirelagin Over a year ago

@AkashdeepSaluja No, this can't be true. First of all, what I can see in the question is not a numpy array, but a Python list. And it was missing commas before I edited it—Python could not output this. Second, my code works for Python lists too, so everything should be fine. Add print(data[:5]) to your code and post the exact output.

2013年06月01日T07:39:08.09Z+00:00

kirelagin

kirelagin Over a year ago

Or even better from pprint import pprint and then use pprint(data[:5]).

2013年06月01日T07:40:54.16Z+00:00

CollectivesTM on Stack Overflow

How to convert from boolean array to int array in python

5 Answers 5

4 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

5 Answers 5

4 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related