pd.read_sql unicode types causing problems

Question 1

I'm formatting some data I'm pulling from a database using sqlalchemy, pyodbc, and pandas read_sql, which I get back as a dataframe df.

I want to apply a formatting of the data in each 'cell' of the dataframe, row by row and excluding the first two columns using this:

df.iloc[6, 2:] = (df.iloc[6, 2:]*100).map('{:,.2f}%'.format)

I apply a similar formatting for several other rows in the dataframe. This used to work great when I was reading my data from a csv file, but now reading from the database causes a ValueError on that line that reads:

ValueError: Unknown format code 'f' for object of type 'unicode'

I tried some other casting attemps such as: df.iloc[6, 2:] = (float(df.iloc[6, 2:].encode())*100).map('{:,.2f}%'.format) But this causes some additional errors.

I'm pretty sure the error is being caused by the unicode type of the results. How should I format my dataframe or modify my read_sql to not have unicode strings? I'm using python 2.7 by the way.

The dtype for each column is object.

Question 2

can you provide an output of: print(df.dtypes)?

Question 3

It says the dtype for each column is 'object'. Series([], Name: 6, dtype: object)

Question 4

You're trying to do string formatting for a float, but you're actually passing it a string.

To illustrate the source of your error, consider the following:

'{:,.2f}%'.format(u'1')

which raises the same error:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-41-fb59302ab6b7> in <module>()
----> 1 '{:,.2f}%'.format(u'1')
ValueError: Unknown format code 'f' for object of type 'unicode'

To solve this, cast your string (dtype = object) columns to float, e.g.

# get columns to cast to float
vals = df.select_dtypes(['object']).astype(float)
cols = vals.columns
# and replace them
df[cols] = vals

Alternatively, you could put some logic in your mapper, e.g.

def safe_float_formatter(value):
 try:
 return '{:,.2f}%'.format(value)
 except ValueError:
 return value
df.map(safe_float_formatter)

Question 5

Thanks Kris, one caveat though is that some of the values are actually strings, written as 'N/A'. Which yielded this error: ValueError: could not convert string to float: N/A

Question 6

So my question is whether there's a way to skip or ignore any value that would cause an error in the type conversion

Question 7

I see.. I updated the answer (starting from "Alternatively ...")

Kris 23.9k3 gold badges32 silver badges37 bronze badges · Accepted Answer · 2016-08-17 22:21:10Z

You're trying to do string formatting for a float, but you're actually passing it a string.

To illustrate the source of your error, consider the following:

'{:,.2f}%'.format(u'1')

which raises the same error:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-41-fb59302ab6b7> in <module>()
----> 1 '{:,.2f}%'.format(u'1')
ValueError: Unknown format code 'f' for object of type 'unicode'

To solve this, cast your string (dtype = object) columns to float, e.g.

# get columns to cast to float
vals = df.select_dtypes(['object']).astype(float)
cols = vals.columns
# and replace them
df[cols] = vals

Alternatively, you could put some logic in your mapper, e.g.

def safe_float_formatter(value):
 try:
 return '{:,.2f}%'.format(value)
 except ValueError:
 return value
df.map(safe_float_formatter)

Thanks Kris, one caveat though is that some of the values are actually strings, written as 'N/A'. Which yielded this error: ValueError: could not convert string to float: N/A
So my question is whether there's a way to skip or ignore any value that would cause an error in the type conversion
I see.. I updated the answer (starting from "Alternatively ...")

CollectivesTM on Stack Overflow

pd.read_sql unicode types causing problems

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related