I'm formatting some data I'm pulling from a database using sqlalchemy, pyodbc, and pandas read_sql, which I get back as a dataframe df.
I want to apply a formatting of the data in each 'cell' of the dataframe, row by row and excluding the first two columns using this:
df.iloc[6, 2:] = (df.iloc[6, 2:]*100).map('{:,.2f}%'.format)
I apply a similar formatting for several other rows in the dataframe. This used to work great when I was reading my data from a csv file, but now reading from the database causes a ValueError on that line that reads:
ValueError: Unknown format code 'f' for object of type 'unicode'
I tried some other casting attemps such as: df.iloc[6, 2:] = (float(df.iloc[6, 2:].encode())*100).map('{:,.2f}%'.format) But this causes some additional errors.
I'm pretty sure the error is being caused by the unicode type of the results. How should I format my dataframe or modify my read_sql to not have unicode strings? I'm using python 2.7 by the way.
The dtype for each column is object.
1 Answer 1
You're trying to do string formatting for a float, but you're actually passing it a string.
To illustrate the source of your error, consider the following:
'{:,.2f}%'.format(u'1')
which raises the same error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-41-fb59302ab6b7> in <module>()
----> 1 '{:,.2f}%'.format(u'1')
ValueError: Unknown format code 'f' for object of type 'unicode'
To solve this, cast your string (dtype = object) columns to float, e.g.
# get columns to cast to float
vals = df.select_dtypes(['object']).astype(float)
cols = vals.columns
# and replace them
df[cols] = vals
Alternatively, you could put some logic in your mapper, e.g.
def safe_float_formatter(value):
try:
return '{:,.2f}%'.format(value)
except ValueError:
return value
df.map(safe_float_formatter)
print(df.dtypes)?