Converting CSV data to string

Asked 8 years, 2 months ago

Viewed 3k times

So I got a CSV file that has 14 columns, and I was trying to convert that data to string type and I tried this

import pandas as pd
df = pd.read_csv("2008_data_test.csv", sep=",")
output = pd.DataFrame(columns=df.columns)
for c in df.columns:
 if df[c].dtype == object:
 print "convert ", df[c].name, " to string"
 df[c] = df[c].astype(str)
 output.to_csv("2008_data_test.csv_tostring2.csv", index=False)

This gives me the headers only, and I can't figure out what I missed?

Any ideas? And is it possible to convert specific columns?

Improve this question

edited Oct 17, 2017 at 21:35

coldspeed95's user avatar

coldspeed95

406k106 gold badges745 silver badges798 bronze badges

asked Oct 17, 2017 at 19:02

Marima's user avatar

Marima

132 silver badges7 bronze badges

output = pd.DataFrame(columns=df.columns) is defined before the for loop. That's what you write to CSV; the for loop does basically nothing. Did you mean df.to_csv("2008_data_test.csv_tostring2.csv", index=False) instead?

roganjosh
– roganjosh

2017年10月17日 19:06:05 +00:00
Commented Oct 17, 2017 at 19:06
@roganjosh Oh right!! stupid mistake . thanks it worked !!

Marima
– Marima

2017年10月17日 19:25:09 +00:00
Commented Oct 17, 2017 at 19:25

Add a comment |

1 Answer 1

Sorted by: Reset to default

You're modifying one dataframe, but writing another, that's the reason. Use select_dtypes instead.

c = df.select_dtypes(include=[object]).columns
df[c] = df[c].astype(str)
df.to_csv("2008_data_test.csv_tostring2.csv", index=False)

As MaxU suggested, it might be simpler to do filter by dtypes in this manner:

c = df.columns[df.dtypes.eq('object')]

The former creates a dataframe subslice before accessing columns, so this should be cheaper.

If you want to convert specific columns only, you can remove columns as needed from c before the conversion using c = c.difference(['Col1', 'Col2', ...]).

Improve this answer

edited Oct 17, 2017 at 21:03

answered Oct 17, 2017 at 19:09

coldspeed95's user avatar

coldspeed95

406k106 gold badges745 silver badges798 bronze badges

3 Comments

MaxU - stand with Ukraine

MaxU - stand with Ukraine Over a year ago

I'd use df.columns[df.dtypes.eq('object')] instead of df.select_dtypes(include=[object]).columns as the latter (df.select_dtypes(include=[object])) first generates a DataFrame and then returns its columns...

2017年10月17日T20:49:08.373Z+00:00

coldspeed95

coldspeed95 Over a year ago

@MaxU that’s great, I didn’t know about that. I’ll edit my answer, or will you write an answer? :)

2017年10月17日T20:58:06.603Z+00:00

MaxU - stand with Ukraine

MaxU - stand with Ukraine Over a year ago

@cᴏʟᴅsᴘᴇᴇᴅ, please feel free to edit your answer ... ;-)

2017年10月17日T20:58:56.333Z+00:00

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Converting CSV data to string

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related