Data-frame manipulation in python

Asked 8 years, 2 months ago

Viewed 76 times

I have a csv file with two columns of a and b as below:

I want to read and save data in a new csv file as below:

I have tried this code:

data=pd.read_csv('./dataset/test4.csv')
list=[]
i=0
while(i<6):
 list.append(data['a'].iloc[i:i+3])
 i+=3
df = pd.DataFrame(list)
print(df)

by this out put:

 0 1 2 3 4 5
a 601.0 602.0 603.0 NaN NaN NaN
a NaN NaN NaN 604.0 605.0 606.0

First I need to save the list in a dataframe with following result:

 0 1 2 3 4 5
 601.0 602.0 603.0 604.0 605.0 606.0

and then save in a csv file. However I've got stuck in the first part.

Thanks for your help.

Improve this question

edited Oct 30, 2017 at 20:57

cs95's user avatar

cs95

406k106 gold badges745 silver badges798 bronze badges

asked Oct 30, 2017 at 20:57

Elham's user avatar

Elham

8773 gold badges14 silver badges26 bronze badges

So, every 3 elements constitute a new group?

cs95
– cs95

2017年10月30日 20:58:40 +00:00
Commented Oct 30, 2017 at 20:58
yes.every 3 elements constitute a new group

Elham
– Elham

2017年10月30日 20:59:54 +00:00
Commented Oct 30, 2017 at 20:59

Add a comment |

3 Answers 3

Sorted by: Reset to default

Assuming every 3 items in a constitute a group in b, just do a little integer division on the index.

data['b'] = (data.index // 3 + 1)
data
 a b
0 601 1
1 602 1
2 603 1
3 604 2
4 605 2
5 606 2

Saving to CSV is straightforward - all you have to do is call df.to_csv(...).

Division by index is fine as long as you have a monotonically increasing integer index. Otherwise, you can use np.arange (on MaxU's recommendation):

data['b'] = np.arange(len(data)) // 3 + 1
data
 a b
0 601 1
1 602 1
2 603 1
3 604 2
4 605 2
5 606 2

Improve this answer

edited Oct 30, 2017 at 21:13

answered Oct 30, 2017 at 20:59

cs95's user avatar

cs95

406k106 gold badges745 silver badges798 bronze badges

2 Comments

MaxU - stand with Ukraine

MaxU - stand with Ukraine Over a year ago

I'd use np.arange(len(data))//3 + 1 instead of (data.index // 3 + 1) - as it will also work for string/datetime/etc. indexes...

2017年10月30日T21:10:44.547Z+00:00

cs95

cs95 Over a year ago

@MaxU Thank you, I will incorporate that into my answer.

2017年10月30日T21:11:33.353Z+00:00

By using you output

df.stack().unstack()
Out[115]: 
 0 1 2 3 4 5
a 601.0 602.0 603.0 604.0 605.0 606.0

Data Input

df
 0 1 2 3 4 5
a 601.0 602.0 603.0 NaN NaN NaN
a NaN NaN NaN 604.0 605.0 606.0

Improve this answer

answered Oct 30, 2017 at 21:02

BENY's user avatar

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

In [45]: df[['a']].T
Out[45]:
 0 1 2 3 4 5
a 601 602 603 604 605 606

In [39]: df.set_index('b').T.rename_axis(None, axis=1)
Out[39]:
 1 2 3 4 5 6
a 601 602 603 604 605 606

Improve this answer

answered Oct 30, 2017 at 21:05

MaxU - stand with Ukraine's user avatar

MaxU - stand with Ukraine

212k37 gold badges402 silver badges437 bronze badges

Comments

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

Data-frame manipulation in python

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related