1

could you please help me understand what's wrong in the script below and how to correct it? I am just trying to add a column iterating over the file. The new column should say 'F', if the percentage of females is higher than the percentage of males. Thank you very much!

babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')
gender=[]
for idx in range(len(babies_df)):
 if babies_df['perc_female'>'perc_male']:
 gender.append('F')
 else:
 gender.append('M')
babies_df['gender'] = gender
asked May 26, 2020 at 11:17
7
  • I think you should use babies_df['perc_female'] > babies_df['perc_male'] Commented May 26, 2020 at 11:19
  • thanks PSKP, tried that already but it does not work :( Commented May 26, 2020 at 11:23
  • In if statement you mistakenly comparing inside [ ] which is wrong. you should complete first column. If still error persist please add error also. Commented May 26, 2020 at 11:25
  • Sure, thanks for helping, this is the code: babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';') gender=[] for idx in range(len(babies_df)): if babies_df['perc_female’]> babies_df[’perc_male']: gender.append('F') else: gender.append('M') babies_df['gender'] = gender Commented May 26, 2020 at 11:33
  • and this is the error: ValueError Traceback (most recent call last) <ipython-input-195-cbb09ab25ce5> in <module>() 4 gender=[] 5 for idx in range(len(babies_df)): ----> 6 if babies_df['perc_female']>babies_df['perc_male']: 7 idx.append('F') 8 else: Commented May 26, 2020 at 11:35

2 Answers 2

1

The problem is that babies_df['perc_female'>'perc_male'] is not correct syntax.

You could try pandas apply for your solution.


babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')
babies_df['gender'] = babies_df.apply(
 lambda x: 'F' if x['perc_female'] > x['perc_male'] else 'M', 
 axis=1
)
answered May 26, 2020 at 11:51

Comments

1

The problem with your code is, you are not iterating row by row and also you are comparing columns directly which is not possible.

babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')
for index, row in babies_df.iterrows():
 if row["perc_female"] > row["perc_male"]:
 gender.append("F")
 else:
 gender.append("M")
babies_df["gender"] = gender
Cody Gray
246k53 gold badges508 silver badges590 bronze badges
answered May 26, 2020 at 11:43

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.