could you please help me understand what's wrong in the script below and how to correct it? I am just trying to add a column iterating over the file. The new column should say 'F', if the percentage of females is higher than the percentage of males. Thank you very much!
babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')
gender=[]
for idx in range(len(babies_df)):
if babies_df['perc_female'>'perc_male']:
gender.append('F')
else:
gender.append('M')
babies_df['gender'] = gender
Michele GiglioniMichele Giglioni
asked May 26, 2020 at 11:17
2 Answers 2
The problem is that babies_df['perc_female'>'perc_male']
is not correct syntax.
You could try pandas apply for your solution.
babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')
babies_df['gender'] = babies_df.apply(
lambda x: 'F' if x['perc_female'] > x['perc_male'] else 'M',
axis=1
)
answered May 26, 2020 at 11:51
Comments
The problem with your code is, you are not iterating row by row
and also you are comparing columns directly which is not possible.
babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')
for index, row in babies_df.iterrows():
if row["perc_female"] > row["perc_male"]:
gender.append("F")
else:
gender.append("M")
babies_df["gender"] = gender
Comments
lang-py
babies_df['perc_female'] > babies_df['perc_male']
if
statement you mistakenly comparing inside[ ]
which is wrong. you should complete first column. If still error persist please add error also.