4
\$\begingroup\$

I have a dictionary where for each key, a single value is stored. Say

import pandas as pd
dd = {'Alice': 40,
 'Bob': 50,
 'Charlie': 35}

Now, I want to cast this dictionary to a pd.Dataframe with two columns. The first column contains the keys of the dictionary, the second column the values and give the columns a name (Say "Name" and "Age"). I expect to have a function call like:

 pd.DataFrame(dd, columns=['Name', 'Age']) 

which gives not desired output, since it only has 0 rows.

Currently I have two "solutions":

# Rename the index and reset it:
pd.DataFrame.from_dict(dd, orient='index', columns=['Age']).rename_axis('Name').reset_index()
pd.DataFrame(list(dd.items()), columns=['Name', 'Age'])
# Both result in the desired output:
 Name Age
0 Alice 40
1 Bob 50
2 Charlie 35

However, both appear a bit hacky and thus inefficient and error-prone to me. Is there a more pythonic way to achieve this?

AlexV
7,3532 gold badges24 silver badges47 bronze badges
asked Jan 31, 2020 at 14:38
\$\endgroup\$
3
  • 3
    \$\begingroup\$ There's nothing wrong/hacky in using pd.DataFrame(dd.items(), columns=['Name', 'Age']) to get the needed result in your case \$\endgroup\$ Commented Jan 31, 2020 at 15:13
  • \$\begingroup\$ @RomanPerekhrest, Didn't realize that ```list()´´´´ can be removed. Without this, it seems to be ok for me. Do you want to post it as an answer, so I can accept it? \$\endgroup\$ Commented Jan 31, 2020 at 15:31
  • 2
    \$\begingroup\$ Honestly, it's too simple to be a significant answer. \$\endgroup\$ Commented Jan 31, 2020 at 15:32

1 Answer 1

3
\$\begingroup\$

The advantage of your call to from_dict is that the method name makes the conversion a little obvious (though the rest of the index manipulation makes this less obvious). Don't rename_axis(); instead pass a names parameter in reset_index().

Your call to dd.items() is probably the best approach in terms of simplicity, just drop the call to list.

I show two other options: one makes it even more obvious what's going on by sending in separate key and value series; and the fourth is a variant of your I expect to have a function call like but repaired.

import typing
import pandas as pd
def method_a(dd: dict[str, typing.Any], columns: typing.Sequence[str]) -> pd.DataFrame:
 return pd.DataFrame.from_dict(
 data=dd, orient='index', columns=columns[1:],
 ).reset_index(names=columns[0])
def method_b(dd: dict[str, typing.Any], columns: typing.Sequence[str]) -> pd.DataFrame:
 return pd.DataFrame(data=dd.items(), columns=columns)
def method_c(dd: dict[str, typing.Any], columns: typing.Sequence[str]) -> pd.DataFrame:
 kcol, vcol = columns
 return pd.DataFrame({kcol: dd.keys(), vcol: dd.values()})
def method_d(dd: dict[str, typing.Any], columns: typing.Sequence[str]) -> pd.DataFrame:
 df = pd.DataFrame(dd, index=columns[1:])
 return df.T.reset_index(names=columns[0])
def test() -> None:
 dd = {'Alice': 40,
 'Bob': 50,
 'Charlie': 35}
 ref = method_a(dd=dd, columns=('Name', 'Age'))
 for method in (method_b, method_c, method_d):
 result = method(dd=dd, columns=('Name', 'Age'))
 assert ref.equals(result)
if __name__ == '__main__':
 test()
answered Dec 22, 2024 at 21:48
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.