0

I am trying to build a dataframe containing columns which are conditional. My code:

from faker import Faker
import pandas as pd
import random
fake = Faker()
def create_rows_faker(num=1):
 output = [{"name":fake.name(),
 "address":fake.address(),
 "name":fake.name(),
 "email":fake.email()} for x in range(num)]
 return output

produces

df = pd.DataFrame(create_rows_faker(3))
df

enter image description here

How can I change the definition of ouput so that if I had a variable if name_column == '1' then include this in output (and don't include otherwise), and similarly with name and email?

asked May 4, 2022 at 12:39
1
  • What would be wrong with removing "name": fake.name(), from your current code and instead adding if name_column == '1': for x in range(num): output[x]["name"] = fake.name()? Commented May 4, 2022 at 12:44

2 Answers 2

2

Use a standard for loop instead of overcomplicating the comprehension.

def create_rows_faker(num=1, name_col = True, address_col = True, email_col = False):
 output = []
 for x in range(num):
 out = {}
 if name_col:
 out["name"] = fake.name()
 if address_col:
 out["address"] = fake.address()
 if email_col:
 out["email"] = fake.email()
 output.append(out)
 return output
answered May 4, 2022 at 12:45
Sign up to request clarification or add additional context in comments.

1 Comment

I tried this initially but I didn't like the amount of ifs inside the loop.
2

Here is an option using a dictionary of function and a list of the keys:

def create_rows_faker(num=1, use=('name', 'address', 'email')):
 options = {"name":fake.name,
 "address":fake.address,
 "email":fake.email}
 use = set(use)
 options = {k:f for k,f in options.items() if k in use}
 output = [{k:f() for k,f in options.items()} for x in range(num)]
 return output
pd.DataFrame(create_rows_faker(3, use=['name']))

output:

 name
0 Tracy Alexander MD
1 Mark Winters
2 Lori Edwards
answered May 4, 2022 at 12:49

4 Comments

List as a default parameter is a shortcut to serious bugs later on.
@matszwecja like what? The list is not mutated here
I made it a tuple, but this doesn't change a thing (I agree this would if I was mutating use in the function)
I just think it's something that should be avoided wherever possible. You never know how the code might change in the future and such things might be a nightmare to debug later on.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.