1

Say I have a python class with a Pandas dataframe df as attribute. I want to query df by releasing one or more pre-defined queries, using a class function to which one or more query handles are provided as arguments:

import pandas as pd
import numpy as np
class doorn:
 def __init__(self):
 self.name = 'foo'
 self.df = pd.DataFrame(data={'A':np.arange(0, 10), 'B':np.arange(5, 15), 'C':np.arange(14, 24)}, index=[x for x in range(0, 10)])
 def query_df(self, *query):
 # query arguments must by formatted as 'q1', 'q2' etc
 queries = [q for q in query]
 q1 = self.df.loc[self.df.A > 2].index
 q2 = self.df.loc[self.df.B < 13].index
 q3 = self.df.loc[self.df.C > 15].index
 sel_rows = set().union(*[eval(x, globals(), locals()) for x in queries])
 self.df = self.df.loc[sel_rows]

Now, it seems that eval cannot find the instances of the query-strings it is provided:

>>> foo = doorn()
>>> foo.query_df('q1', 'q2')
Traceback (most recent call last):
 File "<input>", line 1, in <module>
 File "<input>", line 17, in query_df
 File "<input>", line 17, in <listcomp>
 File "<string>", line 1, in <module>
NameError: name 'q1' is not defined

My guess is that q1, q2, q3 are not present in the row comprehension Namespace. Or something, because I haven't really wrapped my head around Namespaces yet. I've tried solving this by providing globals() and locals() as additional arguments to eval, as suggested in the docs, but without success.

How can I solve this? Can I even refrain from using eval altogether?

asked Apr 29, 2020 at 8:28
3
  • 1
    Using eval(queries[0]) works. Also, replacing the for comprehension with a for-loop and appending to a list works as well. This could be related: stackoverflow.com/questions/50624105/… Commented Apr 29, 2020 at 8:47
  • Yes, it works with a for loop, thanks for suggesting. Commented Apr 29, 2020 at 9:11
  • No problems. I'm not entirely sure why it works though, so it would be nice if someone adds a good answer :) Commented Apr 29, 2020 at 9:13

1 Answer 1

2

I think this is because the locals() in your comprehension loop are not the same as the ones in your function, thus they don't contain 'q1'. You may use global variables but I would not recommend this. Moreover using eval with something coming maybe from user inputs can be hazardous has it can execute malicious code.

I suggest you to store your list of predefined queries in a dictionary like in this example:

class doorn:
 def __init__(self):
 self.name = 'foo'
 self.df = pd.DataFrame(data={'A':np.arange(0, 10), 'B':np.arange(5, 15), 'C':np.arange(14, 24)}, index=[x for x in range(0, 10)])
 def query_df(self, *query):
 # query arguments must by formatted as 'q1', 'q2' etc
 queries = [q for q in query]
 possible_queries = {'q1' : self.df.loc[self.df.A > 2].index,
 'q2' : self.df.loc[self.df.B < 13].index,
 'q3' : self.df.loc[self.df.C > 15].index}
 sel_rows = set().union(*[possible_queries[x] for x in queries])
 self.df = self.df.loc[sel_rows]

Hope this will help you.

Shaido
28.6k26 gold badges76 silver badges82 bronze badges
answered Apr 29, 2020 at 9:15
Sign up to request clarification or add additional context in comments.

1 Comment

Using a dictionary is a definite improvement while avoiding the eval issue.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.