-
Notifications
You must be signed in to change notification settings - Fork 1.9k
-
Hello,
I'm trying to write a CodeQL query to analyze function calls in Python code, specifically the json() method from the requests library. Here's a simplified example of the code I want to analyze:
def fetch_data(self, url):
response = requests.get(url)
return response.json()
I downloaded the requests library locally, and I'm using the following queries to retrieve function and attribute information:
First query: codeql
import python
from Call c, FunctionValue funcVal, Attribute attr
where
c.getFunc().pointsTo(funcVal) and
c.getFunc() = attr
select
c.getScope(),
c.getScope().getLocation(),
attr.getName(),
funcVal.getScope().getLocation()
Second query: codeql
import python
from Attribute t, FunctionValue f
where
t.pointsTo(f)
select t.getName(), f.getName()
I am able to retrieve most function calls, but for some reason, the call to response.json() is not being detected.
Any guidance would be greatly appreciated!
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 3 comments 1 reply
-
@github/codeql-python Can you help here, please?
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi @Faycal572 👋🏼 Apologies for the delayed response here. I have a few follow-up questions for you:
- What version of CodeQL are you using?
- Would you be able to share the results you are seeing for these queries?
- Could you say more about your use of
pointsToin thewhereclauses? If the intent is to find function calls and attributes, it's not clear to me why yourwhereclause is setup the way it is.
Running:
import python from Attribute t, FunctionValue f select t.getName(), f.getName(), f.getScope().getLocation() as functionLocation
Finds both get and json for me, using CodeQL 2.19.2.
Beta Was this translation helpful? Give feedback.
All reactions
-
To expand on the answer from Sid, please don't use points-to, since it's deprecated. Instead, use API graphs, which is described in more detail here: https://codeql.github.com/docs/codeql-language-guides/using-api-graphs-in-python/
So for this example, you could do something like
private import python private import semmle.python.dataflow.new.DataFlow private import semmle.python.ApiGraphs select API::moduleImport("requests").getMember(["get", "post", ...]).getReturn().getMember("json").getACall()
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi @sidshank @RasmusWL 👋🏼 Thank you for your detailed response!
I’m currently using CodeQL version 2.18.4. Your query works for me as well and captures both get and json. However, my main goal is to identify which specific json function is being called by pinpointing its exact location in the code, especially when there are multiple functions named json.
Additionally, I tried the following query:
private import python
private import semmle.python.dataflow.new.DataFlow
private import semmle.python.ApiGraphs
select API::moduleImport("requests").getMember(["get", "post"]).getReturn().getMember("json").getACall()
This query successfully returned that fetch_data calls json, but I haven't managed to generalize it to capture all functions at once.
My goal is to construct a representative graph of the code that includes all function and method calls.
Thank you again for your response and assistance!
Beta Was this translation helpful? Give feedback.