-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[JavaScript] Functions stored in nested properties or Map not detected using DataFlow #18746
-
I'm having some trouble with getting CodeQL to detect a common pattern in a framework I am trying to model, where a global Map variable is accessed by functions. In my testing, this doesn't appear to be tracked by DataFlow, while it does work in a local scope (also proven by this existing test).
In the following source code example, source() is marked as user input and sink() represents a dangerous function:
// Global scope const variable = source() const obj = { key: variable } const obj2 = {} obj2.key = variable const map = new Map() map.set("key", variable) function accessGlobals() { sink(variable) // direct sink(obj.key) // indirect prop sink(obj2.key) // indirect prop 2 (NOT DETECTED) sink(map.get("key")) // indirect map (NOT DETECTED) } function accessLocals() { // Local scope const variable = source() const obj = { key: variable } const obj2 = {} obj2.key = variable const map = new Map() map.set("key", variable) sink(variable) // direct sink(obj.key) // indirect prop sink(obj2.key) // indirect prop 2 sink(map.get("key")) // indirect map }
I'm running the following query to track sources to sinks:
import javascript DataFlow::CallNode getACall(string name) { result.getCalleeName() = name or result.getCalleeNode().getALocalSource() = DataFlow::globalVarRef(name) } module TestingConfig implements DataFlow::ConfigSig { predicate isSource(DataFlow::Node node) { node = getACall("source") } predicate isSink(DataFlow::Node node) { node = getACall("sink").getAnArgument() } } module TestingFlow = DataFlow::Global<TestingConfig>; from DataFlow::Node source, DataFlow::Node sink where TestingFlow::flow(source, sink) select source, sink
The global obj2.key and map.get("key") aren't detected, while they are in the local scope. Even the obj = { key: ... } syntax works in a global variable. It seems like changes to a variable in a global scope aren't tracked?
Is there any easy fix I could apply to allow it to track my example code correctly, or is it a deeper underlying issue of DataFlow?
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 2 comments 2 replies
-
In your original use-case, are they global variables or merely top-level variables in a module? Global variables are accessible outside the file they were declared in, whereas top-level variables are scoped to their file and are therefore tracked more precisely.
In the example code you've provided, the analysis will interpret the top-level declarations as global variables and track them less precisely. But you were to add an import or require call in the file, the analysis would interpret this as a module (as opposed to a global script) and the variables would instead be seen as top-levels and tracked precisely.
I'm saying this mainly because, in the process of crafting a minimal reproduction example, it often happens that the example file stops looking like a module and is instead interpreted as a global script. And then we get a report about global variables even though the user's original problem had nothing to do with global variables. So I was just wondering if you could double check if they are global variables or not. You can use the class GlobalVarDecl to find declarations that are interpreted as global variables.
Beta Was this translation helpful? Give feedback.
All reactions
-
Great answer! This was exactly the problem, just adding a module makes the above example work. It was an issue in minimizing my example too much.
After some more debugging, the real problem seems to lie with storing functions and nested properties. The library has a Client object on which a Map is created as a property. Handlers are saved as values in this map, and the main event loop will then look up the correct handler and pass it some user input.
With this handler function pattern, it seems like CodeQL also detects less features. Saving a handler in an object variable's property works, but saving it in a Map doesn't. In nested objects, this also doesn't work. I've updated the example below to showcase this behaviour:
// Make it detected as a module const { Client } = require('library'); function handler(input) { sink(input); } function test1() { const a = handler; const obj = {}; obj.b = handler; const map = new Map(); map.set("c", handler); // These are detected: handler(source()); a(source()); obj.b(source()); // This one is not: map.get("c")(source()); } function test2() { const client = new Client(); client.a = handler; client.obj = {}; client.obj.b = handler; client.map = new Map(); client.map.set("c", handler); // These are detected: handler(source()); client.a(source()); // These are not: client.obj.b(source()); client.map.get("c")(source()); }
The query is still the same as in my original post, it should find all ways that source() leads to sink() inside the handler() function. I feel like this must simply be a limitation of CodeQL's analysis, but I'm unsure because my previous example also had a mistake, and I'm curious if there is anything simple I can do to have it detect client.map.get("c")(source());.
Beta Was this translation helpful? Give feedback.
All reactions
-
I can confirm that we currently can't resolve functions through these levels of indirection.
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks for confirming, do you know if there is any way I could still "glue" the missing connections together? I don't really care if it's not a generic solution, I'm fine with having to hardcore a few patterns
Beta Was this translation helpful? Give feedback.