-
Notifications
You must be signed in to change notification settings - Fork 1.9k
-
Can you help get me closer to a rule that points directly at the source of an error?
Background
This is based on a real-world investigation into an alert from Likely Bugs/Memory Management/StackAddressEscapes.ql. In alerts generated by that rule, the function that is pointed to as potentially erroneous is the function that is actually assigning the function's parameter (which at least one caller passed a stack address for). Unfortunately, the project is mature, so the function's definition will not be changed. Thus, I want to find the ultimate source ... which code caused a stack pointer to actually be used?
For this project, there are three functions, each of which takes a buffer pointer as the third parameter. Someone somewhere is passing a stack address for that third parameter. Unfortunately, there are many, many callers of those functions. I'm hoping to either generate project-specific rules (strawman / first-draft below) that catch calls to these three functions, where the third parameter is a stack location. I will then update the rule to include caller functions which are simply proxies ... eventually reaching full coverage. (see first question.)
Preferably, an alert would be generated that pinpoints the source location that causes the use of the stack location for these parameters. (see second question.)
Specific depot / functions
In particular, this alert fires in the TinyUSB project.
To start, I want to add alerts for any caller of the following three functions, where the caller provides a stack location as the 3rd parameter (buffer):
bool dcd_edpt_xfer(uint8_t rhport, uint8_t ep_addr, uint8_t * buffer, uint16_t total_bytes);bool usbd_edpt_xfer(uint8_t rhport, uint8_t ep_addr, uint8_t * buffer, uint16_t total_bytes);bool tud_control_xfer(uint8_t rhport, tusb_control_request_t const * request, void* buffer, uint16_t len);
This is because, unless additional checks are done to wait for the completion of the request, such callers would invite stack corruption (as noted in StackAddressEscapes.ql).
First-pass at Query (for review)
This first draft matches when the buffer pointer is stored in the stack. I'm looking for it to match when the pointer itself points to a stack location....
/**
* @name TinyUSB - local variable address passed as buffer for endpoint transfer
* @description Passing the address of a local variable (stack address) as the
* `buffer` for these functions can cause stack corruption when the
* transfer is not completed before the stack location is re-used.
* Typically, this requires waiting on an ISR to fire, as the transfer
* may be handled asynchronously by hardware to that buffer.
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/TinyUSB/stack-address-as-buffer
* @tags reliability
*/
import cpp
import semmle.code.cpp.dataflow.StackAddress
predicate isStackPointer(Expr e) {
exists(StackVariable sv | e = sv.getAnAccess())
}
from Call call, Expr use, string msg
where
// limit call to the three interesting functions
(
call.getTarget().getName() = "dcd_edpt_xfer" or
call.getTarget().getName() = "usbd_edpt_xfer" or
call.getTarget().getName() = "tud_control_xfer"
)
and
// limit use to be the third parameter of those functions (aka `buffer`)
call.getArgument(2) = use /* all these take `buffer` as the 3rd argument, zero-indexed */
and
// limit use to cases where the parameter is a stack pointer
isStackPointer(use)
and
msg = "A stack address ($@) is passed as `buffer` to " + call.getTarget().getName() + "."
select use, msg, call
Question:
- My first draft fires when the place storing the pointer parameter is on the stack. How do I modify this to only fire when the address pointed-to is on the stack?
- Is there an easy way to have the alerts identify the "ultimate source" that provided a pointer to the stack?
... ultimate source example ...
For example:
// static struct { void* forAsyncProcessing; } s_bufferToProcess; bool usbd_edpt_xfer(uint8_t rhport, uint8_t ep_addr, uint8_t * buffer, uint16_t total_bytes) { // validates things, then a line that willl cause alert from StackAddressEscapes.ql s_bufferToProcess.forAsyncProcessing = buffer; } bool usbd_edpt_xfer_wrapper(uint8_t rhport, uint8_t ep_addr, uint8_t * buffer, uint16_t total_bytes) { // does some stuff... and then calls the function mentioned in the draft new rule usbd_edpt_xfer(rhport, ep_addr, buffer, total_bytes); } static uint8_t static_buffer[10]; void foo() { // this should not fire usbd_edpt_xfer_wrapper(0, 0, &(static_buffer[0]), 10); } void bar() { uint8_t stack_buffer[10]; // this should cause the rule to fire, pointing to next line as problem usbd_edpt_xfer_wrapper(0, 0, &(stack_buffer[0]), 10); }
I feel close ... the query is finding callers who first store the pointer on the stack (vs. the pointed-to-location being on stack), and it feels like I just need a nudge in the right direction.
Any help improving these and getting them to do the right thing is greatly appreciated!
Beta Was this translation helpful? Give feedback.
All reactions
How about this, which uses a path-problem query, a global dataflow configuration and the , true qualifier that limits stackPointerFlowsToUse to highlighting local problems:
/** * @name TinyUSB - local variable address passed as buffer for endpoint transfer * @description Passing the address of a local variable (stack address) as the * `buffer` for these functions can cause stack corruption when the * transfer is not completed before the stack location is re-used. * Typically, this requires waiting on an ISR to fire, as the transfer * may be handled asynchronously by hardware to that buffer. * @kind path-problem * @problem.severit...
Replies: 1 comment 6 replies
-
Does stackPointerFlowsToUse (requires import semmle.code.cpp.dataflow.StackAddress) as used in StackAddressEscapes.ql do what you want? It caters for quite a few cases where a stack address flows to a particular expression (here, use), and provides the parameter source to optionally export a more user-friendly description of the escaping allocation.
Beta Was this translation helpful? Give feedback.
All reactions
-
Path I've walked to getting desired information into the alert
This is now well past my ken... so my mistakes will be multiplying here as I learn the language.
The existing cpp/stack-address-escapes results in AssignExpr where a stack pointer is stored in a non-stack location. That's good, but painful as the starting point for tracking where this actually occurs.
However, that same primitive seems like it should be usable as a Sink in a DataFlow query.
class StackAddressEscapes
class StackAddressEscapes extends AssignExpr {
StackAddressEscapes() {
this = any(
AssignExpr assignExpr |
stackPointerFlowsToUse(assignExpr.getRValue(), _, _, _) and
not stackReferenceFlowsToUse(assignExpr.getLValue(), _, _, _)
)
}
boolean getEscapeIsLocal() {
result = any (
boolean isLocal |
stackPointerFlowsToUse(this.getRValue(), _, _, isLocal)
)
}
Expr getEscapeSource() {
result = any (
Expr source |
stackPointerFlowsToUse(this.getRValue(), _, source, _)
)
}
}
However, I'm stuck trying to understand...
- How to define
StackAddressAllocationclass corresponding toAllocationExprthat allocated stack variables (rather than malloc/free/placement new/etc.)? - How to use
StackAddressAllocationas thesourcenodeStackAddressEscapesclass with the ` that stack allocation class with the
Unfortunately, my strawman class returns zero results, even when it's not filtering things.
Strawman StackAllocated class
Intended result:
Matches stack-allocated expression in a form that can be used with
TaintTracking and Flow::PathGraph to find which code paths cause
the cpp/stack-address-escape to be triggered.
For first pass, it's acceptable to limit this to variables that are explicitly
allocated on the stack (e.g., not dealing with C++ classes, member
functions, etc.).
import cpp
import semmle.code.cpp.dataflow.StackAddress
import semmle.code.cpp.dataflow.new.TaintTracking
class StackAllocation extends AllocationExpr {
StackAllocation() {
// FIXME -- even with no filter, this gives zero results
this = any(
AllocationExpr e |
// FIXME -- how to filter to things on the stack?
// e.isStackAllocation() and
this = e
)
}
}
from
StackAllocation stackAllocation
, string msg
where
msg = "Foo"
select
stackAllocation
, msg
Thus, I'm hoping for help understanding where I've gone astray....
Beta Was this translation helpful? Give feedback.
All reactions
-
So if I apply my suggestion to your QL, and fix a couple of minor things (switching call and use in the output since that way around is more natural; amending the alert message format; fixing the query ID to use lower-kebab-case), I think I get results matching what you describe:
/** * @name TinyUSB - local variable address passed as buffer for endpoint transfer * @description Passing the address of a local variable (stack address) as the * `buffer` for these functions can cause stack corruption when the * transfer is not completed before the stack location is re-used. * Typically, this requires waiting on an ISR to fire, as the transfer * may be handled asynchronously by hardware to that buffer. * @kind problem * @problem.severity warning * @precision medium * @id cpp/tinyusb/stack-address-as-buffer * @tags reliability */ import cpp import semmle.code.cpp.dataflow.StackAddress from Call call, Expr use, string msg, Expr source where // limit call to the three interesting functions call.getTarget().getName() = ["dcd_edpt_xfer", "usbd_edpt_xfer", "tud_control_xfer"] and // limit use to be the third parameter of those functions (aka `buffer`) call.getArgument(2) = use and /* all these take `buffer` as the 3rd argument, zero-indexed */ // limit use to cases where the parameter is a stack pointer stackPointerFlowsToUse(use, _, source, _) and msg = "A stack address ($@, which points to $@) is passed as `buffer` to " + call.getTarget().getName() + "." select call, msg, use, use.toString(), source, source.toString()
Example result: A stack address (& ..., which points to rsp) is passed as ``buffer`` to tud_control_xfer. at usbtmc_device.c:763:15, which indeed uses a stack-allocated data structure usbtmc_initiate_abort_rsp_t rsp
This is common enough that I suspect you'll need to do some more filtering to eliminate safe uses of stack memory, but is this a start to what you want?
Beta Was this translation helpful? Give feedback.
All reactions
-
Awesome! Thanks for helping correct my CodeQL grammar.
Yes, the above query gets me closer to the ultimate goal, by showing uses of these functions where the buffer parameter is, in at least one path, a stack location. When the stack location is allocated in that calling function, that's easy to track, and "good enough" to enable the solution (with the occasional addition of more functions listed). 🎉
I can recursively start adding other functions that call into this, looking at their own corresponding incoming parameters.
Of course, this quickly gets complex and messy...
For example, the above query points one alert towards dcd_edpt_xfer() in usbd.c
A stack address (buffer, which points to buffer) is passed as "buffer" to dcd_edpt_xfer. | usbd.c:1343:7
However, as you can see, buffer is a parameter to that function. Thus, when viewing such an alert, the reviewer still has to review all calling functions, of which an unknown number pass a stack-based buffer. In this case, that's about fifteen cases spread across three files. Some of them are easy to review, but others result in messier filters.
Still, doing that where it's easy ensure the alert points "closer" to the stack allocation, so even if not the best architectural solution, it's at least a stepping stone.....
Is there an existing way to get the path from initial allocation of the stack location, to where it's being used per this query?
Specifically, any alert showing dcd_edpt_xfer() as the culprit is ... less helpful ... because it just indicates that some caller of that function passed a stack location.
I can convert the above query to a class usable as a sink filter. But ... what sort of filter/match can be used to link the original stack allocation location to each use?
Using a Path query seems promising ... but that's possibly my ignorance showing. Am I looking in the right direction, or is there a cleaner path forward (no pun intended)?
Thank you again for the excellent help!
Beta Was this translation helpful? Give feedback.
All reactions
-
How about this, which uses a path-problem query, a global dataflow configuration and the , true qualifier that limits stackPointerFlowsToUse to highlighting local problems:
/** * @name TinyUSB - local variable address passed as buffer for endpoint transfer * @description Passing the address of a local variable (stack address) as the * `buffer` for these functions can cause stack corruption when the * transfer is not completed before the stack location is re-used. * Typically, this requires waiting on an ISR to fire, as the transfer * may be handled asynchronously by hardware to that buffer. * @kind path-problem * @problem.severity warning * @precision medium * @id cpp/tinyusb/stack-address-as-buffer * @tags reliability */ import cpp import semmle.code.cpp.dataflow.StackAddress import semmle.code.cpp.dataflow.new.DataFlow module StackAllocConfig implements DataFlow::ConfigSig { predicate isSource(DataFlow::Node source) { stackPointerFlowsToUse(source.asExpr(), _, _, true) } predicate isSink(DataFlow::Node sink) { exists(Call c | sink.asExpr() = c.getArgument(2) and c.getTarget().getName() = ["dcd_edpt_xfer", "usbd_edpt_xfer", "tud_control_xfer"] ) } predicate isBarrierIn(DataFlow::Node n) { // Clean up a little noise by avoiding paths through another source node, // such as those caused by use-use flow in the context f(&x); g(&x); which // might otherwise produce a path &x -> &x -> g... when omitting that first // use-use edge would be cleaner. isSource(n) } } module StackAllocFlow = DataFlow::Global<StackAllocConfig>; // Expose `nodes` and `edges` predicates that power path-problem queries import StackAllocFlow::PathGraph from StackAllocFlow::PathNode source, StackAllocFlow::PathNode sink, Expr underlyingAlloc, Call call, string msg where StackAllocFlow::flowPath(source, sink) and stackPointerFlowsToUse(source.getNode().asExpr(), _, underlyingAlloc, _) and call.getArgument(2) = sink.getNode().asExpr() and msg = "A stack address ($@, which points to $@) is passed as `buffer` to " + call.getTarget().getName() + "." select call, source, sink, msg, source, source.toString(), underlyingAlloc, underlyingAlloc.toString()
The result for usbd.c:1343 now features a path of length 3 showing where the original stack alloc came from and how it got there. Note in this scenario you should only list under isSink those functions which are directly a problem when used with stack-allocated memory; those which are indirectly a problem by virtue of using such a function will be taken care of by the global dataflow logic.
Beta Was this translation helpful? Give feedback.
All reactions
-
Wow! Some gems in here, giving me lots of pointers to edges of my knowledge. 💕 LOVE IT! 💕
On the positive side, this confirms that the one instance has no identical variants ... only one place passes a stack location to dcd_edpt_xfer() ... hooray!
After I get some time to dig into the above, I will try to update it to match (rather than the particular function manually entered) anything matching the stackAddressEscapes() predicate:
stackPointerFlowsToUse(assignExpr.getRValue(), _, source, isLocal) and
not stackReferenceFlowsToUse(assignExpr.getLValue(), _, _, _)
Your guidance has been very helpful. Now it's time for me to explore docs, experiment with modifications, and expand my knowledge. I will update with my path-based solution here ... but if I get blocked and frustration thresholds are reached, I will instead show my attempt(s) and ask for help understanding where I went astray. Again ... thank you! 🎉
Beta Was this translation helpful? Give feedback.