Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Help finding ultimate source of a stack location.... #18388

Answered by smowton
henrygab asked this question in Q&A
Discussion options

Can you help get me closer to a rule that points directly at the source of an error?


Background

This is based on a real-world investigation into an alert from Likely Bugs/Memory Management/StackAddressEscapes.ql. In alerts generated by that rule, the function that is pointed to as potentially erroneous is the function that is actually assigning the function's parameter (which at least one caller passed a stack address for). Unfortunately, the project is mature, so the function's definition will not be changed. Thus, I want to find the ultimate source ... which code caused a stack pointer to actually be used?

For this project, there are three functions, each of which takes a buffer pointer as the third parameter. Someone somewhere is passing a stack address for that third parameter. Unfortunately, there are many, many callers of those functions. I'm hoping to either generate project-specific rules (strawman / first-draft below) that catch calls to these three functions, where the third parameter is a stack location. I will then update the rule to include caller functions which are simply proxies ... eventually reaching full coverage. (see first question.)

Preferably, an alert would be generated that pinpoints the source location that causes the use of the stack location for these parameters. (see second question.)


Specific depot / functions

In particular, this alert fires in the TinyUSB project.

To start, I want to add alerts for any caller of the following three functions, where the caller provides a stack location as the 3rd parameter (buffer):

This is because, unless additional checks are done to wait for the completion of the request, such callers would invite stack corruption (as noted in StackAddressEscapes.ql).


First-pass at Query (for review)

This first draft matches when the buffer pointer is stored in the stack. I'm looking for it to match when the pointer itself points to a stack location....

/**
 * @name TinyUSB - local variable address passed as buffer for endpoint transfer
 * @description Passing the address of a local variable (stack address) as the
 * `buffer` for these functions can cause stack corruption when the
 * transfer is not completed before the stack location is re-used.
 * Typically, this requires waiting on an ISR to fire, as the transfer
 * may be handled asynchronously by hardware to that buffer.
 * @kind problem
 * @problem.severity warning
 * @precision medium
 * @id cpp/TinyUSB/stack-address-as-buffer
 * @tags reliability
 */
 import cpp
 import semmle.code.cpp.dataflow.StackAddress
 
 predicate isStackPointer(Expr e) {
 exists(StackVariable sv | e = sv.getAnAccess())
 }
from Call call, Expr use, string msg
where
 // limit call to the three interesting functions
 (
 call.getTarget().getName() = "dcd_edpt_xfer" or
 call.getTarget().getName() = "usbd_edpt_xfer" or
 call.getTarget().getName() = "tud_control_xfer"
 )
 and
 // limit use to be the third parameter of those functions (aka `buffer`)
 call.getArgument(2) = use /* all these take `buffer` as the 3rd argument, zero-indexed */
 and
 // limit use to cases where the parameter is a stack pointer
 isStackPointer(use)
 and
 msg = "A stack address ($@) is passed as `buffer` to " + call.getTarget().getName() + "."
select use, msg, call

Question:

  1. My first draft fires when the place storing the pointer parameter is on the stack. How do I modify this to only fire when the address pointed-to is on the stack?
  2. Is there an easy way to have the alerts identify the "ultimate source" that provided a pointer to the stack?
... ultimate source example ...

For example:

// 
static struct { void* forAsyncProcessing; } s_bufferToProcess;
bool usbd_edpt_xfer(uint8_t rhport, uint8_t ep_addr, uint8_t * buffer, uint16_t total_bytes) {
 // validates things, then a line that willl cause alert from StackAddressEscapes.ql
 s_bufferToProcess.forAsyncProcessing = buffer;
}
bool usbd_edpt_xfer_wrapper(uint8_t rhport, uint8_t ep_addr, uint8_t * buffer, uint16_t total_bytes) {
 // does some stuff... and then calls the function mentioned in the draft new rule
 usbd_edpt_xfer(rhport, ep_addr, buffer, total_bytes);
}
static uint8_t static_buffer[10];
void foo() {
 // this should not fire
 usbd_edpt_xfer_wrapper(0, 0, &(static_buffer[0]), 10);
}
void bar() {
 uint8_t stack_buffer[10];
 // this should cause the rule to fire, pointing to next line as problem
 usbd_edpt_xfer_wrapper(0, 0, &(stack_buffer[0]), 10);
}

I feel close ... the query is finding callers who first store the pointer on the stack (vs. the pointed-to-location being on stack), and it feels like I just need a nudge in the right direction.

Any help improving these and getting them to do the right thing is greatly appreciated!

You must be logged in to vote

How about this, which uses a path-problem query, a global dataflow configuration and the , true qualifier that limits stackPointerFlowsToUse to highlighting local problems:

/**
 * @name TinyUSB - local variable address passed as buffer for endpoint transfer
 * @description Passing the address of a local variable (stack address) as the
 * `buffer` for these functions can cause stack corruption when the
 * transfer is not completed before the stack location is re-used.
 * Typically, this requires waiting on an ISR to fire, as the transfer
 * may be handled asynchronously by hardware to that buffer.
 * @kind path-problem
 * @problem.severit...

Replies: 1 comment 6 replies

Comment options

Does stackPointerFlowsToUse (requires import semmle.code.cpp.dataflow.StackAddress) as used in StackAddressEscapes.ql do what you want? It caters for quite a few cases where a stack address flows to a particular expression (here, use), and provides the parameter source to optionally export a more user-friendly description of the escaping allocation.

You must be logged in to vote
6 replies
Comment options

Path I've walked to getting desired information into the alert

This is now well past my ken... so my mistakes will be multiplying here as I learn the language.

The existing cpp/stack-address-escapes results in AssignExpr where a stack pointer is stored in a non-stack location. That's good, but painful as the starting point for tracking where this actually occurs.

However, that same primitive seems like it should be usable as a Sink in a DataFlow query.


class StackAddressEscapes
class StackAddressEscapes extends AssignExpr {
 StackAddressEscapes() {
 this = any(
 AssignExpr assignExpr |
 stackPointerFlowsToUse(assignExpr.getRValue(), _, _, _) and
 not stackReferenceFlowsToUse(assignExpr.getLValue(), _, _, _)
 )
 }
 boolean getEscapeIsLocal() {
 result = any (
 boolean isLocal |
 stackPointerFlowsToUse(this.getRValue(), _, _, isLocal)
 )
 }
 Expr getEscapeSource() {
 result = any (
 Expr source |
 stackPointerFlowsToUse(this.getRValue(), _, source, _)
 )
 }
}

However, I'm stuck trying to understand...

  1. How to define StackAddressAllocation class corresponding to AllocationExpr that allocated stack variables (rather than malloc/free/placement new/etc.)?
  2. How to use StackAddressAllocation as the source node StackAddressEscapes class with the ` that stack allocation class with the

Unfortunately, my strawman class returns zero results, even when it's not filtering things.

Strawman StackAllocated class

Intended result:

Matches stack-allocated expression in a form that can be used with
TaintTracking and Flow::PathGraph to find which code paths cause
the cpp/stack-address-escape to be triggered.

For first pass, it's acceptable to limit this to variables that are explicitly
allocated on the stack (e.g., not dealing with C++ classes, member
functions, etc.).

import cpp
import semmle.code.cpp.dataflow.StackAddress
import semmle.code.cpp.dataflow.new.TaintTracking 
 
class StackAllocation extends AllocationExpr {
 StackAllocation() {
 // FIXME -- even with no filter, this gives zero results
 this = any(
 AllocationExpr e |
 // FIXME -- how to filter to things on the stack?
 // e.isStackAllocation() and 
 this = e
 )
 }
}
 
from
 StackAllocation stackAllocation
 , string msg
where
 msg = "Foo"
select
 stackAllocation
 , msg

Thus, I'm hoping for help understanding where I've gone astray....

Comment options

So if I apply my suggestion to your QL, and fix a couple of minor things (switching call and use in the output since that way around is more natural; amending the alert message format; fixing the query ID to use lower-kebab-case), I think I get results matching what you describe:

/**
 * @name TinyUSB - local variable address passed as buffer for endpoint transfer
 * @description Passing the address of a local variable (stack address) as the
 * `buffer` for these functions can cause stack corruption when the
 * transfer is not completed before the stack location is re-used.
 * Typically, this requires waiting on an ISR to fire, as the transfer
 * may be handled asynchronously by hardware to that buffer.
 * @kind problem
 * @problem.severity warning
 * @precision medium
 * @id cpp/tinyusb/stack-address-as-buffer
 * @tags reliability
 */
import cpp
import semmle.code.cpp.dataflow.StackAddress
from Call call, Expr use, string msg, Expr source
where
 // limit call to the three interesting functions
 call.getTarget().getName() = ["dcd_edpt_xfer", "usbd_edpt_xfer", "tud_control_xfer"] and
 // limit use to be the third parameter of those functions (aka `buffer`)
 call.getArgument(2) = use and
 /* all these take `buffer` as the 3rd argument, zero-indexed */
 // limit use to cases where the parameter is a stack pointer
 stackPointerFlowsToUse(use, _, source, _) and
 msg =
 "A stack address ($@, which points to $@) is passed as `buffer` to " +
 call.getTarget().getName() + "."
select call, msg, use, use.toString(), source, source.toString()

Example result: A stack address (& ..., which points to rsp) is passed as ``buffer`` to tud_control_xfer. at usbtmc_device.c:763:15, which indeed uses a stack-allocated data structure usbtmc_initiate_abort_rsp_t rsp

This is common enough that I suspect you'll need to do some more filtering to eliminate safe uses of stack memory, but is this a start to what you want?

Comment options

Awesome! Thanks for helping correct my CodeQL grammar.

Yes, the above query gets me closer to the ultimate goal, by showing uses of these functions where the buffer parameter is, in at least one path, a stack location. When the stack location is allocated in that calling function, that's easy to track, and "good enough" to enable the solution (with the occasional addition of more functions listed). 🎉

I can recursively start adding other functions that call into this, looking at their own corresponding incoming parameters.

Of course, this quickly gets complex and messy...

For example, the above query points one alert towards dcd_edpt_xfer() in usbd.c

A stack address (buffer, which points to buffer) is passed as "buffer" to dcd_edpt_xfer. | usbd.c:1343:7

However, as you can see, buffer is a parameter to that function. Thus, when viewing such an alert, the reviewer still has to review all calling functions, of which an unknown number pass a stack-based buffer. In this case, that's about fifteen cases spread across three files. Some of them are easy to review, but others result in messier filters.

Still, doing that where it's easy ensure the alert points "closer" to the stack allocation, so even if not the best architectural solution, it's at least a stepping stone.....

Is there an existing way to get the path from initial allocation of the stack location, to where it's being used per this query?

Specifically, any alert showing dcd_edpt_xfer() as the culprit is ... less helpful ... because it just indicates that some caller of that function passed a stack location.

I can convert the above query to a class usable as a sink filter. But ... what sort of filter/match can be used to link the original stack allocation location to each use?

Using a Path query seems promising ... but that's possibly my ignorance showing. Am I looking in the right direction, or is there a cleaner path forward (no pun intended)?

Thank you again for the excellent help!

Comment options

How about this, which uses a path-problem query, a global dataflow configuration and the , true qualifier that limits stackPointerFlowsToUse to highlighting local problems:

/**
 * @name TinyUSB - local variable address passed as buffer for endpoint transfer
 * @description Passing the address of a local variable (stack address) as the
 * `buffer` for these functions can cause stack corruption when the
 * transfer is not completed before the stack location is re-used.
 * Typically, this requires waiting on an ISR to fire, as the transfer
 * may be handled asynchronously by hardware to that buffer.
 * @kind path-problem
 * @problem.severity warning
 * @precision medium
 * @id cpp/tinyusb/stack-address-as-buffer
 * @tags reliability
 */
import cpp
import semmle.code.cpp.dataflow.StackAddress
import semmle.code.cpp.dataflow.new.DataFlow
module StackAllocConfig implements DataFlow::ConfigSig {
 predicate isSource(DataFlow::Node source) { stackPointerFlowsToUse(source.asExpr(), _, _, true) }
 predicate isSink(DataFlow::Node sink) {
 exists(Call c |
 sink.asExpr() = c.getArgument(2) and
 c.getTarget().getName() = ["dcd_edpt_xfer", "usbd_edpt_xfer", "tud_control_xfer"]
 )
 }
 predicate isBarrierIn(DataFlow::Node n) {
 // Clean up a little noise by avoiding paths through another source node,
 // such as those caused by use-use flow in the context f(&x); g(&x); which
 // might otherwise produce a path &x -> &x -> g... when omitting that first
 // use-use edge would be cleaner.
 isSource(n)
 }
}
module StackAllocFlow = DataFlow::Global<StackAllocConfig>;
// Expose `nodes` and `edges` predicates that power path-problem queries
import StackAllocFlow::PathGraph
from
 StackAllocFlow::PathNode source, StackAllocFlow::PathNode sink, Expr underlyingAlloc, Call call,
 string msg
where
 StackAllocFlow::flowPath(source, sink) and
 stackPointerFlowsToUse(source.getNode().asExpr(), _, underlyingAlloc, _) and
 call.getArgument(2) = sink.getNode().asExpr() and
 msg =
 "A stack address ($@, which points to $@) is passed as `buffer` to " +
 call.getTarget().getName() + "."
select call, source, sink, msg, source, source.toString(), underlyingAlloc,
 underlyingAlloc.toString()

The result for usbd.c:1343 now features a path of length 3 showing where the original stack alloc came from and how it got there. Note in this scenario you should only list under isSink those functions which are directly a problem when used with stack-allocated memory; those which are indirectly a problem by virtue of using such a function will be taken care of by the global dataflow logic.

Answer selected by henrygab
Comment options

Wow! Some gems in here, giving me lots of pointers to edges of my knowledge. 💕 LOVE IT! 💕

On the positive side, this confirms that the one instance has no identical variants ... only one place passes a stack location to dcd_edpt_xfer() ... hooray!

After I get some time to dig into the above, I will try to update it to match (rather than the particular function manually entered) anything matching the stackAddressEscapes() predicate:

 stackPointerFlowsToUse(assignExpr.getRValue(), _, source, isLocal) and
 not stackReferenceFlowsToUse(assignExpr.getLValue(), _, _, _)

Your guidance has been very helpful. Now it's time for me to explore docs, experiment with modifications, and expand my knowledge. I will update with my path-based solution here ... but if I get blocked and frustration thresholds are reached, I will instead show my attempt(s) and ask for help understanding where I went astray. Again ... thank you! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants

AltStyle によって変換されたページ (->オリジナル) /