Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

How to connect two alias dataflow node in Go? #17329

Answered by rvermeulen
Lslightly asked this question in Q&A
Discussion options

I am new to dataflow library for Go. I want to automatically find the alias slice reference expression with same value when executing in Go. I have a program as follows:

func main() {
 var s []int
 s[0] = nil
 s = s[1:]
}

Is there any dataflow predicates I can use in dataflow library for Go to connect s in s[0] and s in s[1:]?

I tried two functions, localFlowStep and globalValueNumber, to connect exprNode(s in s[0]) and exprNode(s in s[1:]), but failed. I have tried following queries.

import go
import DataFlow
predicate indexOfSlice(IndexExpr sliceIdx, SliceExpr slice) {
 localFlow(exprNode(sliceIdx), exprNode(slice.getBase()))
}
from IndexExpr sliceIdx, SliceExpr slice
where indexOfSlice(sliceIdx, slice)
select sliceIdx, slice

The result of the above query contains nothing.

import go
import DataFlow
from ExprNode s, Node d
where globalValueNumber(s) = globalValueNumber(d)
select s, d

I don't find s[0] and s in s[1:] in the result using the above query.

I don't know which predicate I can use to connect these two node. Is there any predicate? The conditions requested in the question can be relaxed, i.e. connecting s[0] and s in s[1:] or s[0] and s[1:] or other pairs are all allowed.

It is not a good idea to test whether results of toString of two expressions are equal, since it can't detect alias variables with same string. For example:

func foo() {
 a := 2
 x := b.f
 x.g = &a
 b.f.g = &a
}

Obviously, x.g and b.f.g are alias expressions because x is just b.f. But the string of two expression is not the same. I think x and b.f may use the same SSA variable.

You must be logged in to vote

Hi @Lslightly,

Thanks for your question.

In the first scenario you should be able to connect s with s[0] and s[1:] through an SSAVariable and its uses.

For example,

predicate relatedBy(SsaVariable v, Name n, Name m) {
 n != m and
 v.getAUse().isFirstNodeOf(n) and v.getAUse().isFirstNodeOf(m)
}

The second example is much more complicated because you need to keep track of aliases and match access path to relate access paths.

For example, to relate x.g and b.f.g you can relate x with b through b.f and later match x.g with b.f.g through substitution of x in the access path with the access path b.f.

An overfitted, but possible useful starting point, example is:

predicate isAliasedThroug...

Replies: 1 comment 5 replies

Comment options

Hi @Lslightly,

Thanks for your question.

In the first scenario you should be able to connect s with s[0] and s[1:] through an SSAVariable and its uses.

For example,

predicate relatedBy(SsaVariable v, Name n, Name m) {
 n != m and
 v.getAUse().isFirstNodeOf(n) and v.getAUse().isFirstNodeOf(m)
}

The second example is much more complicated because you need to keep track of aliases and match access path to relate access paths.

For example, to relate x.g and b.f.g you can relate x with b through b.f and later match x.g with b.f.g through substitution of x in the access path with the access path b.f.

An overfitted, but possible useful starting point, example is:

predicate isAliasedThroughAccessPath(SsaVariable alias, SsaVariable base, string path) {
 exists(SelectorExpr selector | base.getAUse().isFirstNodeOf(selector.getBase+()) |
 globalValueNumber(DataFlow::ssaNode(alias)) = globalValueNumber(DataFlow::exprNode(selector))
 and getAccessPath(selector) = path
 )
}
string getAccessPath(SelectorExpr e) {
 if e.getBase() instanceof SelectorExpr then
 exists(string root | root = getAccessPath(e.getBase())|
 result = root + "." + e.getSelector().getName()
 )
 else 
 result = e.getBase().(VariableName) + "." + e.getSelector().getName()
}
predicate isAliasedSelector(SelectorExpr n, SelectorExpr m) {
 exists(SsaVariable baseOfN, SsaVariable baseOfM, string path |
 isAliasedThroughAccessPath(baseOfN, baseOfM, path) 
 and baseOfN.getAUse().isFirstNodeOf(n.getBase+()) and
 baseOfM.getAUse().isFirstNodeOf(m.getBase+()) and
 getAccessPath(m) = getAccessPath(n).regexpReplaceAll("^"+baseOfN.getSourceVariable().getName()+"\\.", path + ".")
 )
}
from SelectorExpr n, SelectorExpr m
where isAliasedSelector(n, m)
select n, m

I will also ask the Go team for suggestions, because they are our CodeQL Go experts.

You must be logged in to vote
5 replies
Comment options

Thank you for writing such a detailed answer. I have learned a lot from it.

For the first example, I tried SsaVariable after I posed the question and I got similar solution. Thank you for your kindness. To be more specific with regard to this case, I write the query as follows:

// `sliceIdx` represents `s[0]`, `slice` represents `s[1:]`
predicate useSameSlice(IndexExpr sliceIdx, SliceExpr sliceExpr) {
 exists(
 SsaVariable sliceDef |
 sliceDef.getType().getUnderlyingType() instanceof SliceType
 and sliceDef.getAUse() = exprNode(sliceIdx.getBase()).asInstruction()
 and sliceDef.getAUse() = exprNode(sliceExpr.getBase()).asInstruction()
 )
}

For the solution to the second example, my understanding is annotated in the following code.

// x := b.f
// x.g = &a
// b.f.g = &a
/**
 * This predicate is equal to the constraint that `base.f` is alias to `alias`
 * | ↑ |
 * `path`
 * `alias` is the ssa variable of `x`(the alias of `b.f`)
 * `base` is the ssa variable of `b`
 * `selector` is the selector expression `b.f`
 * `path` is access path string of `b.f`, evaluating to `b.f`
 */
predicate isAliasedThroughAccessPath(SsaVariable alias, SsaVariable base, string path) {
 exists(SelectorExpr selector | base.getAUse().isFirstNodeOf(selector.getBase+()) |
 globalValueNumber(DataFlow::ssaNode(alias)) = globalValueNumber(DataFlow::exprNode(selector))
 and getAccessPath(selector) = path
 )
}
/**
 * input `x.f`, output string `x.f`
 * input `b.f.g`, output string `b.f.g`
 */
string getAccessPath(SelectorExpr e) {
 if e.getBase() instanceof SelectorExpr then
 exists(string root | root = getAccessPath(e.getBase())|
 result = root + "." + e.getSelector().getName()
 )
 else 
 result = e.getBase().(VariableName) + "." + e.getSelector().getName()
}
/**
 * `n` is `x.g`
 * `m` is `b.f.g`
 * `baseOfN` is ssa variable of `x`
 * `baseOfM` is ssa variable of `b` in `b.f.g`
 * `path` is the string `b.f`
 * string `b.f.g` = string `x.g`.regexpReplaceAll("^x\\.", "b.f.")
 */
predicate isAliasedSelector(SelectorExpr n, SelectorExpr m) {
 exists(SsaVariable baseOfN, SsaVariable baseOfM, string path |
 isAliasedThroughAccessPath(baseOfN, baseOfM, path) 
 and baseOfN.getAUse().isFirstNodeOf(n.getBase+()) and
 baseOfM.getAUse().isFirstNodeOf(m.getBase+()) and
 getAccessPath(m) = getAccessPath(n).regexpReplaceAll("^"+baseOfN.getSourceVariable().getName()+"\\.", path + ".")
 )
}
/**
 * `n` is `x.g`, `m` is `b.f.g`
 */
from SelectorExpr n, SelectorExpr m
where isAliasedSelector(n, m)
select n, m

If my testing has no problems, x is a SsaVariable and evaluating b.f.g in b.f.g = &a will not use x SsaVariable to simplify the calculation but re-evalute the expression through 1) b SsaVariable and 2) b.f Instruction and 3) assignment to b.f.g Instruction. So the only approach to judge whether x and b.f are aliases is to test the equality of globalValueNumber instead of SsaVariable or IR::Instruction.

Comment options

Continue the discussion on globalValueNumber. I've tried the following query for the first example. But it returns nothing. The reason I try globalValueNumber is that I also want to recognize the equality of a.s in a.s[0] and a.s in a.s[1:] in a.s = a.s[1:]. I think (s[0], s = s[1:]) and (a.s[0], a.s = a.s[1:]) are similar. But the fact is that a.s is not a use of SsaVariable, so the equality can be only judged by globalValueNumber if the conclusion in my last reply is correct. But globalValueNumber also fails. So what is the solution for more general cases?

predicate useSameSlice(IndexExpr sliceIdx, SliceExpr sliceExpr) {
 // exists(
 // SsaVariable sliceDef |
 // sliceDef.getType().getUnderlyingType() instanceof SliceType
 // and sliceDef.getAUse() = exprNode(sliceIdx.getBase()).asInstruction()
 // and sliceDef.getAUse() = exprNode(sliceExpr.getBase()).asInstruction()
 // )
 globalValueNumber(exprNode(sliceIdx.getBase())) = globalValueNumber(exprNode(sliceExpr.getBase()))
}
from IndexExpr sliceIdx, SliceExpr slice
where useSameSlice(sliceIdx, slice)
select slice
Comment options

Before we get too deep into the AST and SSA trickery that's possible here, could you say a little about what you want to achieve by identifying possible aliases like this? That will help me propose a solution at the right level of abstraction for your goals.

Comment options

Before we get too deep into the AST and SSA trickery that's possible here, could you say a little about what you want to achieve by identifying possible aliases like this? That will help me propose a solution at the right level of abstraction for your goals.

I want to recognize the pattern as follows. expr1 in expr1[Idx] has the same value as expr2 in expr2[Idx+...:]. expr1 and expr2 evaluate to the same slice value at runtime.

expr1[Idx] = nil
expr2 = expr2[Idx+...:]
Comment options

Apologies for the late response; I must have overlooked the notification of your latest reply.

Taking the example Go program

package abc
type S struct {
 f []int
}
type T struct {
 s S
}
func f(s S, t T) int {
 ts := t.s
 return s.f[0] + s.f[1] + t.s.f[0] + ts.f[1]
}

One useful tool is the SsaWithFields class, which as its name suggests tries to identify uses of the same field by building on top of SSA variables.

This query identifies that the two s.f accesses likely match:

from SsaWithFields swf1, SsaWithFields swf2, DataFlow::Node use1, DataFlow::Node use2
where swf1 = swf2.similar()
and swf1.getAUse() = use1
and swf2.getAUse() = use2
and use1 = any(DataFlow::ElementReadNode sn).getBase()
and use2 = any(DataFlow::ElementReadNode sn).getBase()
and use1 != use2
select use1, use2

To go further and catch the matching index bases in t.s.f[0] + ts.f[1] we need to do more than the existing standard library, but if we look at the definition of SsaWithFields starting at https://github.com/github/codeql/blob/main/go/ql/lib/semmle/go/dataflow/SSA.qll#L308, it is possible to augment it so that flow steps can be taken in between field reads:

private IR::Instruction accessPathAux(TSsaWithFields base, Field f) {
 exists(IR::FieldReadInstruction fr, IR::Instruction basePathPlusFlow, IR::Instruction basePath |
 fr.getBase() = basePathPlusFlow or
 fr.getBase() = IR::implicitDerefInstruction(basePathPlusFlow.(IR::EvalInstruction).getExpr())
 |
 DataFlow::localFlow(DataFlow::instructionNode(basePath), DataFlow::instructionNode(basePathPlusFlow)) and
 base = accessPath(basePath) and
 f = fr.getField() and
 result = fr
 )
}

Note the DataFlow::localFlow(DataFlow::instructionNode(basePath), DataFlow::instructionNode(basePathPlusFlow)) added there compared to the version of that predicate in the standard library. For testing purposes I simply copied from private newtype TSsaWithFields up to the end of class SsaWithFields, and renamed the public class SsaWithFields to class SsaWithFieldsWithFlow so it wouldn't clash with the standard library version.

I can then use the same query as previously replacing SsaWithFields with SsaWithFieldsWithFlow to get both results desired.

Answer selected by Lslightly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /