I want to do TaintTracking with functions that taint their arguments with userinput. Example:
fgets(buf, sizeof(buf), stdin); // buf is tainted
[...]
n = strlen(buf); // tainted argument to strlen
[...]
memcpy(somewhere, buf, n) // tainted call to memcpy
Semmle should be able to spot this with a Query like the following (just with fgets->strlen as example). I am borrowing code from SecurityOptions:
import cpp
import semmle.code.cpp.dataflow.TaintTracking
class IsTaintedArg extends string {
  IsTaintedArg() { this = "IsTaintedArg" }
  predicate userInputArgument(FunctionCall functionCall, int arg) {
    exists(string fname |
      functionCall.getTarget().hasGlobalName(fname) and
      exists(functionCall.getArgument(arg)) and (fname = "fgets" and arg = 0) // argument 0 of fgets is tainted
    )
  }
  predicate isUserInput(Expr expr, string cause) {
    exists(FunctionCall fc, int i |
      this.userInputArgument(fc, i) and
      expr = fc.getArgument(i) and
      cause = fc.getTarget().getName()
    )
  }
}
class TaintedFormatConfig extends TaintTracking::Configuration {
  TaintedFormatConfig() { this = "TaintedFormatConfig" }
  override predicate isSource(DataFlow::Node source) {
    exists (IsTaintedArg opts |
      opts.isUserInput(source.asExpr(), _)
    )
  }
  override predicate isSink(DataFlow::Node sink) { 
    exists (FunctionCall fc | sink.asExpr() = fc.getArgument(0) and fc.getTarget().hasName("strlen")) // give me all calls that land in strlen's first argument
  }
}
from TaintedFormatConfig cfg, DataFlow::Node source, DataFlow::Node sink
where cfg.hasFlow(source, sink)
select sink, source
Yet it does not look like it is working.
When I just Query cfg.isSource() or cfg.isSink() however, both source and sink are recognized. But hasFlow() still returns nothing - although a path should definitely exist.
I am using libssh2 to test my findings, the example flow exists here.
My Query to test around is here.
Does anyone have any idea what I might be doing wrong in the Query above?