Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* More ways of checking that a string matches a regular expression are now considered as sanitizers for various queries, including `java/ssrf` and `java/path-injection`. In particular, being annotated with `@javax.validation.constraints.Pattern` is now recognised as a sanitizer for those queries.
1 change: 1 addition & 0 deletions java/ql/lib/java.qll
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import semmle.code.Unit
import semmle.code.java.Annotation
import semmle.code.java.Compilation
import semmle.code.java.CompilationUnit
import semmle.code.java.Concepts
import semmle.code.java.ControlFlowGraph
import semmle.code.java.Dependency
import semmle.code.java.Element
Expand Down
89 changes: 89 additions & 0 deletions java/ql/lib/semmle/code/java/Concepts.qll
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
/**
* Provides abstract classes representing generic concepts such as file system
* access or system command execution, for which individual framework libraries
* provide concrete subclasses.
*/
overlay[local?]
module;

import java
private import semmle.code.java.dataflow.DataFlow
private import semmle.code.java.frameworks.JavaxAnnotations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not imported due to being referenced here, rather it's imported as a necessary reverse dependency to ensure a consistent extent of the abstract classes. To make that more clear, put the private import inside a private module - that way it's easy to see that nothing is put in scope here. Also, the reverse dependency on java/ql/lib/semmle/code/java/frameworks/Regex.qll is missing, so a private import of that should also be included.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also include java/ql/lib/semmle/code/java/JDK.qll for completeness, btw. - even if that happens to be included in import java.


/**
* A data-flow node that executes a regular expression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should define everything in terms of expressions here and stick to one Range class. Currently the Concepts setup suggests that models can be added using either of the two abstract range classes, but this only works if consumers consistently use the RegexExecution. This is however already very much not the case as all the added use-cases refer directly to RegexExecutionExpr::Range, which feels like slight misuse, and would certainly fail to include any potential models added via the DataFlow::Node-based abstract range class.

*
* Extend this class to refine existing API models. If you want to model new APIs,
* extend `RegexExecution::Range` instead.
*/
class RegexExecution extends DataFlow::Node instanceof RegexExecution::Range {
/** Gets the data flow node for the regex being executed by this node. */
DataFlow::Node getRegex() { result = super.getRegex() }

/** Gets a data flow node for the string to be searched or matched against. */
DataFlow::Node getString() { result = super.getString() }

/**
* Gets the name of this regex execution, typically the name of an executing method.
* This is used for nice alert messages and should include the module if possible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "module" terminology looks like left-over copy-paste from a different language. If we're going with Java-specific terminology then it should possibly say "type-qualified name" or something like that (looking at the various instances it seems like we're generally using type+name, but not the package name).

*/
string getName() { result = super.getName() }
}

/** Provides classes for modeling new regular-expression execution APIs. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "new" seems unnecessary to me.

Suggested change
/** Provides classes for modeling new regular-expression execution APIs. */
/** Provides classes for modeling regular-expression execution APIs. */

module RegexExecution {
/**
* A data flow node that executes a regular expression.
*
* Extend this class to model new APIs. If you want to refine existing API models,
* extend `RegexExecution` instead.
*/
abstract class Range extends DataFlow::Node {
/** Gets the data flow node for the regex being executed by this node. */
abstract DataFlow::Node getRegex();

/** Gets a data flow node for the string to be searched or matched against. */
abstract DataFlow::Node getString();

/**
* Gets the name of this regex execution, typically the name of an executing method.
* This is used for nice alert messages and should include the module if possible.
*/
abstract string getName();
}

private class RangeFromExpr extends Range {
private RegexExecutionExpr::Range ree;

RangeFromExpr() { this.asExpr() = ree }

override DataFlow::Node getRegex() { result.asExpr() = ree.getRegex() }

override DataFlow::Node getString() { result.asExpr() = ree.getString() }

override string getName() { result = ree.getName() }
}
}

/** Provides classes for modeling new regular-expression execution APIs. */
module RegexExecutionExpr {
/**
* An expression that executes a regular expression.
*
* Extend this class to model new APIs. If you want to refine existing API models,
* extend `RegexExecution` instead.
*/
abstract class Range extends Expr {
/** Gets the expression for the regex being executed by this node. */
abstract Expr getRegex();

/** Gets an expression for the string to be searched or matched against. */
abstract Expr getString();

/**
* Gets the name of this regex execution, typically the name of an executing method.
* This is used for nice alert messages and should include the module if possible.
*/
abstract string getName();
}
}
8 changes: 7 additions & 1 deletion java/ql/lib/semmle/code/java/JDK.qll
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,19 @@ class StringContainsMethod extends Method {
}

/** A call to the `java.lang.String.matches` method. */
class StringMatchesCall extends MethodCall {
class StringMatchesCall extends MethodCall, RegexExecutionExpr::Range {
StringMatchesCall() {
exists(Method m | m = this.getMethod() |
m.getDeclaringType() instanceof TypeString and
m.hasName("matches")
)
}

override Expr getRegex() { result = this.getArgument(0) }

override Expr getString() { result = this.getQualifier() }

override string getName() { result = "String.matches" }
}

/** A call to the `java.lang.String.replaceAll` method. */
Expand Down
35 changes: 35 additions & 0 deletions java/ql/lib/semmle/code/java/frameworks/JavaxAnnotations.qll
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,38 @@ class WebServiceAnnotation extends Annotation {
class WebServiceRefAnnotation extends Annotation {
WebServiceRefAnnotation() { this.getType().hasQualifiedName("javax.xml.ws", "WebServiceRef") }
}

/*
* Annotations in the package `javax.validation.constraints`.
*/

/**
* A `@javax.validation.constraints.Pattern` annotation.
*/
class PatternAnnotation extends Annotation, RegexExecutionExpr::Range {
PatternAnnotation() {
this.getType()
.hasQualifiedName(["javax.validation.constraints", "jakarta.validation.constraints"],
"Pattern")
}

override Expr getRegex() { result = this.getValue("regexp") }

override Expr getString() {
// Annotation on field accessed by direct read - value of field will match regexp
result = this.getAnnotatedElement().(Field).getAnAccess()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, this includes writes, which doesn't sound intended based on the comment, so perhaps write it as follows instead.

Suggested change
result = this.getAnnotatedElement().(Field).getAnAccess()
result.(FieldRead).getField() = this.getAnnotatedElement()

or
// Annotation on field accessed by getter - value of field will match regexp
result.(MethodCall).getMethod().(GetterMethod).getField() = this.getAnnotatedElement()
or
// Annotation on parameter - value of parameter will match regexp
result = this.getAnnotatedElement().(Parameter).getAnAccess().(VarRead)
or
// Annotation on method - return value of method will match regexp
result.(Call).getCallee() = this.getAnnotatedElement()
// TODO - we could also consider the case where the annotation is on a type
// but this harder to model and not very common.
}

override string getName() { result = "@javax.validation.constraints.Pattern annotation" }
}
61 changes: 61 additions & 0 deletions java/ql/lib/semmle/code/java/frameworks/Regex.qll
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ overlay[local?]
module;

import java
private import semmle.code.java.dataflow.DataFlow

/** The class `java.util.regex.Matcher`. */
class TypeRegexMatcher extends Class {
Expand All @@ -24,6 +25,16 @@ class TypeRegexPattern extends Class {
TypeRegexPattern() { this.hasQualifiedName("java.util.regex", "Pattern") }
}

/**
* The `compile` method of `java.util.regex.Pattern`.
*/
class PatternCompileMethod extends Method {
PatternCompileMethod() {
this.getDeclaringType() instanceof TypeRegexPattern and
this.hasName("compile")
}
}

/**
* The `matches` method of `java.util.regex.Pattern`.
*/
Expand Down Expand Up @@ -59,3 +70,53 @@ class PatternLiteralField extends Field {
this.hasName("LITERAL")
}
}

/** A call to the `compile` method of `java.util.regex.Pattern`. */
class PatternCompileCall extends MethodCall {
PatternCompileCall() { this.getMethod() instanceof PatternCompileMethod }
}

/** A call to the `matcher` method of `java.util.regex.Pattern`. */
class PatternMatcherCall extends MethodCall {
PatternMatcherCall() { this.getMethod() instanceof PatternMatcherMethod }
}

/** A call to the `matches` method of `java.util.regex.Pattern`. */
class PatternMatchesCall extends MethodCall, RegexExecutionExpr::Range {
PatternMatchesCall() { this.getMethod() instanceof PatternMatchesMethod }

override Expr getRegex() { result = this.getArgument(0) }

override Expr getString() { result = this.getArgument(1) }

override string getName() { result = "Pattern.matches" }
}

/** A call to the `matches` method of `java.util.regex.Matcher`. */
class MatcherMatchesCall extends MethodCall, RegexExecutionExpr::Range {
MatcherMatchesCall() { this.getMethod() instanceof MatcherMatchesMethod }

/**
* Get the call to `java.util.regex.Pattern.matcher` which returned the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Get the call to `java.util.regex.Pattern.matcher` which returned the
* Gets the call to `java.util.regex.Pattern.matcher` that returned the

* qualifier of this call. This is needed to determine the string being
* matched.
*/
PatternMatcherCall getPatternMatcherCall() {
DataFlow::localExprFlow(result, this.getQualifier())
}

/**
* Get the call to `java.util.regex.Pattern.compile` which returned the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Get the call to `java.util.regex.Pattern.compile` which returned the
* Gets the call to `java.util.regex.Pattern.compile` that returned the

* `Pattern` used by this matcher. This is needed to determine the regular
* expression being used.
*/
PatternCompileCall getPatternCompileCall() {
DataFlow::localExprFlow(result, this.getPatternMatcherCall())
}

override Expr getRegex() { result = this.getPatternCompileCall().getArgument(0) }

override Expr getString() { result = this.getPatternMatcherCall().getArgument(0) }

override string getName() { result = "Matcher.matches" }
}
13 changes: 5 additions & 8 deletions java/ql/lib/semmle/code/java/security/PathSanitizer.qll
Original file line number Diff line number Diff line change
Expand Up @@ -427,20 +427,17 @@ private class ReplaceDirectoryCharactersSanitizer extends StringReplaceOrReplace
}
}

/** Holds if `target` is the first argument of `matchesCall`. */
private predicate isMatchesTarget(StringMatchesCall matchesCall, CompileTimeConstantExpr target) {
target = matchesCall.getArgument(0)
}

/**
* Holds if `matchesCall` confirms that `checkedExpr` does not contain any directory characters
* on the given `branch`.
*/
private predicate isMatchesCall(StringMatchesCall matchesCall, Expr checkedExpr, boolean branch) {
private predicate isMatchesCall(
RegexExecutionExpr::Range regexMatch, Expr checkedExpr, boolean branch
) {
exists(CompileTimeConstantExpr target, string targetValue |
isMatchesTarget(matchesCall, target) and
target = regexMatch.getRegex() and
target.getStringValue() = targetValue and
checkedExpr = matchesCall.getQualifier()
checkedExpr = regexMatch.getString()
|
(
// Allow anything except `.`, '/', '\'
Expand Down
31 changes: 12 additions & 19 deletions java/ql/lib/semmle/code/java/security/Sanitizers.qll
Original file line number Diff line number Diff line change
Expand Up @@ -41,25 +41,11 @@ class SimpleTypeSanitizer extends DataFlow::Node {
* make the type recursive. Otherwise use `RegexpCheckBarrier`.
*/
predicate regexpMatchGuardChecks(Guard guard, Expr e, boolean branch) {
exists(Method method, MethodCall mc |
method = mc.getMethod() and
guard = mc and
branch = true
|
// `String.matches` and other `matches` methods.
method.getName() = "matches" and
e = mc.getQualifier()
or
method instanceof PatternMatchesMethod and
e = mc.getArgument(1)
or
method instanceof MatcherMatchesMethod and
exists(MethodCall matcherCall |
matcherCall.getMethod() instanceof PatternMatcherMethod and
e = matcherCall.getArgument(0) and
DataFlow::localExprFlow(matcherCall, mc.getQualifier())
)
)
exists(RegexExecutionExpr::Range ree | not ree instanceof Annotation |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concept as documented suggests that it covers regex execution in general, so possibly also group capture and substitutions. With that in mind it's not entirely clear to me whether it's reasonable to assume that a return value of true indicates a matching regex - that doesn't appear to be part of the contract of the concept.

guard = ree and
e = ree.getString()
) and
branch = true
}

/**
Expand All @@ -70,5 +56,12 @@ predicate regexpMatchGuardChecks(Guard guard, Expr e, boolean branch) {
class RegexpCheckBarrier extends DataFlow::Node {
RegexpCheckBarrier() {
this = DataFlow::BarrierGuard<regexpMatchGuardChecks/3>::getABarrierNode()
or
// Annotations don't fit into the model of barrier guards because the
// annotation doesn't dominate the sanitized expression, so we instead
// treat them as barriers directly.
exists(RegexExecutionExpr::Range ree | ree instanceof Annotation |
this.asExpr() = ree.getString()
)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,9 @@ private class ExternalRegexInjectionSanitizer extends RegexInjectionSanitizer {
*/
private class PatternLiteralFlag extends RegexInjectionSanitizer {
PatternLiteralFlag() {
exists(MethodCall ma, Method m, PatternLiteralField field | m = ma.getMethod() |
ma.getArgument(0) = this.asExpr() and
m.getDeclaringType() instanceof TypeRegexPattern and
m.hasName("compile") and
ma.getArgument(1) = field.getAnAccess()
exists(PatternCompileCall pcc, PatternLiteralField field |
pcc.getArgument(0) = this.asExpr() and
pcc.getArgument(1) = field.getAnAccess()
)
}
}
Loading