Rust: Add regular expression injection query #18946

paldepind · 2025-03-07T08:05:08Z

Note:

I've put this query under CWE-730 as that's where the Swift one was. But given that the Rust regex implementation is robust regarding DOS attacks, I think we should find another directory for it?

github-actions · 2025-03-07T08:05:57Z

QHelp previews:

rust/ql/src/queries/security/CWE-730/RegexInjection.qhelp

Regular expression injection

Constructing a regular expression with unsanitized user input can be dangerous. A malicious user may be able to modify the meaning of the expression causing it to match unexpected strings and to construct large regular expressions by using counted repetitions.

Recommendation

Before embedding user input into a regular expression, escape the input string using a function such as regex::escape to escape meta-characters that have special meaning.

If purposefully supporting user supplied regular expressions, then use RegexBuilder::size_limit to limit the pattern size such that it is no larger than necessary.

Example

The following example construct a regular expressions from the user input key without escaping it first.

use regex::Regex;

fn get_value<'h>(key: &str, property: &'h str) -> Option<&'h str> {
    // BAD: User provided `key` is interpolated into the regular expression.
    let pattern = format!(r"^property:{key}=(.*)$");
    let re = Regex::new(&pattern).unwrap();
    re.captures(property)?.get(1).map(|m| m.as_str())
}

The regular expression is intended to match strings starting with "property" such as "property:foo=bar". However, a malicious user might inject the regular expression ".*^|key" and unexpectedly cause strings such as "key=secret" to match.

If user input is used to construct a regular expression it should be escaped first. This ensures that the user cannot insert characters that have special meanings in regular expressions.

use regex::{escape, Regex};

fn get_value<'h>(key: &str, property: &'h str) -> option<&'h str> {
    // GOOD: User input is escaped before being used in the regular expression.
    let escaped_key = escape(key);
    let pattern = format!(r"^property:{escaped_key}=(.*)$");
    let re = regex::new(&pattern).unwrap();
    re.captures(property)?.get(1).map(|m| m.as_str())
}

References

regex crate documentation: Untrusted patterns
Common Weakness Enumeration: CWE-730.
Common Weakness Enumeration: CWE-400.

rust/ql/src/queries/security/CWE-730/RegexInjection.ql

paldepind · 2025-03-07T13:02:44Z

rust/ql/lib/codeql/rust/security/regex/RegexInjectionQuery.qll

+ * A taint configuration for detecting regular expression injection vulnerabilities.
+ */
+module RegexInjectionConfig implements DataFlow::ConfigSig {
+  predicate isSource(DataFlow::Node source) { source instanceof ThreatModelSource }


I've not fully looked into this, but I would've though this should have been ActiveThreatModelSource, but that made the test results disappear.

It should indeed be ActiveThreatModelSource, but then your tests will have to use a remote rather than a local source. OR we could turn on local sources as active threat models for tests. I've thought about doing the latter in the past and I do have instructions for how to do it somewhere.

paldepind · 2025-03-07T13:03:38Z

rust/ql/lib/codeql/rust/frameworks/regex.model.yml

+      pack: codeql/rust-all
+      extensible: summaryModel
+    data:
+      - ["repo:https://github.com/rust-lang/regex:regex", "crate::escape", "Argument[0].Reference", "ReturnValue", "taint", "manual"]


This model is accurate, but I mostly added it to make sure that the barrier actually worked.

geoffw0

Looks great! 🎉

We will need to check if there are any results in a DCA run, and I'm very interested to get an idea of how many sinks there are (regardless of results) as an indication of how much regex usage we are seeing.

geoffw0 · 2025-03-07T16:24:40Z

rust/ql/src/queries/security/CWE-730/RegexInjection.qhelp

+<code>"property"</code> such as <code>"property:foo=bar"</code>. However, a
+malicious user might inject the regular expression <code>".*^|key"</code> and
+unexpectedly cause strings such as <code>"key=secret"</code> to match.
+</p>


Great example! I've added the ready-for-doc-review tag so this PR to attract a member of the docs team for further review of the .qhelp.

geoffw0 · 2025-03-07T16:26:11Z

rust/ql/lib/codeql/rust/security/regex/RegexInjectionExtensions.qll

+ */
+abstract class RegexInjectionBarrier extends DataFlow::Node { }
+
+/** A sink for `a` in `Regex::new(a)` when `a` is not a literal. */


Is there any benefit to defining this sink in QL rather than models-as-data?

geoffw0 · 2025-03-07T16:30:56Z

rust/ql/lib/codeql/rust/security/regex/RegexInjectionQuery.qll

+/**
+ * Provides a taint-tracking configuration for detecting regular expression
+ * injection vulnerabilities.
+ */


We're generally going for a simpler two-files-per-query setup in Rust:

lib/.../FooExtensions.qll src/.../Foo.ql

rather than three:

lib/.../FooExtensions.qll lib/.../FooQuery.qll [this file] src/.../Foo.ql

Before we had query inline expectations tests, the latter pattern was useful as the taint config in FooQuery.qll could be used to define a (non-query) inline expectations test. These days there's less benefit to not just defining the taint configuration in the .ql, which I like because there are fewer files to navigate between.

So I'd prefer we combined this file into the .ql, though I don't feel particularly strongly if you'd rather we worry about things more consistent with other queries later on.

geoffw0 · 2025-03-07T16:32:48Z

rust/ql/lib/codeql/rust/security/regex/RegexInjectionQuery.qll

+ * A taint configuration for detecting regular expression injection vulnerabilities.
+ */
+module RegexInjectionConfig implements DataFlow::ConfigSig {
+  predicate isSource(DataFlow::Node source) { source instanceof ThreatModelSource }


It should indeed be ActiveThreatModelSource, but then your tests will have to use a remote rather than a local source. OR we could turn on local sources as active threat models for tests. I've thought about doing the latter in the past and I do have instructions for how to do it somewhere.

geoffw0 · 2025-03-07T16:35:16Z

rust/ql/src/queries/security/CWE-730/RegexInjection.ql

+ * @id rust/regex-injection
+ * @tags security
+ *       external/cwe/cwe-730
+ *       external/cwe/cwe-400


I've put this query under CWE-730 as that's where the Swift one was. But given that the Rust regex implementation is robust regarding DOS attacks, I think we should find another directory for it?

How about:

CWE-624: Executable Regular Expression Error (which sounds right, but contains hints that it might be talking about something much more specific; sadly lacking examples)

a generic CWE like CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component or CWE-20: Improper Input Validation.

leave it how it is? As you noted in the .qhelp, a more limited DOS attack is still possible, and leaving it as CWE-730 is nicely consistent with other languages.

Note that if we change the CWE tags, we need to regenerate the @security-severity value as well. I can talk you through how to do that.

github-actions bot added documentation Rust Pull requests that update Rust code labels Mar 7, 2025

github-advanced-security bot found potential problems Mar 7, 2025

View reviewed changes

rust/ql/src/queries/security/CWE-730/RegexInjection.ql Fixed Show fixed Hide fixed

paldepind force-pushed the rust-regex-injection branch 3 times, most recently from 49b18d5 to 14722dc Compare March 7, 2025 11:18

Rust: Add regular expression injection query

494f914

paldepind force-pushed the rust-regex-injection branch from 14722dc to 494f914 Compare March 7, 2025 11:37

paldepind marked this pull request as ready for review March 7, 2025 12:59

paldepind commented Mar 7, 2025

View reviewed changes

paldepind requested a review from geoffw0 March 7, 2025 13:03

geoffw0 added the ready-for-doc-review This PR requires and is ready for review from the GitHub docs team. label Mar 7, 2025

geoffw0 reviewed Mar 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rust: Add regular expression injection query #18946

Rust: Add regular expression injection query #18946

paldepind commented Mar 7, 2025 •

edited

Loading

github-actions bot commented Mar 7, 2025

Regular expression injection

Recommendation

Example

References

paldepind Mar 7, 2025

geoffw0 Mar 7, 2025

paldepind Mar 7, 2025

geoffw0 left a comment

geoffw0 Mar 7, 2025

geoffw0 Mar 7, 2025

geoffw0 Mar 7, 2025

geoffw0 Mar 7, 2025

geoffw0 Mar 7, 2025

Rust: Add regular expression injection query #18946

Are you sure you want to change the base?

Rust: Add regular expression injection query #18946

Conversation

paldepind commented Mar 7, 2025 • edited Loading

github-actions bot commented Mar 7, 2025

Regular expression injection

Recommendation

Example

References

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

geoffw0 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paldepind commented Mar 7, 2025 •

edited

Loading