Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust: Add regular expression injection query #18946

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

paldepind
Copy link
Contributor

@paldepind paldepind commented Mar 7, 2025

Note:

  • I've put this query under CWE-730 as that's where the Swift one was. But given that the Rust regex implementation is robust regarding DOS attacks, I think we should find another directory for it?

@github-actions github-actions bot added documentation Rust Pull requests that update Rust code labels Mar 7, 2025
Copy link
Contributor

github-actions bot commented Mar 7, 2025

QHelp previews:

rust/ql/src/queries/security/CWE-730/RegexInjection.qhelp

Regular expression injection

Constructing a regular expression with unsanitized user input can be dangerous. A malicious user may be able to modify the meaning of the expression causing it to match unexpected strings and to construct large regular expressions by using counted repetitions.

Recommendation

Before embedding user input into a regular expression, escape the input string using a function such as regex::escape to escape meta-characters that have special meaning.

If purposefully supporting user supplied regular expressions, then use RegexBuilder::size_limit to limit the pattern size such that it is no larger than necessary.

Example

The following example construct a regular expressions from the user input key without escaping it first.

use regex::Regex;

fn get_value<'h>(key: &str, property: &'h str) -> Option<&'h str> {
    // BAD: User provided `key` is interpolated into the regular expression.
    let pattern = format!(r"^property:{key}=(.*)$");
    let re = Regex::new(&pattern).unwrap();
    re.captures(property)?.get(1).map(|m| m.as_str())
}

The regular expression is intended to match strings starting with "property" such as "property:foo=bar". However, a malicious user might inject the regular expression ".*^|key" and unexpectedly cause strings such as "key=secret" to match.

If user input is used to construct a regular expression it should be escaped first. This ensures that the user cannot insert characters that have special meanings in regular expressions.

use regex::{escape, Regex};

fn get_value<'h>(key: &str, property: &'h str) -> option<&'h str> {
    // GOOD: User input is escaped before being used in the regular expression.
    let escaped_key = escape(key);
    let pattern = format!(r"^property:{escaped_key}=(.*)$");
    let re = regex::new(&pattern).unwrap();
    re.captures(property)?.get(1).map(|m| m.as_str())
}

References

@paldepind paldepind force-pushed the rust-regex-injection branch 3 times, most recently from 49b18d5 to 14722dc Compare March 7, 2025 11:18
@paldepind paldepind force-pushed the rust-regex-injection branch from 14722dc to 494f914 Compare March 7, 2025 11:37
@paldepind paldepind marked this pull request as ready for review March 7, 2025 12:59
* A taint configuration for detecting regular expression injection vulnerabilities.
*/
module RegexInjectionConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source instanceof ThreatModelSource }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not fully looked into this, but I would've though this should have been ActiveThreatModelSource, but that made the test results disappear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should indeed be ActiveThreatModelSource, but then your tests will have to use a remote rather than a local source. OR we could turn on local sources as active threat models for tests. I've thought about doing the latter in the past and I do have instructions for how to do it somewhere.

pack: codeql/rust-all
extensible: summaryModel
data:
- ["repo:https://github.com/rust-lang/regex:regex", "crate::escape", "Argument[0].Reference", "ReturnValue", "taint", "manual"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model is accurate, but I mostly added it to make sure that the barrier actually worked.

@paldepind paldepind requested a review from geoffw0 March 7, 2025 13:03
@geoffw0 geoffw0 added the ready-for-doc-review This PR requires and is ready for review from the GitHub docs team. label Mar 7, 2025
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 🎉

We will need to check if there are any results in a DCA run, and I'm very interested to get an idea of how many sinks there are (regardless of results) as an indication of how much regex usage we are seeing.

<code>"property"</code> such as <code>"property:foo=bar"</code>. However, a
malicious user might inject the regular expression <code>".*^|key"</code> and
unexpectedly cause strings such as <code>"key=secret"</code> to match.
</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great example! I've added the ready-for-doc-review tag so this PR to attract a member of the docs team for further review of the .qhelp.

*/
abstract class RegexInjectionBarrier extends DataFlow::Node { }

/** A sink for `a` in `Regex::new(a)` when `a` is not a literal. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any benefit to defining this sink in QL rather than models-as-data?

/**
* Provides a taint-tracking configuration for detecting regular expression
* injection vulnerabilities.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're generally going for a simpler two-files-per-query setup in Rust:

lib/.../FooExtensions.qll
src/.../Foo.ql

rather than three:

lib/.../FooExtensions.qll
lib/.../FooQuery.qll [this file]
src/.../Foo.ql

Before we had query inline expectations tests, the latter pattern was useful as the taint config in FooQuery.qll could be used to define a (non-query) inline expectations test. These days there's less benefit to not just defining the taint configuration in the .ql, which I like because there are fewer files to navigate between.

So I'd prefer we combined this file into the .ql, though I don't feel particularly strongly if you'd rather we worry about things more consistent with other queries later on.

* A taint configuration for detecting regular expression injection vulnerabilities.
*/
module RegexInjectionConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source instanceof ThreatModelSource }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should indeed be ActiveThreatModelSource, but then your tests will have to use a remote rather than a local source. OR we could turn on local sources as active threat models for tests. I've thought about doing the latter in the past and I do have instructions for how to do it somewhere.

* @id rust/regex-injection
* @tags security
* external/cwe/cwe-730
* external/cwe/cwe-400
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put this query under CWE-730 as that's where the Swift one was. But given that the Rust regex implementation is robust regarding DOS attacks, I think we should find another directory for it?

How about:

Note that if we change the CWE tags, we need to regenerate the @security-severity value as well. I can talk you through how to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation ready-for-doc-review This PR requires and is ready for review from the GitHub docs team. Rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants