Skip to content

Adding the option to disable the DNS processor failure or success cache #44932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

mjmbischoff
Copy link
Contributor

@mjmbischoff mjmbischoff commented Jun 19, 2025

Proposed commit message

Adds the option to disable the success and failure cache.

Motivation

This is to enable use cases that require capturing the current point in time dns record regardless of cache or ttl of the record. Such as the case of monitoring the dns server, or with recorded events that need to capture the current state of the environment. TTL captures the time frame over which the old value might be used over the current DNS record, in other words the frame time in which the agent might observe the old or new record based upon whenever the previous request was made. This unpredictability can be undesired when optimizing time-to-intervention.

Disabling the cache will have throughput implications, serial processing an event will be greater than DNS roundtrip time. For example if round-trip time to perform an DNS request is 1 ms, max throughput it limited to 1000/sec. Known use cases have are low throughput requirements. Parallelization, by for example deploying multiple agents, can be used to stretch this number. We would urge to reevaluate the use case and the use of the cache at this point.

NOTE: setting the ttl on the failure cache to 1ns achieves a similar, but imperfect effect.
NOTE: setting the ttl on the success cache is a valid option as per code, it is however ignored as also document in the code. in the documentation it is omitted as an option. Honoring setting and the ttl (min(ttl, dns_record_ttl)) is a different route. Similar to other dns client behaviour.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

non known, the default values leave the old behavior intact and the setting to trigger the new behavior is added in this PR

How to test this PR locally

Define the DNS processor, observe cache stats / resolver requests.

Related issues

@mjmbischoff mjmbischoff requested a review from a team as a code owner June 19, 2025 14:20
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 19, 2025
@botelastic
Copy link

botelastic bot commented Jun 19, 2025

This pull request doesn't have a Team:<team> label.

Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

Copy link
Contributor

mergify bot commented Jun 19, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @mjmbischoff? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@mjmbischoff mjmbischoff changed the title Adding the option to disable the DNS failure or success cache Adding the option to disable the DNS processor failure or success cache Jun 19, 2025
- QF1008, while I disagree with removing the additional qualification as it makes things more readable, removing the qualifier to appease the linter god.
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add to the proposed commit message to explain the why part and what is the use case for cache disablement.

- WHY: the rationale/motivation for the changes

Turning off the caching will significantly limit the throughput of the pipeline. Even if each request takes 1ms to complete, that means the maximum throughput is 1000 EPS.

Also, the documentation for the processor will need updated to include the new configuration parameter.

@mjmbischoff
Copy link
Contributor Author

mjmbischoff commented Jun 21, 2025

Can you please add to the proposed commit message to explain the why part and what is the use case for cache disablement.

- WHY: the rationale/motivation for the changes

Turning off the caching will significantly limit the throughput of the pipeline. Even if each request takes 1ms to complete, that means the maximum throughput is 1000 EPS.

Also, the documentation for the processor will need updated to include the new configuration parameter.

Added motivation.

TODO: documentation Added documentation c5de66a ab103b2 e1f60c9

- document Enabled settings
- Notes with warnings on throughput and compounding effects
- document Enabled settings
- Notes with warnings on throughput and compounding effects
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement needs_team Indicates that the issue/PR needs a Team:* label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants