Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Recording rules overrides #3973

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

alsoba13
Copy link
Contributor

@alsoba13 alsoba13 commented Mar 5, 2025

First approach to recording rules from overrides. This replaces recording rules from envs.

Missing some test fixes and I probably will need to add some more tests.

With this change, the config for recording rules would work like this:

compaction_worker:
  metrics_exporter:
    enabled: true
    rules_source:
      client_address: <address_to_tenant_settings>  # in the presence of this url, a RecordingRulesClient will be used to fetch rules

If no client_address was provided, then a static ruler will grab recording rules from limits. Static recording rules are defaulted to empty set. You can configure them globally:

limits:
  recording_rules:
  - metric_name: 'xxx'
    matchers: ['{__profile_type__="<profile_type>",env="prod"}']
    group_by: ['pod']
    external_labels:
    - name: 'foo'
      value: 'bar'

or you can define a tenant override:

overrides:
  "13":
    recording_rules:
    - metric_name: 'xxx'
      ...

You can configure them globally with limits flags as well: -compaction-worker.metrics-exporter.rules-source.static=json rules.

@alsoba13 alsoba13 requested a review from a team as a code owner March 5, 2025 13:55
@@ -16,19 +14,19 @@ type RecordingRule struct {
ExternalLabels labels.Labels
}

func NewRecordingRule(rule *settingsv1.RecordingRule) (*RecordingRule, error) {
func NewRecordingRule(metricName string, matchers []string, groupBy []string, externalLabels labels.Labels) (*RecordingRule, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opted to change this slightly so we can create/validate rules that come from tenant-settings api or config

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but in my experience that arguments in Go are lot harder to maintain than a struct (e.g. adding /renaming fields).

In this case all of the fields of the struct are public, so maybe just having a func (r *RecordingRule) WithMatchers([]string) error might be nicer to use any how.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was reverted after the change on 9e38ac9

@@ -185,6 +191,9 @@ func (l *Limits) RegisterFlags(f *flag.FlagSet) {
f.Var(&l.IngestionRelabelingDefaultRulesPosition, "distributor.ingestion-relabeling-default-rules-position", "Position of the default ingestion relabeling rules in relation to relabel rules from overrides. Valid values are 'first', 'last' or 'disabled'.")
_ = l.IngestionRelabelingRules.Set("[]")
f.Var(&l.IngestionRelabelingRules, "distributor.ingestion-relabeling-rules", "List of ingestion relabel configurations. The relabeling rules work the same way, as those of [Prometheus](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config). All rules are applied in the order they are specified. Note: In most situations, it is more effective to use relabeling directly in Grafana Alloy.")

_ = l.RecordingRules.Set("[]")
f.Var(&l.RecordingRules, "compaction-worker.metrics-exporter.rules-source.static", "List of static recording rules of the type settingsv1.RecordingRule. Will only be use in the absence of a recording rules client.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flags for limits may be a bit awkward sometimes. While you can set a flag for -compaction-worker.metrics-exporter.rules-source.static="<json rules>", you can't do it through regular config file with compaction_worker.metrics_exporter.rules_source.static. I opted to use a name similar to our config though, for consistency (although through override file it looks different -> limits.recording_rules or overrides.tenant.recording_rules).

I accept naming suggestions if this is missleading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too sure I understand, I suggest you also add a couple of tests for this, because that would make this a lot clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that limits flags are just flags, not config opts.

So using -compaction-worker.metrics-exporter.rules-source.static will work. But using this pyroscope.yml won't work:

compaction_worker:
  metrics_exporter:
    rules_source:
      static: ...

But same happens with other limits flags defined here. Using -distributor.ingestion-relabeling-rules will work. But not this:

...
distributor:
  ingestion_relabeling_rules:
...

line 6: field ingestion_relabeling_rules not found in type distributor.Config

Said that, I'm just asking for naming suggestions. I chose this very specific path compaction-worker.metrics-exporter.rules-source.static to be consistent with recording rules client source, but given that it can't be combined, compaction-worker.recording-rules could be enough.

I agree this lacks some testing. Working on them 😃

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are actually there, because we expose the limits object for all tenants under limits.:

# Taken from: https://admin-dev-us-central-0.grafana-dev.net/directory-phlare/fire-dev-001/distributor/api/v1/status/config
[...]
limits:
    ingestion_rate_mb: 4
    ingestion_burst_size_mb: 3
    ingestion_limit: null
    max_label_name_length: 128
    max_label_value_length: 1024
    max_label_names_per_series: 15
    max_sessions_per_series: 0
    enforce_labels_order: false
    max_profile_size_bytes: 4194304
    max_profile_stacktrace_samples: 16000
    max_profile_stacktrace_sample_labels: 100
    max_profile_stacktrace_depth: 1000
    max_profile_symbol_value_length: 65535
    distributor_usage_groups: null
    distributor_aggregation_window: 0s
    distributor_aggregation_period: 0s
    ingestion_relabeling_rules: []
    ingestion_relabeling_default_rules_position: first
    ingestion_tenant_shard_size: 3
    max_local_series_per_tenant: 0
    max_global_series_per_tenant: 8192
    max_query_lookback: 31d
    max_query_length: 1w
    max_query_parallelism: 8
    query_analysis_enabled: true
    query_analysis_series_enabled: false
    max_flamegraph_nodes_default: 8192
    max_flamegraph_nodes_max: 0
    store_gateway_tenant_shard_size: 3
    split_queries_by_interval: 12h
    compactor_blocks_retention_period: 0s
    compactor_split_and_merge_shards: 0
    compactor_split_and_merge_stage_size: 0
    compactor_split_groups: 1
    compactor_tenant_shard_size: 0
    compactor_partial_block_deletion_delay: 1d
    compactor_downsampler_enabled: true
    s3_sse_type: ""
    s3_sse_kms_key_id: ""
    s3_sse_kms_encryption_context: ""
    reject_older_than: 1h
    reject_newer_than: 10m
    write_path: segment-writer
    write_path_ingester_weight: 1
    write_path_segment_writer_weight: 0
    write_path_segment_writer_timeout: 5s
    async_ingest: false
    enable_query_backend: false
    enable_query_backend_from: 0001-01-01T00:00:00Z
    adaptive_placement_tenant_shards: 0
    adaptive_placement_default_dataset_shards: 1
    adaptive_placement_load_balancing: dynamic
    adaptive_placement_min_dataset_shards: 1
    adaptive_placement_max_dataset_shards: 32
    adaptive_placement_unit_size_bytes: 65536
    adaptive_placement_burst_window: 17m0s
    adaptive_placement_decay_window: 19m0s
[...]

But that is super confusing anyhow.

Copy link
Contributor Author

@alsoba13 alsoba13 Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, they are under limits, but with the limits name, not the flag name:

limits:
  recording_rules:
    ...

I just think flags for limits are useful but inconsistent in terms of naming.

Comment on lines 12 to 17
type RulesConfig struct {
MetricName string `yaml:"metric_name" json:"metric_name"`
Matchers []string `yaml:"matchers" json:"matchers"`
ExternalLabels []map[string]string `yaml:"external_labels" json:"external_labels"`
GroupBy []string `yaml:"group_by" json:"group_by"`
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I accept suggestions here as well. For parsing config, I opted to do the limits style (dto with yaml and json tags). But, if we do this way we can't use model.RecordingRule (no other model structs in that packages use `yaml), nor settingsv1.RecordingRule (as they don't include the yaml tag).

So I created this yet another rules representation, which can be fairly easy outdated and unaligned with other models.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see to variants:

  • You could first convert yaml to json and then use the existing mode with json tags.

  • Add the struct tags for yaml in model.RecordingRule

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with the first option: 9e38ac9

@alsoba13 alsoba13 marked this pull request as draft March 5, 2025 14:05
@alsoba13 alsoba13 force-pushed the alsoba13/recording-rules-config branch from 5cdc427 to 9e38ac9 Compare March 6, 2025 12:04
@alsoba13 alsoba13 force-pushed the alsoba13/recording-rules-config branch from de79ca9 to 8a5269f Compare March 6, 2025 16:06
@alsoba13 alsoba13 marked this pull request as ready for review March 6, 2025 16:09
@alsoba13
Copy link
Contributor Author

alsoba13 commented Mar 7, 2025

fyi, in last commit I moved all the flag logic under v2 umbrella, with the other v2 flags, so it's hidden in --help and so
4457397

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants