[EPIC] Metric Stats #3147

Dav1dde · 2024-02-20T15:30:53Z

Description

We need outcomes for DDM / custom metrics.

Outcomes for Metrics are generally a tricky problem for several reasons. Metrics are aggregated in multiple stages, SDKs, customer Relays, pop Relays and processing Relays, which makes giving an accurate number of volume hard. On top of that, volume is just a small scaling factor we have to consider, a much bigger factor is the cardinality of a metric.

Ideally outcomes can capture the volume and the cardinality of a metric.

Outcomes should tell us the volume of a single metric (defined by its MRI) and its cardinality per hour.

Our current outcomes cannot capture this information, we need a new mechanism to collect metric outcomes.

Requirements

Indefinite (?) retention
Fast enough to query for billing purposes
Volume and Cardinality needs to be represented
Sentry UI needs access to show the user
Billing needs access for billing
Bizops needs access to process with their own pipelines (can they query snuba instead of reading the topic?)

Why not use Outcomes?

Outcomes currently don't provide a way to group by metric name (incl. metric type and namespace).
Cardinality needs to be max() aggregated not sum()'ed

Quantity / Volume

We want to determine the volume of metrics received by the first layer of our infrastructure (PoP-Relays). Client side aggregated metrics are counted with a quantity of 1.

For example: If Relay receives 500 statsd items for a single metric per hour, this metric would be considered to have a volume/quantity of 500 metrics per hour.

Cardinality

We are interested in the cardinality of a single metric (MRI) per hour.

Cardinality can be queried from storage or collected through the Relay cardinality limiter.

Metric Stats Namespaces

Volume: c:metric_stats/volume@none
Tags:

mri: Metric Name/MRI: <type>:<namespace>/<name>[@<unit>]
mri.type: Metric Type
mri.namespace: Metric namespace (extracted from the MRI)
outcome.id: Outcome ID, metric outcomes share the same numeric outcome ID with regular outcomes.
outcome.reason: Optional machine readable, free-form reason code.

Cardinality: g:metric_stats/cardinality@none

mri: Metric Name/MRI: <type>:<namespace>/<name>[@<unit>]
mri.type: Metric Type
mri.namespace: Metric namespace (extracted from the MRI)
cardinality.limit: The cardinality limit id which generated this report.
cardinality.window: Cardinality window size in seconds.
cardinality.scope: Cardinality scope (name, project, organization).

If cardinality is tracked by a project or organization the mri* tags will not be present.

Examples

Payloads/Metrics emitted from Relay into the generic metrics topic.

Volume - Accepted

{
   "org_id":0,
   "project_id":42,
   "name":"c:metric_stats/volume@none",
   "type":"c",
   "value":2.0,
   "timestamp":1712223931,
   "tags":{
        "mri": "d:custom/foo@none",
        "mri.type": "d",
        "mri.namespace": "custom",
        "outcome.id": "0",
   },
   "retention_days":90
}

Cardinality by Name

{
   "org_id":0,
   "project_id":42,
   "name":"g:metric_stats/cardinality@none",
   "type":"g",
   "value":{
      "last":2.0,
      "min":2.0,
      "max":2.0,
      "sum":2.0,
      "count":1
   },
   "timestamp":1712223931,
   "tags":{
        "mri": "s:custom/bar@none",
        "mri.type": "s",
        "mri.namespace": "custom",
        "cardinality.limit": "custom-limit-with-some-id",
        "cardinality.scope": "name",
        "cardinality.window": "3600",
   },
   "retention_days":90
}

Cardinality by Type

{
   "org_id":0,
   "project_id":42,
   "name":"g:metric_stats/cardinality@none",
   "type":"g",
   "value":{
      "last":2.0,
      "min":2.0,
      "max":2.0,
      "sum":2.0,
      "count":1
   },
   "timestamp":1712223931,
   "tags":{
        "mri.type": "s",
        "mri.namespace": "custom",
        "cardinality.limit": "custom-limit-with-some-id",
        "cardinality.scope": "type",
        "cardinality.window": "3600",
   },
   "retention_days":90
}

Milestone 1 ✅ - Volume / Happy Path

Implement the happy path ("accepted") for volume metric stats: c:metric_stats/volume@none.

Tasks

Give feedback

Create Kafka Topic -> OPS-5396
feat(metric-outcomes): Add initial topic and schema sentry-kafka-schemas#231
feat(metrics): Add metadata for buckets #3254
feat(metric-stats): Emit accepted volume #3281
ref(relay): Add metric stats rollout rate to global config sentry#67221

Scope: Backend
https://github.com/getsentry/sentry-options-automator/pull/929
https://github.com/getsentry/ops/pull/9746
https://github.com/getsentry/sentry-options-automator/pull/931
fix(metric-stats): Prevent producing multiple metric stats for the same bucket #3290
ref(metrics): Add metric_stats usecase id to entity mapping sentry#67596

Scope: Backend
Options

Milestone 2 ✅ - Cardinality / Happy Path

Implement the happy path ("accepted") for cardinality metric stats: g:metric_stats/cardinality@none.

Tasks

Give feedback

feat(cardinality): Implement name based cardinality limits #3313
ref(cardinality): Pipeline Redis script invocations #3321
feat(cardinality): Implement cardinality reporting #3342
feat(metric-stats): Report cardinality to metric stats #3360
ref(relay): Make cardinality limits more configurable sentry#68158

Scope: Backend
https://github.com/getsentry/sentry-options-automator/pull/1033
Options

Milestone 2.5 ✅ - Cardinality by Minute and Project

Tasks

Give feedback

feat(metric-stats): Add additional tags to metric stats metrics #3438
feat(metric-stats): Add cardinality.{limit,scope} tags to strings indexer sentry#68891

Scope: Backend
feat(metric-stats): Implement per type cardinality limits #3437
Enable per minute cardinality
Options

Milestone 3 ✅ - Negative Outcomes

Tasks

Give feedback

Milestone 4 - Finishing Touches

Tasks

Give feedback

ref(metrics): Split project metric transforms into separate steps #3565
ref(metrics): Make buckets view generic over the contained buckets #3569
https://github.com/getsentry/ops/pull/10501
https://github.com/getsentry/ops/pull/10345
fix(metric-stats): Use cardinality timestamp for reports #3626
ref(metric-stats): Make metric stats collection configurable #3629
~~Configurable/Longer Retention~~
Check for missing integration tests (prime candidate: rate limits -> negative outcomes)
Come up with a better way to reject metrics than our current util function
Figure out BucketViews and volume
check_buckets drops buckets without reject_metrics
Cardinality Limits are overlapping, if a limit rejects a metric, all other limits are still increased
fix(metric-stats): Only emit volume metadata for the first bucket view #3609
fix(metric-stats): Fix metric stats outcomes for global metrics endpoint #3612
ref(metric-stats): Rate limit before cardinality limiting #3582
Options

The text was updated successfully, but these errors were encountered:

mcannizz · 2024-05-15T16:07:51Z

@Dav1dde I am wondering if this work is still in flight, and whether we consider that work to be P0. It seems like the remaining tasks are lower priority. If you agree, could you please move the remaining stuff to a lower priority epic and close this one?

Dav1dde · 2024-05-16T06:19:01Z

I would like to keep the Epic open, we are at a stage where we provide all the data which is needed now but we're not at a stage that this data will stay correct with upcoming features.
E.g. currently metric_stats does not interact well with user defined cardinality limits which the product is already planning.

Basically at this point we implemented the minimum set everyone else needs but the remaining legwork in Relay is as important just not visible to anyone outside of ingest.

That being said, if moving the remaining tasks to a new Epic makes stuff easier for you I am not opposed to it.

mcannizz · 2024-05-16T14:25:06Z

@Dav1dde makes sense, thanks. I'm going to leave this epic as is but reduce it's priority since it sounds like we've completed the P0 work. Feel free to adjust as you see fit.

Dav1dde · 2024-06-03T07:24:05Z

Closing this, we got all the important bits in place and with the current restructure of the metrics product the remaining pieces are not as important anymore.

iambriccardo mentioned this issue Feb 20, 2024

Usage Tracking getsentry/sentry#62167

Closed

olksdr assigned Dav1dde Feb 26, 2024

This was referenced Mar 8, 2024

feat(metric-outcomes): Add initial topic and schema getsentry/sentry-kafka-schemas#231

Closed

feat(metric-stats): Add metric stats namespace #3267

Merged

feat(metric-stats): Add metric_stats generic metrics namespace getsentry/sentry#66955

Merged

Dav1dde changed the title ~~[EPIC] Metric Outcomes~~ [EPIC] Metric Stats Mar 19, 2024

This was referenced Apr 8, 2024

feat(metric-stats): Add cardinality limited outcome id #3389

Merged

feat(metric-stats): Add cardinality limited outcome id getsentry/sentry#68422

Merged

Dav1dde mentioned this issue Apr 15, 2024

feat(metric-stats): Add cardinality.{limit,scope} tags to strings indexer getsentry/sentry#68891

Merged

Dav1dde closed this as completed Jun 3, 2024

Dav1dde mentioned this issue Jun 3, 2024

Cardinality limits are not transactional #3676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EPIC] Metric Stats #3147

[EPIC] Metric Stats #3147

Dav1dde commented Feb 20, 2024 •

edited

Loading

Tasks

Tasks

Tasks

Tasks

Tasks

mcannizz commented May 15, 2024

Dav1dde commented May 16, 2024

mcannizz commented May 16, 2024

Dav1dde commented Jun 3, 2024

[EPIC] Metric Stats #3147

[EPIC] Metric Stats #3147

Comments

Dav1dde commented Feb 20, 2024 • edited Loading

Description

Requirements

Why not use Outcomes?

Quantity / Volume

Cardinality

Metric Stats Namespaces

Examples

Volume - Accepted

Cardinality by Name

Cardinality by Type

Milestone 1 ✅ - Volume / Happy Path

Tasks

Milestone 2 ✅ - Cardinality / Happy Path

Tasks

Milestone 2.5 ✅ - Cardinality by Minute and Project

Tasks

Milestone 3 ✅ - Negative Outcomes

Tasks

Milestone 4 - Finishing Touches

Tasks

mcannizz commented May 15, 2024

Dav1dde commented May 16, 2024

mcannizz commented May 16, 2024

Dav1dde commented Jun 3, 2024

Dav1dde commented Feb 20, 2024 •

edited

Loading