Skip to content

Commit 3805959

Browse files
authored
ref(metrics): Use MRI for metrics extraction [INGEST-939] (#1215)
Introduces Metric Resource Identifiers (MRI) for metrics extraction from sessions and transactions. MRIs follow three core principles: 1. **Robustness:** Metrics must be addressed via a stable identifier. During ingestion in Relay and Snuba, metrics are preaggregated and bucketed based on this identifier, so it cannot change over time without breaking bucketing. 2. **Uniqueness:** The identifier for metrics must be unique across variations of units and metric types, within and across use cases, as well as between projects and organizations. 3. **Abstraction:** The user-facing product changes its terminology over time, and splits concepts into smaller parts. The internal metric identifiers must abstract from that, and offer sufficient granularity to allow for such changes. MRIs have the format `<type>:<ns>/<name>@<unit>`, comprising the following components: - **Type:** counter (`c`), set (`s`), distribution (`d`), gauge (`g`), and evaluated (`e`) for derived numeric metrics. - **Namespace:** Identifying the product entity and use case affiliation of the metric. - **Name:** The display name of the metric in the allowed character set. - **Unit:** The verbatim unit name. Namespaces allow to identify the product entity that the metric got extracted from, and/or identify the use case that the metric belongs to. These namespaces **cannot** be defined freely, instead they are defined by Sentry. Over time, there should be more namespaces. Some namespaces can be considered internal. These would not be allowed on the public ingestion endpoints, and would not be exposed via the user-facing Metrics API by default. Relay obtains a form of ACL (access control list) that determines which namespaces can be ingested. The initial namespaces are: - Performance monitoring (**public**): `transactions` - Errors (**public**): `errors` - Issues (_internal_): `issues` - Release Health (**public**): `sessions` - Alerting (_internal_): `alerts` - Custom user-defined metrics (**public**): `custom`
1 parent 9ad1c1d commit 3805959

File tree

5 files changed

+182
-171
lines changed

5 files changed

+182
-171
lines changed

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010

1111
- Remove unused item types. ([#1211](https://github.com/getsentry/relay/pull/1211))
1212
- Pin click dependency in requirements-dev.txt. ([#1214](https://github.com/getsentry/relay/pull/1214))
13+
- Use fully qualified metric resource identifiers (MRI) for metrics ingestion. For example, the sessions duration is now called `d:sessions/duration@s`. ([#1215](https://github.com/getsentry/relay/pull/1215))
1314

1415
## 22.3.0
1516

relay-metrics/src/protocol.rs

+21
Original file line numberDiff line numberDiff line change
@@ -407,6 +407,27 @@ pub struct Metric {
407407
}
408408

409409
impl Metric {
410+
/// Creates a new metric using the MRI naming format.
411+
///
412+
/// MRI is the metric resource identifier in the format `<type>:<ns>/<name>@<unit>`. This name
413+
/// ensures that just the name determines correct bucketing of metrics with name collisions.
414+
pub fn new_mri(
415+
namespace: impl fmt::Display,
416+
name: impl fmt::Display,
417+
unit: MetricUnit,
418+
value: MetricValue,
419+
timestamp: UnixTimestamp,
420+
tags: BTreeMap<String, String>,
421+
) -> Self {
422+
Self {
423+
name: format!("{}:{}/{}@{}", value.ty(), namespace, name, unit),
424+
unit,
425+
value,
426+
timestamp,
427+
tags,
428+
}
429+
}
430+
410431
fn parse_str(string: &str, timestamp: UnixTimestamp) -> Option<Self> {
411432
let mut components = string.split('|');
412433

relay-server/src/metrics_extraction/sessions.rs

+88-83
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,7 @@ fn nil_to_none(distinct_id: Option<&String>) -> Option<&String> {
1818
Some(distinct_id)
1919
}
2020

21-
/// Generate a sessions-related metric name
22-
/// Would be nice to have this as a `const fn`, but [`Metric`] requires a [`String`] anyway.
23-
fn metric_name(name: &str) -> String {
24-
format!("sentry.sessions.{}", name)
25-
}
21+
const METRIC_NAMESPACE: &str = "sessions";
2622

2723
pub fn extract_session_metrics<T: SessionLike>(
2824
attributes: &SessionAttributes,
@@ -46,110 +42,119 @@ pub fn extract_session_metrics<T: SessionLike>(
4642
// Always capture with "init" tag for the first session update of a session. This is used
4743
// for adoption and as baseline for crash rates.
4844
if session.total_count() > 0 {
49-
target.push(Metric {
50-
name: metric_name("session"),
51-
unit: MetricUnit::None,
52-
value: MetricValue::Counter(session.total_count() as f64),
45+
target.push(Metric::new_mri(
46+
METRIC_NAMESPACE,
47+
"session",
48+
MetricUnit::None,
49+
MetricValue::Counter(session.total_count() as f64),
5350
timestamp,
54-
tags: with_tag(&tags, "session.status", "init"),
55-
});
51+
with_tag(&tags, "session.status", "init"),
52+
));
5653

5754
if let Some(distinct_id) = nil_to_none(session.distinct_id()) {
58-
target.push(Metric {
59-
name: metric_name("user"),
60-
unit: MetricUnit::None,
61-
value: MetricValue::set_from_str(distinct_id),
55+
target.push(Metric::new_mri(
56+
METRIC_NAMESPACE,
57+
"user",
58+
MetricUnit::None,
59+
MetricValue::set_from_str(distinct_id),
6260
timestamp,
63-
tags: with_tag(&tags, "session.status", "init"),
64-
});
61+
with_tag(&tags, "session.status", "init"),
62+
));
6563
}
6664
}
6765

6866
// Mark the session as errored, which includes fatal sessions.
6967
if let Some(errors) = session.errors() {
7068
target.push(match errors {
71-
SessionErrored::Individual(session_id) => Metric {
72-
name: metric_name("session.error"),
73-
unit: MetricUnit::None,
74-
value: MetricValue::set_from_display(session_id),
69+
SessionErrored::Individual(session_id) => Metric::new_mri(
70+
METRIC_NAMESPACE,
71+
"error",
72+
MetricUnit::None,
73+
MetricValue::set_from_display(session_id),
7574
timestamp,
76-
tags: tags.clone(),
77-
},
78-
SessionErrored::Aggregated(count) => Metric {
79-
name: metric_name("session"),
80-
unit: MetricUnit::None,
81-
value: MetricValue::Counter(count as f64),
75+
tags.clone(),
76+
),
77+
SessionErrored::Aggregated(count) => Metric::new_mri(
78+
METRIC_NAMESPACE,
79+
"session",
80+
MetricUnit::None,
81+
MetricValue::Counter(count as f64),
8282
timestamp,
83-
tags: with_tag(&tags, "session.status", "errored_preaggr"),
84-
},
83+
with_tag(&tags, "session.status", "errored_preaggr"),
84+
),
8585
});
8686

8787
if let Some(distinct_id) = nil_to_none(session.distinct_id()) {
88-
target.push(Metric {
89-
name: metric_name("user"),
90-
unit: MetricUnit::None,
91-
value: MetricValue::set_from_str(distinct_id),
88+
target.push(Metric::new_mri(
89+
METRIC_NAMESPACE,
90+
"user",
91+
MetricUnit::None,
92+
MetricValue::set_from_str(distinct_id),
9293
timestamp,
93-
tags: with_tag(&tags, "session.status", "errored"),
94-
});
94+
with_tag(&tags, "session.status", "errored"),
95+
));
9596
}
9697
}
9798

9899
// Record fatal sessions for crash rate computation. This is a strict subset of errored
99100
// sessions above.
100101
if session.abnormal_count() > 0 {
101-
target.push(Metric {
102-
name: metric_name("session"),
103-
unit: MetricUnit::None,
104-
value: MetricValue::Counter(session.abnormal_count() as f64),
102+
target.push(Metric::new_mri(
103+
METRIC_NAMESPACE,
104+
"session",
105+
MetricUnit::None,
106+
MetricValue::Counter(session.abnormal_count() as f64),
105107
timestamp,
106-
tags: with_tag(&tags, "session.status", SessionStatus::Abnormal),
107-
});
108+
with_tag(&tags, "session.status", SessionStatus::Abnormal),
109+
));
108110

109111
if let Some(distinct_id) = nil_to_none(session.distinct_id()) {
110-
target.push(Metric {
111-
name: metric_name("user"),
112-
unit: MetricUnit::None,
113-
value: MetricValue::set_from_str(distinct_id),
112+
target.push(Metric::new_mri(
113+
METRIC_NAMESPACE,
114+
"user",
115+
MetricUnit::None,
116+
MetricValue::set_from_str(distinct_id),
114117
timestamp,
115-
tags: with_tag(&tags, "session.status", SessionStatus::Abnormal),
116-
});
118+
with_tag(&tags, "session.status", SessionStatus::Abnormal),
119+
));
117120
}
118121
}
122+
119123
if session.crashed_count() > 0 {
120-
target.push(Metric {
121-
name: metric_name("session"),
122-
unit: MetricUnit::None,
123-
value: MetricValue::Counter(session.crashed_count() as f64),
124+
target.push(Metric::new_mri(
125+
METRIC_NAMESPACE,
126+
"session",
127+
MetricUnit::None,
128+
MetricValue::Counter(session.crashed_count() as f64),
124129
timestamp,
125-
tags: with_tag(&tags, "session.status", SessionStatus::Crashed),
126-
});
130+
with_tag(&tags, "session.status", SessionStatus::Crashed),
131+
));
127132

128133
if let Some(distinct_id) = nil_to_none(session.distinct_id()) {
129-
target.push(Metric {
130-
name: metric_name("user"),
131-
unit: MetricUnit::None,
132-
value: MetricValue::set_from_str(distinct_id),
134+
target.push(Metric::new_mri(
135+
METRIC_NAMESPACE,
136+
"user",
137+
MetricUnit::None,
138+
MetricValue::set_from_str(distinct_id),
133139
timestamp,
134-
tags: with_tag(&tags, "session.status", SessionStatus::Crashed),
135-
});
140+
with_tag(&tags, "session.status", SessionStatus::Crashed),
141+
));
136142
}
137143
}
138144

139145
// Count durations for all exited/crashed sessions. Note that right now, in the product we
140146
// really only use durations from session.status=exited, but decided it may be worth ingesting
141147
// this data in case we need it. If we need to cut cost, this is one place to start though.
142-
// if session.status.is_terminal() {
143148
if let Some((duration, status)) = session.final_duration() {
144-
target.push(Metric {
145-
name: metric_name("session.duration"),
146-
unit: MetricUnit::Duration(DurationPrecision::Second),
147-
value: MetricValue::Distribution(duration),
149+
target.push(Metric::new_mri(
150+
METRIC_NAMESPACE,
151+
"duration",
152+
MetricUnit::Duration(DurationPrecision::Second),
153+
MetricValue::Distribution(duration),
148154
timestamp,
149-
tags: with_tag(&tags, "session.status", status),
150-
});
155+
with_tag(&tags, "session.status", status),
156+
));
151157
}
152-
// }
153158
}
154159

155160
#[cfg(test)]
@@ -210,14 +215,14 @@ mod tests {
210215

211216
let session_metric = &metrics[0];
212217
assert_eq!(session_metric.timestamp, started());
213-
assert_eq!(session_metric.name, "sentry.sessions.session");
218+
assert_eq!(session_metric.name, "c:sessions/session@");
214219
assert!(matches!(session_metric.value, MetricValue::Counter(_)));
215220
assert_eq!(session_metric.tags["session.status"], "init");
216221
assert_eq!(session_metric.tags["release"], "1.0.0");
217222

218223
let user_metric = &metrics[1];
219224
assert_eq!(session_metric.timestamp, started());
220-
assert_eq!(user_metric.name, "sentry.sessions.user");
225+
assert_eq!(user_metric.name, "s:sessions/user@");
221226
assert!(matches!(user_metric.value, MetricValue::Set(_)));
222227
assert_eq!(session_metric.tags["session.status"], "init");
223228
assert_eq!(user_metric.tags["release"], "1.0.0");
@@ -281,13 +286,13 @@ mod tests {
281286

282287
let session_metric = &metrics[expected_metrics - 2];
283288
assert_eq!(session_metric.timestamp, started());
284-
assert_eq!(session_metric.name, "sentry.sessions.session.error");
289+
assert_eq!(session_metric.name, "s:sessions/error@");
285290
assert!(matches!(session_metric.value, MetricValue::Set(_)));
286291
assert_eq!(session_metric.tags.len(), 1); // Only the release tag
287292

288293
let user_metric = &metrics[expected_metrics - 1];
289294
assert_eq!(session_metric.timestamp, started());
290-
assert_eq!(user_metric.name, "sentry.sessions.user");
295+
assert_eq!(user_metric.name, "s:sessions/user@");
291296
assert!(matches!(user_metric.value, MetricValue::Set(_)));
292297
assert_eq!(user_metric.tags["session.status"], "errored");
293298
assert_eq!(user_metric.tags["release"], "1.0.0");
@@ -317,19 +322,19 @@ mod tests {
317322

318323
assert_eq!(metrics.len(), 4);
319324

320-
assert_eq!(metrics[0].name, "sentry.sessions.session.error");
321-
assert_eq!(metrics[1].name, "sentry.sessions.user");
325+
assert_eq!(metrics[0].name, "s:sessions/error@");
326+
assert_eq!(metrics[1].name, "s:sessions/user@");
322327
assert_eq!(metrics[1].tags["session.status"], "errored");
323328

324329
let session_metric = &metrics[2];
325330
assert_eq!(session_metric.timestamp, started());
326-
assert_eq!(session_metric.name, "sentry.sessions.session");
331+
assert_eq!(session_metric.name, "c:sessions/session@");
327332
assert!(matches!(session_metric.value, MetricValue::Counter(_)));
328333
assert_eq!(session_metric.tags["session.status"], status.to_string());
329334

330335
let user_metric = &metrics[3];
331336
assert_eq!(session_metric.timestamp, started());
332-
assert_eq!(user_metric.name, "sentry.sessions.user");
337+
assert_eq!(user_metric.name, "s:sessions/user@");
333338
assert!(matches!(user_metric.value, MetricValue::Set(_)));
334339
assert_eq!(user_metric.tags["session.status"], status.to_string());
335340
}
@@ -359,7 +364,7 @@ mod tests {
359364
assert_eq!(metrics.len(), 1);
360365

361366
let duration_metric = &metrics[0];
362-
assert_eq!(duration_metric.name, "sentry.sessions.session.duration");
367+
assert_eq!(duration_metric.name, "d:sessions/duration@s");
363368
assert!(matches!(
364369
duration_metric.value,
365370
MetricValue::Distribution(_)
@@ -402,7 +407,7 @@ mod tests {
402407
insta::assert_debug_snapshot!(metrics, @r###"
403408
[
404409
Metric {
405-
name: "sentry.sessions.session",
410+
name: "c:sessions/session@",
406411
unit: None,
407412
value: Counter(
408413
135.0,
@@ -415,7 +420,7 @@ mod tests {
415420
},
416421
},
417422
Metric {
418-
name: "sentry.sessions.session",
423+
name: "c:sessions/session@",
419424
unit: None,
420425
value: Counter(
421426
5.0,
@@ -428,7 +433,7 @@ mod tests {
428433
},
429434
},
430435
Metric {
431-
name: "sentry.sessions.session",
436+
name: "c:sessions/session@",
432437
unit: None,
433438
value: Counter(
434439
7.0,
@@ -441,7 +446,7 @@ mod tests {
441446
},
442447
},
443448
Metric {
444-
name: "sentry.sessions.session",
449+
name: "c:sessions/session@",
445450
unit: None,
446451
value: Counter(
447452
15.0,
@@ -454,7 +459,7 @@ mod tests {
454459
},
455460
},
456461
Metric {
457-
name: "sentry.sessions.user",
462+
name: "s:sessions/user@",
458463
unit: None,
459464
value: Set(
460465
3097475539,
@@ -467,7 +472,7 @@ mod tests {
467472
},
468473
},
469474
Metric {
470-
name: "sentry.sessions.session",
475+
name: "c:sessions/session@",
471476
unit: None,
472477
value: Counter(
473478
3.0,
@@ -480,7 +485,7 @@ mod tests {
480485
},
481486
},
482487
Metric {
483-
name: "sentry.sessions.user",
488+
name: "s:sessions/user@",
484489
unit: None,
485490
value: Set(
486491
3097475539,

0 commit comments

Comments
 (0)