Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sampling): Add profile context early and expose getter for it #2715

Merged
merged 6 commits into from
Nov 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
- Group resource spans by scrubbed domain and filename. ([#2654](https://github.com/getsentry/relay/pull/2654))
- Convert transactions to spans for all organizations. ([#2659](https://github.com/getsentry/relay/pull/2659))
- Filter outliers (>180s) for mobile measurements. ([#2649](https://github.com/getsentry/relay/pull/2649))
- Allow access to more context fields in dynamic sampling and metric extraction. ([#2607](https://github.com/getsentry/relay/pull/2607), [#2640](https://github.com/getsentry/relay/pull/2640), [#2675](https://github.com/getsentry/relay/pull/2675), [#2707](https://github.com/getsentry/relay/pull/2707))
- Allow access to more context fields in dynamic sampling and metric extraction. ([#2607](https://github.com/getsentry/relay/pull/2607), [#2640](https://github.com/getsentry/relay/pull/2640), [#2675](https://github.com/getsentry/relay/pull/2675), [#2707](https://github.com/getsentry/relay/pull/2707), [#2715](https://github.com/getsentry/relay/pull/2715))
- Allow advanced scrubbing expressions for datascrubbing safe fields. ([#2670](https://github.com/getsentry/relay/pull/2670))
- Disable graphql scrubbing when datascrubbing is disabled. ([#2689](https://github.com/getsentry/relay/pull/2689))
- Track when a span was received. ([#2688](https://github.com/getsentry/relay/pull/2688))
Expand Down
21 changes: 18 additions & 3 deletions relay-event-schema/src/protocol/event.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ use crate::processor::ProcessValue;
use crate::protocol::{
AppContext, Breadcrumb, Breakdowns, BrowserContext, ClientSdkInfo, Contexts, Csp, DebugMeta,
DefaultContext, DeviceContext, EventType, Exception, ExpectCt, ExpectStaple, Fingerprint, Hpkp,
LenientString, Level, LogEntry, Measurements, Metrics, OsContext, RelayInfo, Request,
ResponseContext, Span, Stacktrace, Tags, TemplateInfo, Thread, Timestamp, TraceContext,
TransactionInfo, User, Values,
LenientString, Level, LogEntry, Measurements, Metrics, OsContext, ProfileContext, RelayInfo,
Request, ResponseContext, Span, Stacktrace, Tags, TemplateInfo, Thread, Timestamp,
TraceContext, TransactionInfo, User, Values,
};

/// Wrapper around a UUID with slightly different formatting.
Expand Down Expand Up @@ -708,6 +708,12 @@ impl Getter for Event {
"contexts.browser.version" => {
self.context::<BrowserContext>()?.version.as_str()?.into()
}
"contexts.profile.profile_id" => self
.context::<ProfileContext>()?
.profile_id
.value()?
.0
.into(),
"contexts.device.uuid" => self.context::<DeviceContext>()?.uuid.value()?.into(),
"contexts.trace.status" => self
.context::<TraceContext>()?
Expand Down Expand Up @@ -1102,6 +1108,11 @@ mod tests {
kernel_version: Annotated::new("17.4.0".to_string()),
..OsContext::default()
});
contexts.add(ProfileContext {
profile_id: Annotated::new(EventId(uuid!(
"abadcade-feed-dead-beef-8addadfeedaa"
))),
});
contexts
}),
..Default::default()
Expand Down Expand Up @@ -1185,6 +1196,10 @@ mod tests {
Some(Val::String("https://sentry.io")),
event.get_value("event.request.url")
);
assert_eq!(
Some(Val::Uuid(uuid!("abadcade-feed-dead-beef-8addadfeedaa"))),
event.get_value("event.contexts.profile.profile_id")
)
}

#[test]
Expand Down
13 changes: 9 additions & 4 deletions relay-profiling/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,15 @@ mod utils;

const MAX_PROFILE_DURATION: Duration = Duration::from_secs(30);

/// Unique identifier for a profile.
///
/// Same format as event IDs.
pub type ProfileId = EventId;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this type alias to make the return type more obvious in function signatures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just new type it here instead?

Then it's actually a separate type and cant get mixed


#[derive(Debug, Deserialize)]
struct MinimalProfile {
#[serde(alias = "profile_id")]
event_id: EventId,
event_id: ProfileId,
platform: String,
#[serde(default)]
version: sample::Version,
Expand All @@ -134,7 +139,7 @@ fn minimal_profile_from_json(
serde_path_to_error::deserialize(d)
}

pub fn parse_metadata(payload: &[u8], project_id: ProjectId) -> Result<(), ProfileError> {
pub fn parse_metadata(payload: &[u8], project_id: ProjectId) -> Result<ProfileId, ProfileError> {
let profile = match minimal_profile_from_json(payload) {
Ok(profile) => profile,
Err(err) => {
Expand Down Expand Up @@ -183,10 +188,10 @@ pub fn parse_metadata(payload: &[u8], project_id: ProjectId) -> Result<(), Profi
_ => return Err(ProfileError::PlatformNotSupported),
},
};
Ok(())
Ok(profile.event_id)
}

pub fn expand_profile(payload: &[u8], event: &Event) -> Result<(EventId, Vec<u8>), ProfileError> {
pub fn expand_profile(payload: &[u8], event: &Event) -> Result<(ProfileId, Vec<u8>), ProfileError> {
let profile = match minimal_profile_from_json(payload) {
Ok(profile) => profile,
Err(err) => {
Expand Down
50 changes: 30 additions & 20 deletions relay-server/src/actors/processor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ use relay_event_normalization::{GeoIpLookup, RawUserAgentInfo};
use relay_event_schema::processor::{self, ProcessingAction, ProcessingState};
use relay_event_schema::protocol::{
Breadcrumb, ClientReport, Contexts, Csp, Event, EventType, ExpectCt, ExpectStaple, Hpkp,
IpAddr, LenientString, Metrics, NetworkReportError, OtelContext, RelayInfo, Replay,
SecurityReportType, SessionAggregates, SessionAttributes, SessionStatus, SessionUpdate,
IpAddr, LenientString, Metrics, NetworkReportError, OtelContext, ProfileContext, RelayInfo,
Replay, SecurityReportType, SessionAggregates, SessionAttributes, SessionStatus, SessionUpdate,
Timestamp, TraceContext, UserReport, Values,
};
use relay_filter::FilterStatKey;
Expand All @@ -58,7 +58,7 @@ use {
crate::actors::project_cache::UpdateRateLimits,
crate::utils::{EnvelopeLimiter, MetricsLimiter},
relay_event_normalization::{span, StoreConfig, StoreProcessor},
relay_event_schema::protocol::{ProfileContext, Span},
relay_event_schema::protocol::Span,
relay_metrics::Aggregator,
relay_quotas::{RateLimitingError, RedisRateLimiter},
relay_redis::RedisPool,
Expand Down Expand Up @@ -1096,15 +1096,15 @@ impl EnvelopeProcessorService {
.items()
.filter(|item| item.ty() == &ItemType::Transaction)
.count();
let mut found_profile = false;
let mut profile_id = None;
state.managed_envelope.retain_items(|item| match item.ty() {
// Drop profile without a transaction in the same envelope.
ItemType::Profile if transaction_count == 0 => ItemAction::DropSilently,
// First profile found in the envelope, we'll keep it if metadata are valid.
ItemType::Profile if !found_profile => {
ItemType::Profile if profile_id.is_none() => {
match relay_profiling::parse_metadata(&item.payload(), state.project_id) {
Ok(_) => {
found_profile = true;
Ok(id) => {
profile_id = Some(id);
ItemAction::Keep
}
Err(err) => ItemAction::Drop(Outcome::Invalid(DiscardReason::Profiling(
Expand All @@ -1118,7 +1118,22 @@ impl EnvelopeProcessorService {
))),
_ => ItemAction::Keep,
});
state.has_profile = found_profile;
state.has_profile = profile_id.is_some();

if let Some(event) = state.event.value_mut() {
if event.ty.value() == Some(&EventType::Transaction) {
let contexts = event.contexts.get_or_insert_with(Contexts::new);
// If we found a profile, add its ID to the profile context on the transaction.
if let Some(profile_id) = profile_id {
contexts.add(ProfileContext {
profile_id: Annotated::new(profile_id),
});
// If not, we delete the profile context.
} else {
contexts.remove::<ProfileContext>();
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block was copied and adapted from process_profiles.

}

/// Normalize monitor check-ins and remove invalid ones.
Expand Down Expand Up @@ -1182,17 +1197,13 @@ impl EnvelopeProcessorService {
}
_ => ItemAction::Keep,
});
if let Some(event) = state.event.value_mut() {
if event.ty.value() == Some(&EventType::Transaction) {
let contexts = event.contexts.get_or_insert_with(Contexts::new);
// If we found a profile, add its ID to the profile context on the transaction.
if let Some(profile_id) = found_profile_id {
contexts.add(ProfileContext {
profile_id: Annotated::new(profile_id),
});
// If not, we delete the profile context.
} else {
contexts.remove::<ProfileContext>();
if found_profile_id.is_none() {
// Remove profile context from event.
if let Some(event) = state.event.value_mut() {
if event.ty.value() == Some(&EventType::Transaction) {
if let Some(contexts) = event.contexts.value_mut() {
contexts.remove::<ProfileContext>();
}
Comment on lines +1200 to +1206
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When profile processing fails, the profile ID is removed here. This means that some (invalid) profiles that were opted in by a dynamic sampling rule based on contexts.profile.profile_id might end up without a profile ID in the final payload.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this implies that we are performing sampling on profile ids that we don't know whether they are valid of not, am I right?

From the code flow, I see that we do filter the profile with context injection, we then move onto sampling (assuming the profile id refers to a valid profile), and then we process the profile and remove the context in case it's invalid?

Is this an expected behavior? Just curious

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this implies that we are performing sampling on profile ids that we don't know whether they are valid of not, am I right?

That is correct and a bit unfortunate. To fix this we would have to move the entirety of process_profiles to non-processing relays (and before dynamic sampling), and I'm not sure if that's currently possible. cc @jan-auer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks for the explanation!

}
}
}
Expand Down Expand Up @@ -2776,7 +2787,6 @@ impl EnvelopeProcessorService {

if_processing!({
self.enforce_quotas(state)?;
// We need the event parsed in order to set the profile context on it
self.process_profiles(state);
self.process_check_ins(state);
});
Expand Down