Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(profiles): Count processed profiles with metrics #2165

Merged
merged 9 commits into from
Jun 2, 2023

Conversation

jjbayer
Copy link
Member

@jjbayer jjbayer commented May 31, 2023

As described in #2158, track the number of processed profiles by tagging the corresponding transaction metric.

Also: Do not accept multiple profiles per envelope.

Not in this PR:

  • Distinguishing between Profile and ProfileIndexed
  • Rate limiting processed profiles based on metrics

#skip-changelog

@jjbayer jjbayer changed the title Ref/count profiles via metric ref(profiles): Count processed profiles with metrics May 31, 2023
@jjbayer jjbayer marked this pull request as ready for review May 31, 2023 16:13
@jjbayer jjbayer requested a review from a team as a code owner May 31, 2023 16:13
@jjbayer jjbayer requested a review from a team May 31, 2023 16:13
Comment on lines 1071 to 1079
if !found_profile {
// Found first profile, keep it.
found_profile = true;
} else {
// We found a second profile, drop it.
return ItemAction::Drop(Outcome::Invalid(DiscardReason::Profiling(
relay_profiling::discard_reason(ProfileError::TooManyProfiles),
)));
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phacops I added this logic as you suggested.

Comment on lines +365 to +372
struct ExtractInput<'a> {
aggregator_config: &'a AggregatorConfig,
config: &'a TransactionMetricsConfig,
event: &'a Event,
transaction_from_dsc: Option<&'a str>,
sampling_result: &'a SamplingResult,
has_profile: bool,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linter finally started complaining about the number of arguments to extract_transaction_metrics_inner so I created this struct.

@jjbayer
Copy link
Member Author

jjbayer commented Jun 1, 2023

Letting one integration test fail deliberately. It will pass when #2166 is merged into this branch.

Copy link
Contributor

@iker-barriocanal iker-barriocanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the PR fulfills the intention in my comment below.

ItemType::Profile => {
if !found_profile {
// Found first profile, keep it.
found_profile = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relay also considers towards outcomes dropped profiles that could not parsed, even if they aren't dropped by dynamic sampling. #2054 mentions only "valid" profiles should be considered:

In other words, "processed profiles" should count all valid, non rate limited profiles seen by Relays, even if they were dropped by dynamic sampling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording in that PR description is too vague, it should say: "accepted processed profiles should count all valid non rate limited profiles seen by Relays" .

A better way to think of it is two data streams splitting ways (see diagram here), and getting counted towards a different category regardless of whether they're accepted, rate limited, or invalid.

jjbayer and others added 2 commits June 1, 2023 15:05
With #2165, "processed" profiles
are represented by their raw payload _before_ metrics extraction, and
represented by the `transaction.duration` metric _after_ metrics
extraction, as it is the case for transactions:

```mermaid
flowchart LR
    . --"payload\n(DataCategory::Profile)"--> MetricsExtraction
    MetricsExtraction --"payloads\n(TransactionIndexed, ProfileIndexed)"-->..
    MetricsExtraction --"metrics\n(Transaction, Profile)"-->...
 ```


To prevent double counting, emit any profile-related outcome after metrics extraction as `DataCategory::ProfileIndexed`.

Co-authored-by: Iker Barriocanal <[email protected]>
@jjbayer jjbayer merged commit 1ee46b8 into ref/profiles-revert-count Jun 2, 2023
@jjbayer jjbayer deleted the ref/count-profiles-via-metric branch June 2, 2023 09:15
jjbayer added a commit that referenced this pull request Jun 5, 2023
Instead of counting processed profiles in two different places (see
#2054), add a `has_profile` tag
to the transaction counter metric, and define

```
processed_profiles := count(processed_transactions) WHERE has_profile = true
```

## Changes

This PR contains the following changes:

* Revert PRs around counting profiles.
* Add `has_profile(s)` tag on `transaction.duration` metric.
#2165
* Update metrics billing consumer to emit `accepted` outcome for
profiles. getsentry/sentry#50047


## Caveats

With this PR, rate limits will be applied consistently if they are
issued for `DataCategory.Transaction`. However, if a rate limit (e.g.
spike protection) is issued _only_ for `DataCategory.Profile`, it will
not be enforced.

## Rollout Order

1. Merge and deploy getsentry/sentry#50047. The
billing metrics consumer will now listen for profiles, but not receive
any.
3. Deploy processing Relays. Profile counts will go down, because
PoP-Relays are not sending the `has_profile` tag yet.
4. Deploy PoP Relays. Profile counts should go back to normal.

## In case of rollback

1. First revert the sentry change
(getsentry/sentry#50047) and deploy it to stop
counting profiles.
2. Then revert the Relay change (this PR) and deploy to processing
Relays and PoPs.

ref: #2158

---------

Co-authored-by: Iker Barriocanal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants