-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(breakdowns): Schema change for span op breakdowns #1761
Conversation
snuba/migrations/snuba_migrations/transactions/0011_transactions_add_span_op_breakdowns.py
Outdated
Show resolved
Hide resolved
snuba/migrations/snuba_migrations/transactions/0011_transactions_add_span_op_breakdowns.py
Show resolved
Hide resolved
Per the offline discussion in Slack, did you decide on whether the durations would fit into a UInt32? On the subject of UInt, are you absolutely sure that there can never be a negative duration? Are the start and end timestamps generated on the same device? Is there something that is doing |
@evanh I'm going ahead with UInt64. The delta values will be non-negative and will be guaranteed by Relay. |
@JTCunning DDL Changes:
|
@dashed Is there documentation somewhere on the decision between nanoseconds/UInt64 and milliseconds/UInt32? |
@evanh Kevan (@k-fish ) is documenting this decision here https://www.notion.so/sentry/Ops-Breakdown-c81469ee3be04f7eb9394c41e62a2dc2 |
I'm going to update this to be float64 based on feedback from Jan so that breakdowns are forward-compatible with the future metrics product. We expect to keep this column until breakdowns are ported to the metrics product. See: getsentry/relay#934 (comment) and getsentry/relay#958 |
@dashed Does this mean that the values in the breakdown field will be in seconds? What level of precision do you need for the data? I ask because Clickhouse also has a Float32 type, which would give some of the benefits we discussed with Int64 vs Int32. However there will be a loss of granularity with that choice. Assuming that the durations are in the range of a few hours at most (~3600 seconds) then the precision would be submillisecond but not microsecond. |
@evanh For span op breakdowns, they're in milliseconds. We'd like to ensure we can read microsecond values. For any general breakdowns, the values will be in float64 when transmitted from Relay. I'd like to keep the types consistent especially when values are crossing the network boundary. Would there be a risk trying to fit a float64 value into a float32 column? |
Ah if you're sending milliseconds and you want microsecond precision then I think we need to stick with Float64. I just wanted to double check. |
6adebce
to
010838b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. Please do not merge this until James has had a chance to look at the DDL.
@JTCunning New DDL:
ALTER TABLE transactions_local ADD COLUMN IF NOT EXISTS span_op_breakdowns Nested(key LowCardinality(String), value Float64) AFTER measurements;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix the after
parameter.
] | ||
), | ||
), | ||
after="measurements", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work.
the measurements
column does not exist after the DDL operation that creates it.
After such ddl you have measurements.key
and measurements.value
columns.
So this should be
after="measurements.value"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fpacifici Good catch. I updated to after="measurements.value"
.
Interestingly, the span_op_breakdowns
columns are added after the measurements
columns in my local Clickhouse instance.
Codecov Report
@@ Coverage Diff @@
## master #1761 +/- ##
==========================================
+ Coverage 90.84% 90.86% +0.01%
==========================================
Files 486 487 +1
Lines 22007 22047 +40
==========================================
+ Hits 19992 20032 +40
Misses 2015 2015
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The migration has been applied to prod. Please feel free to merge
/gcbrun |
Based on #1233
Associated change in Relay: getsentry/relay#934
Add span op breakdowns column to the transactions dataset that has identical schema to the
measurements
column.