-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(projects): Jitter project state requests #277
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have questions but this seems about right.
server/src/actors/project.rs
Outdated
@@ -334,8 +342,13 @@ impl ProjectState { | |||
|
|||
/// Returns whether this state is outdated and needs to be refetched. | |||
pub fn outdated(&self, config: &Config) -> bool { | |||
let jitter = config.cache_timeout_jitter(); | |||
let factor = jitter * (self.seed * 2.0 - self.seed); | |||
debug_assert!(factor >= -1.0 * self.seed && factor <= 1.0 * self.seed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't factor
never be negative? Otherwise you get an elapsed timedelta in the negative range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can we add an assert with less variable deps?
debug_assert!(factor >= 0.0 && factor <= 1.0);
Since both seed and jitter are capped at 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, was refactoring this at the end and ended up with wrong math. The idea here is to apply the jitter in percent to the interval. So for a 6 minute interval with 50% jitter, the actual interval can be anywhere between 3 and 9 minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to double-check this again tomorrow. Now that I'm looking at this, random()
returns a standard deviation, so multiplying with jitter
this way will generate skew. I shouldn't code when jet-lagged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine.
The deviation will change from x
to x * jitter
, but that's what you intended to do.
The distribution is still a standard distribution, just with more spread.
Generally I believe that multiplying your random value with a constant (jitter) is fine. You will create holes in theory, but floats already have holes. You just add more holes.
server/src/actors/project.rs
Outdated
@@ -277,6 +281,9 @@ pub struct ProjectState { | |||
/// The organization id. | |||
#[serde(default)] | |||
pub organization_id: Option<u64>, | |||
/// Random seed for jitter and related operations. | |||
#[serde(skip, default = "project_state_seed")] | |||
pub seed: f64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intent that the dice are rolled once per project over the uptime of a relay, not once per fetch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent is once per fetch (as stated in the description):
The seed changes with every fetch so that certain projects are not systematically discriminated.
... when Relay has long uptimes.
server/src/actors/project.rs
Outdated
@@ -334,11 +342,15 @@ impl ProjectState { | |||
|
|||
/// Returns whether this state is outdated and needs to be refetched. | |||
pub fn outdated(&self, config: &Config) -> bool { | |||
// Jitter is used to compute a random interval between [0; 2 * expiry_interval] | |||
let jitter = config.cache_timeout_jitter(); | |||
let factor = jitter * (self.seed * 2.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Double check this when less tired. Math is hard.
seed
is standard deviated, meaning it centers around 0.5. This makes seed * 2
center around 1.0, but jitter will make it center around something smaller. This actually needs to be something more like:
1 + jitter * (self.seed * 2.0 - 1)
Implemented a completely new approach suggested by @mitsuhiko, which is closer to how we shift quotas as well. We now shift each project to a deterministic slot within a grid of expiry intervals (e.g. 5 minutes). Each project state is still requested immediately when it is seen for the first time. Then it is updated within the next 5 minutes at the end of the current window, and then again at each window boundary. This helps to spread out requests while maintaining a deterministic order. (thinking about it, we can probably refactor some common code for this) |
This semi-deterministically jitters project state requests so that projects are not all requested in the same interval. This should help to spread out project state requests over the general 5 minute timeout interval.
The jitter is computed from a standard deviated random seed that is assigned to every unique project state. The seed changes with every fetch so that certain projects are not systematically discriminated.