Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where purging history could lead to increase in disk space usage #18131

Merged
merged 17 commits into from
Feb 3, 2025

Conversation

erikjohnston
Copy link
Member

When purging history, we try and delete any state groups that become unreferenced (i.e. there are no longer any events that directly reference them). When we delete a state group that is referenced by another state group, we "de-delta" that state group so that it no longer refers to the state group that is deleted.

There are two bugs with this approach that we fix here:

  1. There is a common pattern where we end up storing two state groups when persisting a state event: the state before and after the new state event, where the latter is stored as a delta to the former. When deleting state groups we only deleted the "new" state and left (and potentially de-deltaed) the old state. This was due to a bug/typo when trying to find referenced state groups.
  2. There are times where we store unreferenced state groups in the DB, during the purging of history these would not get rechecked and instead always de-deltaed. Instead, we should check for this case and delete any unreferenced state groups rather than de-deltaing them.

The effect of the above bugs is that when purging history we'd end up with lots of unreferenced state groups that had been de-deltaed (i.e. stored as the full state). This can lead to dramatic increases in storage space used.

Based on #18107 and #18130

erikjohnston and others added 7 commits January 29, 2025 11:48
Currently we don't really have anything that stops us from deleting
state groups when an in-flight event references it. This is a fairly
rare race currently, but we want to be able to more aggresively delete
state groups so it is important to address this to ensure that the
database remains valid.

See the class docstring of the new data store for an explanation for how
this works.
Co-authored-by: Devon Hudson <[email protected]>
Co-authored-by: Devon Hudson <[email protected]>
Co-authored-by: Devon Hudson <[email protected]>
….sql

Co-authored-by: Devon Hudson <[email protected]>
@erikjohnston erikjohnston force-pushed the erikj/purge_unreferenced_state branch from 9fbe823 to 68c8dcd Compare February 3, 2025 13:44
@erikjohnston erikjohnston force-pushed the erikj/purge_unreferenced_state branch from 68c8dcd to e67b3bc Compare February 3, 2025 13:48
This actually makes it so that deleting state groups goes via the new
mechanism.
@erikjohnston erikjohnston force-pushed the erikj/purge_unreferenced_state branch from e67b3bc to 72d1d97 Compare February 3, 2025 14:12
@erikjohnston erikjohnston force-pushed the erikj/purge_unreferenced_state branch from 72d1d97 to 2b5cbc1 Compare February 3, 2025 14:17
@erikjohnston erikjohnston marked this pull request as ready for review February 3, 2025 14:29
@erikjohnston erikjohnston requested a review from a team as a code owner February 3, 2025 14:29
Copy link
Member

@devonh devonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One little improvement, otherwise this looks good.
And it should drastically improve storage of state groups.

)

# Advance so that the background jobs to delete the state groups runs
self.reactor.advance(10000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should reference the deletion timeout constant

Base automatically changed from erikj/state_delete to develop February 3, 2025 17:58
…renced_state
@erikjohnston erikjohnston enabled auto-merge (squash) February 3, 2025 18:20
@erikjohnston erikjohnston merged commit c46d452 into develop Feb 3, 2025
39 checks passed
@erikjohnston erikjohnston deleted the erikj/purge_unreferenced_state branch February 3, 2025 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants