-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't create indexes during bootstrapping; wait until after replay #2161
Conversation
This PR fixes a bug where replaying databases without a snapshot of TX 0, i.e. using the entire commitlog, would fail when replaying deletes on `st_client`. This is because `st_client` was in the unique (*) position of: - Being a system table, - with a unique index, - which is mutated after bootstrapping. (*) Other system tables meet all three criteria in edge cases with schema-changing publishes, but only `st_client` meets them in everyday testing. In this case, when replaying the entire commitlog, we create the indexes on the system tables during bootstrapping before replay. (I theorize, but have not checked, that when restoring from a snapshot we do not create these indexes so eagerly, instead waiting for after replaying the commitlog suffix to create them.) However, we cannot update or use indexes during bootstrapping, at least currently, because doing so causes unique constraint violations. Prior to this commit, we had a bunch of complicated code which avoided updating indexes during replay. PR #2092, aka commit f667f65, introduced a case where we would use an index during replay: when replaying a delete for a table which had an index, and therefore had no pointer map. Prior to that commit, we would never use an index during replay, even if it existed. This all came together to mean that `st_client` had an empty index, and no pointer map, and so calls to `replay_delete_by_rel` on that table would always fail. I theorize that no one noticed this sooner because it did not manifest when restoring from a snapshot, and there is always a snapshot of TX 0 unless you manually delete it, as Kim did during his testing. The fix is simple: we do not construct indexes on the system tables during bootstrapping before replaying the commitlog, instead waiting until after replay like we do for user-defined indexes. This allows us to rip out all the aforementioned fiddly code for dodging indexes during replay, since now there will not be any indexes and therefore we will never update them.
I don't understand why that follows from this, could you explain? In particular how is the double delete thing we talked about relevant? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good; I wonder if we could add a regression unit test using something like TestDB::in_memory()
If you look at the commitlog that @gefjon posted in #bugs, there is actually no double delete. There's just the one delete for the inserted row in |
Mazdak is correct; there is only one delete, for a row which is present, but is impossible to find because its table has no pointer map. |
More specifically, as of #2092 , |
As far as I can tell, this has been there since inception of the locking datastore and could have been removed with the introduction of |
As a follow-up, we could also consider to remove the code which lets empty connect/disconnect transactions be committed to the commitlog, because there are no actually empty transactions? |
…ing; wait until after replay (#2161)
Description of Changes
This PR fixes a bug where replaying databases without a snapshot of TX 0, i.e. using the entire commitlog,
would fail when replaying deletes on
st_client
.This is because
st_client
was in the unique (*) position of:(*) Other system tables meet all three criteria in edge cases with schema-changing publishes, but only
st_client
meets them in everyday testing.In this case, when replaying the entire commitlog, we create the indexes on the system tables during bootstrapping before replay. (I theorize, but have not checked,
that when restoring from a snapshot we do not create these indexes so eagerly, instead waiting for after replaying the commitlog suffix to create them.)
However, we cannot update or use indexes during bootstrapping, at least currently, because doing so causes unique constraint violations. Prior to this commit, we had a bunch of complicated code which avoided updating indexes during replay.
PR #2092, aka commit f667f65, introduced a case where we would use an index during replay: when replaying a delete for a table which had an index, and therefore had no pointer map.
Prior to that commit, we would never use an index during replay, even if it existed.
This all came together to mean that
st_client
had an empty index, and no pointer map, and so calls toreplay_delete_by_rel
on that table would always fail. I theorize that no one noticed this soonerbecause it did not manifest when restoring from a snapshot, and there is always a snapshot of TX 0 unless you manually delete it, as Kim did during his testing.
The fix is simple: we do not construct indexes on the system tables during bootstrapping before replaying the commitlog, instead waiting until after replay like we do for user-defined indexes. This allows us to rip out all the aforementioned fiddly code for dodging indexes during replay,
since now there will not be any indexes and therefore we will never update them.
This PR also includes some new testing infra in
relational_db.rs
, and exports some test helpers from thecommitlog
crate under itstest
feature.API and ABI breaking changes
N/a
Expected complexity level and risk
4: I don't feel like I understand our bootstrapping process terribly well, it is quite fiddly, and I cannot explain why we were previously constructing system table indexes before replay. It may have been important in some way I do not realize.
Testing
replay_delete_from_st_client
, which fails on master but passes on this branch.