Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch DeleteTable impl to one based on FixedBitSet #2183

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

Centril
Copy link
Contributor

@Centril Centril commented Jan 28, 2025

Description of Changes

Fixes #2147.
Changes the implementation of DeleteTable to be based on grouping by page.
To be more specific, each page (PageIndex) is assigned its own FixedBitSet, which tracks which PageOffsets, converted to indices using fixed_row_size, is contained.

Based on micro benchmarks, this brings us to from about 172 ns per iteration (BTreeSet<RowPointer>), to about 40 ns.
Using FixedBitSet performs better in all cases, including contains, insert, remove, iter. The only drawback is that it uses more space for small tables, but it should use significantly less space as the initial cost becomes negligible.

The micro benchmarks are added in the PR, but no effort is made to test and document all the other implementations.

Using DTPageAndOffsetRanges (average = 502 ms):
2025-01-28T23:34:33.505986Z  INFO: : Timing span "update_positions_by_collect": 471.953969ms
2025-01-28T23:34:34.383674Z  INFO: : Timing span "update_positions_by_collect": 505.608822ms
2025-01-28T23:34:35.447897Z  INFO: : Timing span "update_positions_by_collect": 512.471063ms
2025-01-28T23:34:36.498788Z  INFO: : Timing span "update_positions_by_collect": 510.634443ms
2025-01-28T23:34:37.534776Z  INFO: : Timing span "update_positions_by_collect": 511.801253ms

Using DTPageAndBitSet (average = 498 ms -- I've gotten higher averages too)
2025-01-28T23:48:26.766216Z  INFO: : Timing span "update_positions_by_collect": 467.907743ms
2025-01-28T23:48:27.785888Z  INFO: : Timing span "update_positions_by_collect": 510.477257ms
2025-01-28T23:48:28.716523Z  INFO: : Timing span "update_positions_by_collect": 502.43078ms
2025-01-28T23:48:29.725186Z  INFO: : Timing span "update_positions_by_collect": 511.856229ms
2025-01-28T23:48:30.714442Z  INFO: : Timing span "update_positions_by_collect": 497.093745ms

API and ABI breaking changes

None.

Expected complexity level and risk

2, not very complex, but it does touch the datastore.
However, the new code is well tested.

Testing

Proptests are added that assert that DeleteTable behaves correctly as a set, including by checking that it behaves observably as BTreeSet.

@Centril Centril force-pushed the centril/delete-table-opt branch from 6d51044 to 2e99e55 Compare January 28, 2025 20:30
@Centril Centril force-pushed the centril/delete-table-opt branch from 0169d74 to aff1690 Compare January 29, 2025 13:12
@Centril Centril marked this pull request as ready for review January 29, 2025 13:25
@Centril Centril changed the title Centril/delete table opt Switch DeleteTable impl to one based on FixedBitSet Jan 29, 2025
@Centril Centril requested a review from gefjon January 29, 2025 15:00
@Centril Centril added this pull request to the merge queue Jan 29, 2025
Merged via the queue into master with commit 6630a90 Jan 29, 2025
14 of 15 checks passed
@Centril Centril deleted the centril/delete-table-opt branch January 29, 2025 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change DeleteTable to a HashSet
2 participants