Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broadcast repeated new view messages to speed up high QC sharing #2042

Merged
merged 2 commits into from
Dec 19, 2024

Conversation

JamesHinshelwood
Copy link
Contributor

In some cases where the entire network is down, many nodes will have disjoint values of their high QC. The only way for them to sync the high QCs up is to wait for a supermajority of nodes to become the leader of a missed view, meaning it receives NewView messages from everyone else and can updates its own high QC.

To speed this up, we now broadcast NewViews. We still send the first NewView message directly to the leader, but we broadcast repeated messages.

We also change the NewView processing logic to ensure nodes will update their high QC based on received messages, even if they are not the leader of the view.

In some cases where the entire network is down, many nodes will
have disjoint values of their high QC. The only way for them to
sync the high QCs up is to wait for a supermajority of nodes to
become the leader of a missed view, meaning it receives `NewView`
messages from everyone else and can updates its own high QC.

To speed this up, we now broadcast `NewView`s. We still send the
first `NewView` message directly to the leader, but we broadcast
repeated messages.

We also change the `NewView` processing logic to ensure nodes will
update their high QC based on received messages, even if they are
not the leader of the view.
Copy link
Contributor

🐰 Bencher Report

Branchbroadcast-newviews
Testbedself-hosted
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
nanoseconds (ns)
(Result Δ%)
Upper Boundary
nanoseconds (ns)
(Limit %)
process-empty/process-empty📈 view plot
🚷 view threshold
9,386,100.00
(+2.81%)
10,484,576.84
(89.52%)
produce-full/produce-full📈 view plot
🚷 view threshold
1,957,800,000.00
(-8.36%)
2,904,582,435.22
(67.40%)
🐰 View full continuous benchmarking report in Bencher

@86667 86667 enabled auto-merge December 19, 2024 12:38
@86667 86667 added this pull request to the merge queue Dec 19, 2024
Merged via the queue into main with commit f074c55 Dec 19, 2024
5 of 6 checks passed
@86667 86667 deleted the broadcast-newviews branch December 19, 2024 13:23
JamesHinshelwood added a commit that referenced this pull request Dec 22, 2024
PR #2042 added `NewView` broadcasting, but it didn't change
anything because nodes ignored `NewView`s that were not sent via
a direct request. In fact that PR made network recovery worse,
since the leader would no longer see repeated `NewView`s. This PR
just adds the handler for broadcasted `NewView`s.
JamesHinshelwood added a commit that referenced this pull request Dec 22, 2024
PR #2042 added `NewView` broadcasting, but it didn't change
anything because nodes ignored `NewView`s that were not sent via
a direct request. In fact that PR made network recovery worse,
since the leader would no longer see repeated `NewView`s. This PR
just adds the handler for broadcasted `NewView`s.
github-merge-queue bot pushed a commit that referenced this pull request Dec 22, 2024
PR #2042 added `NewView` broadcasting, but it didn't change
anything because nodes ignored `NewView`s that were not sent via
a direct request. In fact that PR made network recovery worse,
since the leader would no longer see repeated `NewView`s. This PR
just adds the handler for broadcasted `NewView`s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants