Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(abci): ignore comet ctx and use app ctx #2568

Merged
merged 71 commits into from
Mar 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
e7f0b79
Enforce synchronicity of final FCU and always return error on engine …
shotes Feb 6, 2025
a3796f5
nit
abi87 Feb 6, 2025
da52ea0
extended error
abi87 Feb 7, 2025
dbd75e1
Add error to force FCU
shotes Feb 6, 2025
3dc865d
Add retryWithTimeout
shotes Feb 7, 2025
26f4c7d
retry on unknown
shotes Feb 7, 2025
8bc54e9
comments around engine API error handling
shotes Feb 7, 2025
3b91793
Add lib for backoff func
shotes Feb 7, 2025
d20bd0d
shorten retry initial interval
shotes Feb 7, 2025
35296a1
lint and remove unused return value
shotes Feb 7, 2025
4687f00
Undo extra changes
shotes Feb 19, 2025
00ef155
remove extra return
shotes Feb 19, 2025
db423b2
Merge branch 'main' into engine-api-retry
calbera Feb 20, 2025
b58fced
Merge branch 'main' into engine-api-retry
calbera Feb 25, 2025
62b5838
Merge branch 'main' into engine-api-retry
shotes Feb 25, 2025
015698c
Merge branch 'main' into engine-api-retry
shotes Feb 28, 2025
194ba50
Merge branch 'main' into engine-api-retry
abi87 Feb 28, 2025
4e49106
Use configured values instead of consts
shotes Feb 28, 2025
2639b6b
lint
shotes Feb 28, 2025
8ec64f3
Add RPCRetryInterval flag
shotes Feb 28, 2025
f9096cb
remove debug print
shotes Feb 28, 2025
9b1255f
add to config and reorder
shotes Feb 28, 2025
b6e92ab
Add max interval and infinite retries
shotes Feb 28, 2025
5303fd5
Organize comments and increase max interval
shotes Feb 28, 2025
dc34dc9
More explicit comments
shotes Feb 28, 2025
d51d636
Condense NotifyNewPayload for funlen lint
shotes Feb 28, 2025
2069930
Allow new payload during beginning sync:
shotes Feb 28, 2025
2798898
lint
shotes Feb 28, 2025
48f6f7d
remove extraneous test files
shotes Feb 28, 2025
8dc24a5
Fix mock
shotes Feb 28, 2025
ab124a9
Add infinite duration
shotes Mar 1, 2025
5f9fe91
Make syncing status WARN
shotes Mar 1, 2025
5df3171
Merge branch 'main' of github.com:berachain/beacon-kit into engine-ap…
shotes Mar 4, 2025
69d67fd
remove extraneous file and generate mock
shotes Mar 4, 2025
156ba5f
Fix test unit cover
shotes Mar 4, 2025
0144133
Add cancellable ctx to cometbft used for stopping the service
fridrik01 Feb 8, 2025
a871e7b
Reuse cancellable context
shotes Mar 4, 2025
8a86207
Clean up comments and logging
shotes Mar 4, 2025
dcf36f0
undo IDE auto correct change
shotes Mar 4, 2025
42fa262
Make comments better
shotes Mar 4, 2025
aa7b9f7
lint
shotes Mar 4, 2025
0265130
combine ctxs
shotes Mar 4, 2025
a7b367b
reduce diff
shotes Mar 4, 2025
adc58d0
reduce diff more
shotes Mar 4, 2025
f8032c4
set max elapsed time to 0 and nits
shotes Mar 4, 2025
347cf4b
Add nil check for combineContexts for tests
shotes Mar 4, 2025
b0da96e
Merge branch 'main' into engine-api-retry
shotes Mar 4, 2025
7c22253
Merge branch 'main' into engine-api-retry
abi87 Mar 5, 2025
7c39a3f
Fixes for context cancellation:
shotes Mar 5, 2025
d422a6d
fix linter?
shotes Mar 5, 2025
73b0444
Merge branch 'engine-api-retry' into abci_context_cancel
calbera Mar 5, 2025
d375414
reorg context defining
shotes Mar 5, 2025
b225f6f
Merge branch 'abci_context_cancel' of github.com:berachain/beacon-kit…
shotes Mar 5, 2025
a1cf15a
Merge branch 'main' into engine-api-retry
rezbera Mar 5, 2025
16feb52
Merge branch 'engine-api-retry' into abci_context_cancel
rezbera Mar 5, 2025
5899cb5
Add chan to kill goroutine
shotes Mar 5, 2025
8ff93eb
lint and fix done case
shotes Mar 5, 2025
17e2094
delete extra file
shotes Mar 5, 2025
a0658c3
Add fatal errors
shotes Mar 5, 2025
d7d7449
Add IsNonFatalError
shotes Mar 5, 2025
38d5b09
Merge branch 'engine-api-retry' into abci_context_cancel
shotes Mar 5, 2025
75b5ad6
add more timeout for test and ignore temp-test-simulated.txt
shotes Mar 5, 2025
b26a276
Remove combineContext
shotes Mar 5, 2025
f8b3cbe
Check context and error in commit
shotes Mar 6, 2025
6416fb7
move ctx checks to higher level service
shotes Mar 6, 2025
994b472
lint
shotes Mar 6, 2025
a4e630d
logs and nits
abi87 Mar 6, 2025
b387d0b
Merge branch 'engine-api-retry' into abci_context_cancel
abi87 Mar 6, 2025
aea72aa
Merge branch 'main' into abci_context_cancel
abi87 Mar 6, 2025
b25d97d
check s.Ctx.Err on IniChain as well
abi87 Mar 6, 2025
f1c22ab
simplified consensus functions to avoid passing s.ctx as parameter
abi87 Mar 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,7 @@ coverage-test-unit-cover.txt
coverage-merged.txt
test-simulated.txt
test-unit-cover.txt
temp-test-simulated.txt
.vercel

# server dev env for Air
Expand Down
55 changes: 45 additions & 10 deletions consensus/cometbft/service/abci.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,19 +37,33 @@ var (
)

func (s *Service) InitChain(
ctx context.Context,
_ context.Context,
req *cmtabci.InitChainRequest,
) (*cmtabci.InitChainResponse, error) {
return s.initChain(ctx, req)
// Check if ctx is still good. CometBFT does not check this.
if s.ctx.Err() != nil {
// If the context is getting cancelled, we are shutting down.
return &cmtabci.InitChainResponse{}, s.ctx.Err()
}
//nolint:contextcheck // see s.ctx comment for more details
return s.initChain(req) // internally this uses s.ctx
}

// PrepareProposal implements the PrepareProposal ABCI method and returns a
// ResponsePrepareProposal object to the client.
func (s *Service) PrepareProposal(
ctx context.Context,
_ context.Context,
req *cmtabci.PrepareProposalRequest,
) (*cmtabci.PrepareProposalResponse, error) {
return s.prepareProposal(ctx, req)
// Check if ctx is still good. CometBFT does not check this.
if s.ctx.Err() != nil {
// If the context is getting cancelled, we are shutting down.
// It is ok returning an empty proposal.
//nolint:nilerr // explicitly allowing this case
return &cmtabci.PrepareProposalResponse{Txs: req.Txs}, nil
}
//nolint:contextcheck // see s.ctx comment for more details
return s.prepareProposal(s.ctx, req)
}

func (s *Service) Info(context.Context,
Expand Down Expand Up @@ -77,17 +91,31 @@ func (s *Service) Info(context.Context,
// ProcessProposal implements the ProcessProposal ABCI method and returns a
// ResponseProcessProposal object to the client.
func (s *Service) ProcessProposal(
ctx context.Context,
_ context.Context,
req *cmtabci.ProcessProposalRequest,
) (*cmtabci.ProcessProposalResponse, error) {
return s.processProposal(ctx, req)
// Check if ctx is still good. CometBFT does not check this.
if s.ctx.Err() != nil {
// Node will panic on context cancel with "CONSENSUS FAILURE!!!" due to
// returning an error. This is expected. We do not want to accept or
// reject a proposal based on incomplete data.
return nil, s.ctx.Err()
}
//nolint:contextcheck // see s.ctx comment for more details
return s.processProposal(req) // internally this uses s.ctx
}

func (s *Service) FinalizeBlock(
ctx context.Context,
_ context.Context,
req *cmtabci.FinalizeBlockRequest,
) (*cmtabci.FinalizeBlockResponse, error) {
return s.finalizeBlock(ctx, req)
// Check if ctx is still good. CometBFT does not check this.
if s.ctx.Err() != nil {
// Node will panic on context cancel with "CONSENSUS FAILURE!!!" due to error.
// We expect this to happen and do not want to finalize any incomplete or invalid state.
return nil, s.ctx.Err()
}
return s.finalizeBlock(req) // internally this uses s.ctx
}

// Commit implements the ABCI interface. It will commit all state that exists in
Expand All @@ -98,9 +126,16 @@ func (s *Service) FinalizeBlock(
// against that height and gracefully halt if it matches the latest committed
// height.
func (s *Service) Commit(
ctx context.Context, req *cmtabci.CommitRequest,
_ context.Context, req *cmtabci.CommitRequest,
) (*cmtabci.CommitResponse, error) {
return s.commit(ctx, req)
// Check if ctx is still good. CometBFT does not check this.
if s.ctx.Err() != nil {
// Node will panic on context cancel with "CONSENSUS FAILURE!!!" due to error.
// We expect this to happen and do not want to commit any incomplete or invalid state.
return nil, s.ctx.Err()
}

return s.commit(req)
}

//
Expand Down
3 changes: 1 addition & 2 deletions consensus/cometbft/service/commit.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,14 @@
package cometbft

import (
"context"
"fmt"

"cosmossdk.io/store/rootmulti"
cmtabci "github.com/cometbft/cometbft/abci/types"
)

func (s *Service) commit(
context.Context, *cmtabci.CommitRequest,
*cmtabci.CommitRequest,
) (*cmtabci.CommitResponse, error) {
if s.finalizeBlockState == nil {
// This is unexpected since CometBFT should call Commit only
Expand Down
10 changes: 5 additions & 5 deletions consensus/cometbft/service/finalize_block.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,16 @@
package cometbft

import (
"context"
"fmt"

cmtabci "github.com/cometbft/cometbft/abci/types"
"github.com/sourcegraph/conc/iter"
)

func (s *Service) finalizeBlock(
ctx context.Context,
req *cmtabci.FinalizeBlockRequest,
) (*cmtabci.FinalizeBlockResponse, error) {
res, err := s.finalizeBlockInternal(ctx, req)
res, err := s.finalizeBlockInternal(req)
if res != nil {
res.AppHash = s.workingHash()
}
Expand All @@ -41,7 +39,6 @@ func (s *Service) finalizeBlock(
}

func (s *Service) finalizeBlockInternal(
ctx context.Context,
req *cmtabci.FinalizeBlockRequest,
) (*cmtabci.FinalizeBlockResponse, error) {
if err := s.validateFinalizeBlockHeight(req); err != nil {
Expand All @@ -53,7 +50,10 @@ func (s *Service) finalizeBlockInternal(
// here given that during block replay ProcessProposal is not executed by
// CometBFT.
if s.finalizeBlockState == nil {
s.finalizeBlockState = s.resetState(ctx)
s.finalizeBlockState = s.resetState(s.ctx)
} else {
// Preserve the CosmosSDK context while using the correct base ctx.
s.finalizeBlockState.SetContext(s.finalizeBlockState.Context().WithContext(s.ctx))
}

// Iterate over all raw transactions in the proposal and attempt to execute
Expand Down
5 changes: 1 addition & 4 deletions consensus/cometbft/service/init_chain.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
package cometbft

import (
"context"
"fmt"

"github.com/berachain/beacon-kit/primitives/encoding/json"
Expand All @@ -31,7 +30,6 @@ import (
)

func (s *Service) initChain(
ctx context.Context,
req *cmtabci.InitChainRequest,
) (*cmtabci.InitChainResponse, error) {
if req.ChainId != s.chainID {
Expand Down Expand Up @@ -83,9 +81,8 @@ func (s *Service) initChain(
}
}

s.finalizeBlockState = s.resetState(ctx)
s.finalizeBlockState = s.resetState(s.ctx)

//nolint:contextcheck // ctx already passed via resetState
resValidators, err := s.initChainer(
s.finalizeBlockState.Context(),
req.AppStateBytes,
Expand Down
1 change: 1 addition & 0 deletions consensus/cometbft/service/prepare_proposal.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ func (s *Service) prepareProposal(
// Always reset state given that PrepareProposal can timeout
// and be called again in a subsequent round.
s.prepareProposalState = s.resetState(ctx)
//nolint:contextcheck // ctx already passed via resetState
s.prepareProposalState.SetContext(
s.getContextForProposal(
s.prepareProposalState.Context(),
Expand Down
6 changes: 2 additions & 4 deletions consensus/cometbft/service/process_proposal.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,15 +21,13 @@
package cometbft

import (
"context"
"fmt"
"time"

cmtabci "github.com/cometbft/cometbft/abci/types"
)

func (s *Service) processProposal(
ctx context.Context,
req *cmtabci.ProcessProposalRequest,
) (*cmtabci.ProcessProposalResponse, error) {
startTime := time.Now()
Expand All @@ -52,9 +50,9 @@ func (s *Service) processProposal(
// processed the first block, as we want to avoid overwriting the
// finalizeState
// after state changes during InitChain.
s.processProposalState = s.resetState(ctx)
s.processProposalState = s.resetState(s.ctx)
if req.Height > s.initialHeight {
s.finalizeBlockState = s.resetState(ctx)
s.finalizeBlockState = s.resetState(s.ctx)
}

s.processProposalState.SetContext(
Expand Down
21 changes: 19 additions & 2 deletions consensus/cometbft/service/service.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,15 @@ type Service struct {
minRetainBlocks uint64

chainID string

// ctx is the context passed in for the service. CometBFT currently does
// not support context usage. It passes "context.TODO()" to apps that
// implement the ABCI++ interface, and does not provide a context that is
// a child context of the one the node originally provides to comet.
// Thus the app cannot tell when the context as been cancelled or not.
// TODO: We must use this as a workaround for now until CometBFT properly
// generates contexts that inherit from the parent context we provide.
ctx context.Context
}

func NewService(
Expand Down Expand Up @@ -165,6 +174,7 @@ func (s *Service) Start(
return err
}

s.ctx = ctx
s.node, err = node.NewNode(
ctx,
cfg,
Expand Down Expand Up @@ -220,6 +230,12 @@ func (s *Service) Stop() error {
return errors.Join(errs...)
}

// ResetAppCtx sets the app ctx for the service. This is used
// primarily for the mock service.
func (s *Service) ResetAppCtx(ctx context.Context) {
s.ctx = ctx
}

// Name returns the name of the cometbft.
func (s *Service) Name() string {
return AppName
Expand Down Expand Up @@ -320,8 +336,9 @@ func (s *Service) getContextForProposal(
// on initialHeight. Panic appeases nilaway.
panic(fmt.Errorf("getContextForProposal: %w", errNilFinalizeBlockState))
}
ctx, _ = s.finalizeBlockState.Context().CacheContext()
return ctx
newCtx, _ := s.finalizeBlockState.Context().CacheContext()
// Preserve the CosmosSDK context while using the correct base ctx.
return newCtx.WithContext(ctx.Context())
}

// CreateQueryContext creates a new sdk.Context for a query, taking as args
Expand Down
3 changes: 2 additions & 1 deletion testing/simulated/simcomet.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,8 @@ func ProvideSimComet(
)}
}

func (s *SimComet) Start(_ context.Context) error {
func (s *SimComet) Start(ctx context.Context) error {
s.Comet.ResetAppCtx(ctx)
return nil
}

Expand Down
2 changes: 1 addition & 1 deletion testing/simulated/simulated_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ func (s *SimulatedSuite) SetupTest() {
}()

// Allow a short period for services to fully initialize.
time.Sleep(2 * time.Second)
time.Sleep(5 * time.Second)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noting here that this is a hacky way to make sure that goroutine running s.TestNode.Start(s.Ctx) above completes before we move on (otherwise consensus service may crash due to ctx being nil).
We will have to rework this to use a different way to check stuff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed to use a reliable method in #2572

It's not ideal, because ideally we'd have a readiness probe API but this is better than sleeping arbitrary time.

}

// TearDownTest cleans up the test environment.
Expand Down
Loading