Base op-reth Archival Node: Can't sync #11512

DaveWK · 2024-10-05T17:05:54Z

Describe the bug

I have attempted on a few different setups, but it does not appear I am able to sync an archival node (without --full) and keep it in sync on AWS.
I am using a io2 storage (20k iops) with an r7a.2xlarge (64 gigs of ram, 8 AMD EPYC 9R14 cores) and it seems to keep looping through the pipeline stages but never catching up.. It seems like the culprit is MerkelExecute, and I cans ee from the performance that it is not a CPU-bound problem; the single core (since I assume this is a serlialized, single-thread step) is not maxed out, however my disk iops and utilization is always at 100%.. Also the amount of data being transferred is pretty small, so even with 20k iops I am only read/writing around 8 megs of data.

My suspicion is the mdbx file is too "sparse" and it needs some kind of online compaction or "defrag" but don't know how to debug this. Running mdbx_copy is not really a solution since it takes 5 hours to run (and is not an online operation) and I am not able to sync from the available reth-base archive snapshot.

Steps to reproduce

Download the base-reth archive
make op-maxperf
exec op-reth node --chain=base \ --rollup.sequencer-http https://mainnet-sequencer.base.org \ --http --http.port 8545 --ws --ws.port 8546 \ --http.api=web3,debug,eth,net,txpool \ --ws.api=web3,debug,eth,net,txpool \ --metrics=127.0.0.1:9001 \ --ws.origins="*" \ --http.corsdomain="*" \ --rollup.discovery.v4 \ --engine.experimental \ --authrpc.jwtsecret ${HOME}/jwt.hex \ --datadir ${HOME}/oprethdata
Wait around 5-ish hours and see that indeed it keeps looping through the pipeline and never catching up. The longest step regardless of number of blocks seems to be MerkleExec, which always takes around 2-3 hours.

Node logs

No response

Platform(s)

Linux (x86)

What version/commit are you on?

v1.0.8

What database version are you on?

2

Which chain / network are you on?

base mainnet

What type of node are you running?

Archive (default)

What prune config do you use, if any?

n/a

If you've built Reth from source, provide the full command you used

make maxperf-op

Code of Conduct

I agree to follow the Code of Conduct

The text was updated successfully, but these errors were encountered:

DaveWK · 2024-10-05T17:07:08Z

for the record, without --engine.experimental the performance is even worse

mattsse · 2024-10-05T17:29:15Z

very odd, same as #11306

we haven't tried to reproduce this from the snapshot yet, but resynced base entirely on similar infrastructure as yours without any issues.
I wonder if this has anything to do with the most recent snapshot itself, will check.

resyncing base archive takes ~48hrs, so currently I'd recommend this

DaveWK · 2024-10-07T15:04:10Z

I tried to sync over the weekend "from scratch" by using the db drop but still the same -- never catches up and keeps looping over the steps. I am trying now by just using rm -rf db to see if that gets it in sync..

Should doing a db drop reset ALL indexes/counters/locks in mdbx?

My working theory is that something is using a non-deterministic hash as some kind of "foreign key relation" and it's re-inserting or updating the same records over and over again, and it never does a true checkpoint.. Perhaps a table is using "wall clock time" as an index and diverges because an old record with another "wall clock time" but same hash exists?

Other suspicion: does MDBX have the concept of iterators such that an index/key and isn't being reset (or contains gaps in records?)

Other thing of note "my safe db_hash" reported by op-node is always 0x0000000000000 or whatever an all-zero record would be.. is something broken since it never sets a "safe" hash for a checkpoint?

Rjected · 2024-10-07T15:28:15Z

My working theory is that something is using a non-deterministic hash as some kind of "foreign key relation" and it's re-inserting or updating the same records over and over again, and it never does a true checkpoint.. Perhaps a table is using "wall clock time" as an index and diverges because an old record with another "wall clock time" but same hash exists?

This is definitely not it, but I wonder if this is because of how op-node chooses ranges to sync, combined with poor / high latency disk

DaveWK · 2024-10-07T15:41:02Z

thanks for the clarification on that -- I think the disk latency may be symptomatic rather than a root cause -- I have 20k provisoned iops on an aws io2 volume and it's saturating the iops but only reading/writing less than 10 megs total..

This log from op-node seems to be wonky but can't tell if it's just a bad log message:

t=2024-10-07T15:34:03+0000 lvl=info msg="Sync progress" reason="new chain head block" l2_finalized=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_pending_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_unsafe=0xb09fad979d89096372aadd19969428526cdee2e2959dd676f987d0b5abefdbbe:20762948 l2_backup_unsafe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_time=1728315243

the fact that the finalized/safe/pending_safe never changes would support the theory that it's resyncing the same ranges every time, but could also just be bad logging

DaveWK · 2024-10-08T15:34:12Z

So I think I have narrowed it down a bit..

I noticed that when I use the --syncmode=execution-layer with op-node it never updates safe/finalized in those fields.

I tried using hildr to see if that would work, and if I use execution-layer or full sync with hildr it does indeed update those params.

When I use hildr, however it complains about "batches" (which I assume were previously inserted by op-node) being invalid because of the wall-clock time of the batches being too far skewed from an "expected" value. I also noticed the performance is pretty bad, which I assume is from the previous batches that op-node inserted as well.

I am not running with --rollup.enable-genesis-walkback and from the docs it would suggest that when I restart op-reth without that flag, it should be at least advancing those safe/finalized values to the last values of "unsafe", but it does not appear to be doing so.

So I think this has something to do with attempting "execution layer" sync modes with reth and either reth is not processing the batches, or op-node is not properly handing off the batches to op-reth, so they never get put in a "finalized/safe" state.

Unfortunately I ran into other bugs with hildr where it would freak out about null values in safe/finalized, as well as the batch time validation mismatches, so unfortunately can't use hildr as a replacement for op-node.

At the moment I am doing a sync from scratch with op-reth and op-node on base with "consesnsus-layer" sync, which seems like it's at least moving up in blocks at a good speed, but will take a while til its finshed.

It does appear that doing a db drop does not clear out whatever these batch entries are, as I noticed they seem to persist.

DaveWK · 2024-10-08T15:37:13Z

Also curious if safe/finalized should ever actually be 0x00 since it would seem that having the genesis block values be the "lowest" a node can wind back to would at least prevent these null/zero variable issues.

bix29 · 2024-10-18T01:03:26Z

I encountered the same issue, when I tried to do the initail sync from the Base mainnet arhive snapshot with --syncmode=execution-layer

the MerkleExecute stage of op-reth is slow, seems cannot catchup the latest block forever
ts=2024-10-18T01:00:11.571181174Z level=info target=reth::cli message=Status connected_peers=30 stage=MerkleExecute checkpoint=21195263 target=21204568 stage_progress=87.39% stage_eta="32m 12s"
t=2024-10-18T01:02:57+0000 lvl=info msg="Sync progress" reason="new chain head block" l2_finalized=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_pending_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_unsafe=0x3d813e78926bbdefe1fc5cb7a87470de1216c0393ab13c5d860ce54a6d1ab665:21212015 l2_backup_unsafe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_time=1729213377

The Disk IOPS \ CPU \ Memory usage are all at a relatively low level

github-actions · 2024-11-15T02:06:05Z

This issue is stale because it has been open for 21 days with no activity.

github-actions · 2024-11-23T02:04:19Z

This issue was closed because it has been inactive for 7 days since being marked as stale.

DaveWK added C-bug An unexpected or incorrect behavior S-needs-triage This issue needs to be labelled labels Oct 5, 2024

github-project-automation bot added this to Reth Tracker Oct 5, 2024

github-project-automation bot moved this to Todo in Reth Tracker Oct 5, 2024

emhane added A-staged-sync Related to staged sync (pipelines and stages) A-op-reth Related to Optimism and op-reth and removed S-needs-triage This issue needs to be labelled labels Oct 8, 2024

mattsse mentioned this issue Oct 10, 2024

fix: persist finalized block #11623

Merged

github-actions bot added the S-stale This issue/PR is stale and will close with no further activity label Nov 15, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 23, 2024

github-project-automation bot moved this from Todo to Done in Reth Tracker Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Base op-reth Archival Node: Can't sync #11512

Base op-reth Archival Node: Can't sync #11512

DaveWK commented Oct 5, 2024

DaveWK commented Oct 5, 2024

mattsse commented Oct 5, 2024 •

edited

Loading

DaveWK commented Oct 7, 2024

Rjected commented Oct 7, 2024

DaveWK commented Oct 7, 2024

DaveWK commented Oct 8, 2024

DaveWK commented Oct 8, 2024

bix29 commented Oct 18, 2024 •

edited

Loading

github-actions bot commented Nov 15, 2024

github-actions bot commented Nov 23, 2024

Base op-reth Archival Node: Can't sync #11512

Base op-reth Archival Node: Can't sync #11512

Comments

DaveWK commented Oct 5, 2024

Describe the bug

Steps to reproduce

Node logs

Platform(s)

What version/commit are you on?

What database version are you on?

Which chain / network are you on?

What type of node are you running?

What prune config do you use, if any?

If you've built Reth from source, provide the full command you used

Code of Conduct

DaveWK commented Oct 5, 2024

mattsse commented Oct 5, 2024 • edited Loading

DaveWK commented Oct 7, 2024

Rjected commented Oct 7, 2024

DaveWK commented Oct 7, 2024

DaveWK commented Oct 8, 2024

DaveWK commented Oct 8, 2024

bix29 commented Oct 18, 2024 • edited Loading

github-actions bot commented Nov 15, 2024

github-actions bot commented Nov 23, 2024

mattsse commented Oct 5, 2024 •

edited

Loading

bix29 commented Oct 18, 2024 •

edited

Loading