-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
16GB of RAM needed to process the first block #670
Comments
I wouldn't call this a bug with the Gaia. This most likely stems from sdk and how it reads the genesis file. |
I'd say it's a bug with Gaia until we fix it in the SDK :). Right now only gaia is being significantly impacted, so it is users of gaia who would feel this. Is it possible to put the same issue on two repositories? |
In docs right info: https://github.com/cosmos/gaia/pull/621/files |
I guess I am calling the level of resource consumption a bug. It's of course good to have the system requirements documented, but we're dealing with a not-terribly-large json file, I figure that the import should not require 16gb of RAM. |
I don't believe this to be a bug for the following reasons:
I think @faddat has highlighted an important point and instead of closing this I think we could transform this into a root cause verification and investigation that the json load actually requires 16GB. Not sure if it should be here or on tm repo (since gaia has no control over the underlying data management/memory system). |
Comparison: Another chain I work with (graphene framework) imports this file: https://gateway.pinata.cloud/ipfs/QmPrwVpwe4Ya46CN9LXNnrUdWvaDLMwFetMUdpcdpjFbyu (678MB of json, 1.35 million accounts with balances and multiple public keys per account) On a raspberry Pi 4 with 4GB RAM in a few minutes. When I get back I will link to what I think is the relevant code in tendermint, I haven't been able to figure out why it's slow just yet. 16GB is fine I guess, but we only ever use 16gb once per node and it rules out a wide range of devices. |
So we eat the json here: But my feeling is that the slowdown occurs when dealing with tm.db: I am going to do a silly test, unsafe-reset-all on my 12 core / 128gb machine and start with a RAM store. The strange thing is that we don't consume much CPU while importing genesis.json, but the bottleneck probably isn't disk, that machine is where I started the ndoe and it's got RAID 0 2TB Nvme disks. |
Gaia v4.0.5 resolves the startup time by bumping up to Cosmos SDK 0.41.4. It may be related to the 16GB memory requirement; startup is down to 10 mins. wdyt of running your test again? |
Absolutely. I'll try it on a Raspberry Pi now. |
@shahankhatch With what setup did you get a 10 minute start? 34 for me on my giant machine at Hetzner; I used the quckstart snippet exactly.
|
I started it with skipping invariants. From your logs it looks like invariants took the amount of time that would coincide with a <10min startup without the invariant checks. Do you agree? Mind running it again with memory usage metrics? Just to determine if this issue is on point with the discussion. |
I don't mind at all. Is there a flag for this? Right now, I am starting it up on rpi, and timing it. Will be interesting. Fully-auto rpi4 image builds for Gaia, are also near completion. https://github.com/faddat/sos/actions/workflows/gaia.yml I would like to make a PR for that here, what do you think? |
Gaia starting on an rpi:
About 36 minutes, and there was no out of memory issue. I could repeat this with a tool like htop, but if you've got another way to measure memory consumption you'd like me to use, just let me know :). I kind of reckon that it is now safe to close this issue. It's a 4GB rpi. |
Summary of Bug
https://github.com/cosmos/gaia/blob/a96d7f50b875c557f6c5fa98ac9db50b9fee68b5/docs/migration/cosmoshub-3.md#preliminary
I think that we should treat the RAM requirement as a bug. Other Cosmos networks will face this issue and it rules out running nodes on smaller machines. I figure that this is somehow related to #669.
Version
4.0.4
Steps to Reproduce
Run gaia on Raspberry Pi, or any machine with less than 16GB RAM. All nodes have to process genesis when they start, meaning that this does not only affect validators, but instead anyone wanting to run
gaia
nodes.For Admin Use
The text was updated successfully, but these errors were encountered: