You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Any plans to have a real training script for DSV3?
Right now run.py only has the forward pass on some dummy data on DSV2 so it's unclear how much of DSV3 is supported and whether it actually works.
For the DSV3 forward pass, I was able to run using 32 H200s but had to lower config.max_seq_len quite a bit, otherwise was OOMing when setting up symmetric memory. Would love to be able to train DSV3!
The text was updated successfully, but these errors were encountered:
Any plans to have a real training script for DSV3?
Right now run.py only has the forward pass on some dummy data on DSV2 so it's unclear how much of DSV3 is supported and whether it actually works.
For the DSV3 forward pass, I was able to run using 32 H200s but had to lower
config.max_seq_len
quite a bit, otherwise was OOMing when setting up symmetric memory. Would love to be able to train DSV3!The text was updated successfully, but these errors were encountered: