Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the scheduler used for finetuning. #724

Open
aHapBean opened this issue Feb 27, 2025 · 4 comments
Open

About the scheduler used for finetuning. #724

aHapBean opened this issue Feb 27, 2025 · 4 comments
Assignees

Comments

@aHapBean
Copy link

I noticed in the inference script (cli_demo.py) the following comment:

    # 2. Set Scheduler.
    # Can be changed to `CogVideoXDPMScheduler` or `CogVideoXDDIMScheduler`.
    # We recommend using `CogVideoXDDIMScheduler` for CogVideoX-2B.
    # using `CogVideoXDPMScheduler` for CogVideoX-5B / CogVideoX-5B-I2V.

So, when fine-tuning CogVideoX-2B, should I set the scheduler to CogVideoXDPMScheduler or CogVideoXDDIMScheduler? I couldn't find any documentation on this. In the released training code, CogVideoXDPMScheduler is used by default for training.

If I use CogVideoXDPMScheduler for fine-tuning and switch to CogVideoXDDIMScheduler for inference, will the results be worse compared to using CogVideoXDDIMScheduler for both fine-tuning and inference?

@aHapBean
Copy link
Author

In addition to this, I would like to inquire whether it is acceptable to fine-tune a 2B model using 'bf16' and conduct inference with 'bf16', even though the official recommendation is to use 'fp16'.

I have attempted to perform inference with 'bf16' on the released CogVideoX-2B model, and the results appear to be reasonable.

@liyz15
Copy link

liyz15 commented Mar 5, 2025

They should be identical for training.

@zRzRzRzRzRzRzR
Copy link
Member

Can use bf16 for 2B inference, but the version of our model weights released is FP16. During training, you should use CogVideoXDDIMScheduler for training, as we mainly work on 5B, and the 5B model uses DPM. But the 2B model should also work normally with DPM.

@aHapBean
Copy link
Author

aHapBean commented Mar 9, 2025

Thank you for your response. It's clear to me now and I will proceed as recommended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants