Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eSpeak issues when running on Windows (solution) #15

Open
lethanner opened this issue Mar 5, 2025 · 3 comments
Open

eSpeak issues when running on Windows (solution) #15

lethanner opened this issue Mar 5, 2025 · 3 comments

Comments

@lethanner
Copy link

lethanner commented Mar 5, 2025

Installing eSpeak .msi from it's official repository is not enough. I got many errors while trying to run DiffRhythm on Windows starting from common "espeak not installed" and ending with "no such file or directory" without any details.

In order:

eSpeak not installed

Set user enviroment variables in your OS:
PHONEMIZER_ESPEAK_LIBRARY -> C:\Program Files\eSpeak NG\libespeak-ng.dll
PHONEMIZER_ESPEAK_PATH -> C:\Program Files\eSpeak NG
Make sure you really installed eSpeak NG to C:\Program Files\eSpeak NG directory.
Reboot your PC to apply changes.

Could not load the mbrola.dll (failed to load voice "ja")

Download https://github.com/thiekus/MBROLA/releases/download/3.3/mbrola_build_3.3_rev2.zip. Open Win64/Release folder in archive (I hope that no one has been using the 32-bit version of Windows) and extract files to eSpeak NG installation directory.
Also download and install http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pcwin/MbrolaTools35.exe.

(source: numediart/MBROLA#46)

Could not load the specified mbrola voice file (it's also failed to load voice "ja")

See my comment espeak-ng/espeak-ng#723 (comment).
UPDATE 07.03.2025: the Japanese voice need to be installed only for preloading voices without errors. It's not used in inference for now! See #17 (comment) for explanation.

No such file or directory (it's again failed to load voice "ja")

See numediart/MBROLA#46 (comment).
Download and extract folder in archive (from link above) directly to C:\Program Files\eSpeak NG\espeak-ng-data directory.
The final path should be C:\Program Files\eSpeak NG\espeak-ng-data\mbrola_ph.

Conclusion

Now it works. Let me know if I forgot anything.
Dear developers, if you seeing my information necessary, could you please put this information in README?

@NZqian
Copy link
Collaborator

NZqian commented Mar 6, 2025

Thank you so much! We haven't tested on Windows yet. We would definately put this information in README, or would you mind providing more details about your deployment on Windows and start a pr? A deploy guidelines for Windows would be perfect.

@lethanner
Copy link
Author

start a pr

Yes, I would really like to start a PR! But sorry, I forgot to warn you that the information I provided is for informational purposes only (like "temporary solution") and allows you to run DiffRhythm "at least it works somehow".

At the moment, the problem of why it uses the Japanese voice pack remains unresolved - and (probably because of this) there are serious problems with the performance of English lyrics (I have not tested Chinese yet). I will open another issue for this problem.

Anyway, the generation of the music (without paying attention to lyrics) pleased me with the quality and speed - thank you for your work!

So, I was planning to open PR after solving the lyrics issue.

@lethanner
Copy link
Author

lethanner commented Mar 7, 2025

UPDATE 07.03.2025: the Japanese voice need to be installed only for preloading voices without errors. It's not used in inference for now, don't panic! See #17 (comment) for explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants