To aux or not to aux? #332

macarbonneau · 2020-01-06T18:44:22Z

Hello!
First of all thank you for the repo. I tried to train my own universal WavRNN, but cannot get it to generate quality samples. I used the config file that is provided here: https://github.com/erogol/WaveRNN

However, here #221 (comment) I can download a trained model and peek at the configuration.

In the latter case, the aux_net and upsamling net from Fatchord is used. In the former case, these augmentations are not used.

Here is my question: Is it possible to train a universal WavRNN without the upsampling and auxiliary networks?

erogol · 2020-01-07T01:36:27Z

In my case, it was not possible to remove any of these. If you remove, the quality degrades much. One option for upsampling net is to estimate its output's mean value and just deterministically upsample reaching the same mean in inference time.

All these comments regarding LJSpeech dataset. I did not try any of these with a multi speaker dataset.

erogol closed this as completed Jan 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

To aux or not to aux? #332

To aux or not to aux? #332

macarbonneau commented Jan 6, 2020

erogol commented Jan 7, 2020

To aux or not to aux? #332

To aux or not to aux? #332

Comments

macarbonneau commented Jan 6, 2020

erogol commented Jan 7, 2020