1. `--config` style melgan config file. You should use the same config with which the model is trained.
@ -113,3 +110,20 @@ optional arguments:
3. `--test-metadata` is the metadata of the test dataset. Use the `metadata.jsonl` in the `dev/norm` subfolder from the processed directory.
4. `--output-dir` is the directory to save the synthesized audio files.
5. `--ngpu` is the number of gpus to use, if ngpu == 0, use cpu.
## Pretrained Models
The pretrained model can be downloaded here [style_melgan_csmsc_ckpt_0.1.1.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/style_melgan/style_melgan_csmsc_ckpt_0.1.1.zip).
The static model of Style MelGAN is not available now.
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/synthesize_e2e.py`, which can synthesize waveform from text file.
```bash
@ -142,7 +140,6 @@ usage: synthesize_e2e.py [-h]
[--waveflow-checkpoint WAVEFLOW_CHECKPOINT]
[--phones-dict PHONES_DICT] [--text TEXT]
[--output-dir OUTPUT_DIR] [--ngpu NGPU]
[--verbose VERBOSE]
Synthesize with transformer tts & waveflow.
@ -165,7 +162,6 @@ optional arguments:
--output-dir OUTPUT_DIR
output dir.
--ngpu NGPU if ngpu == 0, use cpu.
--verbose VERBOSE verbose.
```
1. `--transformer-tts-config`, `--transformer-tts-checkpoint`, `--transformer-tts-stat` and `--phones-dict` are arguments for transformer_tts, which correspond to the 4 files in the transformer_tts pretrained model.
2. `--waveflow-config`, `--waveflow-checkpoint` are arguments for waveflow, which correspond to the 2 files in the waveflow pretrained model.