pull/4068/head
zxcd 4 months ago
parent 6b04ca51f4
commit 9e25663b95

@ -109,9 +109,9 @@ pwg_aishell3_ckpt_0.5
```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text
usage: synthesize.py [-h]
[--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}]
@ -158,9 +158,9 @@ optional arguments:
```
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text
usage: synthesize_e2e.py [-h]
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -27,13 +27,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize_e2e, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

@ -32,6 +32,6 @@ if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, default speech synthesis from Chinese to English, use stage1 to switch from English to Chinese
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize_e2e, run both speech synthesis from Chinese to English, and English to Chinese
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -28,13 +28,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize_e2e, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

@ -116,9 +116,9 @@ pwg_baker_ckpt_0.4
```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can use stage `0-4` to select the vocoder to use {`pwgan`, `multi band melgan`, `style melgan`, `hifigan`, `wavernn`}
The last number controls the vocoder model during synthesis, which can use `0-4` to select the vocoder in {`pwgan`, `multi band melgan`, `style melgan`, `hifigan`, `wavernn`}
```text
usage: synthesize.py [-h]
@ -166,9 +166,9 @@ optional arguments:
```
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can use stage `0,1,3,4` to select the vocoder to use {`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
The last number controls the vocoder model during synthesis, which can use `0,1,3,4` to select the vocoder in {`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
```text
usage: synthesize_e2e.py [-h]

@ -3,8 +3,8 @@
config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -27,15 +27,15 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default stage 0
# use stage 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize, vocoder is pwgan by default 0
# use 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default stage 0
# use stage 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize_e2e, vocoder is pwgan by default 0
# use 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

@ -3,6 +3,7 @@
config_path=$1
train_output_path=$2
ckpt_name=$3
stage=${4:-0}
stop_stage=${4:-0}

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -28,13 +28,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default stage 0
# use stage 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize, vocoder is pwgan by default 0
# use 1-4 to select the vocoder in {multi band melgan, style melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default stage 0
# use stage 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize_e2e, vocoder is pwgan by default 0
# use 1,3,4 to select the vocoder in {multi band melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi

@ -105,9 +105,9 @@ pwg_ljspeech_ckpt_0.5
```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text
usage: synthesize.py [-h]
@ -155,9 +155,9 @@ optional arguments:
```
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text
usage: synthesize_e2e.py [-h]

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -27,13 +27,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize_e2e, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

@ -172,9 +172,9 @@ optional arguments:
`local/pinyin_to_phone.txt` comes from the readme of the opencpop dataset, indicating the mapping from pinyin to phonemes in opencpop.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text
usage: synthesize_e2e.py [-h]

@ -175,9 +175,9 @@ optional arguments:
`local/pinyin_to_phone.txt`来源于opencpop数据集中的README表示opencpop中拼音到音素的映射。
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` 用于选择合成时使用的声码器模型,取值为 `0``1`,分别对应使用 `pwgan``hifigan` 模型作为声码器。
最后一位参数 `0` 用于选择合成时使用的声码器模型,取值为 `0``1`,分别对应使用 `pwgan``hifigan` 模型作为声码器。
```text
usage: synthesize_e2e.py [-h]

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -32,6 +32,6 @@ if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize_e2e, vocoder is pwgan by default, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi

@ -85,9 +85,8 @@ hifigan_vctk_ckpt_0.2.0
```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name}
```
`--stage` controls the vocoder model during synthesis, which can be `0` , use`hifigan` model as vocoder.
## Speech Synthesis and Speech Editing
@ -142,7 +141,6 @@ You can check the text of downloaded wavs in `source/README.md`.
```bash
./run.sh --stage 3 --stop-stage 3 --gpus 0
```
`stage 3` of `run.sh` calls `local/synthesize_e2e.sh`, `stage 0` of it is **Speech Synthesis** and `stage 1` of it is **Speech Editing**.
You can modify `--wav_path`、`--old_str` and `--new_str` yourself, `--old_str` should be the text corresponding to the audio of `--wav_path`, `--new_str` should be designed according to `--task_name`, both `--source_lang` and `--target_lang` should be `en` for model trained with VCTK dataset.
## Pretrained Model

@ -28,10 +28,10 @@ fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is hifigan by default
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize, task_name is speech synthesize by default stage 0, stage 1 will use speech edit as taskname
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize, run both speech synthesize and speech edit
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi

@ -108,9 +108,9 @@ pwg_vctk_ckpt_0.1.1
```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text
usage: synthesize.py [-h]
@ -158,9 +158,9 @@ optional arguments:
```
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0
```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
The last number controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text
usage: synthesize_e2e.py [-h]

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -4,8 +4,8 @@ config_path=$1
train_output_path=$2
ckpt_name=$3
stage=0
stop_stage=0
stage=${4:-0}
stop_stage=${4:-0}
# pwgan
if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then

@ -27,13 +27,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize, vocoder is pwgan by default 0, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
# synthesize_e2e, vocoder is pwgan by 0, use 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} 0 || exit -1
fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

Loading…
Cancel
Save