diff --git a/docs/source/released_model.md b/docs/source/released_model.md index 2b584163..8f855f7c 100644 --- a/docs/source/released_model.md +++ b/docs/source/released_model.md @@ -61,7 +61,7 @@ WaveRNN | CSMSC |[WaveRNN-csmsc](https://github.com/PaddlePaddle/PaddleSpeech/tr Model Type | Dataset| Example Link | Pretrained Models :-------------:| :------------:| :-----: | :-----: GE2E| AISHELL-3, etc. |[ge2e](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/ge2e)|[ge2e_ckpt_0.3.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/ge2e/ge2e_ckpt_0.3.zip) -GE2E + Tactron2| AISHELL-3 |[ge2e-tactron2-aishell3](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/vc0)|[tacotron2_aishell3_ckpt_0.3.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/tacotron2/tacotron2_aishell3_ckpt_0.3.zip) +GE2E + Tactron2| AISHELL-3 |[ge2e-tactron2-aishell3](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/vc0)|[tacotron2_aishell3_ckpt_vc0_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/tacotron2/tacotron2_aishell3_ckpt_vc0_0.2.0.zip) GE2E + FastSpeech2 | AISHELL-3 |[ge2e-fastspeech2-aishell3](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/vc1)|[fastspeech2_nosil_aishell3_vc1_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_aishell3_vc1_ckpt_0.5.zip) diff --git a/examples/aishell3/vc0/README.md b/examples/aishell3/vc0/README.md index 29585eb4..664ec1ac 100644 --- a/examples/aishell3/vc0/README.md +++ b/examples/aishell3/vc0/README.md @@ -116,3 +116,25 @@ ref_audio ```bash CUDA_VISIBLE_DEVICES=${gpus} ./local/voice_cloning.sh ${conf_path} ${train_output_path} ${ckpt_name} ${ge2e_params_path} ${ref_audio_dir} ``` + +## Pretrained Model +[tacotron2_aishell3_ckpt_vc0_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/tacotron2/tacotron2_aishell3_ckpt_vc0_0.2.0.zip) + + +Model | Step | eval/loss | eval/l1_loss | eval/mse_loss | eval/bce_loss| eval/attn_loss +:-------------:| :------------:| :-----: | :-----: | :--------: |:--------:|:---------: +default| 2(gpu) x 37596|0.58704|0.39623|0.15073|0.039|1.9981e-04| + +Tacotron2 checkpoint contains files listed below. +(There is no need for `speaker_id_map.txt` here ) + +```text +tacotron2_aishell3_ckpt_vc0_0.2.0 +├── default.yaml # default config used to train tacotron2 +├── phone_id_map.txt # phone vocabulary file when training tacotron2 +├── snapshot_iter_37596.pdz # model parameters and optimizer states +└── speech_stats.npy # statistics used to normalize spectrogram when training tacotron2 +``` + +## More +We strongly recommend that you use [FastSpeech2 + AISHELL-3 Voice Cloning](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/vc1) which works better. diff --git a/examples/aishell3/vc0/conf/default.yaml b/examples/aishell3/vc0/conf/default.yaml index 16a4a60c..26096eb2 100644 --- a/examples/aishell3/vc0/conf/default.yaml +++ b/examples/aishell3/vc0/conf/default.yaml @@ -77,7 +77,7 @@ optimizer: ########################################################### # TRAINING SETTING # ########################################################### -max_epoch: 200 +max_epoch: 100 num_snapshots: 5 ###########################################################