diff --git a/examples/canton/tts3/README.md b/examples/canton/tts3/README.md index f4acd131c..3bf4fd8ee 100644 --- a/examples/canton/tts3/README.md +++ b/examples/canton/tts3/README.md @@ -74,44 +74,4 @@ Also, there is a `metadata.jsonl` in each subfolder. It is a table-like file tha ### Training details can refer to the script of [examples/aishell3/tts3](../../aishell3/tts3). -## Pretrained Model(Waiting========) -Pretrained FastSpeech2 model with no silence in the edge of audios: -- [fastspeech2_aishell3_ckpt_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_aishell3_ckpt_1.1.0.zip) -- [fastspeech2_conformer_aishell3_ckpt_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_conformer_aishell3_ckpt_0.2.0.zip) (Thanks for [@awmmmm](https://github.com/awmmmm)'s contribution) - - -FastSpeech2 checkpoint contains files listed below. - -```text -fastspeech2_aishell3_ckpt_1.1.0 -├── default.yaml # default config used to train fastspeech2 -├── energy_stats.npy # statistics used to normalize energy when training fastspeech2 -├── phone_id_map.txt # phone vocabulary file when training fastspeech2 -├── pitch_stats.npy # statistics used to normalize pitch when training fastspeech2 -├── snapshot_iter_96400.pdz # model parameters and optimizer states -├── speaker_id_map.txt # speaker id map file when training a multi-speaker fastspeech2 -└── speech_stats.npy # statistics used to normalize spectrogram when training fastspeech2 -``` -You can use the following scripts to synthesize for `${BIN_DIR}/../sentences.txt` using pretrained fastspeech2 and parallel wavegan models. -```bash -source path.sh - -FLAGS_allocator_strategy=naive_best_fit \ -FLAGS_fraction_of_gpu_memory_to_use=0.01 \ -python3 ${BIN_DIR}/../synthesize_e2e.py \ - --am=fastspeech2_aishell3 \ - --am_config=fastspeech2_aishell3_ckpt_1.1.0/default.yaml \ - --am_ckpt=fastspeech2_aishell3_ckpt_1.1.0/snapshot_iter_96400.pdz \ - --am_stat=fastspeech2_aishell3_ckpt_1.1.0/speech_stats.npy \ - --voc=pwgan_aishell3 \ - --voc_config=pwg_aishell3_ckpt_0.5/default.yaml \ - --voc_ckpt=pwg_aishell3_ckpt_0.5/snapshot_iter_1000000.pdz \ - --voc_stat=pwg_aishell3_ckpt_0.5/feats_stats.npy \ - --lang=zh \ - --text=${BIN_DIR}/../sentences.txt \ - --output_dir=exp/default/test_e2e \ - --phones_dict=fastspeech2_aishell3_ckpt_1.1.0/phone_id_map.txt \ - --speaker_dict=fastspeech2_aishell3_ckpt_1.1.0/speaker_id_map.txt \ - --spk_id=0 \ - --inference_dir=exp/default/inference -``` +## Pretrained Model