[TTS]指定G2PW的传入数据类型 , test=tts (#2288)

* fix ONNXRuntimeError Specify data type (int64),test=tts

* Tactron2→Tacotron2 ,test=doc
pull/2299/head
李子 2 years ago committed by GitHub
parent 3f9339edff
commit 5a58a27492
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -67,7 +67,7 @@ WaveRNN | CSMSC |[WaveRNN-csmsc](https://github.com/PaddlePaddle/PaddleSpeech/tr
Model Type | Dataset| Example Link | Pretrained Models Model Type | Dataset| Example Link | Pretrained Models
:-------------:| :------------:| :-----: | :-----: | :-------------:| :------------:| :-----: | :-----: |
GE2E| AISHELL-3, etc. |[ge2e](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/ge2e)|[ge2e_ckpt_0.3.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/ge2e/ge2e_ckpt_0.3.zip) GE2E| AISHELL-3, etc. |[ge2e](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/ge2e)|[ge2e_ckpt_0.3.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/ge2e/ge2e_ckpt_0.3.zip)
GE2E + Tactron2| AISHELL-3 |[ge2e-tactron2-aishell3](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/vc0)|[tacotron2_aishell3_ckpt_vc0_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/tacotron2/tacotron2_aishell3_ckpt_vc0_0.2.0.zip) GE2E + Tacotron2| AISHELL-3 |[ge2e-Tacotron2-aishell3](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/vc0)|[tacotron2_aishell3_ckpt_vc0_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/tacotron2/tacotron2_aishell3_ckpt_vc0_0.2.0.zip)
GE2E + FastSpeech2 | AISHELL-3 |[ge2e-fastspeech2-aishell3](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/vc1)|[fastspeech2_nosil_aishell3_vc1_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_aishell3_vc1_ckpt_0.5.zip) GE2E + FastSpeech2 | AISHELL-3 |[ge2e-fastspeech2-aishell3](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/vc1)|[fastspeech2_nosil_aishell3_vc1_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_aishell3_vc1_ckpt_0.5.zip)

@ -7,7 +7,7 @@ The examples in PaddleSpeech are mainly classified by datasets, the TTS datasets
* VCTK (English multiple speakers) * VCTK (English multiple speakers)
The models in PaddleSpeech TTS have the following mapping relationship: The models in PaddleSpeech TTS have the following mapping relationship:
* tts0 - Tactron2 * tts0 - Tacotron2
* tts1 - TransformerTTS * tts1 - TransformerTTS
* tts2 - SpeedySpeech * tts2 - SpeedySpeech
* tts3 - FastSpeech2 * tts3 - FastSpeech2
@ -17,7 +17,7 @@ The models in PaddleSpeech TTS have the following mapping relationship:
* voc3 - MultiBand MelGAN * voc3 - MultiBand MelGAN
* voc4 - Style MelGAN * voc4 - Style MelGAN
* voc5 - HiFiGAN * voc5 - HiFiGAN
* vc0 - Tactron2 Voice Clone with GE2E * vc0 - Tacotron2 Voice Clone with GE2E
* vc1 - FastSpeech2 Voice Clone with GE2E * vc1 - FastSpeech2 Voice Clone with GE2E
## Quick Start ## Quick Start

@ -9,7 +9,7 @@
PaddleSpeech 的 TTS 模型具有以下映射关系: PaddleSpeech 的 TTS 模型具有以下映射关系:
* tts0 - Tactron2 * tts0 - Tacotron2
* tts1 - TransformerTTS * tts1 - TransformerTTS
* tts2 - SpeedySpeech * tts2 - SpeedySpeech
* tts3 - FastSpeech2 * tts3 - FastSpeech2
@ -19,7 +19,7 @@ PaddleSpeech 的 TTS 模型具有以下映射关系:
* voc3 - MultiBand MelGAN * voc3 - MultiBand MelGAN
* voc4 - Style MelGAN * voc4 - Style MelGAN
* voc5 - HiFiGAN * voc5 - HiFiGAN
* vc0 - Tactron2 Voice Clone with GE2E * vc0 - Tacotron2 Voice Clone with GE2E
* vc1 - FastSpeech2 Voice Clone with GE2E * vc1 - FastSpeech2 Voice Clone with GE2E
## 快速开始 ## 快速开始

@ -769,7 +769,7 @@
"```\n", "```\n",
"我们在每个数据集的 README.md 介绍了子目录和模型的对应关系, 在 TTS 中有如下对应关系:\n", "我们在每个数据集的 README.md 介绍了子目录和模型的对应关系, 在 TTS 中有如下对应关系:\n",
"```text\n", "```text\n",
"tts0 - Tactron2\n", "tts0 - Tacotron2\n",
"tts1 - TransformerTTS\n", "tts1 - TransformerTTS\n",
"tts2 - SpeedySpeech\n", "tts2 - SpeedySpeech\n",
"tts3 - FastSpeech2\n", "tts3 - FastSpeech2\n",

@ -1,6 +1,6 @@
# Aishell3 # Aishell3
* tts0 - Tactron2 * tts0 - Tacotron2
* tts1 - TransformerTTS * tts1 - TransformerTTS
* tts2 - SpeedySpeech * tts2 - SpeedySpeech
* tts3 - FastSpeech2 * tts3 - FastSpeech2
@ -8,5 +8,5 @@
* voc1 - Parallel WaveGAN * voc1 - Parallel WaveGAN
* voc2 - MelGAN * voc2 - MelGAN
* voc3 - MultiBand MelGAN * voc3 - MultiBand MelGAN
* vc0 - Tactron2 Voice Cloning with GE2E * vc0 - Tacotron2 Voice Cloning with GE2E
* vc1 - FastSpeech2 Voice Cloning with GE2E * vc1 - FastSpeech2 Voice Cloning with GE2E

@ -1,7 +1,7 @@
# CSMSC # CSMSC
* tts0 - Tactron2 * tts0 - Tacotron2
* tts1 - TransformerTTS * tts1 - TransformerTTS
* tts2 - SpeedySpeech * tts2 - SpeedySpeech
* tts3 - FastSpeech2 * tts3 - FastSpeech2

@ -1,7 +1,7 @@
# LJSpeech # LJSpeech
* tts0 - Tactron2 * tts0 - Tacotron2
* tts1 - TransformerTTS * tts1 - TransformerTTS
* tts2 - SpeedySpeech * tts2 - SpeedySpeech
* tts3 - FastSpeech2 * tts3 - FastSpeech2

@ -1,7 +1,7 @@
# VCTK # VCTK
* tts0 - Tactron2 * tts0 - Tacotron2
* tts1 - TransformerTTS * tts1 - TransformerTTS
* tts2 - SpeedySpeech * tts2 - SpeedySpeech
* tts3 - FastSpeech2 * tts3 - FastSpeech2

@ -81,12 +81,12 @@ def prepare_onnx_input(tokenizer,
position_ids.append(position_id) position_ids.append(position_id)
outputs = { outputs = {
'input_ids': np.array(input_ids), 'input_ids': np.array(input_ids).astype(np.int64),
'token_type_ids': np.array(token_type_ids), 'token_type_ids': np.array(token_type_ids).astype(np.int64),
'attention_masks': np.array(attention_masks), 'attention_masks': np.array(attention_masks).astype(np.int64),
'phoneme_masks': np.array(phoneme_masks).astype(np.float32), 'phoneme_masks': np.array(phoneme_masks).astype(np.float32),
'char_ids': np.array(char_ids), 'char_ids': np.array(char_ids).astype(np.int64),
'position_ids': np.array(position_ids), 'position_ids': np.array(position_ids).astype(np.int64),
} }
return outputs return outputs

Loading…
Cancel
Save