Merge branch 'PaddlePaddle:develop' into develop

pull/2615/head
HuangLiangJie 3 years ago committed by GitHub
commit 832ff0e6aa
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -157,6 +157,7 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision
- 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV). - 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV).
### Recent Update ### Recent Update
- 👑 2022.11.01: Add [Adversarial Loss](https://arxiv.org/pdf/1907.04448.pdf) for [Chinese English mixed TTS](./examples/zh_en_tts/tts3).
- 🔥 2022.10.26: Add [Prosody Prediction](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/rhy) for TTS. - 🔥 2022.10.26: Add [Prosody Prediction](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/rhy) for TTS.
- 🎉 2022.10.21: Add [SSML](https://github.com/PaddlePaddle/PaddleSpeech/discussions/2538) for TTS Chinese Text Frontend. - 🎉 2022.10.21: Add [SSML](https://github.com/PaddlePaddle/PaddleSpeech/discussions/2538) for TTS Chinese Text Frontend.
- 👑 2022.10.11: Add [Wav2vec2ASR](./examples/librispeech/asr3), wav2vec2.0 fine-tuning for ASR on LibriSpeech. - 👑 2022.10.11: Add [Wav2vec2ASR](./examples/librispeech/asr3), wav2vec2.0 fine-tuning for ASR on LibriSpeech.
@ -716,9 +717,9 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r
<tr> <tr>
<td>Keyword Spotting</td> <td>Keyword Spotting</td>
<td>hey-snips</td> <td>hey-snips</td>
<td>PANN</td> <td>MDTC</td>
<td> <td>
<a href = "./examples/hey_snips/kws0">pann-hey-snips</a> <a href = "./examples/hey_snips/kws0">mdtc-hey-snips</a>
</td> </td>
</tr> </tr>
</tbody> </tbody>

@ -164,6 +164,7 @@
### 近期更新 ### 近期更新
- 👑 2022.11.01: [中英文混合 TTS](./examples/zh_en_tts/tts3) 新增 [Adversarial Loss](https://arxiv.org/pdf/1907.04448.pdf) 模块。
- 🔥 2022.10.26: TTS 新增[韵律预测](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/rhy)功能。 - 🔥 2022.10.26: TTS 新增[韵律预测](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/rhy)功能。
- 🎉 2022.10.21: TTS 中文文本前端新增 [SSML](https://github.com/PaddlePaddle/PaddleSpeech/discussions/2538) 功能。 - 🎉 2022.10.21: TTS 中文文本前端新增 [SSML](https://github.com/PaddlePaddle/PaddleSpeech/discussions/2538) 功能。
- 👑 2022.10.11: 新增 [Wav2vec2ASR](./examples/librispeech/asr3), 在 LibriSpeech 上针对 ASR 任务对 wav2vec2.0 的 finetuning。 - 👑 2022.10.11: 新增 [Wav2vec2ASR](./examples/librispeech/asr3), 在 LibriSpeech 上针对 ASR 任务对 wav2vec2.0 的 finetuning。
@ -696,9 +697,9 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声
</table> </table>
<a name="唤醒模型"></a> <a name="语音唤醒模型"></a>
**唤醒** **语音唤醒**
<table style="width:100%"> <table style="width:100%">
<thead> <thead>
@ -711,11 +712,11 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声
</thead> </thead>
<tbody> <tbody>
<tr> <tr>
<td>唤醒</td> <td>语音唤醒</td>
<td>hey-snips</td> <td>hey-snips</td>
<td>PANN</td> <td>MDTC</td>
<td> <td>
<a href = "./examples/hey_snips/kws0">pann-hey-snips</a> <a href = "./examples/hey_snips/kws0">mdtc-hey-snips</a>
</td> </td>
</tr> </tr>
</tbody> </tbody>

@ -108,7 +108,7 @@ for epoch in range(1, epochs + 1):
optimizer.clear_grad() optimizer.clear_grad()
# Calculate loss # Calculate loss
avg_loss = loss.numpy()[0] avg_loss = float(loss)
# Calculate metrics # Calculate metrics
preds = paddle.argmax(logits, axis=1) preds = paddle.argmax(logits, axis=1)

@ -509,7 +509,7 @@
" optimizer.clear_grad()\n", " optimizer.clear_grad()\n",
"\n", "\n",
" # Calculate loss\n", " # Calculate loss\n",
" avg_loss += loss.numpy()[0]\n", " avg_loss += float(loss)\n",
"\n", "\n",
" # Calculate metrics\n", " # Calculate metrics\n",
" preds = paddle.argmax(logits, axis=1)\n", " preds = paddle.argmax(logits, axis=1)\n",

@ -55,7 +55,7 @@ If you want to finetune Chinese pretrained model, you need to prepare Chinese da
000001|ka2 er2 pu3 pei2 wai4 sun1 wan2 hua2 ti1 000001|ka2 er2 pu3 pei2 wai4 sun1 wan2 hua2 ti1
``` ```
Here is an example of the first 200 data of csmsc. Here is a Chinese data example of the first 200 data of csmsc.
```bash ```bash
mkdir -p input && cd input mkdir -p input && cd input
@ -69,7 +69,7 @@ If you want to finetune English pretrained model, you need to prepare English da
LJ001-0001|Printing, in the only sense with which we are at present concerned, differs from most if not from all the arts and crafts represented in the Exhibition LJ001-0001|Printing, in the only sense with which we are at present concerned, differs from most if not from all the arts and crafts represented in the Exhibition
``` ```
Here is an example of the first 200 data of ljspeech. Here is an English data example of the first 200 data of ljspeech.
```bash ```bash
mkdir -p input && cd input mkdir -p input && cd input
@ -78,7 +78,7 @@ unzip ljspeech_mini.zip
cd ../ cd ../
``` ```
If you want to finetune Chinese-English Mixed pretrained model, you need to prepare Chinese data or English data. Here is an example of the first 12 data of SSB0005 (the speaker of aishell3). If you want to finetune Chinese-English Mixed pretrained model, you need to prepare Chinese data or English data. Here is a Chinese data example of the first 12 data of SSB0005 (the speaker of aishell3).
```bash ```bash
mkdir -p input && cd input mkdir -p input && cd input

@ -101,7 +101,7 @@ if __name__ == "__main__":
optimizer.clear_grad() optimizer.clear_grad()
# Calculate loss # Calculate loss
avg_loss += loss.numpy()[0] avg_loss += float(loss)
# Calculate metrics # Calculate metrics
preds = paddle.argmax(logits, axis=1) preds = paddle.argmax(logits, axis=1)

@ -110,7 +110,7 @@ if __name__ == '__main__':
optimizer.clear_grad() optimizer.clear_grad()
# Calculate loss # Calculate loss
avg_loss += loss.numpy()[0] avg_loss += float(loss)
# Calculate metrics # Calculate metrics
num_corrects += corrects num_corrects += corrects

@ -0,0 +1,10 @@
0001 考古人员<speak>西<say-as pinyin='zang4'>藏</say-as>布达拉宫里发现一个被隐<say-as pinyin="cang2">藏</say-as>的装有宝<say-as pinyin="zang4">藏</say-as></speak>箱子。
0002 <speak>有人询问中国银<say-as pinyin='hang2'>行</say-as>北京分<say-as pinyin='hang2 hang2'>行行</say-as>长是否叫任我<say-as pinyin='xing2'>行</say-as></speak>。
0003 <speak>市委书记亲自<say-as pinyin='shuai4'>率</say-as>领审计员对这家公司进行财务审计,发现企业的利润<say-as pinyin='lv4'>率</say-as>数据虚假</speak>。
0004 <speak>学生们对代<say-as pinyin='shu4'>数</say-as>理解不深刻,特别是小<say-as pinyin='shu4'>数</say-as>点,在<say-as pinyin='shu3 shu4'>数数</say-as>时容易弄错</speak>。
0005 <speak>赵<say-as pinyin='chang2'>长</say-as>军从小学习武术,擅<say-as pinyin='chang2'>长</say-as>散打,<say-as pinyin='zhang3'>长</say-as>大后参军,担任连<say-as pinyin='zhang3'>长</say-as></speak>。
0006 <speak>我说她<say-as pinyin='zhang3'>涨</say-as>了工资,她就<say-as pinyin='zhang4'>涨</say-as>红着脸,摇头否认</speak>。
0007 <speak>请把这封信交<say-as pinyin='gei3'>给</say-as>团长,告诉他,前线的供<say-as pinyin='ji3'>给</say-as>一定要有保障</speak>。
0008 <speak>矿下的<say-as pinyin='hang4'>巷</say-as>道,与北京四合院的小<say-as pinyin='xiang4'>巷</say-as>有点相似</speak>。
0009 <speak>他常叹自己命<say-as pinyin='bo2'>薄</say-as>,几亩<say-as pinyin='bao2'>薄</say-as>田,种点<say-as pinyin='bo4'>薄</say-as>荷</speak>。
0010 <speak>小明对天相很有研究,在<say-as pinyin='su4'>宿</say-as>舍说了一<say-as pinyin='xiu3'>宿</say-as>有关星<say-as pinyin='xiu4'>宿</say-as>的常识</speak>。

@ -13,6 +13,7 @@
# limitations under the License. # limitations under the License.
import paddle import paddle
import torch import torch
from paddle.device.cuda import synchronize
from parallel_wavegan.layers import residual_block from parallel_wavegan.layers import residual_block
from parallel_wavegan.layers import upsample from parallel_wavegan.layers import upsample
from parallel_wavegan.models import parallel_wavegan as pwgan from parallel_wavegan.models import parallel_wavegan as pwgan
@ -24,7 +25,6 @@ from paddlespeech.t2s.models.parallel_wavegan import PWGGenerator
from paddlespeech.t2s.models.parallel_wavegan import ResidualBlock from paddlespeech.t2s.models.parallel_wavegan import ResidualBlock
from paddlespeech.t2s.models.parallel_wavegan import ResidualPWGDiscriminator from paddlespeech.t2s.models.parallel_wavegan import ResidualPWGDiscriminator
from paddlespeech.t2s.utils.layer_tools import summary from paddlespeech.t2s.utils.layer_tools import summary
from paddlespeech.t2s.utils.profile import synchronize
paddle.set_device("gpu:0") paddle.set_device("gpu:0")
device = torch.device("cuda:0") device = torch.device("cuda:0")

Loading…
Cancel
Save