Merge pull request #917 from PaddlePaddle/fix_doc

[doc]fix img link; rsl format;
pull/918/head
Hui Zhang 3 years ago committed by GitHub
commit 6d717a4935
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -14,10 +14,11 @@ In addition, the training process and the testing process are also introduced.
The arcitecture of the model is shown in Fig.1. The arcitecture of the model is shown in Fig.1.
<p align="center"> <p align="center">
<img src="../images/ds2onlineModel.png" width=800> <img src="../../images/ds2onlineModel.png" width=800>
<br/>Fig.1 The Arcitecture of deepspeech2 online model <br/>Fig.1 The Arcitecture of deepspeech2 online model
</p> </p>
### Data Preparation ### Data Preparation
#### Vocabulary #### Vocabulary
For English data, the vocabulary dictionary is composed of 26 English characters with " ' ", space, \<blank\> and \<eos\>. The \<blank\> represents the blank label in CTC, the \<unk\> represents the unknown character and the \<eos\> represents the start and the end characters. For mandarin, the vocabulary dictionary is composed of chinese characters statisticed from the training set and three additional characters are added. The added characters are \<blank\>, \<unk\> and \<eos\>. For both English and mandarin data, we set the default indexs that \<blank\>=0, \<unk\>=1 and \<eos\>= last index. For English data, the vocabulary dictionary is composed of 26 English characters with " ' ", space, \<blank\> and \<eos\>. The \<blank\> represents the blank label in CTC, the \<unk\> represents the unknown character and the \<eos\> represents the start and the end characters. For mandarin, the vocabulary dictionary is composed of chinese characters statisticed from the training set and three additional characters are added. The added characters are \<blank\>, \<unk\> and \<eos\>. For both English and mandarin data, we set the default indexs that \<blank\>=0, \<unk\>=1 and \<eos\>= last index.
@ -130,7 +131,7 @@ By using the command above, the training process can be started. There are 5 sta
Using the command below, you can test the deepspeech2 online model. Using the command below, you can test the deepspeech2 online model.
``` ```
bash run.sh --stage 3 --stop_stage 5 --model_type online --conf_path conf/deepspeech2_online.yaml bash run.sh --stage 3 --stop_stage 5 --model_type online --conf_path conf/deepspeech2_online.yaml
``` ```
The detail commands are: The detail commands are:
``` ```
conf_path=conf/deepspeech2_online.yaml conf_path=conf/deepspeech2_online.yaml
@ -152,7 +153,7 @@ if [ ${stage} -le 5 ] && [ ${stop_stage} -ge 5 ]; then
# test export ckpt avg_n # test export ckpt avg_n
CUDA_VISIBLE_DEVICES=0 ./local/test_export.sh ${conf_path} exp/${ckpt}/checkpoints/${avg_ckpt}.jit ${model_type}|| exit -1 CUDA_VISIBLE_DEVICES=0 ./local/test_export.sh ${conf_path} exp/${ckpt}/checkpoints/${avg_ckpt}.jit ${model_type}|| exit -1
fi fi
``` ```
After the training process, we use stage 3,4,5 for testing process. The stage 3 is for testing the model generated in the stage 2 and provided the CER index of the test set. The stage 4 is for transforming the model from dynamic graph to static graph by using "paddle.jit" library. The stage 5 is for testing the model in static graph. After the training process, we use stage 3,4,5 for testing process. The stage 3 is for testing the model generated in the stage 2 and provided the CER index of the test set. The stage 4 is for transforming the model from dynamic graph to static graph by using "paddle.jit" library. The stage 5 is for testing the model in static graph.
@ -161,12 +162,13 @@ The deepspeech2 offline model is similarity to the deepspeech2 online model. The
The arcitecture of the model is shown in Fig.2. The arcitecture of the model is shown in Fig.2.
<p align="center"> <p align="center">
<img src="../images/ds2offlineModel.png" width=800> <img src="../../images/ds2offlineModel.png" width=800>
<br/>Fig.2 The Arcitecture of deepspeech2 offline model <br/>Fig.2 The Arcitecture of deepspeech2 offline model
</p> </p>
For data preparation and decoder, the deepspeech2 offline model is same with the deepspeech2 online model. For data preparation and decoder, the deepspeech2 offline model is same with the deepspeech2 online model.
The code of encoder and decoder for deepspeech2 offline model is in: The code of encoder and decoder for deepspeech2 offline model is in:
@ -182,7 +184,7 @@ For training and testing, the "model_type" and the "conf_path" must be set.
# Training offline # Training offline
cd examples/aishell/s0 cd examples/aishell/s0
bash run.sh --stage 0 --stop_stage 2 --model_type offline --conf_path conf/deepspeech2.yaml bash run.sh --stage 0 --stop_stage 2 --model_type offline --conf_path conf/deepspeech2.yaml
``` ```
``` ```
# Testing offline # Testing offline
cd examples/aishell/s0 cd examples/aishell/s0

@ -2,10 +2,11 @@
Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle dynamic graph and includes many influential TTS models. Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle dynamic graph and includes many influential TTS models.
<div align="center"> <div align="center">
<img src="docs/images/logo.png" width=300 /> <br> <img src="../../images/logo.png" width=300 /> <br>
</div> </div>
## News <img src="./docs/images/news_icon.png" width="40"/>
## News <img src="../../images/news_icon.png" width="40"/>
- Oct-12-2021, Refector examples code. - Oct-12-2021, Refector examples code.
- Oct-12-2021, Parallel WaveGAN with LJSpeech. Check [examples/GANVocoder/parallelwave_gan/ljspeech](./examples/GANVocoder/parallelwave_gan/ljspeech). - Oct-12-2021, Parallel WaveGAN with LJSpeech. Check [examples/GANVocoder/parallelwave_gan/ljspeech](./examples/GANVocoder/parallelwave_gan/ljspeech).
- Oct-12-2021, FastSpeech2/FastPitch with LJSpeech. Check [examples/fastspeech2/ljspeech](./examples/fastspeech2/ljspeech). - Oct-12-2021, FastSpeech2/FastPitch with LJSpeech. Check [examples/fastspeech2/ljspeech](./examples/fastspeech2/ljspeech).

@ -5,7 +5,8 @@
## Data ## Data
| Data Subset | Duration in Seconds | | Data Subset | Duration in Seconds |
| data/manifest.train | 1.23 ~ 14.53125 | | ------------------- | --------------------- |
| data/manifest.dev | 1.645 ~ 12.533 | | data/manifest.train | 1.23 ~ 14.53125 |
| data/manifest.test | 1.859125 ~ 14.6999375 | | data/manifest.dev | 1.645 ~ 12.533 |
| data/manifest.test | 1.859125 ~ 14.6999375 |

@ -2,8 +2,7 @@
* s0 is for deepspeech2 offline * s0 is for deepspeech2 offline
* s1 is for transformer/conformer/U2 * s1 is for transformer/conformer/U2
* s2 is for transformer/conformer/U2 w/ kaldi feat * s2 is for transformer/conformer/U2 w/ kaldi feat, need install Kaldi
need install Kaldi
## Data ## Data
| Data Subset | Duration in Seconds | | Data Subset | Duration in Seconds |

@ -1,14 +1,6 @@
# LibriSpeech # LibriSpeech
## Data
| Data Subset | Duration in Seconds |
| --- | --- |
| data/manifest.train | 0.83s ~ 29.735s |
| data/manifest.dev | 1.065 ~ 35.155s |
| data/manifest.test-clean | 1.285s ~ 34.955s |
## Deepspeech2 ## Deepspeech2
| Model | Params | release | Config | Test set | Loss | WER | | Model | Params | release | Config | Test set | Loss | WER |
| --- | --- | --- | --- | --- | --- | --- | | --- | --- | --- | --- | --- | --- | --- |
| DeepSpeech2 | 42.96M | 2.2.0 | conf/deepspeech2.yaml + spec_aug | test-clean | 14.49190807 | 0.067283 | | DeepSpeech2 | 42.96M | 2.2.0 | conf/deepspeech2.yaml + spec_aug | test-clean | 14.49190807 | 0.067283 |

@ -1,3 +1,4 @@
# Punctation Restoration # Punctation Restoration
Please using [PaddleSpeechTask](https://github.com/745165806/PaddleSpeechTask] to do this task. Please using [PaddleSpeechTask](https://github.com/745165806/PaddleSpeechTask) to do this task.

Loading…
Cancel
Save