diff --git a/docs/source/asr/deepspeech_architecture.md b/docs/source/asr/deepspeech_architecture.md index 5a6ca886..be9471d9 100644 --- a/docs/source/asr/deepspeech_architecture.md +++ b/docs/source/asr/deepspeech_architecture.md @@ -14,10 +14,11 @@ In addition, the training process and the testing process are also introduced. The arcitecture of the model is shown in Fig.1.

- +
Fig.1 The Arcitecture of deepspeech2 online model

+ ### Data Preparation #### Vocabulary For English data, the vocabulary dictionary is composed of 26 English characters with " ' ", space, \ and \. The \ represents the blank label in CTC, the \ represents the unknown character and the \ represents the start and the end characters. For mandarin, the vocabulary dictionary is composed of chinese characters statisticed from the training set and three additional characters are added. The added characters are \, \ and \. For both English and mandarin data, we set the default indexs that \=0, \=1 and \= last index. @@ -130,7 +131,7 @@ By using the command above, the training process can be started. There are 5 sta Using the command below, you can test the deepspeech2 online model. ``` bash run.sh --stage 3 --stop_stage 5 --model_type online --conf_path conf/deepspeech2_online.yaml -``` + ``` The detail commands are: ``` conf_path=conf/deepspeech2_online.yaml @@ -152,7 +153,7 @@ if [ ${stage} -le 5 ] && [ ${stop_stage} -ge 5 ]; then # test export ckpt avg_n CUDA_VISIBLE_DEVICES=0 ./local/test_export.sh ${conf_path} exp/${ckpt}/checkpoints/${avg_ckpt}.jit ${model_type}|| exit -1 fi - ``` +``` After the training process, we use stage 3,4,5 for testing process. The stage 3 is for testing the model generated in the stage 2 and provided the CER index of the test set. The stage 4 is for transforming the model from dynamic graph to static graph by using "paddle.jit" library. The stage 5 is for testing the model in static graph. @@ -161,12 +162,13 @@ The deepspeech2 offline model is similarity to the deepspeech2 online model. The The arcitecture of the model is shown in Fig.2.

- +
Fig.2 The Arcitecture of deepspeech2 offline model

+ For data preparation and decoder, the deepspeech2 offline model is same with the deepspeech2 online model. The code of encoder and decoder for deepspeech2 offline model is in: @@ -182,7 +184,7 @@ For training and testing, the "model_type" and the "conf_path" must be set. # Training offline cd examples/aishell/s0 bash run.sh --stage 0 --stop_stage 2 --model_type offline --conf_path conf/deepspeech2.yaml -``` + ``` ``` # Testing offline cd examples/aishell/s0 diff --git a/docs/source/tts/README.md b/docs/source/tts/README.md index 87cac76e..18283cb2 100644 --- a/docs/source/tts/README.md +++ b/docs/source/tts/README.md @@ -2,10 +2,11 @@ Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle dynamic graph and includes many influential TTS models.
-
+
-## News + +## News - Oct-12-2021, Refector examples code. - Oct-12-2021, Parallel WaveGAN with LJSpeech. Check [examples/GANVocoder/parallelwave_gan/ljspeech](./examples/GANVocoder/parallelwave_gan/ljspeech). - Oct-12-2021, FastSpeech2/FastPitch with LJSpeech. Check [examples/fastspeech2/ljspeech](./examples/fastspeech2/ljspeech). diff --git a/examples/aishell/README.md b/examples/aishell/README.md index 5e5c5ca9..82ef91da 100644 --- a/examples/aishell/README.md +++ b/examples/aishell/README.md @@ -5,7 +5,8 @@ ## Data -| Data Subset | Duration in Seconds | -| data/manifest.train | 1.23 ~ 14.53125 | -| data/manifest.dev | 1.645 ~ 12.533 | -| data/manifest.test | 1.859125 ~ 14.6999375 | +| Data Subset | Duration in Seconds | +| ------------------- | --------------------- | +| data/manifest.train | 1.23 ~ 14.53125 | +| data/manifest.dev | 1.645 ~ 12.533 | +| data/manifest.test | 1.859125 ~ 14.6999375 | diff --git a/examples/librispeech/README.md b/examples/librispeech/README.md index 57f506a4..72459095 100644 --- a/examples/librispeech/README.md +++ b/examples/librispeech/README.md @@ -1,9 +1,8 @@ # ASR -* s0 is for deepspeech2 offline -* s1 is for transformer/conformer/U2 -* s2 is for transformer/conformer/U2 w/ kaldi feat -need install Kaldi +* s0 is for deepspeech2 offline +* s1 is for transformer/conformer/U2 +* s2 is for transformer/conformer/U2 w/ kaldi feat, need install Kaldi ## Data | Data Subset | Duration in Seconds | diff --git a/examples/librispeech/s0/README.md b/examples/librispeech/s0/README.md index 11bcf5f6..77f92a2b 100644 --- a/examples/librispeech/s0/README.md +++ b/examples/librispeech/s0/README.md @@ -1,14 +1,6 @@ # LibriSpeech -## Data -| Data Subset | Duration in Seconds | -| --- | --- | -| data/manifest.train | 0.83s ~ 29.735s | -| data/manifest.dev | 1.065 ~ 35.155s | -| data/manifest.test-clean | 1.285s ~ 34.955s | - ## Deepspeech2 - | Model | Params | release | Config | Test set | Loss | WER | | --- | --- | --- | --- | --- | --- | --- | | DeepSpeech2 | 42.96M | 2.2.0 | conf/deepspeech2.yaml + spec_aug | test-clean | 14.49190807 | 0.067283 | diff --git a/examples/other/punctuation_restoration/README.md b/examples/other/punctuation_restoration/README.md index 42ae0db3..6393d8f5 100644 --- a/examples/other/punctuation_restoration/README.md +++ b/examples/other/punctuation_restoration/README.md @@ -1,3 +1,4 @@ # Punctation Restoration -Please using [PaddleSpeechTask](https://github.com/745165806/PaddleSpeechTask] to do this task. +Please using [PaddleSpeechTask](https://github.com/745165806/PaddleSpeechTask) to do this task. +