Merge pull request #1677 from PaddlePaddle/Jackwaterveg-patch-1

[Doc] update readem for aishell/asr0
3 years ago · 44ee5cd805
parent 86c6e1952c 75c9dc773b
commit 44ee5cd805
4 changed files with 22 additions and 84 deletions
--- a/examples/aishell/asr0/README.md
+++ b/examples/aishell/asr0/README.md
@ -151,21 +151,14 @@ avg.sh best exp/deepspeech2/checkpoints 1
 CUDA_VISIBLE_DEVICES= ./local/test.sh conf/deepspeech2.yaml exp/deepspeech2/checkpoints/avg_1
 ```
 ## Pretrained Model
-You can get the pretrained transformer or conformer using the scripts below:
-```bash
-Deepspeech2 offline:
-wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/ds2.model.tar.gz
-
-Deepspeech2 online:
-wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/aishell_ds2_online_cer8.00_release.tar.gz
+You can get the pretrained models from [this](../../../docs/source/released_model.md).

-```
 using the `tar` scripts to unpack the model and then you can use the script to test the model.

 For example:
 ```
-wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/ds2.model.tar.gz
-tar xzvf ds2.model.tar.gz
+wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz
+tar xzvf asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz
 source path.sh
 # If you have process the data and get the manifest file， you can skip the following 2 steps
 bash local/data.sh --stage -1 --stop_stage -1
@ -209,8 +202,8 @@ if [ ${stage} -le 6 ] && [ ${stop_stage} -ge 6 ]; then
 ```
 you can train the model by yourself, or you can download the pretrained model by the script below:
 ```bash
-wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/ds2.model.tar.gz
-tar xzvf ds2.model.tar.gz
+wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz
+tar xzvf asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz
 ```
 You can download the audio demo:
 ```bash
--- a/examples/aishell/asr1/README.md
+++ b/examples/aishell/asr1/README.md
@ -143,25 +143,14 @@ avg.sh best exp/conformer/checkpoints 20
 CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20
 ```
 ## Pretrained Model
-You can get the pretrained transformer or conformer using the scripts below:
+You can get the pretrained transformer or conformer from [this](../../../docs/source/released_model.md)

-```bash
-# Conformer:
-wget https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.release.tar.gz
-
-# Chunk Conformer:
-wget https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.chunk.release.tar.gz
-
-# Transformer:
-wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz
-
-```
 using the `tar` scripts to unpack the model and then you can use the script to test the model.

 For example:
 ```
-wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz
-tar xzvf transformer.model.tar.gz
+wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz
+tar xzvf asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz
 source path.sh
 # If you have process the data and get the manifest file， you can skip the following 2 steps
 bash local/data.sh --stage -1 --stop_stage -1
@ -206,7 +195,7 @@ In some situations, you want to use the trained model to do the inference for th
 ```
 you can train the model by yourself using ```bash run.sh --stage 0 --stop_stage 3```, or you can download the pretrained model through the script below:
 ```bash
-wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/transformer.model.tar.gz
+wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz
 tar xzvf transformer.model.tar.gz
 ```
 You can download the audio demo:
--- a/examples/librispeech/asr1/README.md
+++ b/examples/librispeech/asr1/README.md
@ -151,44 +151,22 @@ avg.sh best exp/conformer/checkpoints 20
 CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20
 ```
 ## Pretrained Model
-You can get the pretrained transformer or conformer using the scripts below:
-```bash
-# Conformer:
-wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz
-# Transformer:
-wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/transformer.model.tar.gz
-```
+You can get the pretrained transformer or conformer from [this](../../../docs/source/released_model.md).
+
 using the `tar` scripts to unpack the model and then you can use the script to test the model.

 For example:
 ```bash
-wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz
-tar xzvf transformer.model.tar.gz
+wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
+tar xzvf asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
 source path.sh
 # If you have process the data and get the manifest file， you can skip the following 2 steps
 bash local/data.sh --stage -1 --stop_stage -1
 bash local/data.sh --stage 2 --stop_stage 2
 CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20
 ```
-The performance of the released models are shown below:
-## Conformer
-train: Epoch 70, 4 V100-32G, best avg: 20
-
-| Model     | Params  | Config              | Augmentation | Test set   | Decode method          | Loss              | WER      |
-| --------- | ------- | ------------------- | ------------ | ---------- | ---------------------- | ----------------- | -------- |
-| conformer | 47.63 M | conf/conformer.yaml | spec_aug     | test-clean | attention              | 6.433612394332886 | 0.039771 |
-| conformer | 47.63 M | conf/conformer.yaml | spec_aug     | test-clean | ctc_greedy_search      | 6.433612394332886 | 0.040342 |
-| conformer | 47.63 M | conf/conformer.yaml | spec_aug     | test-clean | ctc_prefix_beam_search | 6.433612394332886 | 0.040342 |
-| conformer | 47.63 M | conf/conformer.yaml | spec_aug     | test-clean | attention_rescoring    | 6.433612394332886 | 0.033761 |
-## Transformer
-train: Epoch 120, 4 V100-32G, 27 Day, best avg: 10
+The performance of the released models are shown in [here](./RESULTS.md).

-| Model       | Params  | Config                | Augmentation | Test set   | Decode method          | Loss              | WER      |
-| ----------- | ------- | --------------------- | ------------ | ---------- | ---------------------- | ----------------- | -------- |
-| transformer | 32.52 M | conf/transformer.yaml | spec_aug     | test-clean | attention              | 6.382194232940674 | 0.049661 |
-| transformer | 32.52 M | conf/transformer.yaml | spec_aug     | test-clean | ctc_greedy_search      | 6.382194232940674 | 0.049566 |
-| transformer | 32.52 M | conf/transformer.yaml | spec_aug     | test-clean | ctc_prefix_beam_search | 6.382194232940674 | 0.049585 |
-| transformer | 32.52 M | conf/transformer.yaml | spec_aug     | test-clean | attention_rescoring    | 6.382194232940674 | 0.038135 |
 ## Stage 4: CTC Alignment 
 If you want to get the alignment between the audio and the text, you can use the ctc alignment. The code of this stage is shown below:
 ```bash
@ -227,8 +205,8 @@ In some situations, you want to use the trained model to do the inference for th
 ```
 you can train the model by yourself using ```bash run.sh --stage 0 --stop_stage 3```, or you can download the pretrained model through the script below:
 ```bash
-wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/conformer.model.tar.gz
-tar xzvf conformer.model.tar.gz
+wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
+tar xzvf asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz
 ```
 You can download the audio demo:
 ```bash
--- a/examples/librispeech/asr2/README.md
+++ b/examples/librispeech/asr2/README.md
@ -1,4 +1,4 @@
-# Transformer/Conformer ASR with Librispeech Asr2
+# Transformer/Conformer ASR with Librispeech ASR2

 This example contains code used to train a Transformer or [Conformer](http://arxiv.org/abs/2008.03802) model with [Librispeech dataset](http://www.openslr.org/resources/12) and use some functions in kaldi.

@ -213,17 +213,14 @@ avg.sh latest exp/transformer/checkpoints 10
 ./local/recog.sh  --ckpt_prefix exp/transformer/checkpoints/avg_10
 ```
 ## Pretrained Model
-You can get the pretrained transformer using the scripts below:
-```bash
-# Transformer:
-wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/transformer.model.tar.gz
-```
+You can get the pretrained models from [this](../../../docs/source/released_model.md).
+
 using the `tar` scripts to unpack the model and then you can use the script to test the model.

 For example:
 ```bash
-wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/transformer.model.tar.gz
-tar xzvf transformer.model.tar.gz
+wget https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz
+tar xzvf asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz
 source path.sh
 # If you have process the data and get the manifest file， you can skip the following 2 steps
 bash local/data.sh --stage -1 --stop_stage -1
@ -231,26 +228,7 @@ bash local/data.sh --stage 2 --stop_stage 2

 CUDA_VISIBLE_DEVICES= ./local/test.sh conf/transformer.yaml exp/ctc/checkpoints/avg_10
 ```
-The performance of the released models are shown below:
-### Transformer
-|    Model    | Params |          GPUS          |  Averaged Model  |        Config         | Augmentation |      Loss       |
-| :---------: | :----: | :--------------------: | :--------------: | :-------------------: | :----------: | :-------------: |
-| transformer | 32.52M | 8 Tesla V100-SXM2-32GB | 10-best val_loss | conf/transformer.yaml |   spec_aug   | 6.3197922706604 |
-
-#### Attention Rescore
-| Test Set   | Decode Method         | #Snt | #Wrd  | Corr | Sub  | Del  | Ins  | Err  | S.Err |
-| ---------- | --------------------- | ---- | ----- | ---- | ---- | ---- | ---- | ---- | ----- |
-| test-clean | attention             | 2620 | 52576 | 96.4 | 2.5  | 1.1  | 0.4  | 4.0  | 34.7  |
-| test-clean | ctc_greedy_search     | 2620 | 52576 | 95.9 | 3.7  | 0.4  | 0.5  | 4.6  | 48.0  |
-| test-clean | ctc_prefix_beamsearch | 2620 | 52576 | 95.9 | 3.7  | 0.4  | 0.5  | 4.6  | 47.6  |
-| test-clean | attention_rescore     | 2620 | 52576 | 96.8 | 2.9  | 0.3  | 0.4  | 3.7  | 38.0  |
-
-#### JoinCTC
-| Test Set   | Decode Method     | #Snt | #Wrd  | Corr | Sub  | Del  | Ins  | Err  | S.Err |
-| ---------- | ----------------- | ---- | ----- | ---- | ---- | ---- | ---- | ---- | ----- |
-| test-clean | join_ctc_only_att | 2620 | 52576 | 96.1 | 2.5  | 1.4  | 0.4  | 4.4  | 34.7  |
-| test-clean | join_ctc_w/o_lm   | 2620 | 52576 | 97.2 | 2.6  | 0.3  | 0.4  | 3.2  | 34.9  |
-| test-clean | join_ctc_w_lm     | 2620 | 52576 | 97.9 | 1.8  | 0.2  | 0.3  | 2.4  | 27.8  |
+The performance of the released models are shown [here](./RESULTS.md).

 Compare with [ESPNET](https://github.com/espnet/espnet/blob/master/egs/librispeech/asr1/RESULTS.md#pytorch-large-transformer-with-specaug-4-gpus--transformer-lm-4-gpus) we using 8gpu, but the model size (aheads4-adim256) small than it.
 ## Stage 5: CTC Alignment