From 4015676a425dab5e3f88e3dd02b54f51da4d7929 Mon Sep 17 00:00:00 2001 From: megemini Date: Fri, 29 Nov 2024 19:30:50 +0800 Subject: [PATCH] =?UTF-8?q?[Hackathon=207th]=20=E4=BF=AE=E5=A4=8D=20`libri?= =?UTF-8?q?speech`=20=E4=B8=AD=20`asr`=20=E7=9A=84=20`readme`=20(#3917)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * [Fix] librispeech asr0 readme * [Fix] librispeech asr1 readme --- examples/librispeech/asr0/README.md | 4 ++-- examples/librispeech/asr1/README.md | 10 +++++----- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/examples/librispeech/asr0/README.md b/examples/librispeech/asr0/README.md index 2d3836c6..a097dd99 100644 --- a/examples/librispeech/asr0/README.md +++ b/examples/librispeech/asr0/README.md @@ -144,7 +144,7 @@ source path.sh bash ./local/data.sh CUDA_VISIBLE_DEVICES= ./local/train.sh conf/deepspeech2.yaml deepspeech2 avg.sh best exp/deepspeech2/checkpoints 1 -CUDA_VISIBLE_DEVICES= ./local/test.sh conf/deepspeech2.yaml exp/deepspeech2/checkpoints/avg_1 +CUDA_VISIBLE_DEVICES= ./local/test.sh conf/deepspeech2.yaml conf/tuning/decode.yaml exp/deepspeech2/checkpoints/avg_1 ``` ## Stage 4: Static graph model Export This stage is to transform dygraph to static graph. @@ -185,5 +185,5 @@ wget -nc https://paddlespeech.bj.bcebos.com/datasets/single_wav/en/demo_002_en.w ``` You can train a model by yourself, then you need to prepare an audio file or use the audio demo above, please confirm the sample rate of the audio is 16K. You can get the result of the audio demo by running the script below. ```bash -CUDA_VISIBLE_DEVICES= ./local/test_wav.sh conf/deepspeech2.yaml exp/deepspeech2/checkpoints/avg_1 data/demo_002_en.wav +CUDA_VISIBLE_DEVICES= ./local/test_wav.sh conf/deepspeech2.yaml conf/tuning/decode.yaml exp/deepspeech2/checkpoints/avg_1 data/demo_002_en.wav ``` diff --git a/examples/librispeech/asr1/README.md b/examples/librispeech/asr1/README.md index ca008144..1b02698c 100644 --- a/examples/librispeech/asr1/README.md +++ b/examples/librispeech/asr1/README.md @@ -148,7 +148,7 @@ or you can run these scripts in the command line (only use CPU). bash ./local/data.sh CUDA_VISIBLE_DEVICES= ./local/train.sh conf/conformer.yaml conformer avg.sh best exp/conformer/checkpoints 20 -CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20 +CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_20 ``` ## Pretrained Model You can get the pretrained transformer or conformer from [this](../../../docs/source/released_model.md). @@ -163,7 +163,7 @@ source path.sh # If you have process the data and get the manifest fileļ¼Œ you can skip the following 2 steps bash local/data.sh --stage -1 --stop_stage -1 bash local/data.sh --stage 2 --stop_stage 2 -CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20 +CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_20 ``` The performance of the released models are shown in [here](./RESULTS.md). @@ -192,8 +192,8 @@ bash ./local/data.sh CUDA_VISIBLE_DEVICES= ./local/train.sh conf/conformer.yaml conformer avg.sh best exp/conformer/checkpoints 20 # test stage is optional -CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20 -CUDA_VISIBLE_DEVICES= ./local/align.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20 +CUDA_VISIBLE_DEVICES= ./local/test.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_20 +CUDA_VISIBLE_DEVICES= ./local/align.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_20 ``` ## Stage 5: Single Audio File Inference In some situations, you want to use the trained model to do the inference for the single audio file. You can use stage 5. The code is shown below @@ -214,5 +214,5 @@ wget -nc https://paddlespeech.bj.bcebos.com/datasets/single_wav/en/demo_002_en.w ``` You need to prepare an audio file or use the audio demo above, please confirm the sample rate of the audio is 16K. You can get the result of the audio demo by running the script below. ```bash -CUDA_VISIBLE_DEVICES= ./local/test_wav.sh conf/conformer.yaml exp/conformer/checkpoints/avg_20 data/demo_002_en.wav +CUDA_VISIBLE_DEVICES= ./local/test_wav.sh conf/conformer.yaml conf/tuning/decode.yaml exp/conformer/checkpoints/avg_20 data/demo_002_en.wav ```