From 6836634afff53d84ac78f4845f9d840f14e10ea9 Mon Sep 17 00:00:00 2001 From: zxcd <228587199@qq.com> Date: Thu, 9 Mar 2023 13:05:15 +0000 Subject: [PATCH] update model link and add readme. --- README.md | 1 + README_cn.md | 1 + docs/source/released_model.md | 2 +- examples/aishell/asr3/README.md | 8 ++++---- 4 files changed, 7 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 0cb99d1c6..5c5dc3a0f 100644 --- a/README.md +++ b/README.md @@ -178,6 +178,7 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision - 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV). ### Recent Update +- 👑 2023.03.09: Add [Wav2vec2ASR-zh](./examples/aishell/asr3). - 🎉 2023.03.07: Add [TTS ARM Linux C++ Demo](./demos/TTSArmLinux). - 🎉 2023.02.16: Add [Cantonese TTS](./examples/canton/tts3). - 🔥 2023.01.10: Add [code-switch asr CLI and Demos](./demos/speech_recognition). diff --git a/README_cn.md b/README_cn.md index 0f2adf811..fa013029c 100644 --- a/README_cn.md +++ b/README_cn.md @@ -183,6 +183,7 @@ - 🧩 级联模型应用: 作为传统语音任务的扩展,我们结合了自然语言处理、计算机视觉等任务,实现更接近实际需求的产业级应用。 ### 近期更新 +- 👑 2023.03.09: 新增 [Wav2vec2ASR-zh](./examples/aishell/asr3). - 🎉 2023.03.07: 新增 [TTS ARM Linux C++ 部署示例](./demos/TTSArmLinux)。 - 🎉 2023.02.16: 新增[粤语语音合成](./examples/canton/tts3)。 - 🔥 2023.01.10: 新增[中英混合 ASR CLI 和 Demos](./demos/speech_recognition)。 diff --git a/docs/source/released_model.md b/docs/source/released_model.md index 634be7b7f..76173e95c 100644 --- a/docs/source/released_model.md +++ b/docs/source/released_model.md @@ -25,7 +25,7 @@ Model | Pre-Train Method | Pre-Train Data | Finetune Data | Size | Descriptions [Wav2vec2-large-960h-lv60-self Model](https://paddlespeech.bj.bcebos.com/wav2vec/wav2vec2-large-960h-lv60-self.pdparams) | wav2vec2 | Librispeech and LV-60k Dataset (5.3w h) | - | 1.18 GB |Pre-trained Wav2vec2.0 Model | - | - | - | [Wav2vec2ASR-large-960h-librispeech Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr3/wav2vec2ASR-large-960h-librispeech_ckpt_1.3.1.model.tar.gz) | wav2vec2 | Librispeech and LV-60k Dataset (5.3w h) | Librispeech (960 h) | 718 MB |Encoder: Wav2vec2.0, Decoder: CTC, Decoding method: Greedy search | - | 0.0189 | [Wav2vecASR Librispeech ASR3](../../examples/librispeech/asr3) | [Wav2vec2-large-wenetspeech-self Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2-large-wenetspeech-self_ckpt_1.3.0.model.tar.gz) | wav2vec2 | Wenetspeech Dataset (1w h) | - | 714 MB |Pre-trained Wav2vec2.0 Model | - | - | - | -[Wav2vec2ASR-large-aishell1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.3.0.model.tar.gz) | wav2vec2 | Wenetspeech Dataset (1w h) | aishell1 (train set) | 1.17 GB |Encoder: Wav2vec2.0, Decoder: CTC, Decoding method: Greedy search | 0.0453 | - | - | +[Wav2vec2ASR-large-aishell1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz) | wav2vec2 | Wenetspeech Dataset (1w h) | aishell1 (train set) | 1.17 GB |Encoder: Wav2vec2.0, Decoder: CTC, Decoding method: Greedy search | 0.0453 | - | - | ### Whisper Model Demo Link | Training Data | Size | Descriptions | CER | Model diff --git a/examples/aishell/asr3/README.md b/examples/aishell/asr3/README.md index e5806d621..f6fa60d7f 100644 --- a/examples/aishell/asr3/README.md +++ b/examples/aishell/asr3/README.md @@ -164,8 +164,8 @@ using the `tar` scripts to unpack the model and then you can use the script to t For example: ```bash -wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.3.0.model.tar.gz -tar xzvf wav2vec2ASR-large-aishell1_ckpt_1.3.0.model.tar.gz +wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz +tar xzvf wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz source path.sh # If you have process the data and get the manifest file, you can skip the following 2 steps bash local/data.sh --stage -1 --stop_stage -1 @@ -185,8 +185,8 @@ In some situations, you want to use the trained model to do the inference for th ``` you can train the model by yourself using ```bash run.sh --stage 0 --stop_stage 3```, or you can download the pretrained model through the script below: ```bash -wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.3.0.model.tar.gz -tar xzvf wav2vec2ASR-large-aishell1_ckpt_1.3.0.model.tar.gz +wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz +tar xzvf wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz ``` You can download the audio demo: ```bash