From dbe8cee2482f5d83417d2ece62f67672f8151011 Mon Sep 17 00:00:00 2001
From: tianhao zhang <15600919271@163.com>
Date: Thu, 13 Oct 2022 07:10:04 +0000
Subject: [PATCH] release wav2vec2ASR and wav2vec2.0 model, update Recent
 Update

---
 README.md                     | 1 +
 README_cn.md                  | 1 +
 docs/source/released_model.md | 2 ++
 3 files changed, 4 insertions(+)
diff --git a/README.md b/README.md
index 72db64b7..c05e1242 100644
--- a/README.md
+++ b/README.md
@@ -157,6 +157,7 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision
   - 🧩  *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV).
 
 ### Recent Update
+- 👑 2022.10.11: Add [Wav2vec2ASR](./examples/librispeech/asr3), wav2vec2.0 fine-tuning for ASR on LibriSpeech.
 - 🔥 2022.09.26: Add Voice Cloning, TTS finetune, and ERNIE-SAT in [PaddleSpeech Web Demo](./demos/speech_web).
 - ⚡ 2022.09.09: Add AISHELL-3 Voice Cloning [example](./examples/aishell3/vc2) with ECAPA-TDNN speaker encoder.
 - ⚡ 2022.08.25: Release TTS [finetune](./examples/other/tts_finetune/tts3) example.
diff --git a/README_cn.md b/README_cn.md
index 725f7eda..20e2d3c8 100644
--- a/README_cn.md
+++ b/README_cn.md
@@ -179,6 +179,7 @@
 </div>
 
 ### 近期更新
+- 👑 2022.10.11: 新增 [Wav2vec2ASR](./examples/librispeech/asr3), 在 LibriSpeech 上针对ASR任务对wav2vec2.0 的fine-tuning.
 - 🔥 2022.09.26: 新增 Voice Cloning, TTS finetune 和 ERNIE-SAT 到 [PaddleSpeech 网页应用](./demos/speech_web)。
 - ⚡ 2022.09.09: 新增基于 ECAPA-TDNN 声纹模型的 AISHELL-3 Voice Cloning [示例](./examples/aishell3/vc2)。
 - ⚡ 2022.08.25: 发布 TTS [finetune](./examples/other/tts_finetune/tts3) 示例。
diff --git a/docs/source/released_model.md b/docs/source/released_model.md
index bdac2c5b..3d51f112 100644
--- a/docs/source/released_model.md
+++ b/docs/source/released_model.md
@@ -17,6 +17,8 @@ Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER |
 [Conformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 191 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0338 | 960 h | [Conformer Librispeech ASR1](../../examples/librispeech/asr1) | python |
 [Transformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_transformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 131 MB  | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0381 | 960 h | [Transformer Librispeech ASR1](../../examples/librispeech/asr1) | python |
 [Transformer Librispeech ASR2 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 131 MB  | Encoder:Transformer, Decoder:Transformer, Decoding method: JoinCTC w/ LM |-| 0.0240 | 960 h | [Transformer Librispeech ASR2](../../examples/librispeech/asr2) | python |
+[Wav2vec2-large-960h-lv60-self Model](https://paddlespeech.bj.bcebos.com/wav2vec/wav2vec2-large-960h-lv60-self.pdparams) | Librispeech and LV-60k Dataset | - | 1.18 GB  | Pre-trained Wav2vec2.0 Model |-| - | 5.3w h | [Wav2vecASR Librispeech ASR3](../../examples/librispeech/asr3) | python |
+[Wav2vec2ASR-large-960h-librispeech Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr3/wav2vec2ASR-large-960h-librispeech_ckpt_1.3.0.model.tar.gz) | Librispeech | - | 1.18 GB  | Encoder:Wav2vec2.0, Decoder:CTC, Decoding method: Greedy search |-| 0.0189 | 960 h | [Wav2vecASR Librispeech ASR3](../../examples/librispeech/asr3) | python |
 
 ### Language Model based on NGram
 Language Model | Training Data | Token-based | Size | Descriptions