From 3ae013959b9e06f419fd98a74de54c90ec887c17 Mon Sep 17 00:00:00 2001
From: Jackwaterveg <87408988+Jackwaterveg@users.noreply.github.com>
Date: Mon, 16 May 2022 15:55:14 +0800
Subject: [PATCH 1/5] Updata PPASR_cn.md, test=doc
---
docs/source/asr/PPASR_cn.md | 36 ++++++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/docs/source/asr/PPASR_cn.md b/docs/source/asr/PPASR_cn.md
index 1f72f1b94..6f04d1043 100644
--- a/docs/source/asr/PPASR_cn.md
+++ b/docs/source/asr/PPASR_cn.md
@@ -1,3 +1,6 @@
+(简体中文|[English](./PPASR.md))
+# PP-ASR
+
## 目录
- [1. 简介](#1)
- [2. 特点](#2)
@@ -12,7 +15,7 @@
## 1. 简介
-PP-ASR 是一个 提供 ASR 功能的工具。其提供了多种中文和英文的模型,支持模型的训练,并且支持使用命令行的方式进行模型的推理。 PP-ASR也支持流式模型的部署,以及个性化场景的部署。
+PP-ASR 是一个 提供 ASR 功能的工具。其提供了多种中文和英文的模型,支持模型的训练,并且支持使用命令行的方式进行模型的推理。 PP-ASR 也支持流式模型的部署,以及个性化场景的部署。
## 2. 特点
@@ -32,21 +35,23 @@ PP-ASR 的主要特点如下:
## 3.1 预训练模型
-支持的预训练模型列表:[released_model.md](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md)。
+支持的预训练模型列表:[released_model](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md)。
其中效果较好的模型为 Ds2 Online Wenetspeech ASR0 Model 以及 Conformer Online Wenetspeech ASR1 Model。 两个模型都支持流式 ASR。
-
+关于模型设计的部分,可以参考 AIStudio 教程:
+- [Deepspeech2](https://aistudio.baidu.com/aistudio/projectdetail/3866807)
+- [Transformer](https://aistudio.baidu.com/aistudio/projectdetail/3470110)
## 3.2 模型训练
模型的训练的参考脚本存放在 examples 中,并按照 `examples/数据集/模型` 存放,数据集主要支持 aishell 和 librispeech,模型支持 deepspeech2 模型和 u2 (conformer/transformer) 模型。
-具体的执行脚本的步骤记录在 run.sh 当中。具体可参考[这里](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
+具体的执行脚本的步骤记录在 run.sh 当中。具体可参考: [asr1](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
## 3.3 模型推理
-PPASR 支持在使用`pip install paddlespeech`后 使用命令行的方式来使用预训练模型进行推理。
+PP-ASR 支持在使用`pip install paddlespeech`后 使用命令行的方式来使用预训练模型进行推理。
具体支持的功能包括:
@@ -54,26 +59,37 @@ PPASR 支持在使用`pip install paddlespeech`后 使用命令行的方式来
- 使用管道的方式对多条音频进行预测
- 支持 RTF 的计算
-具体的使用方式可以参考[这里](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md)
+具体的使用方式可以参考: [speech_recognition](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md)
## 3.4 服务部署
-PPASR 支持流式ASR的服务部署。支持 语音识别 + 标点处理两个功能同时使用。
+PP-ASR 支持流式ASR的服务部署。支持 语音识别 + 标点处理两个功能同时使用。
-server 的 demo [链接](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/streaming_asr_server)
+server 的 demo: [streaming_asr_server](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/streaming_asr_server)

+网页上使用 asr server 的效果展示:[streaming_asr_demo_video](https://paddlespeech.readthedocs.io/en/latest/streaming_asr_demo_video.html)
+
+关于服务部署方面的更多资料,可以参考 AIStudio 教程:
+- [流式服务-模型部分](https://aistudio.baidu.com/aistudio/projectdetail/3839884)
+- [流式服务](https://aistudio.baidu.com/aistudio/projectdetail/4017905)
+
## 3.5 支持个性化场景部署
-针对个性化场景部署,提供了 特征提取(fbank) => 推理模型(打分库)=> TLG(WFST, token, lexion, grammer)的 C++ 程序。具体参考[这里](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx)
+针对个性化场景部署,提供了特征提取(fbank) => 推理模型(打分库)=> TLG(WFST, token, lexion, grammer)的 C++ 程序。具体参考 [speechx](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx)。 如果想快速了解和使用,可以参考: [custom_streaming_asr](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/custom_streaming_asr/README_cn.md)
+
+关于支持个性化场景部署的更多资料,可以参考 AIStudio 教程:
+- [定制化识别](https://aistudio.baidu.com/aistudio/projectdetail/4021561)
+
+
## 4. 快速开始
-关于如果使用 PPASR,可以看这里的[安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md),其中提供了 **简单**、**中等**、**困难** 三种安装方式。如果想体验paddlespeech 的推理功能,可以用 **简单** 安装方式。
+关于如果使用 PP-ASR,可以看这里的 [install](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md),其中提供了 **简单**、**中等**、**困难** 三种安装方式。如果想体验 paddlespeech 的推理功能,可以用 **简单** 安装方式。
From 4228de6f75f891318a691d049e82af6d3d2a752b Mon Sep 17 00:00:00 2001
From: Jackwaterveg <87408988+Jackwaterveg@users.noreply.github.com>
Date: Mon, 16 May 2022 18:13:07 +0800
Subject: [PATCH 2/5] test=asr
---
docs/source/asr/PPASR.md | 96 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 96 insertions(+)
create mode 100644 docs/source/asr/PPASR.md
diff --git a/docs/source/asr/PPASR.md b/docs/source/asr/PPASR.md
new file mode 100644
index 000000000..ef22954ab
--- /dev/null
+++ b/docs/source/asr/PPASR.md
@@ -0,0 +1,96 @@
+([简体中文](./PPASR.md)|English)
+# PP-ASR
+
+## Catalogue
+- [1. Introduction](#1)
+- [2. Characteristic](#2)
+- [3. Tutorials](#3)
+ - [3.1 Pre-trained Models](#31)
+ - [3.2 Training](#32)
+ - [3.3 Inference](#33)
+ - [3.4 Service Deployment](#33)
+ - [3.5 Customized Auto Speech Recognition and Deployment](#33)
+- [4. Quick Start](#4)
+
+
+## 1. Introduction
+
+PP-ASR is a tool to provide ASR(Automatic speech recognition) function. It provides a variety of Chinese and English models and supports model training. It also supports model inference using the command line. In addition, PP-ASR supports the deployment of streaming models and customized ASR.
+
+
+## 2. Characteristic
+The basic process of ASR is shown in the figure below:
+
+
+
+The main characteristics of PP-ASR are shown below:
+- Provides pre-trained models on Chinese/English open source datasets: aishell(Chinese), wenetspeech(Chinese) and librispeech(English). The models includes deepspeech2 and conformer/transformer.
+- Support model training on Chinese/English datasets.
+- Support model inference using the command line. You can use to use `paddlespeech asr --model xxx --input xxx.wav` to use pre-trained model to do model inference.
+- Support deployment of streaming ASR server. Besides ASR function, the server supports timestamp function.
+- Support customized auto speech recognition and deployment.
+
+
+## 3. Tutorials
+
+
+## 3.1 Pre-trained Models
+The support pre-trained model list: [released_model](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md).
+The model with good effect are Ds2 Online Wenetspeech ASR0 Model and Conformer Online Wenetspeech ASR1 Model. Both two models support streaming ASR.
+For more information about model design, you can refer to the aistudio tutorial:
+- [Deepspeech2](https://aistudio.baidu.com/aistudio/projectdetail/3866807)
+- [Transformer](https://aistudio.baidu.com/aistudio/projectdetail/3470110)
+
+
+## 3.2 Training
+The reference script for model training is stored in [examples](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples) and stored according to "examples/dataset/model". The dataset mainly supports aishell and librispeech. The model supports deepspeech2 and u2(conformer/transformer).
+The specific steps of executing the script are recorded in `run.sh`.
+
+For more information, you can refer to: [asr1](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
+
+
+
+## 3.3 Inference
+
+PP-ASR supports use `paddlespeech asr --model xxx --input xxx.wav` to use pre-trained model to do model inference after install `paddlespeech` by `pip install paddlespeech`.
+
+Specific supported functions include:
+
+- Prediction of single audio
+- Use pipe to predict multiple audio
+- Support RTF calculation
+
+For specific usage, please refer to: [speech_recognition](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md)
+
+
+
+## 3.4 Service Deployment
+
+PP-ASR supports the service deployment of streaming ASR. Support the simultaneous use of speech recognition and punctuation processing.
+
+Demo of ASR Server: [streaming_asr_server](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/streaming_asr_server)
+
+
+
+Display of using ASR server on Web page: [streaming_asr_demo_video](https://paddlespeech.readthedocs.io/en/latest/streaming_asr_demo_video.html)
+
+
+For more information about service deployment, you can refer to the aistudio tutorial:
+- [Streaming service - model part](https://aistudio.baidu.com/aistudio/projectdetail/3839884)
+- [Streaming service](https://aistudio.baidu.com/aistudio/projectdetail/4017905)
+
+
+## 3.5 Customized Auto Speech Recognition and Deployment
+
+For customized auto speech recognition and deployment, PP-ASR provides feature extraction(fbank) => Inference model(Scoring Library)=> C++ program of TLG(WFST, token, lexion, grammer). For specific usage, please refer to: [speechx](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx)
+If you want to quickly use it, you can refer to: [custom_streaming_asr](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/custom_streaming_asr/README_cn.md)
+
+For more information about customized auto speech recognition and deployment, you can refer to the aistudio tutorial:
+- [Customized Auto Speech Recognition](https://aistudio.baidu.com/aistudio/projectdetail/4021561)
+
+
+
+
+## 4. Quick Start
+
+To use PP-ASR, you can see here [install](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md), It supplies three methods to install `paddlespeech`, which are **Easy**, **Medium** and **Hard**. If you want to experience the inference function of paddlespeech, you can use **Easy** installation method.
From ff8b487f47f41aedc9a204c9f3a0613d6b88003d Mon Sep 17 00:00:00 2001
From: Jackwaterveg <87408988+Jackwaterveg@users.noreply.github.com>
Date: Mon, 16 May 2022 18:13:18 +0800
Subject: [PATCH 3/5] test=asr
---
docs/source/asr/PPASR_cn.md | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/docs/source/asr/PPASR_cn.md b/docs/source/asr/PPASR_cn.md
index 6f04d1043..82b1c1d37 100644
--- a/docs/source/asr/PPASR_cn.md
+++ b/docs/source/asr/PPASR_cn.md
@@ -26,7 +26,7 @@ PP-ASR 是一个 提供 ASR 功能的工具。其提供了多种中文和英文
PP-ASR 的主要特点如下:
- 提供在中/英文开源数据集 aishell (中文),wenetspeech(中文),librispeech (英文)上的预训练模型。模型包含 deepspeech2 模型以及 conformer/transformer 模型。
- 支持中/英文的模型训练功能。
-- 支持命令行方式的模型推理, `paddlespeech asr --input xxx.wav` 方式调用各个预训练模型进行推理。
+- 支持命令行方式的模型推理,可使用 `paddlespeech asr --model xxx --input xxx.wav` 方式调用各个预训练模型进行推理。
- 支持流式 ASR 的服务部署,也支持输出时间戳。
- 支持个性化场景的部署。
@@ -37,15 +37,15 @@ PP-ASR 的主要特点如下:
## 3.1 预训练模型
支持的预训练模型列表:[released_model](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md)。
其中效果较好的模型为 Ds2 Online Wenetspeech ASR0 Model 以及 Conformer Online Wenetspeech ASR1 Model。 两个模型都支持流式 ASR。
-关于模型设计的部分,可以参考 AIStudio 教程:
+更多关于模型设计的部分,可以参考 AIStudio 教程:
- [Deepspeech2](https://aistudio.baidu.com/aistudio/projectdetail/3866807)
- [Transformer](https://aistudio.baidu.com/aistudio/projectdetail/3470110)
## 3.2 模型训练
-模型的训练的参考脚本存放在 examples 中,并按照 `examples/数据集/模型` 存放,数据集主要支持 aishell 和 librispeech,模型支持 deepspeech2 模型和 u2 (conformer/transformer) 模型。
-具体的执行脚本的步骤记录在 run.sh 当中。具体可参考: [asr1](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
+模型的训练的参考脚本存放在 [examples](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples) 中,并按照 `examples/数据集/模型` 存放,数据集主要支持 aishell 和 librispeech,模型支持 deepspeech2 模型和 u2 (conformer/transformer) 模型。
+具体的执行脚本的步骤记录在 `run.sh` 当中。具体可参考: [asr1](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
@@ -80,7 +80,8 @@ server 的 demo: [streaming_asr_server](https://github.com/PaddlePaddle/Paddle
## 3.5 支持个性化场景部署
-针对个性化场景部署,提供了特征提取(fbank) => 推理模型(打分库)=> TLG(WFST, token, lexion, grammer)的 C++ 程序。具体参考 [speechx](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx)。 如果想快速了解和使用,可以参考: [custom_streaming_asr](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/custom_streaming_asr/README_cn.md)
+针对个性化场景部署,提供了特征提取(fbank) => 推理模型(打分库)=> TLG(WFST, token, lexion, grammer)的 C++ 程序。具体参考 [speechx](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx)。
+如果想快速了解和使用,可以参考: [custom_streaming_asr](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/custom_streaming_asr/README_cn.md)
关于支持个性化场景部署的更多资料,可以参考 AIStudio 教程:
- [定制化识别](https://aistudio.baidu.com/aistudio/projectdetail/4021561)
From bff52147dd1189f0c835093b9468b25c707e1704 Mon Sep 17 00:00:00 2001
From: Jackwaterveg <87408988+Jackwaterveg@users.noreply.github.com>
Date: Mon, 16 May 2022 19:20:53 +0800
Subject: [PATCH 4/5] test=doc
---
docs/source/asr/PPASR.md | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/docs/source/asr/PPASR.md b/docs/source/asr/PPASR.md
index ef22954ab..3779434e3 100644
--- a/docs/source/asr/PPASR.md
+++ b/docs/source/asr/PPASR.md
@@ -1,4 +1,4 @@
-([简体中文](./PPASR.md)|English)
+([简体中文](./PPASR_cn.md)|English)
# PP-ASR
## Catalogue
@@ -24,9 +24,9 @@ The basic process of ASR is shown in the figure below:
The main characteristics of PP-ASR are shown below:
-- Provides pre-trained models on Chinese/English open source datasets: aishell(Chinese), wenetspeech(Chinese) and librispeech(English). The models includes deepspeech2 and conformer/transformer.
+- Provides pre-trained models on Chinese/English open source datasets: aishell(Chinese), wenetspeech(Chinese) and librispeech(English). The models include deepspeech2 and conformer/transformer.
- Support model training on Chinese/English datasets.
-- Support model inference using the command line. You can use to use `paddlespeech asr --model xxx --input xxx.wav` to use pre-trained model to do model inference.
+- Support model inference using the command line. You can use to use `paddlespeech asr --model xxx --input xxx.wav` to use the pre-trained model to do model inference.
- Support deployment of streaming ASR server. Besides ASR function, the server supports timestamp function.
- Support customized auto speech recognition and deployment.
@@ -43,21 +43,21 @@ For more information about model design, you can refer to the aistudio tutorial:
## 3.2 Training
-The reference script for model training is stored in [examples](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples) and stored according to "examples/dataset/model". The dataset mainly supports aishell and librispeech. The model supports deepspeech2 and u2(conformer/transformer).
+The referenced script for model training is stored in [examples](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples) and stored according to "examples/dataset/model". The dataset mainly supports aishell and librispeech. The model supports deepspeech2 and u2(conformer/transformer).
The specific steps of executing the script are recorded in `run.sh`.
-For more information, you can refer to: [asr1](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
+For more information, you can refer to [asr1](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
## 3.3 Inference
-PP-ASR supports use `paddlespeech asr --model xxx --input xxx.wav` to use pre-trained model to do model inference after install `paddlespeech` by `pip install paddlespeech`.
+PP-ASR supports use `paddlespeech asr --model xxx --input xxx.wav` to use the pre-trained model to do model inference after install `paddlespeech` by `pip install paddlespeech`.
Specific supported functions include:
- Prediction of single audio
-- Use pipe to predict multiple audio
+- Use the pipe to predict multiple audio
- Support RTF calculation
For specific usage, please refer to: [speech_recognition](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md)
@@ -83,7 +83,7 @@ For more information about service deployment, you can refer to the aistudio tut
## 3.5 Customized Auto Speech Recognition and Deployment
For customized auto speech recognition and deployment, PP-ASR provides feature extraction(fbank) => Inference model(Scoring Library)=> C++ program of TLG(WFST, token, lexion, grammer). For specific usage, please refer to: [speechx](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx)
-If you want to quickly use it, you can refer to: [custom_streaming_asr](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/custom_streaming_asr/README_cn.md)
+If you want to quickly use it, you can refer to [custom_streaming_asr](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/custom_streaming_asr/README_cn.md)
For more information about customized auto speech recognition and deployment, you can refer to the aistudio tutorial:
- [Customized Auto Speech Recognition](https://aistudio.baidu.com/aistudio/projectdetail/4021561)
From 8e5f825641b83dc6f943660a55c8602bf0bf2c76 Mon Sep 17 00:00:00 2001
From: Jackwaterveg <87408988+Jackwaterveg@users.noreply.github.com>
Date: Mon, 16 May 2022 19:27:09 +0800
Subject: [PATCH 5/5] test=doc
---
docs/source/tts/PPTTS.md | 2 ++
1 file changed, 2 insertions(+)
diff --git a/docs/source/tts/PPTTS.md b/docs/source/tts/PPTTS.md
index c8534cd32..ef0baa07d 100644
--- a/docs/source/tts/PPTTS.md
+++ b/docs/source/tts/PPTTS.md
@@ -1,5 +1,7 @@
([简体中文](./PPTTS_cn.md)|English)
+# PPTTS
+
- [1. Introduction](#1)
- [2. Characteristic](#2)
- [3. Benchmark](#3)