From 57ed5cd2e0b1481c35e973ac6fd386208dcc8ad3 Mon Sep 17 00:00:00 2001
From: Hui Zhang <zhtclz@foxmail.com>
Date: Wed, 10 Mar 2021 12:24:49 +0800
Subject: [PATCH] Fix Doc (#544)

---
 README.md                      | 15 ++++++++++++--
 README_cn.md                   | 10 +++++++++
 docs/benchmark.md              |  2 +-
 docs/faq.md                    | 37 ++++++++++++++++++++++++++++++++++
 docs/geting_started.md         |  2 +-
 examples/aishell/README.md     |  9 +++++++++
 examples/librispeech/README.md |  9 +++++++++
 7 files changed, 80 insertions(+), 4 deletions(-)
 create mode 100644 docs/faq.md
 create mode 100644 examples/aishell/README.md
 create mode 100644 examples/librispeech/README.md
diff --git a/README.md b/README.md
index ed04d241..83d10100 100644
--- a/README.md
+++ b/README.md
@@ -4,13 +4,23 @@
 
 *DeepSpeech on PaddlePaddle* is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform. Our vision is to empower both industrial application and academic research on speech recognition, via an easy-to-use, efficient and scalable implementation, including training, inference & testing module, and demo deployment.
 
-For more information, please docs under `doc`.
+For more information, please see below：
+[Install](docs/install.md)
+[Getting Started](docs/geting_stared.md)
+[Data Prepration](docs/data_preparation.md)
+[Data Augmentation](docs/augmentation.md)
+[Ngram LM](docs/ngram_lm.md)
+[Server Demo](docs/server.md)
+[Benchmark](docs/benchmark.md)
+[Relased Model](docs/released_model.md)
+[FAQ](docs/faq.md)
+
 
 ## Models
 * [Baidu's Deep Speech2](http://proceedings.mlr.press/v48/amodei16.pdf)
 
 ## Setup
-* python3.7
+* python 3.7
 * paddlepaddle 2.0.0
 
 - Run the setup script for the remaining dependencies
@@ -33,6 +43,7 @@ source tools/venv/bin/activate
 
 Please see [Getting Started](docs/geting_started.md) and [tiny egs](examples/tiny/README.md).
 
+
 ## Questions and Help
 
 You are welcome to submit questions and bug reports in [Github Issues](https://github.com/PaddlePaddle/DeepSpeech/issues). You are also welcome to contribute to this project.
diff --git a/README_cn.md b/README_cn.md
index d8dd0db6..ff9d3c07 100644
--- a/README_cn.md
+++ b/README_cn.md
@@ -5,6 +5,16 @@
 *DeepSpeech on PaddlePaddle*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别（ASR）引擎的开源项目，
 我们的愿景是为语音识别在工业应用和学术研究上，提供易于使用、高效和可扩展的工具，包括训练，推理，测试模块，以及 demo 部署。同时，我们还将发布一些预训练好的英语和普通话模型。
 
+更多信息如下：
+[安装](docs/install.md)
+[开始](docs/geting_stared.md)
+[数据处理](docs/data_preparation.md)
+[数据增强](docs/augmentation.md)
+[语言模型](docs/ngram_lm.md)
+[服务部署](docs/server.md)
+[Benchmark](docs/benchmark.md)
+[Relased Model](docs/released_model.md)
+[FAQ](docs/faq.md)
 
 ## 模型
 * [Baidu's Deep Speech2](http://proceedings.mlr.press/v48/amodei16.pdf)
diff --git a/docs/benchmark.md b/docs/benchmark.md
index 4ef3e680..3b5f8e95 100644
--- a/docs/benchmark.md
+++ b/docs/benchmark.md
@@ -4,7 +4,7 @@
 
 We compare the training time with 1, 2, 4, 8 Tesla V100 GPUs (with a subset of LibriSpeech samples whose audio durations are between 6.0 and 7.0 seconds).  And it shows that a **near-linear** acceleration with multiple GPUs has been achieved. In the following figure, the time (in seconds) cost for training is printed on the blue bars.
 
-<img src="docs/images/multi_gpu_speedup.png" width=450><br/>
+<img src="images/multi_gpu_speedup.png" width=450><br/>
 
 | # of GPU  | Acceleration Rate |
 | --------  | --------------:   |
diff --git a/docs/faq.md b/docs/faq.md
new file mode 100644
index 00000000..dc14058c
--- /dev/null
+++ b/docs/faq.md
@@ -0,0 +1,37 @@
+# FAQ
+
+1. 音频变速快慢到达什么晨读会影响识别率？
+
+变速会提升识别效果，一般用0.9， 1.0， 1.1 的变速。
+
+2. 音量大小到什么程度会影响识别率？
+
+一般训练会固定音量到一个范围内，波动过大会影响训练，估计在10dB ~ 20dB吧。
+
+3. 语音模型训练数据的最小时长要求时多少？
+
+Aishell-1大约178h的数据，数据越多越好。
+
+4. 那些噪声或背景生会影响识别率？
+
+主要是人生干扰和低信噪比会影响识别率。
+
+5. 单条语音数据的长度限制是多少？
+
+一般训练的语音长度会限制在1s~6s之间，和训练配置有关。
+
+6. 背景声在识别前是否需要分离出来，或做降噪处理？
+
+需要分离的，需要结合具体场景考虑。
+
+7. 模型是否带有VAD人生激活识别能力？
+
+VAD是单独的模型或模块，模型不包含此能力。
+
+8. 是否支持长语音识别？
+
+一般过VAD后识别。
+
+9. Mandarin LM Large语言模型需要的硬件配置时怎样的？
+
+内存能放得下LM即可。
diff --git a/docs/geting_started.md b/docs/geting_started.md
index fddb639a..478f3bb3 100644
--- a/docs/geting_started.md
+++ b/docs/geting_started.md
@@ -71,7 +71,7 @@ CUDA_VISIBLE_DEVICES=0 bash local/tune.sh
  The grid search will print the WER (word error rate) or CER (character error rate) at each point in the hyper-parameters space, and draw the error surface optionally. A proper hyper-parameters range should include the global minima of the error surface for WER/CER, as illustrated in the following figure.
 
 <p align="center">
-<img src="docs/images/tuning_error_surface.png" width=550>
+<img src="images/tuning_error_surface.png" width=550>
 <br/>An example error surface for tuning on the dev-clean set of LibriSpeech
 </p>
 
diff --git a/examples/aishell/README.md b/examples/aishell/README.md
new file mode 100644
index 00000000..0413d4b2
--- /dev/null
+++ b/examples/aishell/README.md
@@ -0,0 +1,9 @@
+# Aishell-1
+
+## CTC
+| Model | Config | Test set |  CER |
+| --- | --- | --- | --- | 
+| DeepSpeech2 | conf/deepspeech2.yaml | test | 0.078977 |
+| DeepSpeech2 | release 1.8.5 | test | 0.080447 |
+
+
diff --git a/examples/librispeech/README.md b/examples/librispeech/README.md
new file mode 100644
index 00000000..cb1ab003
--- /dev/null
+++ b/examples/librispeech/README.md
@@ -0,0 +1,9 @@
+# LibriSpeech
+
+## CTC
+| Model | Config | Test set |  CER |
+| --- | --- | --- | --- | 
+| DeepSpeech2 | conf/deepspeech2.yaml | test-clean | 0.073973 |
+| DeepSpeech2 | release 1.8.5 | test-clean | 0.074939 |
+
+