update doc (#603)

* fix doc format * format doc
3 years ago · c6ae9857f2
parent 71e046b0ba
commit c6ae9857f2
8 changed files with 49 additions and 19 deletions
--- a/README.md
+++ b/README.md
@ -1,12 +1,12 @@
 [中文版](README_cn.md)

-# DeepSpeech on PaddlePaddle
+# PaddlePaddle ASR toolkit

 ![License](https://img.shields.io/badge/license-Apache%202-red.svg)
 ![python version](https://img.shields.io/badge/python-3.7+-orange.svg)
 ![support os](https://img.shields.io/badge/os-linux-yellow.svg)

-*DeepSpeech on PaddlePaddle* is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform. Our vision is to empower both industrial application and academic research on speech recognition, via an easy-to-use, efficient and scalable implementation, including training, inference & testing module, and demo deployment.
+*PaddleASR* is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform. Our vision is to empower both industrial application and academic research on speech recognition, via an easy-to-use, efficient, samller and scalable implementation, including training, inference & testing module, and deployment.


 ## Models
@ -19,7 +19,7 @@
 ## Setup

 * python>=3.7
-* paddlepaddle>=2.0.0
+* paddlepaddle>=2.1.0

 Please see [install](docs/install.md).

--- a/README_cn.md
+++ b/README_cn.md
@ -1,13 +1,13 @@
 [English](README.md)

-# DeepSpeech on PaddlePaddle
+# PaddlePaddle ASR toolkit

 ![License](https://img.shields.io/badge/license-Apache%202-red.svg)
 ![python version](https://img.shields.io/badge/python-3.7+-orange.svg)
 ![support os](https://img.shields.io/badge/os-linux-yellow.svg)

-*DeepSpeech on PaddlePaddle*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别（ASR）引擎的开源项目，
-我们的愿景是为语音识别在工业应用和学术研究上，提供易于使用、高效和可扩展的工具，包括训练，推理，测试模块，以及 demo 部署。同时，我们还将发布一些预训练好的英语和普通话模型。
+*PaddleASR*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别（ASR）引擎的开源项目，
+我们的愿景是为语音识别在工业应用和学术研究上，提供易于使用、高效、小型化和可扩展的工具，包括训练，推理，以及  部署。

 ## 模型

@ -20,7 +20,7 @@
 ## 安装

 * python>=3.7
-* paddlepaddle>=2.0.0
+* paddlepaddle>=2.1.0

 参看 [安装](docs/install.md)。

--- a/docs/src/asr_postprocess.md
+++ b/docs/src/asr_postprocess.md
@ -0,0 +1,5 @@
+# ASR PostProcess
+
+## Text corrector
+* [pycorrector](https://github.com/shibing624/pycorrector)
+ERNIE on paddlepaddle.
--- a/docs/src/chinese_syllable.md
+++ b/docs/src/chinese_syllable.md
@ -1,10 +1,19 @@
 # chinese syllable

+
+
 ## Syllable
+
 * [List of Syllables in Pinyin](https://resources.allsetlearning.com/chinese/pronunciation/syllable)
-The word syllable is a term referring to the units of a word, composed on an (optional) initial, a final, and a tone.
-The word "syllable" is 音节 (yīnjié) in Chinese. Most spoken syllables in Mandarin Chinese correspond to one written Chinese character.
-There are a total of 410 common pinyin syllables.
+  The word syllable is a term referring to the units of a word, composed on an (optional) initial, a final, and a tone.
+
+  The word "syllable" is 音节 (yīnjié) in Chinese.
+
+  Most spoken syllables in Mandarin Chinese correspond to one written Chinese character.
+
+  There are a total of 410 common pinyin syllables.
+
+

 * [Rare syllable](https://resources.allsetlearning.com/chinese/pronunciation/Rare_syllable)

@ -13,10 +22,12 @@ There are a total of 410 common pinyin syllables.
 * [Mandarin Chinese Phonetics](http://www.zein.se/patrick/chinen8p.html)

 * [chinese phonetics](https://www.easymandarin.cn/online-chinese-lessons/chinese-phonetics/)
-Chinese Characters, called “Hanzi”, are the writing symbols of the Chinese language.
-Pinyin is the Romanization of a phonetic notation for Chinese Characters.
-Each syllable is composed of three parts: initials, finals, and tones.
-In the Pinyin system there are 23 initials, 24 finals, 4 tones and a neutral tone.
+  Chinese Characters, called “Hanzi”, are the writing symbols of the Chinese language.
+  Pinyin is the Romanization of a phonetic notation for Chinese Characters.
+  Each syllable is composed of three parts: initials, finals, and tones.
+  In the Pinyin system there are 23 initials, 24 finals, 4 tones and a neutral tone.
+
+

 ## Pinyin
 * [Pinyin](https://en.wikipedia.org/wiki/Pinyin)
@ -26,12 +37,17 @@ In the Pinyin system there are 23 initials, 24 finals, 4 tones and a neutral ton
 * [Mandarin Chinese Pinyin Table](https://www.archchinese.com/chinese_pinyin.html)
 * [Chinese Pinyin Table ](http://www.quickmandarin.com/chinesepinyintable/)

+
+
 ## Tones
 * [Four tones](https://resources.allsetlearning.com/chinese/pronunciation/Four_tones)
 * [Neutral tone](https://resources.allsetlearning.com/chinese/pronunciation/Neutral_tone)
 * [Where do the tone marks go?](http://www.pinyin.info/rules/where.html)
 * [声调符号标在哪儿？](http://www.hwjyw.com/resource/content/2010/06/04/8183.shtml)

+
+
 ## Zhuyin
+
 * [Bopomofo](https://en.wikipedia.org/wiki/Bopomofo)
 * [Zhuyin table](https://en.wikipedia.org/wiki/Zhuyin_table)
--- a/docs/src/text_front_end.md
+++ b/docs/src/text_front_end.md
@ -1,6 +1,9 @@
 # Text Front End

+
+
 ## MMSEG
+
 * [MMSEG: A Word Identification System for Mandarin Chinese Text Based on Two Variants of the Maximum Matching Algorithm](http://technology.chtsai.org/mmseg/)
 * [`中文分词`简单高效的MMSeg](https://www.cnblogs.com/en-heng/p/5872308.html)
 * [mmseg分词算法及实现](https://blog.csdn.net/daniel_ustc/article/details/50488040)
@ -12,5 +15,7 @@
 * [jkom-cloud/mmseg](https://github.com/jkom-cloud/mmseg)


+
 ## CScanner
+
 * [CScanner - A Chinese Lexical Scanner](http://technology.chtsai.org/cscanner/)
--- a/requirements.txt
+++ b/requirements.txt
@ -8,4 +8,4 @@ SoundFile==0.9.0.post1
 sox
 tensorboardX
 typeguard
-yacs
+yacs
--- a/third_party/README.md
+++ b/third_party/README.md
@ -1,4 +1,8 @@

-* [python_kaldi_features](https://github.com/ZitengWang/python_kaldi_features)
-commit: fc1bd6240c2008412ab64dc25045cd872f5e126c
-ref: https://zhuanlan.zhihu.com/p/55371926
+* [python_kaldi_features](https://github.com/ZitengWang/python_kaldi_features)  
+commit: fc1bd6240c2008412ab64dc25045cd872f5e126c  
+ref: https://zhuanlan.zhihu.com/p/55371926  
+
+* [python-pinyin](https://github.com/mozillazg/python-pinyin.git)
+  commit: 55e524aa1b7b8eec3d15c5306043c6cdd5938b03
+  licence: MIT
--- a/third_party/pymmseg-cpp/README.md
+++ b/third_party/pymmseg-cpp/README.md
@ -73,7 +73,7 @@ mmseg.Dictionary.load_chars('customize_chars.dic')

 # REQUIREMENTS:

-* python 2.5+
+* python 3.7+
 * g++

 # INSTALLATION: