fix doc format

pull/603/head
Hui Zhang 4 years ago
parent 71e046b0ba
commit b5b32c74ed

@ -1,12 +1,12 @@
[中文版](README_cn.md)
# DeepSpeech on PaddlePaddle
# PaddlePaddle ASR toolkit
![License](https://img.shields.io/badge/license-Apache%202-red.svg)
![python version](https://img.shields.io/badge/python-3.7+-orange.svg)
![support os](https://img.shields.io/badge/os-linux-yellow.svg)
*DeepSpeech on PaddlePaddle* is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform. Our vision is to empower both industrial application and academic research on speech recognition, via an easy-to-use, efficient and scalable implementation, including training, inference & testing module, and demo deployment.
*PaddleASR* is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform. Our vision is to empower both industrial application and academic research on speech recognition, via an easy-to-use, efficient, samller and scalable implementation, including training, inference & testing module, and deployment.
## Models
@ -19,7 +19,7 @@
## Setup
* python>=3.7
* paddlepaddle>=2.0.0
* paddlepaddle>=2.1.0
Please see [install](docs/install.md).
@ -52,4 +52,4 @@ DeepSpeech is provided under the [Apache-2.0 License](./LICENSE).
## Acknowledgement
We depends on many open source repos. See [References](docs/src/reference.md) for more information.
We depends on many open source repos. See [References](docs/src/reference.md) for more information.

@ -1,13 +1,13 @@
[English](README.md)
# DeepSpeech on PaddlePaddle
# PaddlePaddle ASR toolkit
![License](https://img.shields.io/badge/license-Apache%202-red.svg)
![python version](https://img.shields.io/badge/python-3.7+-orange.svg)
![support os](https://img.shields.io/badge/os-linux-yellow.svg)
*DeepSpeech on PaddlePaddle*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别ASR引擎的开源项目
我们的愿景是为语音识别在工业应用和学术研究上,提供易于使用、高效和可扩展的工具,包括训练,推理,测试模块,以及 demo 部署。同时,我们还将发布一些预训练好的英语和普通话模型。
*PaddleASR*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别ASR引擎的开源项目
我们的愿景是为语音识别在工业应用和学术研究上,提供易于使用、高效、小型化和可扩展的工具,包括训练,推理,以及 部署。
## 模型
@ -20,7 +20,7 @@
## 安装
* python>=3.7
* paddlepaddle>=2.0.0
* paddlepaddle>=2.1.0
参看 [安装](docs/install.md)。
@ -50,4 +50,4 @@ DeepSpeech遵循[Apache-2.0开源协议](./LICENSE)。
## 感谢
开发中参考一些优秀的仓库,详情参见 [References](docs/src/reference.md)。
开发中参考一些优秀的仓库,详情参见 [References](docs/src/reference.md)。

@ -0,0 +1,5 @@
# ASR PostProcess
## Text corrector
* [pycorrector](https://github.com/shibing624/pycorrector)
ERNIE on paddlepaddle.

@ -1,10 +1,19 @@
# chinese syllable
## Syllable
* [List of Syllables in Pinyin](https://resources.allsetlearning.com/chinese/pronunciation/syllable)
The word syllable is a term referring to the units of a word, composed on an (optional) initial, a final, and a tone.
The word "syllable" is 音节 (yīnjié) in Chinese. Most spoken syllables in Mandarin Chinese correspond to one written Chinese character.
There are a total of 410 common pinyin syllables.
The word syllable is a term referring to the units of a word, composed on an (optional) initial, a final, and a tone.
The word "syllable" is 音节 (yīnjié) in Chinese.
Most spoken syllables in Mandarin Chinese correspond to one written Chinese character.
There are a total of 410 common pinyin syllables.
* [Rare syllable](https://resources.allsetlearning.com/chinese/pronunciation/Rare_syllable)
@ -13,10 +22,12 @@ There are a total of 410 common pinyin syllables.
* [Mandarin Chinese Phonetics](http://www.zein.se/patrick/chinen8p.html)
* [chinese phonetics](https://www.easymandarin.cn/online-chinese-lessons/chinese-phonetics/)
Chinese Characters, called “Hanzi”, are the writing symbols of the Chinese language.
Pinyin is the Romanization of a phonetic notation for Chinese Characters.
Each syllable is composed of three parts: initials, finals, and tones.
In the Pinyin system there are 23 initials, 24 finals, 4 tones and a neutral tone.
Chinese Characters, called “Hanzi”, are the writing symbols of the Chinese language.
Pinyin is the Romanization of a phonetic notation for Chinese Characters.
Each syllable is composed of three parts: initials, finals, and tones.
In the Pinyin system there are 23 initials, 24 finals, 4 tones and a neutral tone.
## Pinyin
* [Pinyin](https://en.wikipedia.org/wiki/Pinyin)
@ -26,12 +37,17 @@ In the Pinyin system there are 23 initials, 24 finals, 4 tones and a neutral ton
* [Mandarin Chinese Pinyin Table](https://www.archchinese.com/chinese_pinyin.html)
* [Chinese Pinyin Table ](http://www.quickmandarin.com/chinesepinyintable/)
## Tones
* [Four tones](https://resources.allsetlearning.com/chinese/pronunciation/Four_tones)
* [Neutral tone](https://resources.allsetlearning.com/chinese/pronunciation/Neutral_tone)
* [Where do the tone marks go?](http://www.pinyin.info/rules/where.html)
* [声调符号标在哪儿?](http://www.hwjyw.com/resource/content/2010/06/04/8183.shtml)
## Zhuyin
* [Bopomofo](https://en.wikipedia.org/wiki/Bopomofo)
* [Zhuyin table](https://en.wikipedia.org/wiki/Zhuyin_table)
* [Zhuyin table](https://en.wikipedia.org/wiki/Zhuyin_table)

@ -1,6 +1,9 @@
# Text Front End
## MMSEG
* [MMSEG: A Word Identification System for Mandarin Chinese Text Based on Two Variants of the Maximum Matching Algorithm](http://technology.chtsai.org/mmseg/)
* [`中文分词`简单高效的MMSeg](https://www.cnblogs.com/en-heng/p/5872308.html)
* [mmseg分词算法及实现](https://blog.csdn.net/daniel_ustc/article/details/50488040)
@ -12,5 +15,7 @@
* [jkom-cloud/mmseg](https://github.com/jkom-cloud/mmseg)
## CScanner
* [CScanner - A Chinese Lexical Scanner](http://technology.chtsai.org/cscanner/)
* [CScanner - A Chinese Lexical Scanner](http://technology.chtsai.org/cscanner/)

@ -8,4 +8,4 @@ SoundFile==0.9.0.post1
sox
tensorboardX
typeguard
yacs
yacs

@ -1,4 +1,8 @@
* [python_kaldi_features](https://github.com/ZitengWang/python_kaldi_features)
commit: fc1bd6240c2008412ab64dc25045cd872f5e126c
ref: https://zhuanlan.zhihu.com/p/55371926
* [python_kaldi_features](https://github.com/ZitengWang/python_kaldi_features)
commit: fc1bd6240c2008412ab64dc25045cd872f5e126c
ref: https://zhuanlan.zhihu.com/p/55371926
* [python-pinyin](https://github.com/mozillazg/python-pinyin.git)
commit: 55e524aa1b7b8eec3d15c5306043c6cdd5938b03
licence: MIT

@ -73,7 +73,7 @@ mmseg.Dictionary.load_chars('customize_chars.dic')
# REQUIREMENTS:
* python 2.5+
* python 3.7+
* g++
# INSTALLATION:
@ -136,4 +136,4 @@ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Loading…
Cancel
Save