fix result; add feature list

pull/625/head
Hui Zhang 3 years ago
parent 0a3a840bee
commit 37c5324138

@ -9,12 +9,9 @@
*PaddleASR* is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform. Our vision is to empower both industrial application and academic research on speech recognition, via an easy-to-use, efficient, samller and scalable implementation, including training, inference & testing module, and deployment.
## Models
## Features
* [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf)
* [Transformer](https://arxiv.org/abs/1706.03762)
* [Conformer](https://arxiv.org/abs/2005.08100)
* [U2](https://arxiv.org/pdf/2012.05481.pdf)
See [feature list](doc/src/feature_list) for more information.
## Setup
@ -30,8 +27,6 @@ Please see [Getting Started](doc/src/getting_started.md) and [tiny egs](examples
## More Information
* [Install](doc/src/install.md)
* [Getting Started](doc/src/getting_started.md)
* [Data Prepration](doc/src/data_preparation.md)
* [Data Augmentation](doc/src/augmentation.md)
* [Ngram LM](doc/src/ngram_lm.md)
@ -48,8 +43,8 @@ You are welcome to submit questions in [Github Discussions](https://github.com/P
## License
DeepSpeech is provided under the [Apache-2.0 License](./LICENSE).
DeepASR is provided under the [Apache-2.0 License](./LICENSE).
## Acknowledgement
We depends on many open source repos. See [References](doc/src/reference.md) for more information.
We depends on many open source repos. See [References](doc/src/reference.md) for more information.

@ -9,12 +9,9 @@
*PaddleASR*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别ASR引擎的开源项目
我们的愿景是为语音识别在工业应用和学术研究上,提供易于使用、高效、小型化和可扩展的工具,包括训练,推理,以及 部署。
## 模型
## 特性
* [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf)
* [Transformer](https://arxiv.org/abs/1706.03762)
* [Conformer](https://arxiv.org/abs/2005.08100)
* [U2](https://arxiv.org/pdf/2012.05481.pdf)
参看 [特性列表](doc/src/feature_list)。
## 安装
@ -26,12 +23,10 @@
## 开始
请查看 [Getting Started](doc/src/getting_started.md) 和 [tiny egs](examples/tiny/s0/README.md)。
请查看 [开始](doc/src/getting_started.md) 和 [tiny egs](examples/tiny/s0/README.md)。
## 更多信息
* [安装](doc/src/install.md)
* [开始](doc/src/getting_started.md)
* [数据处理](doc/src/data_preparation.md)
* [数据增强](doc/src/augmentation.md)
* [语言模型](doc/src/ngram_lm.md)
@ -46,8 +41,8 @@
## License
DeepSpeech遵循[Apache-2.0开源协议](./LICENSE)。
DeepASR 遵循[Apache-2.0开源协议](./LICENSE)。
## 感谢
开发中参考一些优秀的仓库,详情参见 [References](doc/src/reference.md)。
开发中参考一些优秀的仓库,详情参见 [References](doc/src/reference.md)。

@ -0,0 +1,61 @@
# Featrues
### Speech Recognition
* Offline
* [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf)
* [Transformer](https://arxiv.org/abs/1706.03762)
* [Conformer](https://arxiv.org/abs/2005.08100)
* Online
* [U2](https://arxiv.org/pdf/2012.05481.pdf)
### Language Model
* Ngram
### Decoder
* ctc greedy
* ctc prefix beam search
* greedy
* beam search
* attention rescore
### Speech Frontend
* Audio
* Auto Gain
* Feature
* kaldi fbank
* kaldi mfcc
* linear
* delta detla
### Speech Augmentation
* Audio
- Volume Perturbation
- Speed Perturbation
- Shifting Perturbation
- Online Bayesian normalization
- Noise Perturbation
- Impulse Response
* Spectrum
- SpecAugment
- Adaptive SpecAugment
### Tokenizer
* Chinese/English Character
* English Word
* Sentence Piece
### Word Segmentation
* [mmseg](http://technology.chtsai.org/mmseg/)
### Grapheme To Phoneme
* syallable
* phoneme

@ -1,6 +1,6 @@
# Aishell-1
## Deepspeech2
## Deepspeech2
| Model | release | Config | Test set | CER |
| --- | --- | --- | --- | --- |
| DeepSpeech2 | 2.1 | conf/deepspeech2.yaml | test | 0.078671 |

@ -1,14 +1,16 @@
# Aishell
## Conformer
| Model | Config | Augmentation| Test set | Decode method | Loss | WER |
| --- | --- | --- | --- | --- | --- |
| conformer | conf/conformer.yaml | spec_aug + shift | test | attention | - | 0.059858 |
| conformer | conf/conformer.yaml | spec_aug + shift | test | ctc_greedy_search | - | 0.062311 |
| conformer | conf/conformer.yaml | spec_aug + shift | test | ctc_prefix_beam_search | - | 0.062196 |
| conformer | conf/conformer.yaml | spec_aug + shift | test | attention_rescoring | - | 0.054694 |
| Model | Config | Augmentation| Test set | Decode method | Loss | WER |
| --- | --- | --- | --- | --- | --- | --- |
| conformer | conf/conformer.yaml | spec_aug + shift | test | attention | - | 0.059858 |
| conformer | conf/conformer.yaml | spec_aug + shift | test | ctc_greedy_search | - | 0.062311 |
| conformer | conf/conformer.yaml | spec_aug + shift | test | ctc_prefix_beam_search | - | 0.062196 |
| conformer | conf/conformer.yaml | spec_aug + shift | test | attention_rescoring | - | 0.054694 |
## Transformer
| Model | Config | Augmentation| Test set | Decode method | Loss | WER |
| --- | --- | --- | --- | --- | --- |
| transformer | conf/transformer.yaml | spec_aug + shift | test | attention | - | - |
| Model | Config | Augmentation| Test set | Decode method | Loss | WER |
| --- | --- | --- | --- | --- | --- | ---|
| transformer | conf/transformer.yaml | spec_aug + shift | test | attention | - | - |

@ -1,7 +1,8 @@
# LibriSpeech
## Deepspeech2
| Model | Config | Test set | WER |
| --- | --- | --- | --- |
| DeepSpeech2 | conf/deepspeech2.yaml | test-clean | 0.073973 |
| DeepSpeech2 | release 1.8.5 | test-clean | 0.074939 |
## Deepspeech2
| Model | Config | Test set | WER |
| --- | --- | --- | --- |
| DeepSpeech2 | conf/deepspeech2.yaml | test-clean | 0.073973 |
| DeepSpeech2 | release 1.8.5 | test-clean | 0.074939 |

@ -1,16 +1,18 @@
# LibriSpeech
## Conformer
| Model | Config | Augmentation| Test set | Decode method | Loss | WER |
| --- | --- | --- | --- | --- | --- |
| conformer | conf/conformer.yaml | spec_aug + shift | test-all | attention | test-all 6.35 | 0.057117 |
| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | attention | test-all 6.35 | 0.030162 |
| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | ctc_greedy_search | test-all 6.35 | 0.037910 |
| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | ctc_prefix_beam_search | test-all 6.35 | 0.037761 |
| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | attention_rescoring | test-all 6.35 | 0.032115 |
| Model | Config | Augmentation| Test set | Decode method | Loss | WER |
| --- | --- | --- | --- | --- | --- | --- |
| conformer | conf/conformer.yaml | spec_aug + shift | test-all | attention | test-all 6.35 | 0.057117 |
| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | attention | test-all 6.35 | 0.030162 |
| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | ctc_greedy_search | test-all 6.35 | 0.037910 |
| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | ctc_prefix_beam_search | test-all 6.35 | 0.037761 |
| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | attention_rescoring | test-all 6.35 | 0.032115 |
## Transformer
| Model | Config | Augmentation| Test set | Decode method | Loss | WER |
| --- | --- | --- | --- | --- | --- |
| transformer | conf/transformer.yaml | spec_aug + shift | test-all | attention | test-all 6.98 | 0.066500 |
| transformer | conf/transformer.yaml | spec_aug + shift | test-clean | attention | test-all 6.98 | 0.036 |
| Model | Config | Augmentation| Test set | Decode method | Loss | WER |
| --- | --- | --- | --- | --- | --- | --- |
| transformer | conf/transformer.yaml | spec_aug + shift | test-all | attention | test-all 6.98 | 0.066500 |
| transformer | conf/transformer.yaml | spec_aug + shift | test-clean | attention | test-all 6.98 | 0.036 |

Loading…
Cancel
Save