From 37c5324138478ab5a2dc95e4bda77deaa1ef7978 Mon Sep 17 00:00:00 2001 From: Hui Zhang Date: Wed, 19 May 2021 14:08:04 +0800 Subject: [PATCH] fix result; add feature list --- README.md | 13 ++----- README_cn.md | 15 +++----- doc/src/feature_list.md | 61 +++++++++++++++++++++++++++++++ examples/aishell/s0/README.md | 2 +- examples/aishell/s1/README.md | 20 +++++----- examples/librispeech/s0/README.md | 11 +++--- examples/librispeech/s1/README.md | 24 ++++++------ 7 files changed, 101 insertions(+), 45 deletions(-) create mode 100644 doc/src/feature_list.md diff --git a/README.md b/README.md index eb181490..f78cb59e 100644 --- a/README.md +++ b/README.md @@ -9,12 +9,9 @@ *PaddleASR* is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform. Our vision is to empower both industrial application and academic research on speech recognition, via an easy-to-use, efficient, samller and scalable implementation, including training, inference & testing module, and deployment. -## Models +## Features -* [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf) -* [Transformer](https://arxiv.org/abs/1706.03762) -* [Conformer](https://arxiv.org/abs/2005.08100) -* [U2](https://arxiv.org/pdf/2012.05481.pdf) + See [feature list](doc/src/feature_list) for more information. ## Setup @@ -30,8 +27,6 @@ Please see [Getting Started](doc/src/getting_started.md) and [tiny egs](examples ## More Information -* [Install](doc/src/install.md) -* [Getting Started](doc/src/getting_started.md) * [Data Prepration](doc/src/data_preparation.md) * [Data Augmentation](doc/src/augmentation.md) * [Ngram LM](doc/src/ngram_lm.md) @@ -48,8 +43,8 @@ You are welcome to submit questions in [Github Discussions](https://github.com/P ## License -DeepSpeech is provided under the [Apache-2.0 License](./LICENSE). +DeepASR is provided under the [Apache-2.0 License](./LICENSE). ## Acknowledgement -We depends on many open source repos. See [References](doc/src/reference.md) for more information. +We depends on many open source repos. See [References](doc/src/reference.md) for more information. \ No newline at end of file diff --git a/README_cn.md b/README_cn.md index e1a38906..b1d13e9c 100644 --- a/README_cn.md +++ b/README_cn.md @@ -9,12 +9,9 @@ *PaddleASR*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别(ASR)引擎的开源项目, 我们的愿景是为语音识别在工业应用和学术研究上,提供易于使用、高效、小型化和可扩展的工具,包括训练,推理,以及 部署。 -## 模型 +## 特性 -* [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf) -* [Transformer](https://arxiv.org/abs/1706.03762) -* [Conformer](https://arxiv.org/abs/2005.08100) -* [U2](https://arxiv.org/pdf/2012.05481.pdf) + 参看 [特性列表](doc/src/feature_list)。 ## 安装 @@ -26,12 +23,10 @@ ## 开始 -请查看 [Getting Started](doc/src/getting_started.md) 和 [tiny egs](examples/tiny/s0/README.md)。 +请查看 [开始](doc/src/getting_started.md) 和 [tiny egs](examples/tiny/s0/README.md)。 ## 更多信息 -* [安装](doc/src/install.md) -* [开始](doc/src/getting_started.md) * [数据处理](doc/src/data_preparation.md) * [数据增强](doc/src/augmentation.md) * [语言模型](doc/src/ngram_lm.md) @@ -46,8 +41,8 @@ ## License -DeepSpeech遵循[Apache-2.0开源协议](./LICENSE)。 +DeepASR 遵循[Apache-2.0开源协议](./LICENSE)。 ## 感谢 -开发中参考一些优秀的仓库,详情参见 [References](doc/src/reference.md)。 +开发中参考一些优秀的仓库,详情参见 [References](doc/src/reference.md)。 \ No newline at end of file diff --git a/doc/src/feature_list.md b/doc/src/feature_list.md new file mode 100644 index 00000000..57641d5e --- /dev/null +++ b/doc/src/feature_list.md @@ -0,0 +1,61 @@ +# Featrues + +### Speech Recognition + +* Offline + * [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf) + * [Transformer](https://arxiv.org/abs/1706.03762) + * [Conformer](https://arxiv.org/abs/2005.08100) + +* Online + * [U2](https://arxiv.org/pdf/2012.05481.pdf) + +### Language Model + +* Ngram + +### Decoder + +* ctc greedy +* ctc prefix beam search +* greedy +* beam search +* attention rescore + +### Speech Frontend + +* Audio + * Auto Gain +* Feature + * kaldi fbank + * kaldi mfcc + * linear + * delta detla + +### Speech Augmentation + +* Audio + - Volume Perturbation + - Speed Perturbation + - Shifting Perturbation + - Online Bayesian normalization + - Noise Perturbation + - Impulse Response +* Spectrum + - SpecAugment + - Adaptive SpecAugment + +### Tokenizer + +* Chinese/English Character +* English Word +* Sentence Piece + +### Word Segmentation + +* [mmseg](http://technology.chtsai.org/mmseg/) + +### Grapheme To Phoneme + +* syallable +* phoneme \ No newline at end of file diff --git a/examples/aishell/s0/README.md b/examples/aishell/s0/README.md index 00449879..a27d3a83 100644 --- a/examples/aishell/s0/README.md +++ b/examples/aishell/s0/README.md @@ -1,6 +1,6 @@ # Aishell-1 -## Deepspeech2 +## Deepspeech2 | Model | release | Config | Test set | CER | | --- | --- | --- | --- | --- | | DeepSpeech2 | 2.1 | conf/deepspeech2.yaml | test | 0.078671 | diff --git a/examples/aishell/s1/README.md b/examples/aishell/s1/README.md index 9bfa45c9..2048c4d5 100644 --- a/examples/aishell/s1/README.md +++ b/examples/aishell/s1/README.md @@ -1,14 +1,16 @@ # Aishell ## Conformer -| Model | Config | Augmentation| Test set | Decode method | Loss | WER | -| --- | --- | --- | --- | --- | --- | -| conformer | conf/conformer.yaml | spec_aug + shift | test | attention | - | 0.059858 | -| conformer | conf/conformer.yaml | spec_aug + shift | test | ctc_greedy_search | - | 0.062311 | -| conformer | conf/conformer.yaml | spec_aug + shift | test | ctc_prefix_beam_search | - | 0.062196 | -| conformer | conf/conformer.yaml | spec_aug + shift | test | attention_rescoring | - | 0.054694 | + +| Model | Config | Augmentation| Test set | Decode method | Loss | WER | +| --- | --- | --- | --- | --- | --- | --- | +| conformer | conf/conformer.yaml | spec_aug + shift | test | attention | - | 0.059858 | +| conformer | conf/conformer.yaml | spec_aug + shift | test | ctc_greedy_search | - | 0.062311 | +| conformer | conf/conformer.yaml | spec_aug + shift | test | ctc_prefix_beam_search | - | 0.062196 | +| conformer | conf/conformer.yaml | spec_aug + shift | test | attention_rescoring | - | 0.054694 | ## Transformer -| Model | Config | Augmentation| Test set | Decode method | Loss | WER | -| --- | --- | --- | --- | --- | --- | -| transformer | conf/transformer.yaml | spec_aug + shift | test | attention | - | - | + +| Model | Config | Augmentation| Test set | Decode method | Loss | WER | +| --- | --- | --- | --- | --- | --- | ---| +| transformer | conf/transformer.yaml | spec_aug + shift | test | attention | - | - | diff --git a/examples/librispeech/s0/README.md b/examples/librispeech/s0/README.md index e71cc834..100a0577 100644 --- a/examples/librispeech/s0/README.md +++ b/examples/librispeech/s0/README.md @@ -1,7 +1,8 @@ # LibriSpeech -## Deepspeech2 -| Model | Config | Test set | WER | -| --- | --- | --- | --- | -| DeepSpeech2 | conf/deepspeech2.yaml | test-clean | 0.073973 | -| DeepSpeech2 | release 1.8.5 | test-clean | 0.074939 | +## Deepspeech2 + +| Model | Config | Test set | WER | +| --- | --- | --- | --- | +| DeepSpeech2 | conf/deepspeech2.yaml | test-clean | 0.073973 | +| DeepSpeech2 | release 1.8.5 | test-clean | 0.074939 | diff --git a/examples/librispeech/s1/README.md b/examples/librispeech/s1/README.md index 8fbbe9d7..73f6156d 100644 --- a/examples/librispeech/s1/README.md +++ b/examples/librispeech/s1/README.md @@ -1,16 +1,18 @@ # LibriSpeech ## Conformer -| Model | Config | Augmentation| Test set | Decode method | Loss | WER | -| --- | --- | --- | --- | --- | --- | -| conformer | conf/conformer.yaml | spec_aug + shift | test-all | attention | test-all 6.35 | 0.057117 | -| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | attention | test-all 6.35 | 0.030162 | -| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | ctc_greedy_search | test-all 6.35 | 0.037910 | -| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | ctc_prefix_beam_search | test-all 6.35 | 0.037761 | -| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | attention_rescoring | test-all 6.35 | 0.032115 | + +| Model | Config | Augmentation| Test set | Decode method | Loss | WER | +| --- | --- | --- | --- | --- | --- | --- | +| conformer | conf/conformer.yaml | spec_aug + shift | test-all | attention | test-all 6.35 | 0.057117 | +| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | attention | test-all 6.35 | 0.030162 | +| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | ctc_greedy_search | test-all 6.35 | 0.037910 | +| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | ctc_prefix_beam_search | test-all 6.35 | 0.037761 | +| conformer | conf/conformer.yaml | spec_aug + shift | test-clean | attention_rescoring | test-all 6.35 | 0.032115 | ## Transformer -| Model | Config | Augmentation| Test set | Decode method | Loss | WER | -| --- | --- | --- | --- | --- | --- | -| transformer | conf/transformer.yaml | spec_aug + shift | test-all | attention | test-all 6.98 | 0.066500 | -| transformer | conf/transformer.yaml | spec_aug + shift | test-clean | attention | test-all 6.98 | 0.036 | + +| Model | Config | Augmentation| Test set | Decode method | Loss | WER | +| --- | --- | --- | --- | --- | --- | --- | +| transformer | conf/transformer.yaml | spec_aug + shift | test-all | attention | test-all 6.98 | 0.066500 | +| transformer | conf/transformer.yaml | spec_aug + shift | test-clean | attention | test-all 6.98 | 0.036 |