PaddleSpeech/README_cn.md

[English](README.md)

# DeepSpeech on PaddlePaddle

![License](https://img.shields.io/badge/license-Apache%202-red.svg)
![python version](https://img.shields.io/badge/python-3.7+-orange.svg)
![support os](https://img.shields.io/badge/os-linux-yellow.svg)

*DeepSpeech on PaddlePaddle*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别（ASR）引擎的开源项目，
我们的愿景是为语音识别在工业应用和学术研究上，提供易于使用、高效和可扩展的工具，包括训练，推理，测试模块，以及 demo 部署。同时，我们还将发布一些预训练好的英语和普通话模型。

## 模型

* [Baidu's Deep Speech2](http://proceedings.mlr.press/v48/amodei16.pdf)

## 安装
* python>=3.7
* paddlepaddle>=2.0.0

- 安装依赖

```bash
git clone https://github.com/PaddlePaddle/DeepSpeech.git
cd DeepSpeech
pushd tools; make; popd
source tools/venv/bin/activate
bash setup.sh
```

- 开始实验前要source环境.

```bash
source tools/venv/bin/activate
```

## 开始

请查看 [Getting Started](docs/geting_started.md) 和 [tiny egs](examples/tiny/README.md)。

## 更多信息

* [安装](docs/install.md)  
* [开始](docs/geting_stared.md)  
* [数据处理](docs/data_preparation.md)  
* [数据增强](docs/augmentation.md)  
* [语言模型](docs/ngram_lm.md)  
* [服务部署](docs/server.md)  
* [Benchmark](docs/benchmark.md)  
* [Relased Model](docs/released_model.md)  
* [FAQ](docs/faq.md)  

## 问题和帮助

欢迎您在[Github问题](https://github.com/PaddlePaddle/models/issues)中提交问题和bug。也欢迎您为这个项目做出贡献。

## License

DeepSpeech遵循[Apache-2.0开源协议](./LICENSE)。
-												Update README_cn.md
											
										
										
											4 years ago
+								[English](README.md)
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								# DeepSpeech on PaddlePaddle
-												update README_cn.md (#207)

* add README_cn.md

* update fix

* delete url

											
										
										
											7 years ago
-												Update README_cn.md
											
										
										
											4 years ago
+								![License](https://img.shields.io/badge/license-Apache%202-red.svg)
 								![python version](https://img.shields.io/badge/python-3.7+-orange.svg)
 								![support os](https://img.shields.io/badge/os-linux-yellow.svg)
-												upadte README.md and README_cn.md

											
										
										
											5 years ago
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								*DeepSpeech on PaddlePaddle*是一个采用[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)平台的端到端自动语音识别（ASR）引擎的开源项目，
-												unify api to 1.6 version and fix some problems

											
										
										
											5 years ago
+								我们的愿景是为语音识别在工业应用和学术研究上，提供易于使用、高效和可扩展的工具，包括训练，推理，测试模块，以及 demo 部署。同时，我们还将发布一些预训练好的英语和普通话模型。
-												update README_cn.md (#207)

* add README_cn.md

* update fix

* delete url

											
										
										
											7 years ago
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								## 模型
-												Update README_cn.md
											
										
										
											4 years ago
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								* [Baidu's Deep Speech2](http://proceedings.mlr.press/v48/amodei16.pdf)
-												update deepspeech to fluid api

											
										
										
											5 years ago
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								## 安装
-												Update README_cn.md
											
										
										
											4 years ago
+								* python>=3.7
 								* paddlepaddle>=2.0.0
-												update deepspeech to fluid api

											
										
										
											5 years ago
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								- 安装依赖
-												update deepspeech to fluid api

											
										
										
											5 years ago
 								```bash
 								git clone https://github.com/PaddlePaddle/DeepSpeech.git
 								cd DeepSpeech
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								pushd tools; make; popd
 								source tools/venv/bin/activate
 								bash setup.sh
-												update readme

											
										
										
											4 years ago
+								```
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								- 开始实验前要source环境.
-												update readme

											
										
										
											4 years ago
 								```bash
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								source tools/venv/bin/activate
-												update readme

											
										
										
											4 years ago
+								```
-												update README_cn.md (#207)

* add README_cn.md

* update fix

* delete url

											
										
										
											7 years ago
+								## 开始
-												Support paddle 2.x (#538)

* 2.x model

* model test pass

* fix data

* fix soundfile with flac support

* one thread dataloader test pass

* export feasture size
add trainer and utils
add setup model and dataloader
update travis using Bionic dist

* add venv; test under venv

* fix unittest; train and valid

* add train and config

* add config and train script

* fix ctc cuda memcopy error

* fix imports

* fix train valid log

* fix dataset batch shuffle shift start from 1
fix rank_zero_only decreator error
close tensorboard when train over
add decoding config and code

* test process can run

* test with decoding

* test and infer with decoding

* fix infer

* fix ctc loss
lr schedule
sortagrad
logger

* aishell egs

* refactor train
add aishell egs

* fix dataset batch shuffle and add batch sampler log
print model parameter

* fix model and ctc

* sequence_mask make all inputs zeros, which cause grad be zero, this is a bug of LessThanOp
add grad clip by global norm
add model train test notebook

* ctc loss
remove run prefix
using ord value as text id

* using unk when training
compute_loss need text ids
ord id using in test mode, which compute wer/cer

* fix tester

* add lr_deacy
refactor code

* fix tools

* fix ci
add tune
fix gru model bugs
add dataset and model test

* fix decoding

* refactor repo
fix decoding

* fix musan and rir dataset

* refactor io, loss, conv, rnn, gradclip, model, utils

* fix ci and import

* refactor model
add export jit model

* add deploy bin and test it

* rm uselss egs

* add layer tools

* refactor socket server
new model from pretrain

* remve useless

* fix instability loss and grad nan or inf for librispeech training

* fix sampler

* fix libri train.sh

* fix doc

* add license on cpp

* fix doc

* fix libri script

* fix install

* clip 5 wer 7.39, clip 400 wer 7.54, 1.8 clip 400 baseline 7.49
											
										
										
											4 years ago
+								请查看 [Getting Started](docs/geting_started.md) 和 [tiny egs](examples/tiny/README.md)。
-												update cn readme

											
										
										
											4 years ago
-												Fix doc format (#546)


											
										
										
											4 years ago
+								## 更多信息
 								* [安装](docs/install.md)
 								* [开始](docs/geting_stared.md)
 								* [数据处理](docs/data_preparation.md)
 								* [数据增强](docs/augmentation.md)
 								* [语言模型](docs/ngram_lm.md)
 								* [服务部署](docs/server.md)
 								* [Benchmark](docs/benchmark.md)
 								* [Relased Model](docs/released_model.md)
 								* [FAQ](docs/faq.md)
-												update README_cn.md (#207)

* add README_cn.md

* update fix

* delete url

											
										
										
											7 years ago
+								## 问题和帮助
 								欢迎您在[Github问题](https://github.com/PaddlePaddle/models/issues)中提交问题和bug。也欢迎您为这个项目做出贡献。
-												Update README_cn.md
											
										
										
											4 years ago
 								## License
 								DeepSpeech遵循[Apache-2.0开源协议](./LICENSE)。