|
|
|
@ -6,8 +6,31 @@ PaddleSpeech is an open-source toolkit on PaddlePaddle platform for two critical
|
|
|
|
|
## What can PaddleSpeech do?
|
|
|
|
|
|
|
|
|
|
### Speech-To-Text
|
|
|
|
|
(An introduce of ASR in PaddleSpeech is needed here!)
|
|
|
|
|
PaddleSpeech ASR mainly consists of components below:
|
|
|
|
|
- Implementation of models and commonly used neural network layers.
|
|
|
|
|
- Dataset abstraction and common data preprocessing pipelines.
|
|
|
|
|
- Ready-to-run experiments.
|
|
|
|
|
|
|
|
|
|
PaddleSpeech ASR provides you with a complete ASR pipeline, including:
|
|
|
|
|
- Data Preparation
|
|
|
|
|
- Build vocabulary
|
|
|
|
|
- Compute Cepstral mean and variance normalization (CMVN)
|
|
|
|
|
- Featrue extraction
|
|
|
|
|
- linear
|
|
|
|
|
- fbank (also support kaldi feature)
|
|
|
|
|
- mfcc
|
|
|
|
|
- Acoustic Models
|
|
|
|
|
- Deepspeech2 (online and offline)
|
|
|
|
|
- Transformer (online and offline)
|
|
|
|
|
- Conformer (online and offline)
|
|
|
|
|
- Decoder
|
|
|
|
|
- ctc greedy search (used in DeepSpeech2, Transformer and Conformer)
|
|
|
|
|
- ctc beam search (used in DeepSpeech2, Transformer and Conformer)
|
|
|
|
|
- attention decoding (used in Transformer and Conformer)
|
|
|
|
|
- attention rescoring (used in Transformer and Conformer)
|
|
|
|
|
|
|
|
|
|
Speech-To-Text helps you training the ASR model very simply.
|
|
|
|
|
|
|
|
|
|
### Text-To-Speech
|
|
|
|
|
TTS mainly consists of components below:
|
|
|
|
|
- Implementation of models and commonly used neural network layers.
|
|
|
|
|