diff --git a/docs/source/introduction.md b/docs/source/introduction.md index 2f71b104..5bf279d2 100644 --- a/docs/source/introduction.md +++ b/docs/source/introduction.md @@ -6,8 +6,31 @@ PaddleSpeech is an open-source toolkit on PaddlePaddle platform for two critical ## What can PaddleSpeech do? ### Speech-To-Text -(An introduce of ASR in PaddleSpeech is needed here!) +PaddleSpeech ASR mainly consists of components below: +- Implementation of models and commonly used neural network layers. +- Dataset abstraction and common data preprocessing pipelines. +- Ready-to-run experiments. + +PaddleSpeech ASR provides you with a complete ASR pipeline, including: +- Data Preparation + - Build vocabulary + - Compute Cepstral mean and variance normalization (CMVN) + - Featrue extraction + - Linear + - fbank (also support kaldi feature) + - mfcc +- Acoustic Models + - Deepspeech2 (online and offline) + - Transformer (online and offline) + - Conformer (online and offline) +- Decoder + - ctc greedy search (used in DeepSpeech2, Transformer and Conformer) + - ctc beam search (used in DeepSpeech2, Transformer and Conformer) + - attention decoding (used in Transformer and Conformer) + - attention rescoring (used in Transformer and Conformer) +Speech-To-Text helps you training the ASR model very simply. + ### Text-To-Speech TTS mainly consists of components below: - Implementation of models and commonly used neural network layers.