From 7f010f7d2209a7026af60407bda4a6846904829b Mon Sep 17 00:00:00 2001 From: Mingxue-Xu <92848346+Mingxue-Xu@users.noreply.github.com> Date: Thu, 21 Oct 2021 17:58:55 +0800 Subject: [PATCH] Update README.md --- README.md | 50 ++++++++++++++++++++++++++++++-------------------- 1 file changed, 30 insertions(+), 20 deletions(-) diff --git a/README.md b/README.md index 98451177f..6f83b6b40 100644 --- a/README.md +++ b/README.md @@ -8,9 +8,9 @@

- Getting Started - | Tutorials - | Models List + Quick Start + | Tutorials + | Models List

@@ -19,8 +19,7 @@ ![python version](https://img.shields.io/badge/python-3.7+-orange.svg) ![support os](https://img.shields.io/badge/os-linux-yellow.svg) -> Notes: 1.Here place an icon/image as the logo at the beginning like PaddleOCR or PaddleNLP; 2. Is there any idea to add [Parakeet logo](https://github.com/PaddlePaddle/Parakeet/blob/develop/docs/images/logo.png) into this .md document? 3. **It is strongly recommended to refer to [PaddleHub](https://github.com/PaddlePaddle/PaddleHub) documents.** - +> Notes: Is there any idea to add [Parakeet logo](https://github.com/PaddlePaddle/Parakeet/blob/develop/docs/images/logo.png) into this .md document? -- **Text FrontEnd**: Rule based Chinese frontend. +- **Text FrontEnd**: Rule based *Chinese* frontend. - **Acoustic Models**: FastSpeech2, SpeedySpeech, TransformerTTS, Tacotron2 - **Vocoders**: Parallel WaveGAN, WaveFlow - **Voice Cloning**: Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis, GE2E @@ -89,27 +88,36 @@ Base environment: Please see the [ASR installation](docs/source/asr/install.md) and [TTS installation](docs/source/tts/install.md) documents for all the alternatives. -## Getting Started +## Quick Start Please see [ASR getting started](docs/source/asr/getting_started.md) ([tiny test](examples/tiny/s0/README.md)) and [TTS Basic Use](/docs/source/tts/basic_usage.md). -## Example Overview +## Models List + +PaddleSpeech ASR supports a lot of mainstream models. For more information, please refer to [ASRModels](./docs/source/asr/released_model.md). -| Task | Models | Dataset | Performance Summary | Link | -| ---- | ---------------- | -------- | ------------------- | ---- | -| ASR | Ds2 | Aishell | ... | [Ds2 Online Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s0/aishell.s0.ds_online.5rnn.debug.tar.gz) | -| TTS | Parallel WaveGAN | LJSpeech | ... | [parallelwave_gan-ljspeech](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/GANVocoder/parallelwave_gan/ljspeech) | +PaddleSpeech TTS mainly contains three modules: *Text Frontend*, *Acoustic Model* and *Vocoder*. Models for each are listed as follow: +| Type | Dataset | Model Type | Link | +| --------------------- | --------- | --------------------- | --------------------------------------------------------------------------------------------------------------------- | +| Vocoder | LJSpeech | Parallel WaveGAN | [pwGAN-ljspeech](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/GANVocoder/parallelwave_gan/ljspeech) | +| Vocoder | CSMSC | Parallel WaveGAN | [pwGAN-csmsc](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/GANVocoder/parallelwave_gan/baker) | +| Vocoder | LJSpeech | WaveFlow | [waveflow-ljspeech](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/waveflow) | +| Acoustic Model | LJSpeech | FastSpeech2/FastPitch | [fastspeech2-ljspeech](https://github.com/PaddlePaddle/Parakeet/blob/develop/examples/fastspeech2/ljspeech) | +| Acoustic Model | LJSpeech | TransformerTTS | [transformer-ljspeech](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/transformer_tts/ljspeech) | +| Acoustic Model | AISHELL-3 | FastSpeech2/FastPitch | [fastspeech2-aishell3](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/fastspeech2/aishell3) | +| Acoustic Model | CSMSC | FastSpeech2/FastPitch | [fastspeech2-csmsc](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/fastspeech2/baker) | +| Acoustic Model | CSMSC | Speedyspeech | [speedyspeech-csmsc](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/speedyspeech/baker) | +| Chinese Text Frontend | BZNSYP | g2p | [chinese-fronted](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples/text_frontend) | -For more detailed description, please refer to [ASR released models](docs/source/asr/released_model.md) and [TTS released models](docs/source/tts/released_models.md) +## Tutorials -## Guidelines of Pipeline +More background information for ASR, please refer to: * [Data Prepration](docs/source/asr/data_preparation.md) * [Data Augmentation](docs/source/asr/augmentation.md) @@ -117,6 +125,8 @@ For more detailed description, please refer to [ASR released models](docs/source * [Benchmark](docs/source/asr/benchmark.md) * [Relased Model](docs/source/asr/released_model.md) +For TTS, [TTS Document](https://paddleparakeet.readthedocs.io/en/latest/) is a good guideline. + ## FAQ and Contributing @@ -132,6 +142,6 @@ DeepSpeech is provided under the [Apache-2.0 License](./LICENSE). DeepSpeech depends on many open source repos. See [References](docs/source/asr/reference.md) for more information. - **Updates on 2021/10/20**: This [README.md](README.md) outline is not completed, especially *from section **Getting Started***. Besides, this document needs to be further adjusted with reference to PaddleHub. + **Updates on 2021/10/21**: This [README.md](README.md) outline is not completed, especially *from section **Quick Start***.