From 276d8cd91c2442a88351b84ef477b82df2e832aa Mon Sep 17 00:00:00 2001 From: Mingxue-Xu <92848346+Mingxue-Xu@users.noreply.github.com> Date: Wed, 27 Oct 2021 20:33:17 +0800 Subject: [PATCH] Update README.md --- README.md | 49 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 41 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index b07389616..7456f2867 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,7 @@ how they can use it Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing module, and deployment. Besides, this toolkit also features at: - **Fast and Light-weight**: we provide a high-speed and ultra-lightweight model that is convenient for industrial deployment. -- **Rule-based Chinese frontend**: we utilize plenty of Chinese datasets and corpora to enhance user experience, including CSMSC and Baidu Internal Corpus. +- **Rule-based Chinese frontend**: our frontend contains Text Normalization (TN) and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi). Moreover, we use self-defined linguistic rules to adapt Chinese context. - **Varieties of Functions that Vitalize Research**: - *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of both ASR and TTS, and uses datasets like LibriSpeech, LJSpeech, AIShell, etc. See also [model lists](#models-list) for more details. - *Support of ASR streaming and non-streaming data*: This toolkit contains non-streaming/streaming models like [DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf), [Transformer](https://arxiv.org/abs/1706.03762), [Conformer](https://arxiv.org/abs/2005.08100) and [U2](https://arxiv.org/pdf/2012.05481.pdf). @@ -154,7 +154,7 @@ The current hyperlinks redirect to [Previous Parakeet](https://github.com/Paddle Conformer Librispeech Model - Encoder:Conformer, Decoder:Transformer, Decoding method: Attention + Encoder:Transformer, Decoder:Transformer, Decoding method: Attention Transformer Librispeech Model @@ -197,7 +197,15 @@ PaddleSpeech TTS mainly contains three modules: *Text Frontend*, *Acoustic Model - Acoustic Model + Text Frontend + G2P + CSMSC + + chinese-fronted + + + + Acoustic Model Tacotron2 LJSpeech @@ -218,16 +226,20 @@ PaddleSpeech TTS mainly contains three modules: *Text Frontend*, *Acoustic Model - FastSpeech2 + FastSpeech2 AISHELL-3 fastspeech2-aishell3 - + VCTK fastspeech2-vctk + + LJSpeech + fastspeech2-ljspeech + CSMSC @@ -235,7 +247,7 @@ PaddleSpeech TTS mainly contains three modules: *Text Frontend*, *Acoustic Model - Vocoder + Vocoder WaveFlow LJSpeech @@ -243,18 +255,38 @@ PaddleSpeech TTS mainly contains three modules: *Text Frontend*, *Acoustic Model - Parallel WaveGAN + Parallel WaveGAN LJSpeech pwGAN-ljspeech + + VCTK + + pwGAN-vctk + + CSMSC pwGAN-csmsc + + Voice Cloning + GE2E + AISHELL-3 + + ge2e-aishell3 + + + + GE2E + Tactron2 + + ge2e-tactron2-aishell3 + + @@ -284,4 +316,5 @@ PaddleSpeech is provided under the [Apache-2.0 License](./LICENSE). ## Acknowledgement -PaddleSpeech depends on a lot of open source repositories. See [references](docs/source/asr/reference.md) for more information. +PaddleSpeech depends on a lot of open source repos. See [references](docs/source/asr/reference.md) for more information. +