diff --git a/README.md b/README.md index c47fcde2..66feb098 100644 --- a/README.md +++ b/README.md @@ -6,10 +6,11 @@

Quick Start | Tutorials - | Models List + | Models List ------------------------------------------------------------------------------------ + ![License](https://img.shields.io/badge/license-Apache%202-red.svg) ![python version](https://img.shields.io/badge/python-3.7+-orange.svg) ![support os](https://img.shields.io/badge/os-linux-yellow.svg) @@ -24,7 +25,7 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme **PaddleSpeech** is an open-source toolkit on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform for a variety of critical tasks in speech, with the state-of-art and influential models. -##### Speech-To-Text +##### Speech-to-Text
@@ -54,7 +55,7 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme -##### Text-To-Speech +##### Text-to-Speech
@@ -83,30 +84,30 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme -For more synthesized audios, please refer to [PaddleSpeech Text-To-Speech samples](https://paddlespeech.readthedocs.io/en/latest/tts/demo.html). +For more synthesized audios, please refer to [PaddleSpeech Text-to-Speech samples](https://paddlespeech.readthedocs.io/en/latest/tts/demo.html). Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. To be more specific, this toolkit features at: - **Fast and Light-weight**: we provide high-speed and ultra-lightweight models that are convenient for industrial deployment. - **Rule-based Chinese frontend**: our frontend contains Text Normalization and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi). Moreover, we use self-defined linguistic rules to adapt Chinese context. - **Varieties of Functions that Vitalize both Industrial and Academia**: - - *Implementation of critical audio tasks*: this toolkit contains audio functions like Speech Translation, Automatic Speech Recognition, Text-To-Speech Synthesis, Voice Cloning, etc. - - *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of the speech tasks, and uses mainstream datasets like LibriSpeech, LJSpeech, AIShell, CSMSC, etc. See also [model lists](#models-list) for more details. + - *Implementation of critical audio tasks*: this toolkit contains audio functions like Speech Translation, Automatic Speech Recognition, Text-to-Speech Synthesis, Voice Cloning, etc. + - *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of the speech tasks, and uses mainstream datasets like LibriSpeech, LJSpeech, AIShell, CSMSC, etc. See also [model list](#model-list) for more details. - *Cascaded models application*: as an extension of the application of traditional audio tasks, we combine the workflows of aforementioned tasks with other fields like Natural language processing (NLP), like Punctuation Restoration. -# Alternative Installation +## Installation The base environment in this page is - Ubuntu 16.04 - python>=3.7 -- paddlepaddle>=2.2.0-rc +- paddlepaddle>=2.2.0 If you want to set up PaddleSpeech in other environment, please see the [installation](./docs/source/install.md) documents for all the alternatives. -# Quick Start +## Quick Start Developers can have a try of our model with only a few lines of code. -A tiny DeepSpeech2 **Speech-To-Text** model training on toy set of LibriSpeech: +A tiny DeepSpeech2 **Speech-to-Text** model training on toy set of LibriSpeech: ```shell cd examples/tiny/s0/ @@ -149,13 +150,13 @@ python3 ${BIN_DIR}/synthesize_e2e.py \ --phones-dict=fastspeech2_nosil_baker_ckpt_0.4/phone_id_map.txt ``` -If you want to try more functions like training and tuning, please see [Speech-To-Text Quick Start](./docs/source/asr/quick_start.md) and [Text-To-Speech Quick Start](./docs/source/tts/quick_start.md). +If you want to try more functions like training and tuning, please see [Speech-to-Text Quick Start](./docs/source/asr/quick_start.md) and [Text-To-Speech Quick Start](./docs/source/tts/quick_start.md). -# Models List +## Model List PaddleSpeech supports a series of most popular models, summarized in [released models](./docs/source/released_models.md) with available pretrained models. -Speech-To-Text module contains *Acoustic Model* and *Language Model*, with the following details: +Speech-to-Text module contains *Acoustic Model* and *Language Model*, with the following details: