From 8ca14fe9807d3ec1561bb7fe635adbe87c5dc4b7 Mon Sep 17 00:00:00 2001 From: Mingxue-Xu <92848346+Mingxue-Xu@users.noreply.github.com> Date: Tue, 26 Oct 2021 23:50:08 +0800 Subject: [PATCH] Update README.md --- README.md | 99 ++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 61 insertions(+), 38 deletions(-) diff --git a/README.md b/README.md index b91c74899..3aa3e6866 100644 --- a/README.md +++ b/README.md @@ -21,29 +21,43 @@ English | [简体中文](README_ch.md) ![python version](https://img.shields.io/badge/python-3.7+-orange.svg) ![support os](https://img.shields.io/badge/os-linux-yellow.svg) -> Notes: Is there any idea to add [Parakeet logo](https://github.com/PaddlePaddle/Parakeet/blob/develop/docs/images/logo.png) into this .md document? - -**PaddleSpeech** is an open-source toolkit on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform for two critical tasks in Speech - Automatic Speech Recognition (ASR) and Text-To-Speech Synthesis (TTS), with modules involving state-of-art and influential models. +**PaddleSpeech** is an open-source toolkit on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform for two critical tasks in Speech - **Automatic Speech Recognition (ASR)** and **Text-To-Speech Synthesis (TTS)**, with modules involving state-of-art and influential models. Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing module, and deployment. Besides, this toolkit also features at: +- **Fast and Light-weight**: we provide a high-speed and ultra-lightweight model that is convenient for industrial deployment. - **Rule-based Chinese frontend**: we utilize plenty of Chinese datasets and corpora to enhance user experience, including CSMSC and Baidu Internal Corpus. -- **Supporting of ASR streaming and non-streaming data**: This toolkit contains non-streaming models like [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf), [Transformer](https://arxiv.org/abs/1706.03762) and [Conformer](https://arxiv.org/abs/2005.08100). And for streaming models, we have [Baidu's DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf) and [U2](https://arxiv.org/pdf/2012.05481.pdf). -- **Varieties of mainstream models**: The toolkit integrates modules that participate in the whole pipeline of both ASR and TTS, [See also model lists](#models-list). +- **Varieties of Functions that Vitalize Research**: + - *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of both ASR and TTS, and uses datasets like LibriSpeech, LJSpeech, AIShell, etc. See also [model lists](#models-list) for more details. + - *Support of ASR streaming and non-streaming data*: This toolkit contains non-streaming/streaming models like [DeepSpeech2](http://proceedings.mlr.press/v48/amodei16.pdf), [Transformer](https://arxiv.org/abs/1706.03762), [Conformer](https://arxiv.org/abs/2005.08100) and [U2](https://arxiv.org/pdf/2012.05481.pdf). -> Notes: It is better to add a brief getting started. +Let's install PaddleSpeech with only a few lines of code! + +>Note: The official name is still deepspeech. 2021/10/26 + +``` shell +# 1. Install essential libraries and paddlepaddle first. +# install prerequisites +sudo apt-get install -y sox pkg-config libflac-dev libogg-dev libvorbis-dev libboost-dev swig python3-dev libsndfile1 +# `pip install paddlepaddle-gpu` instead if you are using GPU. +pip install paddlepaddle + +# 2.Then install PaddleSpeech. +git clone https://github.com/PaddlePaddle/DeepSpeech.git +cd DeepSpeech +pip install -e . +``` + ## Table of Contents The contents of this README is as follow: - -- [Table of Contents](#table-of-contents) -- [Installation](#installation) +- [Alternative Installation](#installation) - [Quick Start](#quick-start) - [Models List](#models-list) - [Tutorials](#tutorials) @@ -51,26 +65,38 @@ The contents of this README is as follow: - [License](#license) - [Acknowledgement](#acknowledgement) -## Installation - -> Note: The installation guidance of TTS and ASR is now separated. +## Alternative Installation -Base environment: -* Ubuntu 16.04 -* python>=3.7 -* paddlepaddle==2.1.2 +The base environment in this page is +- Ubuntu 16.04 +- python>=3.7 +- paddlepaddle==2.1.2 -Please see the [ASR installation](docs/source/asr/install.md) and [TTS installation](docs/source/tts/install.md) documents for all the alternatives. +If you want to set up PaddleSpeech in other environment, please see the [ASR installation](docs/source/asr/install.md) and [TTS installation](docs/source/tts/install.md) documents for all the alternatives. ## Quick Start -> Note: It is better to use code blocks rather than hyperlinks. +> Note: Both ASR and TTS tiny examples are too long and duplicate, thus they are hard to summarized as a few lines of code. -Please see [ASR getting started](docs/source/asr/getting_started.md) ([tiny test](examples/tiny/s0/README.md)) and [TTS Basic Use](/docs/source/tts/basic_usage.md). +Try a tiny ASR DeepSpeech2 model training on toy set of LibriSpeech: + +```shell +cd examples/tiny/s0/ +# prepare, train, infer, evaluate and export model +bash local/data.sh +bash local/train.sh +bash local/infer.sh +bash local/test.sh +bash local/export.sh ckpt_path saved_jit_model_path +``` + +For more examples, please see [ASR getting started](docs/source/asr/getting_started.md) and [TTS Basic Use](/docs/source/tts/basic_usage.md). ## Models List -PaddleSpeech ASR supports a lot of mainstream models, which are summarized as follow. For more information, please refer to [ASRModels](./docs/source/asr/released_model.md). +> Note: ASR model list is aligned with [acoustic-model](https://github.com/PaddlePaddle/DeepSpeech/blob/develop/docs/source/asr/released_model.md#acoustic-model-released-in-paddle-2x). + +PaddleSpeech ASR supports a lot of mainstream models, which are summarized as follow. For more information, please refer to [ASR Models](./docs/source/asr/released_model.md).