parent
e395462419
commit
3f9e30c9b3
After Width: | Height: | Size: 4.9 KiB |
After Width: | Height: | Size: 108 KiB |
@ -0,0 +1,6 @@
|
||||
myst-parser
|
||||
recommonmark>=0.5.0
|
||||
sphinx
|
||||
sphinx-autobuild
|
||||
sphinx-markdown-tables
|
||||
sphinx_rtd_theme
|
@ -1,28 +0,0 @@
|
||||
# Released Models
|
||||
|
||||
## Acoustic Model Released in paddle 2.X
|
||||
Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech
|
||||
:-------------:| :------------:| :-----: | -----: | :----------------- |:--------- | :---------- | :---------
|
||||
[Ds2 Online Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s0/aishell.s0.ds_online.5rnn.debug.tar.gz) | Aishell Dataset | Char-based | 345 MB | 2 Conv + 5 LSTM layers with only forward direction | 0.0824 |-| 151 h
|
||||
[Ds2 Offline Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s0/aishell.s0.ds2.offline.cer6p65.release.tar.gz)| Aishell Dataset | Char-based | 306 MB | 2 Conv + 3 bidirectional GRU layers| 0.065 |-| 151 h
|
||||
[Conformer Online Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.chunk.release.tar.gz) | Aishell Dataset | Char-based | 283 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention + CTC | 0.0594 |-| 151 h
|
||||
[Conformer Offline Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.release.tar.gz) | Aishell Dataset | Char-based | 284 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention | 0.0547 |-| 151 h
|
||||
[Conformer Librispeech Model](https://deepspeech.bj.bcebos.com/release2.1/librispeech/s1/conformer.release.tar.gz) | Librispeech Dataset | Word-based | 287 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention |-| 0.0325 | 960 h
|
||||
[Transformer Librispeech Model](https://deepspeech.bj.bcebos.com/release2.1/librispeech/s1/transformer.release.tar.gz) | Librispeech Dataset | Word-based | 195 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention |-| 0.0544 | 960 h
|
||||
|
||||
## Acoustic Model Transformed from paddle 1.8
|
||||
Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech
|
||||
:-------------:| :------------:| :-----: | -----: | :----------------- | :---------- | :---------- | :---------
|
||||
[Ds2 Offline Aishell model](https://deepspeech.bj.bcebos.com/mandarin_models/aishell_model_v1.8_to_v2.x.tar.gz)|Aishell Dataset| Char-based| 234 MB| 2 Conv + 3 bidirectional GRU layers| 0.0804 |-| 151 h|
|
||||
[Ds2 Offline Librispeech model](https://deepspeech.bj.bcebos.com/eng_models/librispeech_v1.8_to_v2.x.tar.gz)|Librispeech Dataset| Word-based| 307 MB| 2 Conv + 3 bidirectional sharing weight RNN layers |-| 0.0685| 960 h|
|
||||
[Ds2 Offline Baidu en8k model](https://deepspeech.bj.bcebos.com/eng_models/baidu_en8k_v1.8_to_v2.x.tar.gz)|Baidu Internal English Dataset| Word-based| 273 MB| 2 Conv + 3 bidirectional GRU layers |-| 0.0541 | 8628 h|
|
||||
|
||||
|
||||
|
||||
## Language Model Released
|
||||
|
||||
Language Model | Training Data | Token-based | Size | Descriptions
|
||||
:-------------:| :------------:| :-----: | -----: | :-----------------
|
||||
[English LM](https://deepspeech.bj.bcebos.com/en_lm/common_crawl_00.prune01111.trie.klm) | [CommonCrawl(en.00)](http://web-language-models.s3-website-us-east-1.amazonaws.com/ngrams/en/deduped/en.00.deduped.xz) | Word-based | 8.3 GB | Pruned with 0 1 1 1 1; <br/> About 1.85 billion n-grams; <br/> 'trie' binary with '-a 22 -q 8 -b 8'
|
||||
[Mandarin LM Small](https://deepspeech.bj.bcebos.com/zh_lm/zh_giga.no_cna_cmn.prune01244.klm) | Baidu Internal Corpus | Char-based | 2.8 GB | Pruned with 0 1 2 4 4; <br/> About 0.13 billion n-grams; <br/> 'probing' binary with default settings
|
||||
[Mandarin LM Large](https://deepspeech.bj.bcebos.com/zh_lm/zhidao_giga.klm) | Baidu Internal Corpus | Char-based | 70.4 GB | No Pruning; <br/> About 3.7 billion n-grams; <br/> 'probing' binary with default settings
|
@ -0,0 +1,33 @@
|
||||
# PaddleSpeech
|
||||
|
||||
## What is PaddleSpeech?
|
||||
PaddleSpeech is an open-source toolkit on PaddlePaddle platform for two critical tasks in Speech - Speech-To-Text (Automatic Speech Recognition, ASR) and Text-To-Speech Synthesis (TTS), with modules involving state-of-art and influential models.
|
||||
|
||||
## What can PaddleSpeech do?
|
||||
|
||||
### Speech-To-Text
|
||||
(An introduce of ASR in PaddleSpeech is needed here!)
|
||||
|
||||
### Text-To-Speech
|
||||
TTS mainly consists of components below:
|
||||
- Implementation of models and commonly used neural network layers.
|
||||
- Dataset abstraction and common data preprocessing pipelines.
|
||||
- Ready-to-run experiments.
|
||||
|
||||
PaddleSpeech TTS provides you with a complete TTS pipeline, including:
|
||||
- Text FrontEnd
|
||||
- Rule based Chinese frontend.
|
||||
- Acoustic Models
|
||||
- FastSpeech2
|
||||
- SpeedySpeech
|
||||
- TransformerTTS
|
||||
- Tacotron2
|
||||
- Vocoders
|
||||
- Multi Band MelGAN
|
||||
- Parallel WaveGAN
|
||||
- WaveFlow
|
||||
- Voice Cloning
|
||||
- Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
|
||||
- GE2E
|
||||
|
||||
Text-To-Speech helps you to train TTS models with simple commands.
|
@ -0,0 +1,55 @@
|
||||
# Released Models
|
||||
|
||||
## Speech-To-Text Models
|
||||
### Acoustic Model Released in paddle 2.X
|
||||
Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech
|
||||
:-------------:| :------------:| :-----: | -----: | :----------------- |:--------- | :---------- | :---------
|
||||
[Ds2 Online Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s0/aishell.s0.ds_online.5rnn.debug.tar.gz) | Aishell Dataset | Char-based | 345 MB | 2 Conv + 5 LSTM layers with only forward direction | 0.0824 |-| 151 h
|
||||
[Ds2 Offline Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s0/aishell.s0.ds2.offline.cer6p65.release.tar.gz)| Aishell Dataset | Char-based | 306 MB | 2 Conv + 3 bidirectional GRU layers| 0.065 |-| 151 h
|
||||
[Conformer Online Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.chunk.release.tar.gz) | Aishell Dataset | Char-based | 283 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention + CTC | 0.0594 |-| 151 h
|
||||
[Conformer Offline Aishell Model](https://deepspeech.bj.bcebos.com/release2.1/aishell/s1/aishell.release.tar.gz) | Aishell Dataset | Char-based | 284 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention | 0.0547 |-| 151 h
|
||||
[Conformer Librispeech Model](https://deepspeech.bj.bcebos.com/release2.1/librispeech/s1/conformer.release.tar.gz) | Librispeech Dataset | Word-based | 287 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention |-| 0.0325 | 960 h
|
||||
[Transformer Librispeech Model](https://deepspeech.bj.bcebos.com/release2.1/librispeech/s1/transformer.release.tar.gz) | Librispeech Dataset | Word-based | 195 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention |-| 0.0544 | 960 h
|
||||
|
||||
### Acoustic Model Transformed from paddle 1.8
|
||||
Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech
|
||||
:-------------:| :------------:| :-----: | -----: | :----------------- | :---------- | :---------- | :---------
|
||||
[Ds2 Offline Aishell model](https://deepspeech.bj.bcebos.com/mandarin_models/aishell_model_v1.8_to_v2.x.tar.gz)|Aishell Dataset| Char-based| 234 MB| 2 Conv + 3 bidirectional GRU layers| 0.0804 |-| 151 h|
|
||||
[Ds2 Offline Librispeech model](https://deepspeech.bj.bcebos.com/eng_models/librispeech_v1.8_to_v2.x.tar.gz)|Librispeech Dataset| Word-based| 307 MB| 2 Conv + 3 bidirectional sharing weight RNN layers |-| 0.0685| 960 h|
|
||||
[Ds2 Offline Baidu en8k model](https://deepspeech.bj.bcebos.com/eng_models/baidu_en8k_v1.8_to_v2.x.tar.gz)|Baidu Internal English Dataset| Word-based| 273 MB| 2 Conv + 3 bidirectional GRU layers |-| 0.0541 | 8628 h|
|
||||
|
||||
### Language Model Released
|
||||
|
||||
Language Model | Training Data | Token-based | Size | Descriptions
|
||||
:-------------:| :------------:| :-----: | -----: | :-----------------
|
||||
[English LM](https://deepspeech.bj.bcebos.com/en_lm/common_crawl_00.prune01111.trie.klm) | [CommonCrawl(en.00)](http://web-language-models.s3-website-us-east-1.amazonaws.com/ngrams/en/deduped/en.00.deduped.xz) | Word-based | 8.3 GB | Pruned with 0 1 1 1 1; <br/> About 1.85 billion n-grams; <br/> 'trie' binary with '-a 22 -q 8 -b 8'
|
||||
[Mandarin LM Small](https://deepspeech.bj.bcebos.com/zh_lm/zh_giga.no_cna_cmn.prune01244.klm) | Baidu Internal Corpus | Char-based | 2.8 GB | Pruned with 0 1 2 4 4; <br/> About 0.13 billion n-grams; <br/> 'probing' binary with default settings
|
||||
[Mandarin LM Large](https://deepspeech.bj.bcebos.com/zh_lm/zhidao_giga.klm) | Baidu Internal Corpus | Char-based | 70.4 GB | No Pruning; <br/> About 3.7 billion n-grams; <br/> 'probing' binary with default settings
|
||||
|
||||
## Text-To-Speech Models
|
||||
### Acoustic Models
|
||||
Model Type | Dataset| Example Link | Pretrained Models
|
||||
:-------------:| :------------:| :-----: | :-----
|
||||
Tacotron2|LJSpeech|[tacotron2-vctk](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/ljspeech/tts0)|[tacotron2_ljspeech_ckpt_0.3.zip](https://paddlespeech.bj.bcebos.com/Parakeet/tacotron2_ljspeech_ckpt_0.3.zip)
|
||||
TransformerTTS| LJSpeech| [transformer-ljspeech](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/ljspeech/tts1)|[transformer_tts_ljspeech_ckpt_0.4.zip](https://paddlespeech.bj.bcebos.com/Parakeet/transformer_tts_ljspeech_ckpt_0.4.zip)
|
||||
SpeedySpeech| CSMSC | [speedyspeech-csmsc](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/csmsc/tts2) |[speedyspeech_nosil_baker_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/speedyspeech_nosil_baker_ckpt_0.5.zip)
|
||||
FastSpeech2| CSMSC |[fastspeech2-csmsc](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/csmsc/tts3)|[fastspeech2_nosil_baker_ckpt_0.4.zip](https://paddlespeech.bj.bcebos.com/Parakeet/fastspeech2_nosil_baker_ckpt_0.4.zip)
|
||||
FastSpeech2| AISHELL-3 |[fastspeech2-aishell3](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/aishell3/tts3)|[fastspeech2_nosil_aishell3_ckpt_0.4.zip](https://paddlespeech.bj.bcebos.com/Parakeet/fastspeech2_nosil_aishell3_ckpt_0.4.zip)
|
||||
FastSpeech2| LJSpeech |[fastspeech2-ljspeech](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/ljspeech/tts3)|[fastspeech2_nosil_ljspeech_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/fastspeech2_nosil_ljspeech_ckpt_0.5.zip)
|
||||
FastSpeech2| VCTK |[fastspeech2-csmsc](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/vctk/tts3)|[fastspeech2_nosil_vctk_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/fastspeech2_nosil_vctk_ckpt_0.5.zip)
|
||||
|
||||
|
||||
### Vocoders
|
||||
|
||||
Model Type | Dataset| Example Link | Pretrained Models
|
||||
:-------------:| :------------:| :-----: | :-----
|
||||
WaveFlow| LJSpeech |[waveflow-ljspeech](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/ljspeech/voc0)|[waveflow_ljspeech_ckpt_0.3.zip](https://paddlespeech.bj.bcebos.com/Parakeet/waveflow_ljspeech_ckpt_0.3.zip)
|
||||
Parallel WaveGAN| CSMSC |[PWGAN-csmsc](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/csmsc/voc1)|[pwg_baker_ckpt_0.4.zip.](https://paddlespeech.bj.bcebos.com/Parakeet/pwg_baker_ckpt_0.4.zip)
|
||||
Parallel WaveGAN| LJSpeech |[PWGAN-ljspeech](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/ljspeech/voc1)|[pwg_ljspeech_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/pwg_ljspeech_ckpt_0.5.zip)
|
||||
Parallel WaveGAN| VCTK |[PWGAN-vctk](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/vctk/voc1)|[pwg_vctk_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/pwg_vctk_ckpt_0.5.zip)
|
||||
|
||||
### Voice Cloning
|
||||
Model Type | Dataset| Example Link | Pretrained Models
|
||||
:-------------:| :------------:| :-----: | :-----
|
||||
GE2E| AISHELL-3, etc. |[ge2e](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/other/ge2e)|[ge2e_ckpt_0.3.zip](https://paddlespeech.bj.bcebos.com/Parakeet/ge2e_ckpt_0.3.zip)
|
||||
GE2E + Tactron2| AISHELL-3 |[ge2e-tactron2-aishell3](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/aishell3/vc0)|[tacotron2_aishell3_ckpt_0.3.zip](https://paddlespeech.bj.bcebos.com/Parakeet/tacotron2_aishell3_ckpt_0.3.zip)
|
@ -0,0 +1,7 @@
|
||||
Audio Sample (PaddleSpeech TTS VS Espnet TTS)
|
||||
==================
|
||||
|
||||
This is an audio demo page to contrast PaddleSpeech TTS and Espnet TTS, We use their respective modules (Text Frontend, Acoustic model and Vocoder) here.
|
||||
We use Espnet's released models here.
|
||||
|
||||
FastSpeech2 + Parallel WaveGAN in CSMSC
|
@ -0,0 +1,9 @@
|
||||
# GAN Vocoders
|
||||
This is a brief introduction of GAN Vocoders, we mainly introduce the losses of different vocoders here.
|
||||
|
||||
Model | Generator Loss |Discriminator Loss
|
||||
:-------------:| :------------:| :-----
|
||||
Parallel Wave GAN| adversial loss <br> Feature Matching | Multi-Scale Discriminator |
|
||||
Mel GAN |adversial loss <br> Multi-resolution STFT loss | adversial loss|
|
||||
Multi-Band Mel GAN | adversial loss <br> full band Multi-resolution STFT loss <br> sub band Multi-resolution STFT loss |Multi-Scale Discriminator|
|
||||
HiFi GAN |adversial loss <br> Feature Matching <br> Mel-Spectrogram Loss | Multi-Scale Discriminator <br> Multi-Period Discriminato |
|
@ -1,45 +0,0 @@
|
||||
.. parakeet documentation master file, created by
|
||||
sphinx-quickstart on Fri Sep 10 14:22:24 2021.
|
||||
You can adapt this file completely to your liking, but it should at least
|
||||
contain the root `toctree` directive.
|
||||
|
||||
Parakeet
|
||||
====================================
|
||||
|
||||
``parakeet`` is a deep learning based text-to-speech toolkit built upon ``paddlepaddle`` framework. It aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It includes many influential TTS models proposed by `Baidu Research <http://research.baidu.com>`_ and other research groups.
|
||||
|
||||
``parakeet`` mainly consists of components below.
|
||||
|
||||
#. Implementation of models and commonly used neural network layers.
|
||||
#. Dataset abstraction and common data preprocessing pipelines.
|
||||
#. Ready-to-run experiments.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Introduction
|
||||
|
||||
introduction
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Getting started
|
||||
|
||||
install
|
||||
basic_usage
|
||||
advanced_usage
|
||||
cn_text_frontend
|
||||
released_models
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Demos
|
||||
|
||||
demo
|
||||
|
||||
|
||||
Indices and tables
|
||||
==================
|
||||
|
||||
* :ref:`genindex`
|
||||
* :ref:`modindex`
|
||||
* :ref:`search`
|
@ -1,47 +0,0 @@
|
||||
# Installation
|
||||
## Install PaddlePaddle
|
||||
Parakeet requires PaddlePaddle as its backend. Note that 2.1.2 or newer versions of paddle is required.
|
||||
|
||||
Since paddlepaddle has multiple packages depending on the device (cpu or gpu) and the dependency libraries, it is recommended to install a proper package of paddlepaddle with respect to the device and dependency library versons via `pip`.
|
||||
|
||||
Installing paddlepaddle with conda or build paddlepaddle from source is also supported. Please refer to [PaddlePaddle installation](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html) for more details.
|
||||
|
||||
Example instruction to install paddlepaddle via pip is listed below.
|
||||
|
||||
### PaddlePaddle with GPU
|
||||
```python
|
||||
# CUDA10.1 的 PaddlePaddle
|
||||
python -m pip install paddlepaddle-gpu==2.1.2.post101 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
|
||||
# CUDA10.2 的 PaddlePaddle
|
||||
python -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
|
||||
# CUDA11.0 的 PaddlePaddle
|
||||
python -m pip install paddlepaddle-gpu==2.1.2.post110 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
|
||||
# CUDA11.2 的 PaddlePaddle
|
||||
python -m pip install paddlepaddle-gpu==2.1.2.post112 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
|
||||
```
|
||||
### PaddlePaddle with CPU
|
||||
```python
|
||||
python -m pip install paddlepaddle==2.1.2 -i https://mirror.baidu.com/pypi/simple
|
||||
```
|
||||
## Install libsndfile
|
||||
Experimemts in parakeet often involve audio and spectrum processing, thus `librosa` and `soundfile` are required. `soundfile` requires a extra C library `libsndfile`, which is not always handled by pip.
|
||||
|
||||
For Windows and Mac users, `libsndfile` is also installed when installing `soundfile` via pip, but for Linux users, installing `libsndfile` via system package manager is required. Example commands for popular distributions are listed below.
|
||||
```bash
|
||||
# ubuntu, debian
|
||||
sudo apt-get install libsndfile1
|
||||
# centos, fedora
|
||||
sudo yum install libsndfile
|
||||
# openSUSE
|
||||
sudo zypper in libsndfile
|
||||
```
|
||||
For any problem with installtion of soundfile, please refer to [SoundFile](https://pypi.org/project/SoundFile/).
|
||||
## Install Parakeet
|
||||
There are two ways to install parakeet according to the purpose of using it.
|
||||
|
||||
1. If you want to run experiments provided by parakeet or add new models and experiments, it is recommended to clone the project from github (Parakeet), and install it in editable mode.
|
||||
```python
|
||||
git clone https://github.com/PaddlePaddle/Parakeet
|
||||
cd Parakeet
|
||||
pip install -e .
|
||||
```
|
@ -1,27 +0,0 @@
|
||||
# Parakeet - PAddle PARAllel text-to-speech toolKIT
|
||||
|
||||
## What is Parakeet?
|
||||
Parakeet is a deep learning based text-to-speech toolkit built upon paddlepaddle framework. It aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It includes many influential TTS models proposed by Baidu Research and other research groups.
|
||||
|
||||
## What can Parakeet do?
|
||||
Parakeet mainly consists of components below:
|
||||
- Implementation of models and commonly used neural network layers.
|
||||
- Dataset abstraction and common data preprocessing pipelines.
|
||||
- Ready-to-run experiments.
|
||||
|
||||
Parakeet provides you with a complete TTS pipeline, including:
|
||||
- Text FrontEnd
|
||||
- Rule based Chinese frontend.
|
||||
- Acoustic Models
|
||||
- FastSpeech2
|
||||
- SpeedySpeech
|
||||
- TransformerTTS
|
||||
- Tacotron2
|
||||
- Vocoders
|
||||
- Parallel WaveGAN
|
||||
- WaveFlow
|
||||
- Voice Cloning
|
||||
- Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
|
||||
- GE2E
|
||||
|
||||
Parakeet helps you to train TTS models with simple commands.
|
@ -1,5 +1,5 @@
|
||||
# Chinese Rule Based Text Frontend
|
||||
TTS system mainly includes three modules: `text frontend`, `Acoustic model` and `Vocoder`. We provide a complete Chinese text frontend module in Parakeet, see exapmle in `Parakeet/examples/text_frontend/`.
|
||||
A TTS system mainly includes three modules: `Text Frontend`, `Acoustic model` and `Vocoder`. We provide a complete Chinese text frontend module in PaddleSpeech TTS, see exapmle in [examples/other/text_frontend/](https://github.com/PaddlePaddle/DeepSpeech/tree/develop/examples/other/text_frontend).
|
||||
|
||||
A text frontend module mainly includes:
|
||||
- Text Segmentation
|
Loading…
Reference in new issue