|
|
@ -39,7 +39,117 @@
|
|
|
|
|
|
|
|
|
|
|
|
**PaddleSpeech** 荣获 [NAACL2022 Best Demo Award](https://2022.naacl.org/blog/best-demo-award/), 请访问 [Arxiv](https://arxiv.org/abs/2205.12007) 论文。
|
|
|
|
**PaddleSpeech** 荣获 [NAACL2022 Best Demo Award](https://2022.naacl.org/blog/best-demo-award/), 请访问 [Arxiv](https://arxiv.org/abs/2205.12007) 论文。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 效果展示
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
##### 语音识别
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<div align = "center">
|
|
|
|
|
|
|
|
<table style="width:100%">
|
|
|
|
|
|
|
|
<thead>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<th> 输入音频 </th>
|
|
|
|
|
|
|
|
<th width="550"> 识别结果 </th>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200 style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
<td >I knocked at the door on the ancient side of the building.</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200" style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
<td>我认为跑步最重要的就是给我带来了身体健康。</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
##### 语音翻译 (英译中)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<div align = "center">
|
|
|
|
|
|
|
|
<table style="width:100%">
|
|
|
|
|
|
|
|
<thead>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<th> 输入音频 </th>
|
|
|
|
|
|
|
|
<th width="550"> 翻译结果 </th>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200 style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
<td >我 在 这栋 建筑 的 古老 门上 敲门。</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
##### 语音合成
|
|
|
|
|
|
|
|
<div align = "center">
|
|
|
|
|
|
|
|
<table style="width:100%">
|
|
|
|
|
|
|
|
<thead>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<th width="550">输入文本</th>
|
|
|
|
|
|
|
|
<th>合成音频</th>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td >Life was like a box of chocolates, you never know what you're gonna get.</td>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/Parakeet/docs/demos/tacotron2_ljspeech_waveflow_samples_0.2/sentence_1.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200" style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td >早上好,今天是2020/10/29,最低温度是-3°C。</td>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/Parakeet/docs/demos/parakeet_espnet_fs2_pwg_demo/tn_g2p/parakeet/001.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200" style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td >季姬寂,集鸡,鸡即棘鸡。棘鸡饥叽,季姬及箕稷济鸡。鸡既济,跻姬笈,季姬忌,急咭鸡,鸡急,继圾几,季姬急,即籍箕击鸡,箕疾击几伎,伎即齑,鸡叽集几基,季姬急极屐击鸡,鸡既殛,季姬激,即记《季姬击鸡记》。</td>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/Parakeet/docs/demos/jijiji.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200" style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
更多合成音频,可以参考 [PaddleSpeech 语音合成音频示例](https://paddlespeech.readthedocs.io/en/latest/tts/demo.html)。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
##### 标点恢复
|
|
|
|
|
|
|
|
<div align = "center">
|
|
|
|
|
|
|
|
<table style="width:100%">
|
|
|
|
|
|
|
|
<thead>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<th width="390"> 输入文本 </th>
|
|
|
|
|
|
|
|
<th width="390"> 输出文本 </th>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td>今天的天气真不错啊你下午有空吗我想约你一起去吃饭</td>
|
|
|
|
|
|
|
|
<td>今天的天气真不错啊!你下午有空吗?我想约你一起去吃饭。</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 特性
|
|
|
|
### 特性
|
|
|
@ -47,7 +157,7 @@
|
|
|
|
本项目采用了易用、高效、灵活以及可扩展的实现,旨在为工业应用、学术研究提供更好的支持,实现的功能包含训练、推断以及测试模块,以及部署过程,主要包括
|
|
|
|
本项目采用了易用、高效、灵活以及可扩展的实现,旨在为工业应用、学术研究提供更好的支持,实现的功能包含训练、推断以及测试模块,以及部署过程,主要包括
|
|
|
|
- 📦 **易用性**: 安装门槛低,可使用 [CLI](#quick-start) 快速开始。
|
|
|
|
- 📦 **易用性**: 安装门槛低,可使用 [CLI](#quick-start) 快速开始。
|
|
|
|
- 🏆 **对标 SoTA**: 提供了高速、轻量级模型,且借鉴了最前沿的技术。
|
|
|
|
- 🏆 **对标 SoTA**: 提供了高速、轻量级模型,且借鉴了最前沿的技术。
|
|
|
|
- 🏆 **流式ASR和TTS系统**:工业级的端到端流式识别、流式合成系统。
|
|
|
|
- 🏆 **流式 ASR 和 TTS 系统**:工业级的端到端流式识别、流式合成系统。
|
|
|
|
- 💯 **基于规则的中文前端**: 我们的前端包含文本正则化和字音转换(G2P)。此外,我们使用自定义语言规则来适应中文语境。
|
|
|
|
- 💯 **基于规则的中文前端**: 我们的前端包含文本正则化和字音转换(G2P)。此外,我们使用自定义语言规则来适应中文语境。
|
|
|
|
- **多种工业界以及学术界主流功能支持**:
|
|
|
|
- **多种工业界以及学术界主流功能支持**:
|
|
|
|
- 🛎️ 典型音频任务: 本工具包提供了音频任务如音频分类、语音翻译、自动语音识别、文本转语音、语音合成、声纹识别、KWS等任务的实现。
|
|
|
|
- 🛎️ 典型音频任务: 本工具包提供了音频任务如音频分类、语音翻译、自动语音识别、文本转语音、语音合成、声纹识别、KWS等任务的实现。
|
|
|
@ -82,7 +192,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
### 相关依赖
|
|
|
|
### 相关依赖
|
|
|
|
+ gcc >= 4.8.5
|
|
|
|
+ gcc >= 4.8.5
|
|
|
|
+ paddlepaddle >= 2.2.0
|
|
|
|
+ paddlepaddle >= 2.3.1
|
|
|
|
+ python >= 3.7
|
|
|
|
+ python >= 3.7
|
|
|
|
+ linux(推荐), mac, windows
|
|
|
|
+ linux(推荐), mac, windows
|
|
|
|
|
|
|
|
|
|
|
@ -108,12 +218,14 @@ pip install pytest-runner
|
|
|
|
pip install .
|
|
|
|
pip install .
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
更多关于安装问题,如conda环境,librosa依赖的系统库,gcc环境问题,kaldi安装等,可以参考这篇[安装文档](docs/source/install_cn.md),如安装上遇到问题可以在[#2150](https://github.com/PaddlePaddle/PaddleSpeech/issues/2150)上留言以及查找相关问题
|
|
|
|
更多关于安装问题,如 conda 环境,librosa 依赖的系统库,gcc 环境问题,kaldi 安装等,可以参考这篇[安装文档](docs/source/install_cn.md),如安装上遇到问题可以在 [#2150](https://github.com/PaddlePaddle/PaddleSpeech/issues/2150) 上留言以及查找相关问题
|
|
|
|
|
|
|
|
|
|
|
|
<a name="快速开始"></a>
|
|
|
|
<a name="快速开始"></a>
|
|
|
|
## 快速开始
|
|
|
|
## 快速开始
|
|
|
|
|
|
|
|
|
|
|
|
安装完成后,开发者可以通过命令行或者Python快速开始,改变 `--input` 可以尝试用自己的音频或文本测试,支持16k wav格式音频。你也可以在`aistudio`中进行体验[PaddleSpeech API Demo ](https://aistudio.baidu.com/aistudio/projectdetail/4281335?shared=1)。
|
|
|
|
安装完成后,开发者可以通过命令行或者Python快速开始,命令行模式下改变 `--input` 可以尝试用自己的音频或文本测试,支持16k wav格式音频。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
你也可以在`aistudio`中快速体验 👉🏻[PaddleSpeech API Demo ](https://aistudio.baidu.com/aistudio/projectdetail/4281335?shared=1)。
|
|
|
|
|
|
|
|
|
|
|
|
测试音频示例下载
|
|
|
|
测试音频示例下载
|
|
|
|
```shell
|
|
|
|
```shell
|
|
|
@ -121,95 +233,147 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
|
|
|
|
wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav
|
|
|
|
wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**语音识别**
|
|
|
|
### 语音识别
|
|
|
|
|
|
|
|
<details><summary> (点击可展开)开源中文语音识别</summary>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
命令行一键体验
|
|
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
```shell
|
|
|
|
paddlespeech asr --lang zh --input zh.wav
|
|
|
|
paddlespeech asr --lang zh --input zh.wav
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Python API 一键预测
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
```python
|
|
|
|
from paddlespeech.cli.asr.infer import ASRExecutor
|
|
|
|
>>> from paddlespeech.cli.asr.infer import ASRExecutor
|
|
|
|
asr = ASRExecutor()
|
|
|
|
>>> asr = ASRExecutor()
|
|
|
|
result = asr(audio_file="zh.wav")
|
|
|
|
>>> result = asr(audio_file="zh.wav")
|
|
|
|
|
|
|
|
>>> print(result)
|
|
|
|
|
|
|
|
我认为跑步最重要的就是给我带来了身体健康
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 语音合成
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<details><summary> 开源中文语音合成</summary>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
输出 24k 采样率wav格式音频
|
|
|
|
|
|
|
|
|
|
|
|
**语音合成**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
输出24k采样率wav格式音频
|
|
|
|
命令行一键体验
|
|
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
```shell
|
|
|
|
paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" --output output.wav
|
|
|
|
paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" --output output.wav
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Python API 一键预测
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
```python
|
|
|
|
from paddlespeech.cli.tts.infer import TTSExecutor
|
|
|
|
>>> from paddlespeech.cli.tts.infer import TTSExecutor
|
|
|
|
tts = TTSExecutor()
|
|
|
|
>>> tts = TTSExecutor()
|
|
|
|
tts(text="今天天气十分不错。", output="output.wav")
|
|
|
|
>>> tts(text="今天天气十分不错。", output="output.wav")
|
|
|
|
```
|
|
|
|
```
|
|
|
|
- 语音合成的 web demo 已经集成进了 [Huggingface Spaces](https://huggingface.co/spaces). 请参考: [TTS Demo](https://huggingface.co/spaces/akhaliq/paddlespeech)
|
|
|
|
- 语音合成的 web demo 已经集成进了 [Huggingface Spaces](https://huggingface.co/spaces). 请参考: [TTS Demo](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS)
|
|
|
|
|
|
|
|
|
|
|
|
**声音分类**
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 声音分类
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<details><summary> 适配多场景的开放领域声音分类工具</summary>
|
|
|
|
|
|
|
|
|
|
|
|
输出声音分类结果
|
|
|
|
基于AudioSet数据集527个类别的声音分类模型
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
命令行一键体验
|
|
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
```shell
|
|
|
|
paddlespeech cls --input zh.wav
|
|
|
|
paddlespeech cls --input zh.wav
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
python API 一键预测
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
```python
|
|
|
|
from paddlespeech.cli.cls.infer import CLSExecutor
|
|
|
|
>>> from paddlespeech.cli.cls.infer import CLSExecutor
|
|
|
|
cls = CLSExecutor()
|
|
|
|
>>> cls = CLSExecutor()
|
|
|
|
result = cls(audio_file="zh.wav")
|
|
|
|
>>> result = cls(audio_file="zh.wav")
|
|
|
|
|
|
|
|
>>> print(result)
|
|
|
|
|
|
|
|
Speech 0.9027186632156372
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**声纹提取**
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 声纹提取
|
|
|
|
|
|
|
|
|
|
|
|
输出声音提取向量
|
|
|
|
<details><summary> 工业级声纹提取工具</summary>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
命令行一键体验
|
|
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
```shell
|
|
|
|
paddlespeech vector --task spk --input zh.wav
|
|
|
|
paddlespeech vector --task spk --input zh.wav
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Python API 一键预测
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
```python
|
|
|
|
from paddlespeech.cli.vector import VectorExecutor
|
|
|
|
>>> from paddlespeech.cli.vector import VectorExecutor
|
|
|
|
vec = VectorExecutor()
|
|
|
|
>>> vec = VectorExecutor()
|
|
|
|
result = vec(audio_file="zh.wav")
|
|
|
|
>>> result = vec(audio_file="zh.wav")
|
|
|
|
|
|
|
|
>>> print(result) # 187维向量
|
|
|
|
|
|
|
|
[ -0.19083306 9.474295 -14.122263 -2.0916545 0.04848729
|
|
|
|
|
|
|
|
4.9295826 1.4780062 0.3733844 10.695862 3.2697146
|
|
|
|
|
|
|
|
-4.48199 -0.6617882 -9.170393 -11.1568775 -1.2358263 ...]
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**标点恢复**
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 标点恢复
|
|
|
|
|
|
|
|
|
|
|
|
输出添加标点的文本
|
|
|
|
<details><summary> 一键恢复文本标点,可与ASR模型配合使用</summary>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
命令行一键体验
|
|
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
```shell
|
|
|
|
paddlespeech text --task punc --input 今天的天气真不错啊你下午有空吗我想约你一起去吃饭
|
|
|
|
paddlespeech text --task punc --input 今天的天气真不错啊你下午有空吗我想约你一起去吃饭
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Python API 一键预测
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
```python
|
|
|
|
from paddlespeech.cli.text.infer import TextExecutor
|
|
|
|
>>> from paddlespeech.cli.text.infer import TextExecutor
|
|
|
|
text_punc = TextExecutor()
|
|
|
|
>>> text_punc = TextExecutor()
|
|
|
|
result = text_punc(text="今天的天气真不错啊你下午有空吗我想约你一起去吃饭")
|
|
|
|
>>> result = text_punc(text="今天的天气真不错啊你下午有空吗我想约你一起去吃饭")
|
|
|
|
|
|
|
|
今天的天气真不错啊!你下午有空吗?我想约你一起去吃饭。
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**语音翻译**
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 语音翻译
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<details><summary> 端到端英译中语音翻译工具</summary>
|
|
|
|
|
|
|
|
|
|
|
|
使用预编译的kaldi相关工具,只支持在Ubuntu系统中体验
|
|
|
|
使用预编译的kaldi相关工具,只支持在Ubuntu系统中体验
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
命令行一键体验
|
|
|
|
|
|
|
|
|
|
|
|
```shell
|
|
|
|
```shell
|
|
|
|
paddlespeech st --input en.wav
|
|
|
|
paddlespeech st --input en.wav
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
python API 一键预测
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
```python
|
|
|
|
from paddlespeech.cli.st.infer import STExecutor
|
|
|
|
>>> from paddlespeech.cli.st.infer import STExecutor
|
|
|
|
st = STExecutor()
|
|
|
|
>>> st = STExecutor()
|
|
|
|
result = st(audio_file="en.wav")
|
|
|
|
>>> result = st(audio_file="en.wav")
|
|
|
|
|
|
|
|
['我 在 这栋 建筑 的 古老 门上 敲门 。']
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
更多命令行命令请参考 [demos](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos)
|
|
|
|
|
|
|
|
> Note: 如果需要训练或者微调,请查看[语音识别](./docs/source/asr/quick_start.md), [语音合成](./docs/source/tts/quick_start.md)。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<a name="快速使用服务"></a>
|
|
|
|
<a name="快速使用服务"></a>
|
|
|
|
## 快速使用服务
|
|
|
|
## 快速使用服务
|
|
|
|
安装完成后,开发者可以通过命令行快速使用服务。
|
|
|
|
安装完成后,开发者可以通过命令行一键启动语音识别,语音合成,音频分类三种服务。
|
|
|
|
|
|
|
|
|
|
|
|
**启动服务**
|
|
|
|
**启动服务**
|
|
|
|
```shell
|
|
|
|
```shell
|
|
|
@ -587,118 +751,6 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声
|
|
|
|
|
|
|
|
|
|
|
|
语音合成模块最初被称为 [Parakeet](https://github.com/PaddlePaddle/Parakeet),现在与此仓库合并。如果您对该任务的学术研究感兴趣,请参阅 [TTS 研究概述](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/docs/source/tts#overview)。此外,[模型介绍](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/tts/models_introduction.md) 是了解语音合成流程的一个很好的指南。
|
|
|
|
语音合成模块最初被称为 [Parakeet](https://github.com/PaddlePaddle/Parakeet),现在与此仓库合并。如果您对该任务的学术研究感兴趣,请参阅 [TTS 研究概述](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/docs/source/tts#overview)。此外,[模型介绍](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/tts/models_introduction.md) 是了解语音合成流程的一个很好的指南。
|
|
|
|
|
|
|
|
|
|
|
|
## 效果展示
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
##### 语音识别
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<div align = "center">
|
|
|
|
|
|
|
|
<table style="width:100%">
|
|
|
|
|
|
|
|
<thead>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<th> 输入音频 </th>
|
|
|
|
|
|
|
|
<th width="550"> 识别结果 </th>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200 style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
<td >I knocked at the door on the ancient side of the building.</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200" style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
<td>我认为跑步最重要的就是给我带来了身体健康。</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
##### 语音翻译 (英译中)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<div align = "center">
|
|
|
|
|
|
|
|
<table style="width:100%">
|
|
|
|
|
|
|
|
<thead>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<th> 输入音频 </th>
|
|
|
|
|
|
|
|
<th width="550"> 翻译结果 </th>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200 style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
<td >我 在 这栋 建筑 的 古老 门上 敲门。</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
##### 语音合成
|
|
|
|
|
|
|
|
<div align = "center">
|
|
|
|
|
|
|
|
<table style="width:100%">
|
|
|
|
|
|
|
|
<thead>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<th width="550">输入文本</th>
|
|
|
|
|
|
|
|
<th>合成音频</th>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td >Life was like a box of chocolates, you never know what you're gonna get.</td>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/Parakeet/docs/demos/tacotron2_ljspeech_waveflow_samples_0.2/sentence_1.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200" style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td >早上好,今天是2020/10/29,最低温度是-3°C。</td>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/Parakeet/docs/demos/parakeet_espnet_fs2_pwg_demo/tn_g2p/parakeet/001.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200" style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td >季姬寂,集鸡,鸡即棘鸡。棘鸡饥叽,季姬及箕稷济鸡。鸡既济,跻姬笈,季姬忌,急咭鸡,鸡急,继圾几,季姬急,即籍箕击鸡,箕疾击几伎,伎即齑,鸡叽集几基,季姬急极屐击鸡,鸡既殛,季姬激,即记《季姬击鸡记》。</td>
|
|
|
|
|
|
|
|
<td align = "center">
|
|
|
|
|
|
|
|
<a href="https://paddlespeech.bj.bcebos.com/Parakeet/docs/demos/jijiji.wav" rel="nofollow">
|
|
|
|
|
|
|
|
<img align="center" src="./docs/images/audio_icon.png" width="200" style="max-width: 100%;"></a><br>
|
|
|
|
|
|
|
|
</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
更多合成音频,可以参考 [PaddleSpeech 语音合成音频示例](https://paddlespeech.readthedocs.io/en/latest/tts/demo.html)。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
##### 标点恢复
|
|
|
|
|
|
|
|
<div align = "center">
|
|
|
|
|
|
|
|
<table style="width:100%">
|
|
|
|
|
|
|
|
<thead>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<th width="390"> 输入文本 </th>
|
|
|
|
|
|
|
|
<th width="390"> 输出文本 </th>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</thead>
|
|
|
|
|
|
|
|
<tbody>
|
|
|
|
|
|
|
|
<tr>
|
|
|
|
|
|
|
|
<td>今天的天气真不错啊你下午有空吗我想约你一起去吃饭</td>
|
|
|
|
|
|
|
|
<td>今天的天气真不错啊!你下午有空吗?我想约你一起去吃饭。</td>
|
|
|
|
|
|
|
|
</tr>
|
|
|
|
|
|
|
|
</tbody>
|
|
|
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## ⭐ 应用案例
|
|
|
|
## ⭐ 应用案例
|
|
|
|
- **[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo): 使用 PaddleSpeech 的语音合成模块生成虚拟人的声音。**
|
|
|
|
- **[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo): 使用 PaddleSpeech 的语音合成模块生成虚拟人的声音。**
|
|
|
|