Merge branch 'audio' into io

3 years ago · 0550d09234
parent eead92af06 56eb1f0ebf
commit 0550d09234
650 changed files with 23824 additions and 65542 deletions
--- a/.gitignore
+++ b/.gitignore
@ -39,6 +39,9 @@ tools/env.sh
 tools/openfst-1.8.1/
 tools/libsndfile/
 tools/python-soundfile/
+tools/onnx
+tools/onnxruntime
+tools/Paddle2ONNX

 speechx/fc_patch/

--- a/.mergify.yml
+++ b/.mergify.yml
@ -52,7 +52,7 @@ pull_request_rules:
        add: ["T2S"]
  - name: "auto add label=Audio"
    conditions:
-      - files~=^paddleaudio/
+      - files~=^paddlespeech/audio/
    actions:
      label:
        add: ["Audio"]
@ -100,7 +100,7 @@ pull_request_rules:
        add: ["README"]
  - name: "auto add label=Documentation"
    conditions:
-      - files~=^(docs/|CHANGELOG.md|paddleaudio/CHANGELOG.md)
+      - files~=^(docs/|CHANGELOG.md)
    actions:
      label:
        add: ["Documentation"]
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@ -51,12 +51,12 @@ repos:
        language: system
        files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx|cuh|proto)$
        exclude: (?=speechx/speechx/kaldi|speechx/patch|speechx/tools/fstbin|speechx/tools/lmbin).*(\.cpp|\.cc|\.h|\.py)$
-    -   id: copyright_checker
-        name: copyright_checker
-        entry: python .pre-commit-hooks/copyright-check.hook
-        language: system
-        files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx|proto|py)$
-        exclude: (?=third_party|pypinyin|speechx/speechx/kaldi|speechx/patch|speechx/tools/fstbin|speechx/tools/lmbin).*(\.cpp|\.cc|\.h|\.py)$
+    #-   id: copyright_checker
+    #    name: copyright_checker
+    #    entry: python .pre-commit-hooks/copyright-check.hook
+    #    language: system
+    #    files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx|proto|py)$
+    #    exclude: (?=third_party|pypinyin|speechx/speechx/kaldi|speechx/patch|speechx/tools/fstbin|speechx/tools/lmbin).*(\.cpp|\.cc|\.h|\.py)$
 -   repo: https://github.com/asottile/reorder_python_imports
    rev: v2.4.0
    hooks:
--- a/.pre-commit-hooks/copyright-check.hook
+++ b/.pre-commit-hooks/copyright-check.hook
@ -19,7 +19,7 @@ import subprocess
 import platform

 COPYRIGHT = '''
-Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.

 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
--- a/README.md
+++ b/README.md
@ -1,19 +1,7 @@
 ([简体中文](./README_cn.md)|English)
-
 <p align="center">
  <img src="./docs/images/PaddleSpeech_logo.png" />
 </p>
-<div align="center">  
-
-  <h3>
-  <a href="#quick-start"> Quick Start </a>
-  | <a href="#quick-start-server"> Quick Start Server </a>
-  | <a href="#documents"> Documents </a>
-  | <a href="#model-list"> Models List </a>
-</div>
-
------------------------------------------------------------------------------------
-

 <p align="center">
    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-red.svg"></a>
@ -28,7 +16,20 @@
    <a href="=https://pypi.org/project/paddlespeech/"><img src="https://static.pepy.tech/badge/paddlespeech"></a>
    <a href="https://huggingface.co/spaces"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue"></a>
 </p>
+<div align="center">  
+<h4>
+    <a href="#quick-start"> Quick Start </a>
+  | <a href="#quick-start-server"> Quick Start Server </a>
+  | <a href="#quick-start-streaming-server"> Quick Start Streaming Server</a>
+  | <a href="#documents"> Documents </a>
+  | <a href="#model-list"> Models List </a>
+  | <a href="https://aistudio.baidu.com/aistudio/education/group/info/25130"> AIStudio Courses </a>
+  | <a href="https://arxiv.org/abs/2205.12007"> Paper </a>
+  | <a href="https://gitee.com/paddlepaddle/PaddleSpeech"> Gitee </a>
+</h4>
+</div>

+------------------------------------------------------------------------------------

 **PaddleSpeech** is an open-source toolkit on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform for a variety of critical tasks in speech and audio, with the state-of-art and influential models.

@ -142,53 +143,35 @@ For more synthesized audios, please refer to [PaddleSpeech Text-to-Speech sample

 </div>

-### ⭐ Examples
- **[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo): Use PaddleSpeech TTS to generate virtual human voice.**
-  
-<div align="center"><a href="https://www.bilibili.com/video/BV1cL411V71o?share_source=copy_web"><img src="https://ai-studio-static-online.cdn.bcebos.com/06fd746ab32042f398fb6f33f873e6869e846fe63c214596ae37860fe8103720" / width="500px"></a></div>
-
- [PaddleSpeech Demo Video](https://paddlespeech.readthedocs.io/en/latest/demo_video.html)
-
- **[VTuberTalk](https://github.com/jerryuhoo/VTuberTalk): Use PaddleSpeech TTS and ASR to clone voice from videos.**
-
-<div align="center">
-<img src="https://raw.githubusercontent.com/jerryuhoo/VTuberTalk/main/gui/gui.png"  width = "500px"  />
-</div>
-
-### 🔥 Hot Activities
-
- 2021.12.21~12.24
-
-  4 Days Live Courses: Depth interpretation of PaddleSpeech!
-
-  **Courses videos and related materials: https://aistudio.baidu.com/aistudio/education/group/info/25130**

 ### Features

 Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. To be more specific, this toolkit features at:
- 📦  **Ease of Use**: low barriers to install, and [CLI](#quick-start) is available to quick-start your journey.
+- 📦  **Ease of Use**: low barriers to install, [CLI](#quick-start), [Server](#quick-start-server), and [Streaming Server](#quick-start-streaming-server) is available to quick-start your journey.
 - 🏆  **Align to the State-of-the-Art**: we provide high-speed and ultra-lightweight models, and also cutting-edge technology. 
+- 🏆  **Streaming ASR and TTS System**: we provide production ready streaming asr and streaming tts system.
 - 💯  **Rule-based Chinese frontend**: our frontend contains Text Normalization and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi). Moreover, we use self-defined linguistic rules to adapt Chinese context.
- **Varieties of Functions that Vitalize both Industrial and Academia**:
-  - 🛎️  *Implementation of critical audio tasks*: this toolkit contains audio functions like  Audio Classification, Speech Translation, Automatic Speech Recognition, Text-to-Speech Synthesis, etc.
+- 📦  **Varieties of Functions that Vitalize both Industrial and Academia**:
+  - 🛎️  *Implementation of critical audio tasks*: this toolkit contains audio functions like  Automatic Speech Recognition, Text-to-Speech Synthesis, Speaker Verfication, KeyWord Spotting, Audio Classification, and Speech Translation, etc.
  - 🔬  *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of the speech tasks, and uses mainstream datasets like LibriSpeech, LJSpeech, AIShell, CSMSC, etc. See also [model list](#model-list) for more details.
  - 🧩  *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV).

 ### Recent Update
+- 👑 2022.05.13: Release [PP-ASR](./docs/source/asr/PPASR.md)、[PP-TTS](./docs/source/tts/PPTTS.md)、[PP-VPR](docs/source/vpr/PPVPR.md)
+- 👏🏻  2022.05.06: `Streaming ASR` with `Punctuation Restoration` and `Token Timestamp`.
+- 👏🏻  2022.05.06: `Server` is available for `Speaker Verification`, and `Punctuation Restoration`.
+- 👏🏻  2022.04.28: `Streaming Server` is available for `Automatic Speech Recognition` and `Text-to-Speech`.
+- 👏🏻  2022.03.28: `Server` is available for `Audio Classification`, `Automatic Speech Recognition` and `Text-to-Speech`.
+- 👏🏻  2022.03.28: `CLI` is available for `Speaker Verification`.
+- 🤗  2021.12.14: [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available!
+- 👏🏻  2021.12.10: `CLI` is available for `Audio Classification`, `Automatic Speech Recognition`, `Speech Translation (English to Chinese)` and `Text-to-Speech`.

-<!---
-2021.12.14: We would like to have an online courses to introduce basics and research of speech, as well as code practice with `paddlespeech`. Please pay attention to our [Calendar](https://www.paddlepaddle.org.cn/live).
--->
- 👏🏻  2022.03.28: PaddleSpeech Server is available for Audio Classification, Automatic Speech Recognition and Text-to-Speech.
- 👏🏻  2022.03.28: PaddleSpeech CLI is available for Speaker Verification.
- 🤗  2021.12.14: Our PaddleSpeech [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available!
- 👏🏻  2021.12.10: PaddleSpeech CLI is available for Audio Classification, Automatic Speech Recognition, Speech Translation (English to Chinese) and Text-to-Speech.

 ### Community
- Scan the QR code below with your Wechat (reply【语音】after your friend's application is approved), you can access to official technical exchange group. Look forward to your participation.
+- Scan the QR code below with your Wechat, you can access to official technical exchange group and get the bonus ( more than 20GB learning materials, such as papers, codes and videos ) and the live link of the lessons. Look forward to your participation.

 <div align="center">
-<img src="https://raw.githubusercontent.com/yt605155624/lanceTest/main/images/wechat_4.jpg"  width = "300"  />
+<img src="https://user-images.githubusercontent.com/23690325/169763015-cbd8e28d-602c-4723-810d-dbc6da49441e.jpg"  width = "200"  />
 </div>

 ## Installation
@ -196,6 +179,7 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision
 We strongly recommend our users to install PaddleSpeech in **Linux** with *python>=3.7*.
 Up to now, **Linux** supports CLI for the all our tasks, **Mac OSX** and **Windows** only supports PaddleSpeech CLI for Audio Classification, Speech-to-Text and Text-to-Speech. To install `PaddleSpeech`, please see [installation](./docs/source/install.md).

+
 <a name="quickstart"></a>
 ## Quick Start

@ -238,7 +222,7 @@ paddlespeech tts --input "你好，欢迎使用飞桨深度学习框架！" --ou
 **Batch Process**
 ```
 echo -e "1 欢迎光临。\n2 谢谢惠顾。" | paddlespeech tts
-```  
+```

 **Shell Pipeline**   
 - ASR + Punctuation Restoration
@ -257,16 +241,19 @@ If you want to try more functions like training and tuning, please have a look a
 Developers can have a try of our speech server with [PaddleSpeech Server Command Line](./paddlespeech/server/README.md).

 **Start server**     
+
 ```shell
 paddlespeech_server start --config_file ./paddlespeech/server/conf/application.yaml
 ```

 **Access Speech Recognition Services**     
+
 ```shell
 paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input input_16k.wav
 ```

 **Access Text to Speech Services**     
+
 ```shell
 paddlespeech_client tts --server_ip 127.0.0.1 --port 8090 --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
 ```
@ -280,6 +267,37 @@ paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input input.wav
 For more information about server command lines, please see: [speech server demos](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/speech_server)


+<a name="quickstartstreamingserver"></a>
+## Quick Start Streaming Server
+
+Developers can have a try of  [streaming asr](./demos/streaming_asr_server/README.md) and [streaming tts](./demos/streaming_tts_server/README.md) server.
+
+**Start Streaming Speech Recognition Server**
+
+```
+paddlespeech_server start --config_file ./demos/streaming_asr_server/conf/application.yaml
+```
+
+**Access Streaming Speech Recognition Services**     
+
+```
+paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input input_16k.wav
+```
+
+**Start Streaming Text to Speech  Server**
+
+```
+paddlespeech_server start --config_file ./demos/streaming_tts_server/conf/tts_online_application.yaml
+```
+
+**Access Streaming Text to Speech Services**     
+
+```
+paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol http --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+```
+
+For more information please see:  [streaming asr](./demos/streaming_asr_server/README.md) and [streaming tts](./demos/streaming_tts_server/README.md) 
+
 <a name="ModelList"></a>

 ## Model List
@ -296,7 +314,7 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r
      <th>Speech-to-Text Module Type</th>
      <th>Dataset</th>
      <th>Model Type</th>
-      <th>Link</th>
+      <th>Example</th>
    </tr>
  </thead>
  <tbody>
@ -371,7 +389,7 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r
      <th> Text-to-Speech Module Type </th>
      <th> Model Type </th>
      <th> Dataset </th>
-      <th> Link </th>
+      <th> Example </th>
    </tr>
  </thead>
  <tbody>
@ -489,7 +507,7 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r
      <th> Task </th>
      <th> Dataset </th>
      <th> Model Type </th>
-      <th> Link </th>
+      <th> Example </th>
    </tr>
  </thead>
  <tbody>
@ -514,7 +532,7 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r
      <th> Task </th>
      <th> Dataset </th>
      <th> Model Type </th>
-      <th> Link </th>
+      <th> Example </th>
    </tr>
  </thead>
  <tbody>
@ -539,7 +557,7 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r
      <th> Task </th>
      <th> Dataset </th>
      <th> Model Type </th>
-      <th> Link </th>
+      <th> Example </th>
    </tr>
  </thead>
  <tbody>
@ -589,6 +607,21 @@ Normally, [Speech SoTA](https://paperswithcode.com/area/speech), [Audio SoTA](ht

 The Text-to-Speech module is originally called [Parakeet](https://github.com/PaddlePaddle/Parakeet), and now merged with this repository. If you are interested in academic research about this task, please see [TTS research overview](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/docs/source/tts#overview). Also, [this document](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/tts/models_introduction.md) is a good guideline for the pipeline components.

+
+## ⭐ Examples
+- **[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo): Use PaddleSpeech TTS to generate virtual human voice.**
+  
+<div align="center"><a href="https://www.bilibili.com/video/BV1cL411V71o?share_source=copy_web"><img src="https://ai-studio-static-online.cdn.bcebos.com/06fd746ab32042f398fb6f33f873e6869e846fe63c214596ae37860fe8103720" / width="500px"></a></div>
+
+- [PaddleSpeech Demo Video](https://paddlespeech.readthedocs.io/en/latest/demo_video.html)
+
+- **[VTuberTalk](https://github.com/jerryuhoo/VTuberTalk): Use PaddleSpeech TTS and ASR to clone voice from videos.**
+
+<div align="center">
+<img src="https://raw.githubusercontent.com/jerryuhoo/VTuberTalk/main/gui/gui.png"  width = "500px"  />
+</div>
+
+
 ## Citation

 To cite PaddleSpeech for research, please use the following format.
@ -655,7 +688,6 @@ You are warmly welcome to submit questions in [discussions](https://github.com/P

 ## Acknowledgement

-
 - Many thanks to [yeyupiaoling](https://github.com/yeyupiaoling)/[PPASR](https://github.com/yeyupiaoling/PPASR)/[PaddlePaddle-DeepSpeech](https://github.com/yeyupiaoling/PaddlePaddle-DeepSpeech)/[VoiceprintRecognition-PaddlePaddle](https://github.com/yeyupiaoling/VoiceprintRecognition-PaddlePaddle)/[AudioClassification-PaddlePaddle](https://github.com/yeyupiaoling/AudioClassification-PaddlePaddle) for years of attention, constructive advice and great help.
 - Many thanks to [mymagicpower](https://github.com/mymagicpower) for the Java implementation of ASR upon [short](https://github.com/mymagicpower/AIAS/tree/main/3_audio_sdks/asr_sdk) and [long](https://github.com/mymagicpower/AIAS/tree/main/3_audio_sdks/asr_long_audio_sdk) audio files.
 - Many thanks to [JiehangXie](https://github.com/JiehangXie)/[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo) for developing Virtual Uploader(VUP)/Virtual YouTuber(VTuber) with PaddleSpeech TTS function.
--- a/README_cn.md
+++ b/README_cn.md
@ -2,34 +2,36 @@
 <p align="center">
  <img src="./docs/images/PaddleSpeech_logo.png" />
 </p>
-<div align="center">  

-  <h3>
-  <a href="#quick-start"> 快速开始 </a>
-  | <a href="#quick-start-server"> 快速使用服务 </a>
-  | <a href="#documents"> 教程文档 </a>
-  | <a href="#model-list"> 模型列表 </a>
-</div>

------------------------------------------------------------------------------------
 <p align="center">
    <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-red.svg"></a>
-    <a href="support os"><img src="https://img.shields.io/badge/os-linux-yellow.svg"></a>
+    <a href="https://github.com/PaddlePaddle/PaddleSpeech/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleSpeech?color=ffa"></a>
+    <a href="support os"><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a>
    <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a>
    <a href="https://github.com/PaddlePaddle/PaddleSpeech/graphs/contributors"><img src="https://img.shields.io/github/contributors/PaddlePaddle/PaddleSpeech?color=9ea"></a>
    <a href="https://github.com/PaddlePaddle/PaddleSpeech/commits"><img src="https://img.shields.io/github/commit-activity/m/PaddlePaddle/PaddleSpeech?color=3af"></a>
    <a href="https://github.com/PaddlePaddle/PaddleSpeech/issues"><img src="https://img.shields.io/github/issues/PaddlePaddle/PaddleSpeech?color=9cc"></a>
    <a href="https://github.com/PaddlePaddle/PaddleSpeech/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleSpeech?color=ccf"></a>
+    <a href="=https://pypi.org/project/paddlespeech/"><img src="https://img.shields.io/pypi/dm/PaddleSpeech"></a>
+    <a href="=https://pypi.org/project/paddlespeech/"><img src="https://static.pepy.tech/badge/paddlespeech"></a>
    <a href="https://huggingface.co/spaces"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue"></a>
 </p>
+<div align="center">  
+<h4>
+    <a href="#快速开始"> 快速开始 </a>
+  | <a href="#快速使用服务"> 快速使用服务 </a>
+  | <a href="#快速使用流式服务"> 快速使用流式服务 </a>
+  | <a href="#教程文档"> 教程文档 </a>
+  | <a href="#模型列表"> 模型列表 </a>
+  | <a href="https://aistudio.baidu.com/aistudio/education/group/info/25130"> AIStudio 课程 </a>
+  | <a href="https://arxiv.org/abs/2205.12007"> 论文 </a>
+  | <a href="https://gitee.com/paddlepaddle/PaddleSpeech"> Gitee 
+</h4>
+</div>
+

-<!---
-from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readmes-readable.md
-1.What is this repo or project? (You can reuse the repo description you used earlier because this section doesn’t have to be long.)
-2.How does it work?
-3.Who will use this repo or project?
-4.What is the goal of this project?
-->
+------------------------------------------------------------------------------------

 **PaddleSpeech** 是基于飞桨 [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) 的语音方向的开源模型库，用于语音和音频中的各种关键任务的开发，包含大量基于深度学习前沿和有影响力的模型，一些典型的应用示例如下：
 ##### 语音识别
@ -57,7 +59,6 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme
      </td>
      <td>我认为跑步最重要的就是给我带来了身体健康。</td>
    </tr>
-    
  </tbody>
 </table>

@ -143,53 +144,37 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme

 </div>

-### ⭐ 应用案例
- **[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo): 使用 PaddleSpeech 的语音合成模块生成虚拟人的声音。**
-  
-<div align="center"><a href="https://www.bilibili.com/video/BV1cL411V71o?share_source=copy_web"><img src="https://ai-studio-static-online.cdn.bcebos.com/06fd746ab32042f398fb6f33f873e6869e846fe63c214596ae37860fe8103720" / width="500px"></a></div>
-
- [PaddleSpeech 示例视频](https://paddlespeech.readthedocs.io/en/latest/demo_video.html)
-
-
- **[VTuberTalk](https://github.com/jerryuhoo/VTuberTalk): 使用 PaddleSpeech 的语音合成和语音识别从视频中克隆人声。**

-<div align="center">
-<img src="https://raw.githubusercontent.com/jerryuhoo/VTuberTalk/main/gui/gui.png"  width = "500px"  />
-</div>
-
-### 🔥 热门活动
-
- 2021.12.21~12.24
-
-  4 日直播课: 深度解读 PaddleSpeech 语音技术!
-
-  **直播回放与课件资料: https://aistudio.baidu.com/aistudio/education/group/info/25130**
 ### 特性

 本项目采用了易用、高效、灵活以及可扩展的实现，旨在为工业应用、学术研究提供更好的支持，实现的功能包含训练、推断以及测试模块，以及部署过程，主要包括
 - 📦 **易用性**: 安装门槛低，可使用 [CLI](#quick-start) 快速开始。
 - 🏆 **对标 SoTA**: 提供了高速、轻量级模型，且借鉴了最前沿的技术。
+- 🏆 **流式ASR和TTS系统**：工业级的端到端流式识别、流式合成系统。
 - 💯 **基于规则的中文前端**: 我们的前端包含文本正则化和字音转换（G2P）。此外，我们使用自定义语言规则来适应中文语境。
 - **多种工业界以及学术界主流功能支持**:
-  - 🛎️ 典型音频任务: 本工具包提供了音频任务如音频分类、语音翻译、自动语音识别、文本转语音、语音合成等任务的实现。
+  - 🛎️ 典型音频任务: 本工具包提供了音频任务如音频分类、语音翻译、自动语音识别、文本转语音、语音合成、声纹识别、KWS等任务的实现。
  - 🔬 主流模型及数据集: 本工具包实现了参与整条语音任务流水线的各个模块，并且采用了主流数据集如 LibriSpeech、LJSpeech、AIShell、CSMSC，详情请见 [模型列表](#model-list)。
  - 🧩 级联模型应用: 作为传统语音任务的扩展，我们结合了自然语言处理、计算机视觉等任务，实现更接近实际需求的产业级应用。

+
 ### 近期更新
+- 👑 2022.05.13: PaddleSpeech 发布 [PP-ASR](./docs/source/asr/PPASR_cn.md) 流式语音识别系统、[PP-TTS](./docs/source/tts/PPTTS_cn.md) 流式语音合成系统、[PP-VPR](docs/source/vpr/PPVPR_cn.md) 全链路声纹识别系统
+- 👏🏻 2022.05.06: PaddleSpeech Streaming Server 上线! 覆盖了语音识别（标点恢复、时间戳），和语音合成。
+- 👏🏻 2022.05.06: PaddleSpeech Server 上线! 覆盖了声音分类、语音识别、语音合成、声纹识别，标点恢复。
+- 👏🏻 2022.03.28: PaddleSpeech CLI 覆盖声音分类、语音识别、语音翻译（英译中）、语音合成，声纹验证。
+- 🤗 2021.12.14: PaddleSpeech [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available!

-<!---
-2021.12.14: We would like to have an online courses to introduce basics and research of speech, as well as code practice with `paddlespeech`. Please pay attention to our [Calendar](https://www.paddlepaddle.org.cn/live).
--->
- 👏🏻 2022.03.28: PaddleSpeech Server 上线! 覆盖了声音分类、语音识别、以及语音合成。
- 👏🏻 2022.03.28: PaddleSpeech CLI 上线声纹验证。
- 🤗  2021.12.14: Our PaddleSpeech [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/KPatrick/PaddleSpeechTTS) Demos on Hugging Face Spaces are available!
- 👏🏻 2021.12.10: PaddleSpeech CLI 上线！覆盖了声音分类、语音识别、语音翻译（英译中）以及语音合成。

-### 技术交流群
-微信扫描二维码（好友申请通过后回复【语音】）加入官方交流群，获得更高效的问题答疑，与各行各业开发者充分交流，期待您的加入。
+ ### 🔥 加入技术交流群获取入群福利
+
+ - 3 日直播课链接: 深度解读 PP-TTS、PP-ASR、PP-VPR 三项核心语音系统关键技术
+ - 20G 学习大礼包：视频课程、前沿论文与学习资料
+  
+微信扫描二维码关注公众号，点击“马上报名”填写问卷加入官方交流群，获得更高效的问题答疑，与各行各业开发者充分交流，期待您的加入。

 <div align="center">
-<img src="https://raw.githubusercontent.com/yt605155624/lanceTest/main/images/wechat_4.jpg"  width = "300"  />
+<img src="https://user-images.githubusercontent.com/23690325/169763015-cbd8e28d-602c-4723-810d-dbc6da49441e.jpg"  width = "200"  />
 </div>

 ## 安装
@ -197,6 +182,7 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme
 我们强烈建议用户在 **Linux** 环境下，*3.7* 以上版本的 *python* 上安装 PaddleSpeech。
 目前为止，**Linux** 支持声音分类、语音识别、语音合成和语音翻译四种功能，**Mac OSX、 Windows** 下暂不支持语音翻译功能。 想了解具体安装细节，可以参考[安装文档](./docs/source/install_cn.md)。

+<a name="快速开始"></a>
 ## 快速开始

 安装完成后，开发者可以通过命令行快速开始，改变 `--input` 可以尝试用自己的音频或文本测试。
@ -232,7 +218,7 @@ paddlespeech tts --input "你好，欢迎使用百度飞桨深度学习框架！
 **批处理**
 ```
 echo -e "1 欢迎光临。\n2 谢谢惠顾。" | paddlespeech tts
-```  
+```

 **Shell管道**
 ASR + Punc:
@ -243,7 +229,7 @@ paddlespeech asr --input ./zh.wav | paddlespeech text --task punc
 更多命令行命令请参考 [demos](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos)
 > Note: 如果需要训练或者微调，请查看[语音识别](./docs/source/asr/quick_start.md)， [语音合成](./docs/source/tts/quick_start.md)。

-
+<a name="快速使用服务"></a>
 ## 快速使用服务
 安装完成后，开发者可以通过命令行快速使用服务。

@ -269,7 +255,38 @@ paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input input.wav

 更多服务相关的命令行使用信息，请参考 [demos](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/speech_server)

+<a name="快速使用流式服务"></a>
+## 快速使用流式服务
+
+开发者可以尝试 [流式 ASR](./demos/streaming_asr_server/README.md) 和 [流式 TTS](./demos/streaming_tts_server/README.md) 服务.
+
+**启动流式 ASR 服务**
+
+```
+paddlespeech_server start --config_file ./demos/streaming_asr_server/conf/application.yaml
+```
+
+**访问流式 ASR 服务**     
+
+```
+paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input input_16k.wav
+```
+
+**启动流式 TTS 服务**
+
+```
+paddlespeech_server start --config_file ./demos/streaming_tts_server/conf/tts_online_application.yaml
+```
+
+**访问流式 TTS 服务**     
+
+```
+paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol http --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+```
+
+更多信息参看： [流式 ASR](./demos/streaming_asr_server/README.md) 和 [流式 TTS](./demos/streaming_tts_server/README.md) 

+<a name="模型列表"></a>
 ## 模型列表
 PaddleSpeech 支持很多主流的模型，并提供了预训练模型，详情请见[模型列表](./docs/source/released_model.md)。

@ -282,8 +299,8 @@ PaddleSpeech 的 **语音转文本** 包含语音识别声学模型、语音识
    <tr>
      <th>语音转文本模块类型</th>
      <th>数据集</th>
-      <th>模型种类</th>
-      <th>链接</th>
+      <th>模型类型</th>
+      <th>脚本</th>
    </tr>
  </thead>
  <tbody>
@ -356,9 +373,9 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块：文本前端、声
  <thead>
    <tr>
      <th> 语音合成模块类型 </th>
-      <th> 模型种类 </th>
+      <th> 模型类型 </th>
      <th> 数据集  </th>
-      <th> 链接  </th>
+      <th> 脚本  </th>
    </tr>
  </thead>
  <tbody>
@ -474,8 +491,8 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块：文本前端、声
    <tr>
      <th> 任务 </th>
      <th> 数据集 </th>
-      <th> 模型种类 </th>
-      <th> 链接</th>
+      <th> 模型类型 </th>
+      <th> 脚本</th>
    </tr>
  </thead>
  <tbody>
@ -498,10 +515,10 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块：文本前端、声
 <table style="width:100%">
  <thead>
    <tr>
-      <th> Task </th>
-      <th> Dataset </th>
-      <th> Model Type </th>
-      <th> Link </th>
+      <th> 任务 </th>
+      <th> 数据集 </th>
+      <th> 模型类型 </th>
+      <th> 脚本 </th>
    </tr>
  </thead>
  <tbody>
@ -525,8 +542,8 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块：文本前端、声
    <tr>
      <th> 任务 </th>
      <th> 数据集 </th>
-      <th> 模型种类 </th>
-      <th> 链接 </th>
+      <th> 模型类型 </th>
+      <th> 脚本 </th>
    </tr>
  </thead>
  <tbody>
@ -541,6 +558,7 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块：文本前端、声
  </tbody>
 </table>

+<a name="教程文档"></a>
 ## 教程文档

 对于 PaddleSpeech 的所关注的任务，以下指南有助于帮助开发者快速入门，了解语音相关核心思想。
@ -582,6 +600,21 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块：文本前端、声

 语音合成模块最初被称为 [Parakeet](https://github.com/PaddlePaddle/Parakeet)，现在与此仓库合并。如果您对该任务的学术研究感兴趣，请参阅 [TTS 研究概述](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/docs/source/tts#overview)。此外，[模型介绍](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/tts/models_introduction.md) 是了解语音合成流程的一个很好的指南。

+## ⭐ 应用案例
+- **[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo): 使用 PaddleSpeech 的语音合成模块生成虚拟人的声音。**
+  
+<div align="center"><a href="https://www.bilibili.com/video/BV1cL411V71o?share_source=copy_web"><img src="https://ai-studio-static-online.cdn.bcebos.com/06fd746ab32042f398fb6f33f873e6869e846fe63c214596ae37860fe8103720" / width="500px"></a></div>
+
+- [PaddleSpeech 示例视频](https://paddlespeech.readthedocs.io/en/latest/demo_video.html)
+
+
+- **[VTuberTalk](https://github.com/jerryuhoo/VTuberTalk): 使用 PaddleSpeech 的语音合成和语音识别从视频中克隆人声。**
+
+<div align="center">
+<img src="https://raw.githubusercontent.com/jerryuhoo/VTuberTalk/main/gui/gui.png"  width = "500px"  />
+</div>
+
+
 ## 引用

 要引用 PaddleSpeech 进行研究，请使用以下格式进行引用。
@ -607,7 +640,7 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块：文本前端、声
 <a name="欢迎贡献"></a>
 ## 参与 PaddleSpeech 的开发

-热烈欢迎您在[Discussions](https://github.com/PaddlePaddle/PaddleSpeech/discussions) 中提交问题，并在[Issues](https://github.com/PaddlePaddle/PaddleSpeech/issues) 中指出发现的 bug。此外，我们非常希望您参与到 PaddleSpeech 的开发中！
+热烈欢迎您在 [Discussions](https://github.com/PaddlePaddle/PaddleSpeech/discussions) 中提交问题，并在 [Issues](https://github.com/PaddlePaddle/PaddleSpeech/issues) 中指出发现的 bug。此外，我们非常希望您参与到 PaddleSpeech 的开发中！

 ### 贡献者
 <p align="center">
@ -658,6 +691,7 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块：文本前端、声
 - 非常感谢 [jerryuhoo](https://github.com/jerryuhoo)/[VTuberTalk](https://github.com/jerryuhoo/VTuberTalk) 基于 PaddleSpeech 的 TTS GUI 界面和基于 ASR 制作数据集的相关代码。

  
+
 此外，PaddleSpeech 依赖于许多开源存储库。有关更多信息，请参阅 [references](./docs/source/reference.md)。

 ## License
--- a/audio/.gitignore
+++ b/audio/.gitignore
@ -1,2 +0,0 @@
-.eggs
-*.wav
--- a/audio/CHANGELOG.md
+++ b/audio/CHANGELOG.md
@ -1,9 +0,0 @@
-# Changelog
-
-Date: 2022-3-15, Author: Xiaojie Chen.
-  - kaldi and librosa mfcc, fbank, spectrogram.
-  - unit test and benchmark.
-
-Date: 2022-2-25, Author: Hui Zhang.
-  - Refactor architecture.
-  - dtw distance and mcd style dtw.
--- a/audio/README.md
+++ b/audio/README.md
@ -1,7 +0,0 @@
-# PaddleAudio
-
-PaddleAudio is an audio library for PaddlePaddle.
-
-## Install
-
-`pip install .`
--- a/audio/docs/Makefile
+++ b/audio/docs/Makefile
@ -1,19 +0,0 @@
-# Minimal makefile for Sphinx documentation
-#
-
-# You can set these variables from the command line.
-SPHINXOPTS    =
-SPHINXBUILD   = sphinx-build
-SOURCEDIR     = source
-BUILDDIR      = build
-
-# Put it first so that "make" without argument is like "make help".
-help:
-	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
-
-.PHONY: help Makefile
-
-# Catch-all target: route all unknown targets to Sphinx using the new
-# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
-%: Makefile
-	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
--- a/audio/docs/README.md
+++ b/audio/docs/README.md
@ -1,24 +0,0 @@
-# Build docs for PaddleAudio
-
-Execute the following steps in **current directory**.
-
-## 1. Install
-
-`pip install Sphinx sphinx_rtd_theme`
-
-
-## 2. Generate API docs
-
-Generate API docs from doc string.
-
-`sphinx-apidoc -fMeT -o source ../paddleaudio ../paddleaudio/utils --templatedir source/_templates`
-
-
-## 3. Build
-
-`sphinx-build source _html`
-
-
-## 4. Preview
-
-Open `_html/index.html` for page preview.
--- a/audio/docs/images/paddle.png
+++ b/audio/docs/images/paddle.png
--- a/audio/docs/make.bat
+++ b/audio/docs/make.bat
@ -1,35 +0,0 @@
-@ECHO OFF
-
-pushd %~dp0
-
-REM Command file for Sphinx documentation
-
-if "%SPHINXBUILD%" == "" (
-	set SPHINXBUILD=sphinx-build
-)
-set SOURCEDIR=source
-set BUILDDIR=build
-
-if "%1" == "" goto help
-
-%SPHINXBUILD% >NUL 2>NUL
-if errorlevel 9009 (
-	echo.
-	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
-	echo.installed, then set the SPHINXBUILD environment variable to point
-	echo.to the full path of the 'sphinx-build' executable. Alternatively you
-	echo.may add the Sphinx directory to PATH.
-	echo.
-	echo.If you don't have Sphinx installed, grab it from
-	echo.http://sphinx-doc.org/
-	exit /b 1
-)
-
-%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
-goto end
-
-:help
-%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
-
-:end
-popd
--- a/audio/paddleaudio/metric/dtw.py
+++ b/audio/paddleaudio/metric/dtw.py
@ -1,44 +0,0 @@
-# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-import numpy as np
-from dtaidistance import dtw_ndim
-
-__all__ = [
-    'dtw_distance',
-]
-
-
-def dtw_distance(xs: np.ndarray, ys: np.ndarray) -> float:
-    """Dynamic Time Warping.
-    This function keeps a compact matrix, not the full warping paths matrix.
-    Uses dynamic programming to compute:
-
-    Examples:
-        .. code-block:: python
-
-            wps[i, j] = (s1[i]-s2[j])**2 + min(
-                            wps[i-1, j  ] + penalty,  // vertical   / insertion / expansion
-                            wps[i  , j-1] + penalty,  // horizontal / deletion  / compression
-                            wps[i-1, j-1])            // diagonal   / match
-
-            dtw = sqrt(wps[-1, -1])
-
-    Args:
-        xs (np.ndarray): ref sequence, [T,D]
-        ys (np.ndarray): hyp sequence, [T,D]
-
-    Returns:
-        float: dtw distance
-    """
-    return dtw_ndim.distance(xs, ys)
--- a/audio/paddleaudio/utils/env.py
+++ b/audio/paddleaudio/utils/env.py
@ -1,60 +0,0 @@
-# Copyright (c) 2021  PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License"
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-'''
-This module is used to store environmental variables in PaddleAudio.
-PPAUDIO_HOME     -->  the root directory for storing PaddleAudio related data. Default to ~/.paddleaudio. Users can change the
-├                            default value through the PPAUDIO_HOME environment variable.
-├─ MODEL_HOME    -->  Store model files.
-└─ DATA_HOME     -->  Store automatically downloaded datasets.
-'''
-import os
-
-__all__ = [
-    'USER_HOME',
-    'PPAUDIO_HOME',
-    'MODEL_HOME',
-    'DATA_HOME',
-]
-
-
-def _get_user_home():
-    return os.path.expanduser('~')
-
-
-def _get_ppaudio_home():
-    if 'PPAUDIO_HOME' in os.environ:
-        home_path = os.environ['PPAUDIO_HOME']
-        if os.path.exists(home_path):
-            if os.path.isdir(home_path):
-                return home_path
-            else:
-                raise RuntimeError(
-                    'The environment variable PPAUDIO_HOME {} is not a directory.'.
-                    format(home_path))
-        else:
-            return home_path
-    return os.path.join(_get_user_home(), '.paddleaudio')
-
-
-def _get_sub_home(directory):
-    home = os.path.join(_get_ppaudio_home(), directory)
-    if not os.path.exists(home):
-        os.makedirs(home)
-    return home
-
-
-USER_HOME = _get_user_home()
-PPAUDIO_HOME = _get_ppaudio_home()
-MODEL_HOME = _get_sub_home('models')
-DATA_HOME = _get_sub_home('datasets')
--- a/audio/setup.py
+++ b/audio/setup.py
@ -1,150 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-import glob
-import os
-import subprocess
-
-import pybind11
-import setuptools
-from setuptools import Extension
-from setuptools.command.build_ext import build_ext
-from setuptools.command.test import test
-
-# set the version here
-VERSION = '1.0.0a'
-
-
-# Inspired by the example at https://pytest.org/latest/goodpractises.html
-class TestCommand(test):
-    def finalize_options(self):
-        test.finalize_options(self)
-        self.test_args = []
-        self.test_suite = True
-
-    def run(self):
-        self.run_benchmark()
-        super(TestCommand, self).run()
-
-    def run_tests(self):
-        # Run nose ensuring that argv simulates running nosetests directly
-        import nose
-        nose.run_exit(argv=['nosetests', '-w', 'tests'])
-
-    def run_benchmark(self):
-        for benchmark_item in glob.glob('tests/benchmark/*py'):
-            os.system(f'pytest {benchmark_item}')
-
-
-class ExtBuildCommand(build_ext):
-    def run(self):
-        try:
-            subprocess.check_output(["cmake", "--version"])
-        except OSError:
-            raise RuntimeError("CMake is not available.") from None
-        super().run()
-
-    def build_extension(self, ext):
-        extdir = os.path.abspath(
-            os.path.dirname(self.get_ext_fullpath(ext.name)))
-        cfg = "Debug" if self.debug else "Release"
-        cmake_args = [
-            f"-DCMAKE_BUILD_TYPE={cfg}",
-            f"-Dpybind11_DIR={pybind11.get_cmake_dir()}",
-            f"-DCMAKE_INSTALL_PREFIX={extdir}",
-            "-DCMAKE_VERBOSE_MAKEFILE=ON",
-            "-DBUILD_SOX:BOOL=ON",
-        ]
-        build_args = ["--target", "install"]
-
-        # Set CMAKE_BUILD_PARALLEL_LEVEL to control the parallel build level
-        # across all generators.
-        if "CMAKE_BUILD_PARALLEL_LEVEL" not in os.environ:
-            if hasattr(self, "parallel") and self.parallel:
-                build_args += ["-j{}".format(self.parallel)]
-
-        if not os.path.exists(self.build_temp):
-            os.makedirs(self.build_temp)
-
-        subprocess.check_call(
-            ["cmake", os.path.abspath(os.path.dirname(__file__))] + cmake_args,
-            cwd=self.build_temp)
-        subprocess.check_call(
-            ["cmake", "--build", "."] + build_args, cwd=self.build_temp)
-
-    def get_ext_filename(self, fullname):
-        ext_filename = super().get_ext_filename(fullname)
-        ext_filename_parts = ext_filename.split(".")
-        without_abi = ext_filename_parts[:-2] + ext_filename_parts[-1:]
-        ext_filename = ".".join(without_abi)
-        return ext_filename
-
-
-def write_version_py(filename='paddleaudio/__init__.py'):
-    with open(filename, "a") as f:
-        f.write(f"__version__ = '{VERSION}'")
-
-
-def remove_version_py(filename='paddleaudio/__init__.py'):
-    with open(filename, "r") as f:
-        lines = f.readlines()
-    with open(filename, "w") as f:
-        for line in lines:
-            if "__version__" not in line:
-                f.write(line)
-
-
-def get_ext_modules():
-    modules = [
-        Extension(name="paddleaudio._paddleaudio", sources=[]),
-    ]
-
-    return modules
-
-
-remove_version_py()
-write_version_py()
-
-setuptools.setup(
-    name="paddleaudio",
-    version=VERSION,
-    author="",
-    author_email="",
-    description="PaddleAudio, in development",
-    long_description="",
-    long_description_content_type="text/markdown",
-    url="",
-    packages=setuptools.find_packages(include=['paddleaudio*']),
-    classifiers=[
-        "Programming Language :: Python :: 3",
-        "License :: OSI Approved :: MIT License",
-        "Operating System :: OS Independent",
-    ],
-    python_requires='>=3.6',
-    install_requires=[
-        'numpy >= 1.15.0', 'scipy >= 1.0.0', 'resampy >= 0.2.2',
-        'soundfile >= 0.9.0', 'colorlog', 'dtaidistance == 2.3.1', 'pathos'
-    ],
-    extras_require={
-        'test': [
-            'nose', 'librosa==0.8.1', 'soundfile==0.10.3.post1',
-            'torchaudio==0.10.2', 'pytest-benchmark'
-        ],
-    },
-    ext_modules=get_ext_modules(),
-    cmdclass={
-        "build_ext": ExtBuildCommand,
-        'test': TestCommand,
-    }, )
-
-remove_version_py()
--- a/audio/tests/.gitkeep
+++ b/audio/tests/.gitkeep
--- a/demos/README.md
+++ b/demos/README.md
@ -2,14 +2,14 @@

 ([简体中文](./README_cn.md)|English)

-The directory containes many speech applications in multi scenarios.
+This directory contains many speech applications in multiple scenarios.

 * audio searching - mass audio similarity retrieval
 * audio tagging - multi-label tagging of an audio file
-* automatic_video_subtitiles - generate subtitles from a video
+* automatic_video_subtitles - generate subtitles from a video
 * metaverse - 2D AR with TTS  
 * punctuation_restoration - restore punctuation from raw text
-* speech recogintion - recognize text of an audio file 
+* speech recognition - recognize text of an audio file 
 * speech server - Server for Speech Task, e.g. ASR,TTS,CLS
 * streaming asr server - receive audio stream from websocket, and recognize to transcript.
 * speech translation - end to end speech translation  
--- a/demos/audio_content_search/README.md
+++ b/demos/audio_content_search/README.md
@ -0,0 +1,79 @@
+([简体中文](./README_cn.md)|English)
+# ACS (Audio Content Search)
+
+## Introduction
+ACS, or Audio Content Search, refers to the problem of getting the key word time stamp from automatically transcribe spoken language (speech-to-text). 
+
+This demo is an implementation of obtaining the keyword timestamp in the text from a given audio file. It can be done by a single command or a few lines in python using `PaddleSpeech`. 
+Now, the search word in demo is:
+```
+我
+康
+```
+## Usage
+### 1. Installation
+see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
+
+You can choose one way from meduim and hard to install paddlespeech.
+
+The dependency refers to the requirements.txt, and install the dependency as follows:
+
+```
+pip install -r requriement.txt 
+```
+
+### 2. Prepare Input File
+The input of this demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
+
+Here are sample files for this demo that can be downloaded:
+```bash
+wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
+```
+
+### 3. Usage
+- Command Line(Recommended)
+  ```bash
+  # Chinese
+  paddlespeech_client acs --server_ip 127.0.0.1 --port 8090 --input ./zh.wav 
+  ```
+  
+  Usage:
+  ```bash
+  paddlespeech asr --help
+  ```
+  Arguments:
+  - `input`(required): Audio file to recognize.
+  - `server_ip`: the server ip.
+  - `port`: the server port.
+  - `lang`: the language type of the model. Default: `zh`.
+  - `sample_rate`: Sample rate of the model. Default: `16000`.
+  - `audio_format`: The audio format.
+
+  Output:
+  ```bash
+  [2022-05-15 15:00:58,185] [    INFO] - acs http client start
+  [2022-05-15 15:00:58,185] [    INFO] - endpoint: http://127.0.0.1:8490/paddlespeech/asr/search
+  [2022-05-15 15:01:03,220] [    INFO] - acs http client finished
+  [2022-05-15 15:01:03,221] [    INFO] - ACS result: {'transcription': '我认为跑步最重要的就是给我带来了身体健康', 'acs': [{'w': '我', 'bg': 0, 'ed': 1.6800000000000002}, {'w': '我', 'bg': 2.1, 'ed': 4.28}, {'w': '康', 'bg': 3.2, 'ed': 4.92}]}
+  [2022-05-15 15:01:03,221] [    INFO] - Response time 5.036084 s.
+  ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import ACSClientExecutor
+
+  acs_executor = ACSClientExecutor()
+  res = acs_executor(
+      input='./zh.wav',
+      server_ip="127.0.0.1",
+      port=8490,)
+  print(res)
+  ```
+
+  Output:
+  ```bash
+  [2022-05-15 15:08:13,955] [    INFO] - acs http client start
+  [2022-05-15 15:08:13,956] [    INFO] - endpoint: http://127.0.0.1:8490/paddlespeech/asr/search
+  [2022-05-15 15:08:19,026] [    INFO] - acs http client finished
+  {'transcription': '我认为跑步最重要的就是给我带来了身体健康', 'acs': [{'w': '我', 'bg': 0, 'ed': 1.6800000000000002}, {'w': '我', 'bg': 2.1, 'ed': 4.28}, {'w': '康', 'bg': 3.2, 'ed': 4.92}]}
+  ```
--- a/demos/audio_content_search/README_cn.md
+++ b/demos/audio_content_search/README_cn.md
@ -0,0 +1,78 @@
+(简体中文|[English](./README.md))
+
+# 语音内容搜索
+## 介绍
+语音内容搜索是一项用计算机程序获取转录语音内容关键词时间戳的技术。
+
+这个 demo 是一个从给定音频文件获取其文本中关键词时间戳的实现，它可以通过使用 `PaddleSpeech` 的单个命令或 python 中的几行代码来实现。
+
+当前示例中检索词是
+```
+我
+康
+```
+## 使用方法
+### 1. 安装
+请看[安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md)。
+
+你可以从 medium，hard 三中方式中选择一种方式安装。
+依赖参见 requirements.txt, 安装依赖
+
+```
+pip install -r requriement.txt 
+```
+
+### 2. 准备输入
+这个 demo 的输入应该是一个 WAV 文件（`.wav`），并且采样率必须与模型的采样率相同。
+
+可以下载此 demo 的示例音频：
+```bash
+wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
+```
+### 3. 使用方法
+- 命令行 (推荐使用)
+  ```bash
+  # 中文
+  paddlespeech_client acs --server_ip 127.0.0.1 --port 8090 --input ./zh.wav 
+  ```
+  
+  使用方法：
+  ```bash
+  paddlespeech acs --help
+  ```
+  参数：
+  - `input`(必须输入)：用于识别的音频文件。
+  - `server_ip`: 服务的ip。
+  - `port`：服务的端口。
+  - `lang`：模型语言，默认值：`zh`。
+  - `sample_rate`：音频采样率，默认值：`16000`。
+  - `audio_format`: 音频的格式。
+
+  输出：
+  ```bash
+  [2022-05-15 15:00:58,185] [    INFO] - acs http client start
+  [2022-05-15 15:00:58,185] [    INFO] - endpoint: http://127.0.0.1:8490/paddlespeech/asr/search
+  [2022-05-15 15:01:03,220] [    INFO] - acs http client finished
+  [2022-05-15 15:01:03,221] [    INFO] - ACS result: {'transcription': '我认为跑步最重要的就是给我带来了身体健康', 'acs': [{'w': '我', 'bg': 0, 'ed': 1.6800000000000002}, {'w': '我', 'bg': 2.1, 'ed': 4.28}, {'w': '康', 'bg': 3.2, 'ed': 4.92}]}
+  [2022-05-15 15:01:03,221] [    INFO] - Response time 5.036084 s.
+  ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import ACSClientExecutor
+
+  acs_executor = ACSClientExecutor()
+  res = acs_executor(
+      input='./zh.wav',
+      server_ip="127.0.0.1",
+      port=8490,)
+  print(res)
+  ```
+
+  输出：
+  ```bash
+  [2022-05-15 15:08:13,955] [    INFO] - acs http client start
+  [2022-05-15 15:08:13,956] [    INFO] - endpoint: http://127.0.0.1:8490/paddlespeech/asr/search
+  [2022-05-15 15:08:19,026] [    INFO] - acs http client finished
+  {'transcription': '我认为跑步最重要的就是给我带来了身体健康', 'acs': [{'w': '我', 'bg': 0, 'ed': 1.6800000000000002}, {'w': '我', 'bg': 2.1, 'ed': 4.28}, {'w': '康', 'bg': 3.2, 'ed': 4.92}]}
+  ```
--- a/demos/audio_content_search/acs_clinet.py
+++ b/demos/audio_content_search/acs_clinet.py
@ -0,0 +1,49 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+
+from paddlespeech.cli.log import logger
+from paddlespeech.server.utils.audio_handler import ASRHttpHandler
+
+
+def main(args):
+    logger.info("asr http client start")
+    audio_format = "wav"
+    sample_rate = 16000
+    lang = "zh"
+    handler = ASRHttpHandler(
+        server_ip=args.server_ip, port=args.port, endpoint=args.endpoint)
+    res = handler.run(args.wavfile, audio_format, sample_rate, lang)
+    # res = res['result']
+    logger.info(f"the final result: {res}")
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="audio content search client")
+    parser.add_argument(
+        '--server_ip', type=str, default='127.0.0.1', help='server ip')
+    parser.add_argument('--port', type=int, default=8090, help='server port')
+    parser.add_argument(
+        "--wavfile",
+        action="store",
+        help="wav file path ",
+        default="./16_audio.wav")
+    parser.add_argument(
+        '--endpoint',
+        type=str,
+        default='/paddlespeech/asr/search',
+        help='server endpoint')
+    args = parser.parse_args()
+
+    main(args)
--- a/demos/audio_content_search/conf/acs_application.yaml
+++ b/demos/audio_content_search/conf/acs_application.yaml
@ -0,0 +1,35 @@
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 8490
+
+# The task format in the engin_list is: <speech task>_<engine type>
+# task choices = ['acs_python']
+# protocol = ['http'] (only one can be selected). 
+# http only support offline engine type.
+protocol: 'http'
+engine_list: ['acs_python']
+
+
+#################################################################################
+#                                ENGINE CONFIG                                  #
+#################################################################################
+
+################################### ACS #########################################
+################### acs task: engine_type: python ###############################
+acs_python:
+    task: acs
+    asr_protocol: 'websocket' # 'websocket'
+    offset: 1.0 # second
+    asr_server_ip: 127.0.0.1
+    asr_server_port: 8390
+    lang: 'zh'
+    word_list: "./conf/words.txt"
+    sample_rate: 16000
+    device: 'cpu' # set 'gpu:id' or 'cpu'
+    ping_timeout: 100 # seconds
+
+
+
+
--- a/demos/audio_content_search/conf/words.txt
+++ b/demos/audio_content_search/conf/words.txt
@ -0,0 +1,2 @@
+我
+康
--- a/demos/audio_content_search/conf/ws_conformer_application.yaml
+++ b/demos/audio_content_search/conf/ws_conformer_application.yaml
@ -0,0 +1,43 @@
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 8390
+
+# The task format in the engin_list is: <speech task>_<engine type>
+# task choices = ['asr_online']
+# protocol = ['websocket'] (only one can be selected).
+# websocket only support online engine type.
+protocol: 'websocket'
+engine_list: ['asr_online']
+
+
+#################################################################################
+#                                ENGINE CONFIG                                  #
+#################################################################################
+
+################################### ASR #########################################
+################### speech task: asr; engine_type: online #######################
+asr_online:
+    model_type: 'conformer_online_multicn'
+    am_model: # the pdmodel file of am static model [optional]
+    am_params:  # the pdiparams file of am static model [optional]
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: 
+    decode_method: 'attention_rescoring' 
+    force_yes: True
+    device: 'cpu' # cpu or gpu:id
+    am_predictor_conf:
+        device:  # set 'gpu:id' or 'cpu'
+        switch_ir_optim: True
+        glog_info: False  # True -> print glog
+        summary: True  # False -> do not show predictor config
+
+    chunk_buffer_conf:
+        window_n: 7     # frame
+        shift_n: 4      # frame
+        window_ms: 25   # ms
+        shift_ms: 10    # ms
+        sample_rate: 16000
+        sample_width: 2
--- a/demos/audio_content_search/conf/ws_conformer_wenetspeech_application.yaml
+++ b/demos/audio_content_search/conf/ws_conformer_wenetspeech_application.yaml
@ -4,11 +4,11 @@
 #                             SERVER SETTING                                    #
 #################################################################################
 host: 0.0.0.0
-port: 8090
+port: 8390

 # The task format in the engin_list is: <speech task>_<engine type>
-# task choices = ['asr_online', 'tts_online']
-# protocol = ['websocket', 'http'] (only one can be selected).
+# task choices = ['asr_online']
+# protocol = ['websocket'] (only one can be selected).
 # websocket only support online engine type.
 protocol: 'websocket'
 engine_list: ['asr_online']
@ -21,7 +21,7 @@ engine_list: ['asr_online']
 ################################### ASR #########################################
 ################### speech task: asr; engine_type: online #######################
 asr_online:
-    model_type: 'deepspeech2online_aishell'
+    model_type: 'conformer_online_wenetspeech'
    am_model: # the pdmodel file of am static model [optional]
    am_params:  # the pdiparams file of am static model [optional]
    lang: 'zh'
@ -29,7 +29,8 @@ asr_online:
    cfg_path: 
    decode_method: 
    force_yes: True
-
+    device: 'cpu' # cpu or gpu:id
+    decode_method: "attention_rescoring"
    am_predictor_conf:
        device:  # set 'gpu:id' or 'cpu'
        switch_ir_optim: True
@ -37,11 +38,9 @@ asr_online:
        summary: True  # False -> do not show predictor config

    chunk_buffer_conf:
-        frame_duration_ms: 80
-        shift_ms: 40
-        sample_rate: 16000
-        sample_width: 2
        window_n: 7     # frame
        shift_n: 4      # frame
-        window_ms: 20   # ms
+        window_ms: 25   # ms
        shift_ms: 10    # ms
+        sample_rate: 16000
+        sample_width: 2
--- a/demos/audio_content_search/requirements.txt
+++ b/demos/audio_content_search/requirements.txt
@ -0,0 +1 @@
+websocket-client
--- a/demos/audio_content_search/run.sh
+++ b/demos/audio_content_search/run.sh
@ -0,0 +1,7 @@
+export CUDA_VISIBLE_DEVICE=0,1,2,3
+# we need the streaming asr server
+nohup python3 streaming_asr_server.py --config_file conf/ws_conformer_application.yaml > streaming_asr.log  2>&1  &
+
+# start the acs server
+nohup paddlespeech_server start --config_file conf/acs_application.yaml > acs.log 2>&1 &
+
--- a/demos/audio_content_search/streaming_asr_server.py
+++ b/demos/audio_content_search/streaming_asr_server.py
@ -0,0 +1,38 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+
+from paddlespeech.cli.log import logger
+from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(
+        prog='paddlespeech_server.start', add_help=True)
+    parser.add_argument(
+        "--config_file",
+        action="store",
+        help="yaml file of the app",
+        default=None,
+        required=True)
+
+    parser.add_argument(
+        "--log_file",
+        action="store",
+        help="log file",
+        default="./log/paddlespeech.log")
+    logger.info("start to parse the args")
+    args = parser.parse_args()
+
+    logger.info("start to launch the streaming asr server")
+    streaming_asr_server = ServerExecutor()
+    streaming_asr_server(config_file=args.config_file, log_file=args.log_file)
--- a/demos/audio_searching/README.md
+++ b/demos/audio_searching/README.md
@ -89,7 +89,7 @@ Then to start the system server, and it provides HTTP backend services.
  Then start the server with Fastapi.

  ```bash
-  export PYTHONPATH=$PYTHONPATH:./src:../../paddleaudio
+  export PYTHONPATH=$PYTHONPATH:./src
  python src/audio_search.py
  ```

--- a/demos/audio_searching/README_cn.md
+++ b/demos/audio_searching/README_cn.md
@ -91,7 +91,7 @@ ffce340b3790  minio/minio:RELEASE.2020-12-03T00-03-10Z  "/usr/bin/docker-ent…"
  启动用 Fastapi 构建的服务

  ```bash
-  export PYTHONPATH=$PYTHONPATH:./src:../../paddleaudio
+  export PYTHONPATH=$PYTHONPATH:./src
  python src/audio_search.py
  ```

--- a/demos/audio_searching/src/encode.py
+++ b/demos/audio_searching/src/encode.py
@ -14,7 +14,7 @@
 import numpy as np
 from logs import LOGGER

-from paddlespeech.cli import VectorExecutor
+from paddlespeech.cli.vector import VectorExecutor

 vector_executor = VectorExecutor()

--- a/demos/audio_tagging/README.md
+++ b/demos/audio_tagging/README.md
@ -57,7 +57,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/cat.wav https://paddlespe
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import CLSExecutor
+  from paddlespeech.cli.cls import CLSExecutor

  cls_executor = CLSExecutor()
  result = cls_executor(
--- a/demos/audio_tagging/README_cn.md
+++ b/demos/audio_tagging/README_cn.md
@ -57,7 +57,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/cat.wav https://paddlespe
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import CLSExecutor
+  from paddlespeech.cli.cls import CLSExecutor

  cls_executor = CLSExecutor()
  result = cls_executor(
--- a/demos/automatic_video_subtitiles/README.md
+++ b/demos/automatic_video_subtitiles/README.md
@ -28,7 +28,8 @@ ffmpeg -i subtitle_demo1.mp4 -ac 1 -ar 16000 -vn input.wav
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import ASRExecutor, TextExecutor
+  from paddlespeech.cli.asr import ASRExecutor
+  from paddlespeech.cli.text import TextExecutor

  asr_executor = ASRExecutor()
  text_executor = TextExecutor()
--- a/demos/automatic_video_subtitiles/README_cn.md
+++ b/demos/automatic_video_subtitiles/README_cn.md
@ -23,7 +23,8 @@ ffmpeg -i subtitle_demo1.mp4 -ac 1 -ar 16000 -vn input.wav
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import ASRExecutor, TextExecutor
+  from paddlespeech.cli.asr import ASRExecutor
+  from paddlespeech.cli.text import TextExecutor

  asr_executor = ASRExecutor()
  text_executor = TextExecutor()
--- a/demos/automatic_video_subtitiles/recognize.py
+++ b/demos/automatic_video_subtitiles/recognize.py
@ -16,8 +16,8 @@ import os

 import paddle

-from paddlespeech.cli import ASRExecutor
-from paddlespeech.cli import TextExecutor
+from paddlespeech.cli.asr import ASRExecutor
+from paddlespeech.cli.text import TextExecutor

 # yapf: disable
 parser = argparse.ArgumentParser(__doc__)
--- a/demos/custom_streaming_asr/README.md
+++ b/demos/custom_streaming_asr/README.md
@ -0,0 +1,68 @@
+([简体中文](./README_cn.md)|English)
+
+# Customized Auto Speech Recognition
+
+## introduction
+
+In some cases, we need to recognize the specific rare words with high accuracy. eg: address recognition in navigation apps. customized ASR can slove those issues.
+
+this demo is customized for expense account, which need to recognize rare address.
+
+the scripts are in https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx/examples/custom_asr
+
+* G with slot: 打车到 "address_slot"。  
+![](https://ai-studio-static-online.cdn.bcebos.com/28d9ef132a7f47a895a65ae9e5c4f55b8f472c9f3dd24be8a2e66e0b88b173a4)
+
+* this is address slot wfst, you can add the address which want to recognize.  
+![](https://ai-studio-static-online.cdn.bcebos.com/47c89100ef8c465bac733605ffc53d76abefba33d62f4d818d351f8cea3c8fe2)
+
+* after replace operation, G = fstreplace(G_with_slot, address_slot), we will get the customized graph.  
+![](https://ai-studio-static-online.cdn.bcebos.com/60a3095293044f10b73039ab10c7950d139a6717580a44a3ba878c6e74de402b)  
+
+## Usage
+### 1. Installation
+install paddle:2.2.2 docker.
+```
+sudo docker pull registry.baidubce.com/paddlepaddle/paddle:2.2.2
+
+sudo docker run --privileged  --net=host --ipc=host -it --rm -v $PWD:/paddle --name=paddle_demo_docker registry.baidubce.com/paddlepaddle/paddle:2.2.2 /bin/bash 
+```
+
+### 2. demo
+* run websocket_server.sh.  This script will download resources and libs, and launch the service.
+```
+cd /paddle
+bash websocket_server.sh
+```
+this script run in two steps:  
+1. download the resources.tar.gz, those direcotries will be found in resource directory.  
+model: acustic model  
+graph: the decoder graph (TLG.fst)  
+lib: some libs  
+bin: binary  
+data: audio and wav.scp  
+
+2. websocket_server_main launch the service.  
+some params:  
+port: the service port  
+graph_path: the decoder graph path  
+model_path: acustic model path  
+please refer other params in those files:  
+PaddleSpeech/speechx/speechx/decoder/param.h  
+PaddleSpeech/speechx/examples/ds2_ol/websocket/websocket_server_main.cc  
+
+* In other terminal, run script websocket_client.sh, the client will send data and get the results.
+```
+bash websocket_client.sh
+```
+websocket_client_main will launch the client, the wav_scp is the wav set, port is the server service port.
+
+* result:
+In the log of client, you will see the message below:
+```
+0513 10:58:13.827821 41768 recognizer_test_main.cc:56] wav len (sample): 70208
+I0513 10:58:13.884493 41768 feature_cache.h:52] set finished
+I0513 10:58:24.247171 41768 paddle_nnet.h:76] Tensor neml: 10240
+I0513 10:58:24.247249 41768 paddle_nnet.h:76] Tensor neml: 10240
+LOG ([5.5.544~2-f21d7]:main():decoder/recognizer_test_main.cc:90)  the result of case_10 is 五月十二日二十二点三十六分加班打车回家四十一元
+```
--- a/demos/custom_streaming_asr/README_cn.md
+++ b/demos/custom_streaming_asr/README_cn.md
@ -0,0 +1,65 @@
+(简体中文|[English](./README.md))
+
+# 定制化语音识别演示
+## 介绍
+在一些场景中，识别系统需要高精度的识别一些稀有词，例如导航软件中地名识别。而通过定制化识别可以满足这一需求。  
+
+这个 demo 是打车报销单的场景识别，需要识别一些稀有的地名，可以通过如下操作实现。
+
+相关脚本:https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx/examples/custom_asr
+
+* G with slot: 打车到 "address_slot"。  
+![](https://ai-studio-static-online.cdn.bcebos.com/28d9ef132a7f47a895a65ae9e5c4f55b8f472c9f3dd24be8a2e66e0b88b173a4)
+
+* 这是 address slot wfst, 可以添加一些需要识别的地名.  
+![](https://ai-studio-static-online.cdn.bcebos.com/47c89100ef8c465bac733605ffc53d76abefba33d62f4d818d351f8cea3c8fe2)
+
+* 通过 replace 操作, G = fstreplace(G_with_slot, address_slot), 最终可以得到定制化的解码图。  
+![](https://ai-studio-static-online.cdn.bcebos.com/60a3095293044f10b73039ab10c7950d139a6717580a44a3ba878c6e74de402b)  
+
+## 使用方法
+### 1. 配置环境
+安装paddle:2.2.2 docker镜像。
+```
+sudo docker pull registry.baidubce.com/paddlepaddle/paddle:2.2.2
+
+sudo docker run --privileged  --net=host --ipc=host -it --rm -v $PWD:/paddle --name=paddle_demo_docker registry.baidubce.com/paddlepaddle/paddle:2.2.2 /bin/bash 
+```
+
+### 2. 演示
+* 运行如下命令，完成相关资源和库的下载和服务启动。
+```
+cd /paddle
+bash websocket_server.sh
+```
+上面脚本完成了如下两个功能：
+1. 完成 resource.tar.gz 下载，解压后,会在 resource 中发现如下目录：  
+model: 声学模型  
+graph: 解码构图  
+lib: 相关库  
+bin: 运行程序  
+data: 语音数据  
+
+2. 通过 websocket_server_main 来启动服务。
+这里简单的介绍几个参数:  
+port 是服务端口，  
+graph_path 用来指定解码图文件，  
+其他参数说明可参见代码：  
+PaddleSpeech/speechx/speechx/decoder/param.h  
+PaddleSpeech/speechx/examples/ds2_ol/websocket/websocket_server_main.cc  
+
+* 在另一个终端中， 通过 client 发送数据，得到结果。运行如下命令：
+```
+bash websocket_client.sh
+```
+通过 websocket_client_main 来启动 client 服务，其中 wav_scp 是发送的语音句子集合，port 为服务端口。
+
+* 结果：
+client 的 log 中可以看到如下类似的结果
+```
+0513 10:58:13.827821 41768 recognizer_test_main.cc:56] wav len (sample): 70208
+I0513 10:58:13.884493 41768 feature_cache.h:52] set finished
+I0513 10:58:24.247171 41768 paddle_nnet.h:76] Tensor neml: 10240
+I0513 10:58:24.247249 41768 paddle_nnet.h:76] Tensor neml: 10240
+LOG ([5.5.544~2-f21d7]:main():decoder/recognizer_test_main.cc:90)  the result of case_10 is 五月十二日二十二点三十六分加班打车回家四十一元
+```
--- a/demos/custom_streaming_asr/path.sh
+++ b/demos/custom_streaming_asr/path.sh
@ -0,0 +1,2 @@
+export LD_LIBRARY_PATH=$PWD/resource/lib
+export PATH=$PATH:$PWD/resource/bin
--- a/demos/custom_streaming_asr/setup_docker.sh
+++ b/demos/custom_streaming_asr/setup_docker.sh
@ -0,0 +1 @@
+sudo nvidia-docker run --privileged  --net=host --ipc=host -it --rm -v $PWD:/paddle --name=paddle_demo_docker registry.baidubce.com/paddlepaddle/paddle:2.2.2 /bin/bash
--- a/demos/custom_streaming_asr/websocket_client.sh
+++ b/demos/custom_streaming_asr/websocket_client.sh
@ -0,0 +1,18 @@
+#!/bin/bash
+set +x
+set -e
+
+. path.sh
+# input
+data=$PWD/data
+
+# output
+wav_scp=wav.scp
+
+export GLOG_logtostderr=1
+
+# websocket client
+websocket_client_main \
+    --wav_rspecifier=scp:$data/$wav_scp \
+    --streaming_chunk=0.36 \
+    --port=8881
--- a/demos/custom_streaming_asr/websocket_server.sh
+++ b/demos/custom_streaming_asr/websocket_server.sh
@ -0,0 +1,33 @@
+#!/bin/bash
+set +x
+set -e
+
+export GLOG_logtostderr=1
+
+. path.sh
+#test websocket server 
+
+model_dir=./resource/model
+graph_dir=./resource/graph
+cmvn=./data/cmvn.ark
+
+
+#paddle_asr_online/resource.tar.gz
+if [ ! -f $cmvn ]; then
+    wget -c https://paddlespeech.bj.bcebos.com/s2t/paddle_asr_online/resource.tar.gz
+    tar xzfv resource.tar.gz
+    ln -s ./resource/data .
+fi
+
+websocket_server_main \
+    --cmvn_file=$cmvn \
+    --streaming_chunk=0.1 \
+    --use_fbank=true \
+    --model_path=$model_dir/avg_10.jit.pdmodel \
+    --param_path=$model_dir/avg_10.jit.pdiparams \
+    --model_cache_shapes="5-1-2048,5-1-2048" \
+    --model_output_names=softmax_0.tmp_0,tmp_5,concat_0.tmp_0,concat_1.tmp_0 \
+    --word_symbol_table=$graph_dir/words.txt \
+    --graph_path=$graph_dir/TLG.fst --max_active=7500 \
+    --port=8881 \
+    --acoustic_scale=12 
--- a/demos/punctuation_restoration/README.md
+++ b/demos/punctuation_restoration/README.md
@ -42,7 +42,7 @@ The input of this demo should be a text of the specific language that can be pas
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import TextExecutor
+  from paddlespeech.cli.text import TextExecutor

  text_executor = TextExecutor()
  result = text_executor(
--- a/demos/punctuation_restoration/README_cn.md
+++ b/demos/punctuation_restoration/README_cn.md
@ -44,7 +44,7 @@
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import TextExecutor
+  from paddlespeech.cli.text import TextExecutor

  text_executor = TextExecutor()
  result = text_executor(
--- a/demos/speaker_verification/README.md
+++ b/demos/speaker_verification/README.md
@ -14,7 +14,7 @@ see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/doc
 You can choose one way from easy, meduim and hard to install paddlespeech.

 ### 2. Prepare Input File
-The input of this demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
+The input of this cli demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.

 Here are sample files for this demo that can be downloaded:
 ```bash
@ -53,51 +53,50 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
  Output:

  ```bash
-    demo [  1.4217498    5.626253    -5.342073     1.1773866    3.308055
-    1.756596     5.167894    10.80636     -3.8226728   -5.6141334
-    2.623845    -0.8072968    1.9635103   -7.3128724    0.01103897
-    -9.723131     0.6619743   -6.976803    10.213478     7.494748
-    2.9105635    3.8949256    3.7999806    7.1061673   16.905321
-    -7.1493764    8.733103     3.4230042   -4.831653   -11.403367
-    11.232214     7.1274667   -4.2828417    2.452362    -5.130748
-    -18.177666    -2.6116815  -11.000337    -6.7314315    1.6564683
-    0.7618269    1.1253023   -2.083836     4.725744    -8.782597
-    -3.539873     3.814236     5.1420674    2.162061     4.096431
-    -6.4162116   12.747448     1.9429878  -15.152943     6.417416
-    16.097002    -9.716668    -1.9920526   -3.3649497   -1.871939
-    11.567354     3.69788     11.258265     7.442363     9.183411
-    4.5281515   -1.2417862    4.3959084    6.6727695    5.8898783
-    7.627124    -0.66919386 -11.889693    -9.208865    -7.4274073
-    -3.7776625    6.917234    -9.848748    -2.0944717   -5.135116
-    0.49563864   9.317534    -5.9141874   -1.8098574   -0.11738578
-    -7.169265    -1.0578263   -5.7216787   -5.1173844   16.137651
-    -4.473626     7.6624317   -0.55381083   9.631587    -6.4704556
-    -8.548508     4.3716145   -0.79702514   4.478997    -2.9758704
-    3.272176     2.8382776    5.134597    -9.190781    -0.5657382
-    -4.8745747    2.3165567   -5.984303    -2.1798875    0.35541576
-    -0.31784213   9.493548     2.1144536    4.358092   -12.089823
-    8.451689    -7.925461     4.6242585    4.4289427   18.692003
-    -2.6204622   -5.149185    -0.35821092   8.488551     4.981496
-    -9.32683     -2.2544234    6.6417594    1.2119585   10.977129
-    16.555033     3.3238444    9.551863    -1.6676947   -0.79539716
-    -8.605674    -0.47356385   2.6741948   -5.359179    -2.6673796
-    0.66607     15.443222     4.740594    -3.4725387   11.592567
-    -2.054497     1.7361217   -8.265324    -9.30447      5.4068313
-    -1.5180256   -7.746615    -6.089606     0.07112726  -0.34904733
-    -8.649895    -9.998958    -2.564841    -0.53999114   2.601808
-    -0.31927416  -1.8815292   -2.07215     -3.4105783   -8.2998085
-    1.483641   -15.365992    -8.288208     3.8847756   -3.4876456
-    7.3629923    0.4657332    3.132599    12.438889    -1.8337058
-    4.532936     2.7264361   10.145339    -6.521951     2.897153
-    -3.3925855    5.079156     7.759716     4.677565     5.8457737
-    2.402413     7.7071047    3.9711342   -6.390043     6.1268735
-    -3.7760346  -11.118123  ]
+    demo [ -1.3251206    7.8606825   -4.620626     0.3000721    2.2648535
+    -1.1931441    3.0647137    7.673595    -6.0044727  -12.02426
+    -1.9496069    3.1269536    1.618838    -7.6383104   -1.2299773
+  -12.338331     2.1373026   -5.3957124    9.717328     5.6752305
+    3.7805123    3.0597172    3.429692     8.97601     13.174125
+    -0.53132284   8.9424715    4.46511     -4.4262476   -9.726503
+    8.399328     7.2239175   -7.435854     2.9441683   -4.3430395
+  -13.886965    -1.6346735  -10.9027405   -5.311245     3.8007221
+    3.8976038   -2.1230774   -2.3521194    4.151031    -7.4048667
+    0.13911647   2.4626107    4.9664545    0.9897574    5.4839754
+    -3.3574002   10.1340065   -0.6120171  -10.403095     4.6007543
+    16.00935     -7.7836914   -4.1945305   -6.9368606    1.1789556
+    11.490801     4.2380238    9.550931     8.375046     7.5089145
+    -0.65707296  -0.30051577   2.8406055    3.0828028    0.730817
+    6.148354     0.13766119 -13.424735    -7.7461405   -2.3227983
+    -8.305252     2.9879124  -10.995229     0.15211068  -2.3820348
+    -1.7984174    8.495629    -5.8522367   -3.755498     0.6989711
+    -5.2702994   -2.6188622   -1.8828466   -4.64665     14.078544
+    -0.5495333   10.579158    -3.2160501    9.349004    -4.381078
+  -11.675817    -2.8630207    4.5721755    2.246612    -4.574342
+    1.8610188    2.3767874    5.6257877   -9.784078     0.64967257
+    -1.4579505    0.4263264   -4.9211264   -2.454784     3.4869802
+    -0.42654222   8.341269     1.356552     7.0966883  -13.102829
+    8.016734    -7.1159344    1.8699781    0.208721    14.699384
+    -1.025278    -2.6107233   -2.5082312    8.427193     6.9138527
+    -6.2912464    0.6157366    2.489688    -3.4668267    9.921763
+    11.200815    -0.1966403    7.4916005   -0.62312716  -0.25848144
+    -9.947997    -0.9611041    1.1649219   -2.1907122   -1.5028487
+    -0.51926106  15.165954     2.4649463   -0.9980445    7.4416637
+    -2.0768049    3.5896823   -7.3055434   -7.5620847    4.323335
+    0.0804418   -6.56401     -2.3148053   -1.7642345   -2.4708817
+    -7.675618    -9.548878    -1.0177554    0.16986446   2.5877135
+    -1.8752296   -0.36614323  -6.0493784   -2.3965611   -5.9453387
+    0.9424033  -13.155974    -7.457801     0.14658108  -3.742797
+    5.8414927   -1.2872906    5.5694313   12.57059      1.0939219
+    2.2142086    1.9181576    6.9914207   -5.888139     3.1409824
+    -2.003628     2.4434285    9.973139     5.03668      2.0051203
+    2.8615603    5.860224     2.9176188   -1.6311141    2.0292206
+    -4.070415    -6.831437  ]
  ```

 - Python API
  ```python
-  import paddle
-  from paddlespeech.cli import VectorExecutor
+  from paddlespeech.cli.vector import VectorExecutor

  vector_executor = VectorExecutor()
  audio_emb = vector_executor(
@ -128,88 +127,88 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
  ```bash
  # Vector Result:
   Audio embedding Result:
-    [  1.4217498    5.626253    -5.342073     1.1773866    3.308055
-    1.756596     5.167894    10.80636     -3.8226728   -5.6141334
-    2.623845    -0.8072968    1.9635103   -7.3128724    0.01103897
-    -9.723131     0.6619743   -6.976803    10.213478     7.494748
-    2.9105635    3.8949256    3.7999806    7.1061673   16.905321
-    -7.1493764    8.733103     3.4230042   -4.831653   -11.403367
-    11.232214     7.1274667   -4.2828417    2.452362    -5.130748
-    -18.177666    -2.6116815  -11.000337    -6.7314315    1.6564683
-    0.7618269    1.1253023   -2.083836     4.725744    -8.782597
-    -3.539873     3.814236     5.1420674    2.162061     4.096431
-    -6.4162116   12.747448     1.9429878  -15.152943     6.417416
-    16.097002    -9.716668    -1.9920526   -3.3649497   -1.871939
-    11.567354     3.69788     11.258265     7.442363     9.183411
-    4.5281515   -1.2417862    4.3959084    6.6727695    5.8898783
-    7.627124    -0.66919386 -11.889693    -9.208865    -7.4274073
-    -3.7776625    6.917234    -9.848748    -2.0944717   -5.135116
-    0.49563864   9.317534    -5.9141874   -1.8098574   -0.11738578
-    -7.169265    -1.0578263   -5.7216787   -5.1173844   16.137651
-    -4.473626     7.6624317   -0.55381083   9.631587    -6.4704556
-    -8.548508     4.3716145   -0.79702514   4.478997    -2.9758704
-    3.272176     2.8382776    5.134597    -9.190781    -0.5657382
-    -4.8745747    2.3165567   -5.984303    -2.1798875    0.35541576
-    -0.31784213   9.493548     2.1144536    4.358092   -12.089823
-    8.451689    -7.925461     4.6242585    4.4289427   18.692003
-    -2.6204622   -5.149185    -0.35821092   8.488551     4.981496
-    -9.32683     -2.2544234    6.6417594    1.2119585   10.977129
-    16.555033     3.3238444    9.551863    -1.6676947   -0.79539716
-    -8.605674    -0.47356385   2.6741948   -5.359179    -2.6673796
-    0.66607     15.443222     4.740594    -3.4725387   11.592567
-    -2.054497     1.7361217   -8.265324    -9.30447      5.4068313
-    -1.5180256   -7.746615    -6.089606     0.07112726  -0.34904733
-    -8.649895    -9.998958    -2.564841    -0.53999114   2.601808
-    -0.31927416  -1.8815292   -2.07215     -3.4105783   -8.2998085
-    1.483641   -15.365992    -8.288208     3.8847756   -3.4876456
-    7.3629923    0.4657332    3.132599    12.438889    -1.8337058
-    4.532936     2.7264361   10.145339    -6.521951     2.897153
-    -3.3925855    5.079156     7.759716     4.677565     5.8457737
-    2.402413     7.7071047    3.9711342   -6.390043     6.1268735
-    -3.7760346  -11.118123  ]
+    [ -1.3251206    7.8606825   -4.620626     0.3000721    2.2648535
+      -1.1931441    3.0647137    7.673595    -6.0044727  -12.02426
+      -1.9496069    3.1269536    1.618838    -7.6383104   -1.2299773
+    -12.338331     2.1373026   -5.3957124    9.717328     5.6752305
+      3.7805123    3.0597172    3.429692     8.97601     13.174125
+      -0.53132284   8.9424715    4.46511     -4.4262476   -9.726503
+      8.399328     7.2239175   -7.435854     2.9441683   -4.3430395
+    -13.886965    -1.6346735  -10.9027405   -5.311245     3.8007221
+      3.8976038   -2.1230774   -2.3521194    4.151031    -7.4048667
+      0.13911647   2.4626107    4.9664545    0.9897574    5.4839754
+      -3.3574002   10.1340065   -0.6120171  -10.403095     4.6007543
+      16.00935     -7.7836914   -4.1945305   -6.9368606    1.1789556
+      11.490801     4.2380238    9.550931     8.375046     7.5089145
+      -0.65707296  -0.30051577   2.8406055    3.0828028    0.730817
+      6.148354     0.13766119 -13.424735    -7.7461405   -2.3227983
+      -8.305252     2.9879124  -10.995229     0.15211068  -2.3820348
+      -1.7984174    8.495629    -5.8522367   -3.755498     0.6989711
+      -5.2702994   -2.6188622   -1.8828466   -4.64665     14.078544
+      -0.5495333   10.579158    -3.2160501    9.349004    -4.381078
+    -11.675817    -2.8630207    4.5721755    2.246612    -4.574342
+      1.8610188    2.3767874    5.6257877   -9.784078     0.64967257
+      -1.4579505    0.4263264   -4.9211264   -2.454784     3.4869802
+      -0.42654222   8.341269     1.356552     7.0966883  -13.102829
+      8.016734    -7.1159344    1.8699781    0.208721    14.699384
+      -1.025278    -2.6107233   -2.5082312    8.427193     6.9138527
+      -6.2912464    0.6157366    2.489688    -3.4668267    9.921763
+      11.200815    -0.1966403    7.4916005   -0.62312716  -0.25848144
+      -9.947997    -0.9611041    1.1649219   -2.1907122   -1.5028487
+      -0.51926106  15.165954     2.4649463   -0.9980445    7.4416637
+      -2.0768049    3.5896823   -7.3055434   -7.5620847    4.323335
+      0.0804418   -6.56401     -2.3148053   -1.7642345   -2.4708817
+      -7.675618    -9.548878    -1.0177554    0.16986446   2.5877135
+      -1.8752296   -0.36614323  -6.0493784   -2.3965611   -5.9453387
+      0.9424033  -13.155974    -7.457801     0.14658108  -3.742797
+      5.8414927   -1.2872906    5.5694313   12.57059      1.0939219
+      2.2142086    1.9181576    6.9914207   -5.888139     3.1409824
+      -2.003628     2.4434285    9.973139     5.03668      2.0051203
+      2.8615603    5.860224     2.9176188   -1.6311141    2.0292206
+      -4.070415    -6.831437  ]
    # get the test embedding
    Test embedding Result:
-    [ -1.902964     2.0690894   -8.034194     3.5472693    0.18089125
-      6.9085927    1.4097427   -1.9487704  -10.021278    -0.20755845
-      -8.04332      4.344489     2.3200977  -14.306299     5.184692
-    -11.55602     -3.8497238    0.6444722    1.2833948    2.6766639
-      0.5878921    0.7946299    1.7207596    2.5791872   14.998469
-      -1.3385371   15.031221    -0.8006958    1.99287     -9.52007
-      2.435466     4.003221    -4.33817     -4.898601    -5.304714
-    -18.033886    10.790787   -12.784645    -5.641755     2.9761686
-    -10.566622     1.4839455    6.152458    -5.7195854    2.8603241
-      6.112133     8.489869     5.5958056    1.2836679   -1.2293907
-      0.89927405   7.0288725   -2.854029    -0.9782962    5.8255906
-      14.905906    -5.025907     0.7866458   -4.2444224  -16.354029
-      10.521315     0.9604709   -3.3257897    7.144871   -13.592733
-      -8.568869    -1.7953678    0.26313916  10.916714    -6.9374123
-      1.857403    -6.2746415    2.8154466   -7.2338667   -2.293357
-      -0.05452765   5.4287076    5.0849075   -6.690375    -1.6183422
-      3.654291     0.94352573  -9.200294    -5.4749465   -3.5235846
-      1.3420814    4.240421    -2.772944    -2.8451524   16.311104
-      4.2969875   -1.762936   -12.5758915    8.595198    -0.8835239
-      -1.5708797    1.568961     1.1413603    3.5032008   -0.45251232
-      -6.786333    16.89443      5.3366146   -8.789056     0.6355629
-      3.2579517   -3.328322     7.5969577    0.66025066  -6.550468
-      -9.148656     2.020372    -0.4615173    1.1965656   -3.8764873
-      11.6562195   -6.0750933   12.182899     3.2218833    0.81969476
-      5.570001    -3.8459578   -7.205299     7.9262037   -7.6611166
-      -5.249467    -2.2671914    7.2658715  -13.298164     4.821147
-      -2.7263982   11.691089    -3.8918593   -2.838112    -1.0336838
-      -3.8034165    2.8536487   -5.60398     -1.1972581    1.3455094
-      -3.4903061    2.2408795    5.5010734   -3.970756    11.99696
-      -7.8858757    0.43160373  -5.5059714    4.3426995   16.322706
-      11.635366     0.72157705  -9.245714    -3.91465     -4.449838
-      -1.5716927    7.713747    -2.2430465   -6.198303   -13.481864
-      2.8156567   -5.7812386    5.1456156    2.7289324  -14.505571
-      13.270688     3.448231    -7.0659585    4.5886116   -4.466099
-      -0.296428   -11.463529    -2.6076477   14.110243    -6.9725137
-      -1.9962958    2.7119343   19.391657     0.01961198  14.607133
-      -1.6695905   -4.391516     1.3131028   -6.670972    -5.888604
-      12.0612335    5.9285784    3.3715196    1.492534    10.723728
-      -0.95514804 -12.085431  ]
+    [  2.5247195    5.119042    -4.335273     4.4583654    5.047907
+      3.5059214    1.6159848    0.49364898 -11.6899185   -3.1014526
+      -5.6589785   -0.42684984   2.674276   -11.937654     6.2248464
+    -10.776924    -5.694543     1.112041     1.5709964    1.0961034
+      1.3976512    2.324352     1.339981     5.279319    13.734659
+      -2.5753925   13.651442    -2.2357535    5.1575427   -3.251567
+      1.4023279    6.1191974   -6.0845175   -1.3646189   -2.6789894
+    -15.220778     9.779349    -9.411551    -6.388947     6.8313975
+      -9.245996     0.31196198   2.5509644   -4.413065     6.1649427
+      6.793837     2.6328635    8.620976     3.4832475    0.52491665
+      2.9115407    5.8392377    0.6702376   -3.2726715    2.6694255
+      16.91701     -5.5811176    0.23362345  -4.5573606  -11.801059
+      14.728292    -0.5198082   -3.999922     7.0927105   -7.0459595
+      -5.4389      -0.46420583  -5.1085467   10.376568    -8.889225
+      -0.37705845  -1.659806     2.6731026   -7.1909504    1.4608804
+      -2.163136    -0.17949677   4.0241547    0.11319201   0.601279
+      2.039692     3.1910992  -11.649526    -8.121584    -4.8707457
+      0.3851982    1.4231744   -2.3321972    0.99332285  14.121717
+      5.899413     0.7384519  -17.760096    10.555021     4.1366534
+      -0.3391071   -0.20792882   3.208204     0.8847948   -8.721497
+      -6.432868    13.006379     4.8956      -9.155822    -1.9441519
+      5.7815638   -2.066733    10.425042    -0.8802383   -2.4314315
+      -9.869258     0.35095334  -5.3549943    2.1076174   -8.290468
+      8.4433365   -4.689333     9.334139    -2.172678    -3.0250976
+      8.394216    -3.2110903   -7.93868      2.3960824   -2.3213403
+      -1.4963245   -3.476059     4.132903   -10.893354     4.362673
+      -0.45456508  10.258634    -1.1655927   -6.7799754    0.22885278
+      -4.399287     2.333433    -4.84745     -4.2752337   -1.3577863
+      -1.0685898    9.505196     7.3062205    0.08708266  12.927811
+      -9.57974      1.3936648   -1.9444873    5.776769    15.251903
+      10.6118355   -1.4903594   -9.535318    -3.6553776   -1.6699586
+      -0.5933151    7.600357    -4.8815503   -8.698617   -15.855757
+      0.25632986  -7.2235737    0.9506656    0.7128582   -9.051738
+      8.74869     -1.6426028   -6.5762258    2.506905    -6.7431564
+      5.129912   -12.189555    -3.6435068   12.068113    -6.0059533
+      -2.3535995    2.9014351   22.3082      -1.5563312   13.193291
+      2.7583609   -7.468798     1.3407065   -4.599617    -6.2345777
+      10.7689295    7.137627     5.099476     0.3473359    9.647881
+      -2.0484571   -5.8549366 ]
    # get the score between enroll and test
-    Eembeddings Score: 0.4292638301849365
+    Eembeddings Score: 0.45332613587379456
  ```

 ### 4.Pretrained Models
--- a/demos/speaker_verification/README_cn.md
+++ b/demos/speaker_verification/README_cn.md
@ -4,16 +4,16 @@
 ## 介绍
 声纹识别是一项用计算机程序自动提取说话人特征的技术。

-这个 demo 是一个从给定音频文件提取说话人特征，它可以通过使用 `PaddleSpeech` 的单个命令或 python 中的几行代码来实现。
+这个 demo 是从一个给定音频文件中提取说话人特征，它可以通过使用 `PaddleSpeech` 的单个命令或 python 中的几行代码来实现。

 ## 使用方法
 ### 1. 安装
 请看[安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md)。

-你可以从 easy，medium，hard 三中方式中选择一种方式安装。
+你可以从easy medium，hard 三种方式中选择一种方式安装。

 ### 2. 准备输入
-这个 demo 的输入应该是一个 WAV 文件（`.wav`），并且采样率必须与模型的采样率相同。
+声纹cli demo 的输入应该是一个 WAV 文件（`.wav`），并且采样率必须与模型的采样率相同。

 可以下载此 demo 的示例音频：
 ```bash
@ -51,51 +51,51 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav

  输出：
  ```bash
-  demo  [  1.4217498    5.626253    -5.342073     1.1773866    3.308055
-    1.756596     5.167894    10.80636     -3.8226728   -5.6141334
-    2.623845    -0.8072968    1.9635103   -7.3128724    0.01103897
-    -9.723131     0.6619743   -6.976803    10.213478     7.494748
-    2.9105635    3.8949256    3.7999806    7.1061673   16.905321
-    -7.1493764    8.733103     3.4230042   -4.831653   -11.403367
-    11.232214     7.1274667   -4.2828417    2.452362    -5.130748
-    -18.177666    -2.6116815  -11.000337    -6.7314315    1.6564683
-    0.7618269    1.1253023   -2.083836     4.725744    -8.782597
-    -3.539873     3.814236     5.1420674    2.162061     4.096431
-    -6.4162116   12.747448     1.9429878  -15.152943     6.417416
-    16.097002    -9.716668    -1.9920526   -3.3649497   -1.871939
-    11.567354     3.69788     11.258265     7.442363     9.183411
-    4.5281515   -1.2417862    4.3959084    6.6727695    5.8898783
-    7.627124    -0.66919386 -11.889693    -9.208865    -7.4274073
-    -3.7776625    6.917234    -9.848748    -2.0944717   -5.135116
-    0.49563864   9.317534    -5.9141874   -1.8098574   -0.11738578
-    -7.169265    -1.0578263   -5.7216787   -5.1173844   16.137651
-    -4.473626     7.6624317   -0.55381083   9.631587    -6.4704556
-    -8.548508     4.3716145   -0.79702514   4.478997    -2.9758704
-    3.272176     2.8382776    5.134597    -9.190781    -0.5657382
-    -4.8745747    2.3165567   -5.984303    -2.1798875    0.35541576
-    -0.31784213   9.493548     2.1144536    4.358092   -12.089823
-    8.451689    -7.925461     4.6242585    4.4289427   18.692003
-    -2.6204622   -5.149185    -0.35821092   8.488551     4.981496
-    -9.32683     -2.2544234    6.6417594    1.2119585   10.977129
-    16.555033     3.3238444    9.551863    -1.6676947   -0.79539716
-    -8.605674    -0.47356385   2.6741948   -5.359179    -2.6673796
-    0.66607     15.443222     4.740594    -3.4725387   11.592567
-    -2.054497     1.7361217   -8.265324    -9.30447      5.4068313
-    -1.5180256   -7.746615    -6.089606     0.07112726  -0.34904733
-    -8.649895    -9.998958    -2.564841    -0.53999114   2.601808
-    -0.31927416  -1.8815292   -2.07215     -3.4105783   -8.2998085
-    1.483641   -15.365992    -8.288208     3.8847756   -3.4876456
-    7.3629923    0.4657332    3.132599    12.438889    -1.8337058
-    4.532936     2.7264361   10.145339    -6.521951     2.897153
-    -3.3925855    5.079156     7.759716     4.677565     5.8457737
-    2.402413     7.7071047    3.9711342   -6.390043     6.1268735
-    -3.7760346  -11.118123  ]
+    [ -1.3251206    7.8606825   -4.620626     0.3000721    2.2648535
+    -1.1931441    3.0647137    7.673595    -6.0044727  -12.02426
+    -1.9496069    3.1269536    1.618838    -7.6383104   -1.2299773
+  -12.338331     2.1373026   -5.3957124    9.717328     5.6752305
+    3.7805123    3.0597172    3.429692     8.97601     13.174125
+    -0.53132284   8.9424715    4.46511     -4.4262476   -9.726503
+    8.399328     7.2239175   -7.435854     2.9441683   -4.3430395
+  -13.886965    -1.6346735  -10.9027405   -5.311245     3.8007221
+    3.8976038   -2.1230774   -2.3521194    4.151031    -7.4048667
+    0.13911647   2.4626107    4.9664545    0.9897574    5.4839754
+    -3.3574002   10.1340065   -0.6120171  -10.403095     4.6007543
+    16.00935     -7.7836914   -4.1945305   -6.9368606    1.1789556
+    11.490801     4.2380238    9.550931     8.375046     7.5089145
+    -0.65707296  -0.30051577   2.8406055    3.0828028    0.730817
+    6.148354     0.13766119 -13.424735    -7.7461405   -2.3227983
+    -8.305252     2.9879124  -10.995229     0.15211068  -2.3820348
+    -1.7984174    8.495629    -5.8522367   -3.755498     0.6989711
+    -5.2702994   -2.6188622   -1.8828466   -4.64665     14.078544
+    -0.5495333   10.579158    -3.2160501    9.349004    -4.381078
+  -11.675817    -2.8630207    4.5721755    2.246612    -4.574342
+    1.8610188    2.3767874    5.6257877   -9.784078     0.64967257
+    -1.4579505    0.4263264   -4.9211264   -2.454784     3.4869802
+    -0.42654222   8.341269     1.356552     7.0966883  -13.102829
+    8.016734    -7.1159344    1.8699781    0.208721    14.699384
+    -1.025278    -2.6107233   -2.5082312    8.427193     6.9138527
+    -6.2912464    0.6157366    2.489688    -3.4668267    9.921763
+    11.200815    -0.1966403    7.4916005   -0.62312716  -0.25848144
+    -9.947997    -0.9611041    1.1649219   -2.1907122   -1.5028487
+    -0.51926106  15.165954     2.4649463   -0.9980445    7.4416637
+    -2.0768049    3.5896823   -7.3055434   -7.5620847    4.323335
+    0.0804418   -6.56401     -2.3148053   -1.7642345   -2.4708817
+    -7.675618    -9.548878    -1.0177554    0.16986446   2.5877135
+    -1.8752296   -0.36614323  -6.0493784   -2.3965611   -5.9453387
+    0.9424033  -13.155974    -7.457801     0.14658108  -3.742797
+    5.8414927   -1.2872906    5.5694313   12.57059      1.0939219
+    2.2142086    1.9181576    6.9914207   -5.888139     3.1409824
+    -2.003628     2.4434285    9.973139     5.03668      2.0051203
+    2.8615603    5.860224     2.9176188   -1.6311141    2.0292206
+    -4.070415    -6.831437  ]
  ```

 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import VectorExecutor
+  from paddlespeech.cli.vector import VectorExecutor

  vector_executor = VectorExecutor()
  audio_emb = vector_executor(
@ -125,88 +125,88 @@ wget -c https://paddlespeech.bj.bcebos.com/vector/audio/85236145389.wav
  ```bash
  # Vector Result:
   Audio embedding Result:
-    [  1.4217498    5.626253    -5.342073     1.1773866    3.308055
-    1.756596     5.167894    10.80636     -3.8226728   -5.6141334
-    2.623845    -0.8072968    1.9635103   -7.3128724    0.01103897
-    -9.723131     0.6619743   -6.976803    10.213478     7.494748
-    2.9105635    3.8949256    3.7999806    7.1061673   16.905321
-    -7.1493764    8.733103     3.4230042   -4.831653   -11.403367
-    11.232214     7.1274667   -4.2828417    2.452362    -5.130748
-    -18.177666    -2.6116815  -11.000337    -6.7314315    1.6564683
-    0.7618269    1.1253023   -2.083836     4.725744    -8.782597
-    -3.539873     3.814236     5.1420674    2.162061     4.096431
-    -6.4162116   12.747448     1.9429878  -15.152943     6.417416
-    16.097002    -9.716668    -1.9920526   -3.3649497   -1.871939
-    11.567354     3.69788     11.258265     7.442363     9.183411
-    4.5281515   -1.2417862    4.3959084    6.6727695    5.8898783
-    7.627124    -0.66919386 -11.889693    -9.208865    -7.4274073
-    -3.7776625    6.917234    -9.848748    -2.0944717   -5.135116
-    0.49563864   9.317534    -5.9141874   -1.8098574   -0.11738578
-    -7.169265    -1.0578263   -5.7216787   -5.1173844   16.137651
-    -4.473626     7.6624317   -0.55381083   9.631587    -6.4704556
-    -8.548508     4.3716145   -0.79702514   4.478997    -2.9758704
-    3.272176     2.8382776    5.134597    -9.190781    -0.5657382
-    -4.8745747    2.3165567   -5.984303    -2.1798875    0.35541576
-    -0.31784213   9.493548     2.1144536    4.358092   -12.089823
-    8.451689    -7.925461     4.6242585    4.4289427   18.692003
-    -2.6204622   -5.149185    -0.35821092   8.488551     4.981496
-    -9.32683     -2.2544234    6.6417594    1.2119585   10.977129
-    16.555033     3.3238444    9.551863    -1.6676947   -0.79539716
-    -8.605674    -0.47356385   2.6741948   -5.359179    -2.6673796
-    0.66607     15.443222     4.740594    -3.4725387   11.592567
-    -2.054497     1.7361217   -8.265324    -9.30447      5.4068313
-    -1.5180256   -7.746615    -6.089606     0.07112726  -0.34904733
-    -8.649895    -9.998958    -2.564841    -0.53999114   2.601808
-    -0.31927416  -1.8815292   -2.07215     -3.4105783   -8.2998085
-    1.483641   -15.365992    -8.288208     3.8847756   -3.4876456
-    7.3629923    0.4657332    3.132599    12.438889    -1.8337058
-    4.532936     2.7264361   10.145339    -6.521951     2.897153
-    -3.3925855    5.079156     7.759716     4.677565     5.8457737
-    2.402413     7.7071047    3.9711342   -6.390043     6.1268735
-    -3.7760346  -11.118123  ]
+    [ -1.3251206    7.8606825   -4.620626     0.3000721    2.2648535
+      -1.1931441    3.0647137    7.673595    -6.0044727  -12.02426
+      -1.9496069    3.1269536    1.618838    -7.6383104   -1.2299773
+    -12.338331     2.1373026   -5.3957124    9.717328     5.6752305
+      3.7805123    3.0597172    3.429692     8.97601     13.174125
+      -0.53132284   8.9424715    4.46511     -4.4262476   -9.726503
+      8.399328     7.2239175   -7.435854     2.9441683   -4.3430395
+    -13.886965    -1.6346735  -10.9027405   -5.311245     3.8007221
+      3.8976038   -2.1230774   -2.3521194    4.151031    -7.4048667
+      0.13911647   2.4626107    4.9664545    0.9897574    5.4839754
+      -3.3574002   10.1340065   -0.6120171  -10.403095     4.6007543
+      16.00935     -7.7836914   -4.1945305   -6.9368606    1.1789556
+      11.490801     4.2380238    9.550931     8.375046     7.5089145
+      -0.65707296  -0.30051577   2.8406055    3.0828028    0.730817
+      6.148354     0.13766119 -13.424735    -7.7461405   -2.3227983
+      -8.305252     2.9879124  -10.995229     0.15211068  -2.3820348
+      -1.7984174    8.495629    -5.8522367   -3.755498     0.6989711
+      -5.2702994   -2.6188622   -1.8828466   -4.64665     14.078544
+      -0.5495333   10.579158    -3.2160501    9.349004    -4.381078
+    -11.675817    -2.8630207    4.5721755    2.246612    -4.574342
+      1.8610188    2.3767874    5.6257877   -9.784078     0.64967257
+      -1.4579505    0.4263264   -4.9211264   -2.454784     3.4869802
+      -0.42654222   8.341269     1.356552     7.0966883  -13.102829
+      8.016734    -7.1159344    1.8699781    0.208721    14.699384
+      -1.025278    -2.6107233   -2.5082312    8.427193     6.9138527
+      -6.2912464    0.6157366    2.489688    -3.4668267    9.921763
+      11.200815    -0.1966403    7.4916005   -0.62312716  -0.25848144
+      -9.947997    -0.9611041    1.1649219   -2.1907122   -1.5028487
+      -0.51926106  15.165954     2.4649463   -0.9980445    7.4416637
+      -2.0768049    3.5896823   -7.3055434   -7.5620847    4.323335
+      0.0804418   -6.56401     -2.3148053   -1.7642345   -2.4708817
+      -7.675618    -9.548878    -1.0177554    0.16986446   2.5877135
+      -1.8752296   -0.36614323  -6.0493784   -2.3965611   -5.9453387
+      0.9424033  -13.155974    -7.457801     0.14658108  -3.742797
+      5.8414927   -1.2872906    5.5694313   12.57059      1.0939219
+      2.2142086    1.9181576    6.9914207   -5.888139     3.1409824
+      -2.003628     2.4434285    9.973139     5.03668      2.0051203
+      2.8615603    5.860224     2.9176188   -1.6311141    2.0292206
+      -4.070415    -6.831437  ]
    # get the test embedding
    Test embedding Result:
-    [ -1.902964     2.0690894   -8.034194     3.5472693    0.18089125
-      6.9085927    1.4097427   -1.9487704  -10.021278    -0.20755845
-      -8.04332      4.344489     2.3200977  -14.306299     5.184692
-    -11.55602     -3.8497238    0.6444722    1.2833948    2.6766639
-      0.5878921    0.7946299    1.7207596    2.5791872   14.998469
-      -1.3385371   15.031221    -0.8006958    1.99287     -9.52007
-      2.435466     4.003221    -4.33817     -4.898601    -5.304714
-    -18.033886    10.790787   -12.784645    -5.641755     2.9761686
-    -10.566622     1.4839455    6.152458    -5.7195854    2.8603241
-      6.112133     8.489869     5.5958056    1.2836679   -1.2293907
-      0.89927405   7.0288725   -2.854029    -0.9782962    5.8255906
-      14.905906    -5.025907     0.7866458   -4.2444224  -16.354029
-      10.521315     0.9604709   -3.3257897    7.144871   -13.592733
-      -8.568869    -1.7953678    0.26313916  10.916714    -6.9374123
-      1.857403    -6.2746415    2.8154466   -7.2338667   -2.293357
-      -0.05452765   5.4287076    5.0849075   -6.690375    -1.6183422
-      3.654291     0.94352573  -9.200294    -5.4749465   -3.5235846
-      1.3420814    4.240421    -2.772944    -2.8451524   16.311104
-      4.2969875   -1.762936   -12.5758915    8.595198    -0.8835239
-      -1.5708797    1.568961     1.1413603    3.5032008   -0.45251232
-      -6.786333    16.89443      5.3366146   -8.789056     0.6355629
-      3.2579517   -3.328322     7.5969577    0.66025066  -6.550468
-      -9.148656     2.020372    -0.4615173    1.1965656   -3.8764873
-      11.6562195   -6.0750933   12.182899     3.2218833    0.81969476
-      5.570001    -3.8459578   -7.205299     7.9262037   -7.6611166
-      -5.249467    -2.2671914    7.2658715  -13.298164     4.821147
-      -2.7263982   11.691089    -3.8918593   -2.838112    -1.0336838
-      -3.8034165    2.8536487   -5.60398     -1.1972581    1.3455094
-      -3.4903061    2.2408795    5.5010734   -3.970756    11.99696
-      -7.8858757    0.43160373  -5.5059714    4.3426995   16.322706
-      11.635366     0.72157705  -9.245714    -3.91465     -4.449838
-      -1.5716927    7.713747    -2.2430465   -6.198303   -13.481864
-      2.8156567   -5.7812386    5.1456156    2.7289324  -14.505571
-      13.270688     3.448231    -7.0659585    4.5886116   -4.466099
-      -0.296428   -11.463529    -2.6076477   14.110243    -6.9725137
-      -1.9962958    2.7119343   19.391657     0.01961198  14.607133
-      -1.6695905   -4.391516     1.3131028   -6.670972    -5.888604
-      12.0612335    5.9285784    3.3715196    1.492534    10.723728
-      -0.95514804 -12.085431  ]
+    [  2.5247195    5.119042    -4.335273     4.4583654    5.047907
+      3.5059214    1.6159848    0.49364898 -11.6899185   -3.1014526
+      -5.6589785   -0.42684984   2.674276   -11.937654     6.2248464
+    -10.776924    -5.694543     1.112041     1.5709964    1.0961034
+      1.3976512    2.324352     1.339981     5.279319    13.734659
+      -2.5753925   13.651442    -2.2357535    5.1575427   -3.251567
+      1.4023279    6.1191974   -6.0845175   -1.3646189   -2.6789894
+    -15.220778     9.779349    -9.411551    -6.388947     6.8313975
+      -9.245996     0.31196198   2.5509644   -4.413065     6.1649427
+      6.793837     2.6328635    8.620976     3.4832475    0.52491665
+      2.9115407    5.8392377    0.6702376   -3.2726715    2.6694255
+      16.91701     -5.5811176    0.23362345  -4.5573606  -11.801059
+      14.728292    -0.5198082   -3.999922     7.0927105   -7.0459595
+      -5.4389      -0.46420583  -5.1085467   10.376568    -8.889225
+      -0.37705845  -1.659806     2.6731026   -7.1909504    1.4608804
+      -2.163136    -0.17949677   4.0241547    0.11319201   0.601279
+      2.039692     3.1910992  -11.649526    -8.121584    -4.8707457
+      0.3851982    1.4231744   -2.3321972    0.99332285  14.121717
+      5.899413     0.7384519  -17.760096    10.555021     4.1366534
+      -0.3391071   -0.20792882   3.208204     0.8847948   -8.721497
+      -6.432868    13.006379     4.8956      -9.155822    -1.9441519
+      5.7815638   -2.066733    10.425042    -0.8802383   -2.4314315
+      -9.869258     0.35095334  -5.3549943    2.1076174   -8.290468
+      8.4433365   -4.689333     9.334139    -2.172678    -3.0250976
+      8.394216    -3.2110903   -7.93868      2.3960824   -2.3213403
+      -1.4963245   -3.476059     4.132903   -10.893354     4.362673
+      -0.45456508  10.258634    -1.1655927   -6.7799754    0.22885278
+      -4.399287     2.333433    -4.84745     -4.2752337   -1.3577863
+      -1.0685898    9.505196     7.3062205    0.08708266  12.927811
+      -9.57974      1.3936648   -1.9444873    5.776769    15.251903
+      10.6118355   -1.4903594   -9.535318    -3.6553776   -1.6699586
+      -0.5933151    7.600357    -4.8815503   -8.698617   -15.855757
+      0.25632986  -7.2235737    0.9506656    0.7128582   -9.051738
+      8.74869     -1.6426028   -6.5762258    2.506905    -6.7431564
+      5.129912   -12.189555    -3.6435068   12.068113    -6.0059533
+      -2.3535995    2.9014351   22.3082      -1.5563312   13.193291
+      2.7583609   -7.468798     1.3407065   -4.599617    -6.2345777
+      10.7689295    7.137627     5.099476     0.3473359    9.647881
+      -2.0484571   -5.8549366 ]
    # get the score between enroll and test
-    Eembeddings Score: 0.4292638301849365
+    Eembeddings Score: 0.45332613587379456
  ```

 ### 4.预训练模型
--- a/demos/speech_recognition/README.md
+++ b/demos/speech_recognition/README.md
@ -24,13 +24,13 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 - Command Line(Recommended)
  ```bash
  # Chinese
-  paddlespeech asr --input ./zh.wav
+  paddlespeech asr --input ./zh.wav -v
  # English
-  paddlespeech asr --model transformer_librispeech --lang en --input ./en.wav
+  paddlespeech asr --model transformer_librispeech --lang en --input ./en.wav -v
  # Chinese ASR + Punctuation Restoration
-  paddlespeech asr --input ./zh.wav | paddlespeech text --task punc
+  paddlespeech asr --input ./zh.wav -v | paddlespeech text --task punc -v
  ```
-  (It doesn't matter if package `paddlespeech-ctcdecoders` is not found, this package is optional.)
+  (If you don't want to see the log information, you can remove "-v". Besides, it doesn't matter if package `paddlespeech-ctcdecoders` is not found, this package is optional.)
  
  Usage:
  ```bash
@ -45,6 +45,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
  - `ckpt_path`: Model checkpoint. Use pretrained model when it is None. Default: `None`.
  - `yes`: No additional parameters required. Once set this parameter, it means accepting the request of the program by default, which includes transforming the audio sample rate. Default: `False`.
  - `device`: Choose device to execute model inference. Default: default device of paddlepaddle in current environment.
+  - `verbose`: Show the log information.

  Output:
  ```bash
@ -57,7 +58,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import ASRExecutor
+  from paddlespeech.cli.asr import ASRExecutor

  asr_executor = ASRExecutor()
  text = asr_executor(
@ -84,8 +85,12 @@ Here is a list of pretrained models released by PaddleSpeech that can be used by

 | Model | Language | Sample Rate
 | :--- | :---: | :---: |
-| conformer_wenetspeech| zh| 16k
-| transformer_librispeech| en| 16k
+| conformer_wenetspeech | zh | 16k
+| conformer_online_multicn | zh | 16k
+| conformer_aishell | zh | 16k
+| conformer_online_aishell | zh | 16k
+| transformer_librispeech | en | 16k
+| deepspeech2online_wenetspeech | zh | 16k
 | deepspeech2offline_aishell| zh| 16k
 | deepspeech2online_aishell | zh | 16k
-|deepspeech2offline_librispeech|en| 16k
+| deepspeech2offline_librispeech | en | 16k
--- a/demos/speech_recognition/README_cn.md
+++ b/demos/speech_recognition/README_cn.md
@ -22,13 +22,13 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 - 命令行 (推荐使用)
  ```bash
  # 中文
-  paddlespeech asr --input ./zh.wav
+  paddlespeech asr --input ./zh.wav -v
  # 英文
-  paddlespeech asr --model transformer_librispeech --lang en --input ./en.wav
+  paddlespeech asr --model transformer_librispeech --lang en --input ./en.wav -v
  # 中文 + 标点恢复
-  paddlespeech asr --input ./zh.wav | paddlespeech text --task punc
+  paddlespeech asr --input ./zh.wav -v | paddlespeech text --task punc -v
  ```
-  (如果显示 `paddlespeech-ctcdecoders` 这个 python 包没有找到的 Error，没有关系，这个包是非必须的。)
+  (如果不想显示 log 信息，可以不使用"-v", 另外如果显示 `paddlespeech-ctcdecoders` 这个 python 包没有找到的 Error，没有关系，这个包是非必须的。)
  
  使用方法：
  ```bash
@ -43,6 +43,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
  - `ckpt_path`：模型参数文件，若不设置则下载预训练模型使用，默认值：`None`。
  - `yes`；不需要设置额外的参数，一旦设置了该参数，说明你默认同意程序的所有请求，其中包括自动转换输入音频的采样率。默认值：`False`。
  - `device`：执行预测的设备，默认值：当前系统下 paddlepaddle 的默认 device。
+  - `verbose`: 如果使用，显示 logger 信息。

  输出：
  ```bash
@ -55,7 +56,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import ASRExecutor
+  from paddlespeech.cli.asr import ASRExecutor

  asr_executor = ASRExecutor()
  text = asr_executor(
@ -82,7 +83,11 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 | 模型 | 语言 | 采样率
 | :--- | :---: | :---: |
 | conformer_wenetspeech | zh | 16k
+| conformer_online_multicn | zh | 16k
+| conformer_aishell | zh | 16k
+| conformer_online_aishell | zh | 16k
 | transformer_librispeech | en | 16k
+| deepspeech2online_wenetspeech | zh | 16k
 | deepspeech2offline_aishell| zh| 16k
 | deepspeech2online_aishell | zh | 16k
 | deepspeech2offline_librispeech | en | 16k
--- a/demos/speech_server/README.md
+++ b/demos/speech_server/README.md
@ -10,7 +10,7 @@ This demo is an implementation of starting the voice service and accessing the s
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).

-It is recommended to use **paddlepaddle 2.2.1** or above.
+It is recommended to use **paddlepaddle 2.2.2** or above.
 You can choose one way from meduim and hard to install paddlespeech.

 ### 2. Prepare config File
@ -18,6 +18,7 @@ The configuration file can be found in `conf/application.yaml` .
 Among them, `engine_list` indicates the speech engine that will be included in the service to be started, in the format of `<speech task>_<engine type>`.
 At present, the speech tasks integrated by the service include: asr (speech recognition), tts (text to sppech) and cls (audio classification).
 Currently the engine type supports two forms: python and inference (Paddle Inference)
+**Note:** If the service can be started normally in the container, but the client access IP is unreachable, you can try to replace the `host` address in the configuration file with the local IP address.


 The input of  ASR client demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
@ -83,6 +84,9 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 ### 4. ASR Client Usage
 **Note:** The response time will be slightly longer when using the client for the first time
 - Command Line (Recommended)
+
+   If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
   ```
   paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
   ```
@ -131,6 +135,9 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 ### 5. TTS Client Usage
 **Note:** The response time will be slightly longer when using the client for the first time
 - Command Line (Recommended)
+
+   If `127.0.0.1` is not accessible, you need to use the actual service IP address
+
   ```bash
   paddlespeech_client tts --server_ip 127.0.0.1 --port 8090 --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
   ```
@ -191,6 +198,9 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 ### 6. CLS Client Usage
 **Note:** The response time will be slightly longer when using the client for the first time
 - Command Line (Recommended)
+
+   If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
   ```
   paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
   ```
@ -235,6 +245,173 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
  ```


+### 7. Speaker Verification Client Usage
+
+#### 7.1 Extract speaker embedding
+**Note:** The response time will be slightly longer when using the client for the first time
+- Command Line (Recommended)
+
+  If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+  ``` bash
+  paddlespeech_client vector --task spk  --server_ip 127.0.0.1 --port 8090 --input 85236145389.wav
+  ```
+
+  Usage:
+
+  ``` bash
+  paddlespeech_client vector --help
+  ```
+
+  Arguments:
+    * server_ip: server ip. Default: 127.0.0.1
+    * port: server port. Default: 8090
+    * input(required): Input text to generate.
+    * task: the task of vector, can be use 'spk' or 'score。Default is 'spk'。
+    * enroll: enroll audio
+    * test: test audio
+
+  Output:
+
+  ```bash
+    [2022-05-25 12:25:36,165] [    INFO] - vector http client start
+    [2022-05-25 12:25:36,165] [    INFO] - the input audio: 85236145389.wav
+    [2022-05-25 12:25:36,165] [    INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector
+    [2022-05-25 12:25:36,166] [    INFO] - http://127.0.0.1:8790/paddlespeech/vector
+    [2022-05-25 12:25:36,324] [    INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
+    [2022-05-25 12:25:36,324] [    INFO] - Response time 0.159053 s.
+  ```
+
+* Python API
+
+  ``` python
+  from paddlespeech.server.bin.paddlespeech_client import VectorClientExecutor
+
+  vectorclient_executor = VectorClientExecutor()
+  res = vectorclient_executor(
+      input="85236145389.wav",
+      server_ip="127.0.0.1",
+      port=8090,
+      task="spk")
+  print(res)
+  ```
+
+  Output:
+
+  ``` bash
+    {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
+  ```
+
+#### 7.2 Get the score between speaker audio embedding
+
+**Note:** The response time will be slightly longer when using the client for the first time
+
+- Command Line (Recommended)
+
+  If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+  ``` bash
+  paddlespeech_client vector --task score  --server_ip 127.0.0.1 --port 8090 --enroll 85236145389.wav --test 123456789.wav
+  ```
+
+  Usage:
+
+  ``` bash
+  paddlespeech_client vector --help
+  ```
+
+  Arguments:
+    * server_ip: server ip. Default: 127.0.0.1
+    * port: server port. Default: 8090
+    * input(required): Input text to generate.
+    * task: the task of vector, can be use 'spk' or 'score。If get the score, this must be 'score' parameter.
+    * enroll: enroll audio
+    * test: test audio
+  
+  Output:
+
+  ``` bash
+    [2022-05-25 12:33:24,527] [    INFO] - vector score http client start
+    [2022-05-25 12:33:24,527] [    INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
+    [2022-05-25 12:33:24,528] [    INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
+    [2022-05-25 12:33:24,695] [    INFO] - The vector score is: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
+    [2022-05-25 12:33:24,696] [    INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
+    [2022-05-25 12:33:24,696] [    INFO] - Response time 0.168271 s.
+  ```
+
+* Python API
+
+  ``` python 
+  from paddlespeech.server.bin.paddlespeech_client import VectorClientExecutor
+
+  vectorclient_executor = VectorClientExecutor()
+  res = vectorclient_executor(
+      input=None,
+      enroll_audio="85236145389.wav",
+      test_audio="123456789.wav",
+      server_ip="127.0.0.1",
+      port=8090,
+      task="score")
+  print(res)
+  ```
+
+  Output:
+
+  ``` bash
+  [2022-05-25 12:30:14,143] [    INFO] - vector score http client start
+  [2022-05-25 12:30:14,143] [    INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
+  [2022-05-25 12:30:14,143] [    INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
+  [2022-05-25 12:30:14,363] [    INFO] - The vector score is: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
+  {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
+  ```
+
+### 8. Punctuation prediction
+  
+**Note:** The response time will be slightly longer when using the client for the first time
+
+- Command Line (Recommended)
+
+  If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+   ``` bash
+   paddlespeech_client text --server_ip 127.0.0.1 --port 8090 --input "我认为跑步最重要的就是给我带来了身体健康"
+   ```
+
+  Usage:
+  
+  ```bash
+  paddlespeech_client text --help
+  ```
+  Arguments:
+  - `server_ip`: server ip. Default: 127.0.0.1
+  - `port`: server port. Default: 8090
+  - `input`(required): Input text to get punctuation.
+
+  Output:
+  ```bash
+    [2022-05-09 18:19:04,397] [    INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
+    [2022-05-09 18:19:04,397] [    INFO] - Response time 0.092407 s.
+  ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import TextClientExecutor
+
+  textclient_executor = TextClientExecutor()
+  res = textclient_executor(
+      input="我认为跑步最重要的就是给我带来了身体健康",
+      server_ip="127.0.0.1",
+      port=8090,)
+  print(res)
+
+  ```
+
+  Output:
+  ```bash
+  我认为跑步最重要的就是给我带来了身体健康。
+  ```
+
+
 ## Models supported by the service
 ### ASR model
 Get all models supported by the ASR service via `paddlespeech_server stats --task asr`, where static models can be used for paddle inference inference.
@ -244,3 +421,9 @@ Get all models supported by the TTS service via `paddlespeech_server stats --tas

 ### CLS model
 Get all models supported by the CLS service via `paddlespeech_server stats --task cls`, where static models can be used for paddle inference inference.
+
+### Vector model
+Get all models supported by the TTS service via `paddlespeech_server stats --task vector`, where static models can be used for paddle inference inference.
+
+### Text model
+Get all models supported by the CLS service via `paddlespeech_server stats --task text`, where static models can be used for paddle inference inference.
--- a/demos/speech_server/README_cn.md
+++ b/demos/speech_server/README_cn.md
@ -1,29 +1,30 @@
-([简体中文](./README_cn.md)|English)
+(简体中文|[English](./README.md))

 # 语音服务

 ## 介绍
-这个demo是一个启动语音服务和访问服务的实现。 它可以通过使用`paddlespeech_server` 和 `paddlespeech_client`的单个命令或 python 的几行代码来实现。
+这个 demo 是一个启动离线语音服务和访问服务的实现。它可以通过使用`paddlespeech_server` 和 `paddlespeech_client`的单个命令或 python 的几行代码来实现。


 ## 使用方法
 ### 1. 安装
 请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).

-推荐使用 **paddlepaddle 2.2.1** 或以上版本。
-你可以从 medium，hard 三中方式中选择一种方式安装 PaddleSpeech。
+推荐使用 **paddlepaddle 2.2.2** 或以上版本。
+你可以从 medium，hard 两种方式中选择一种方式安装 PaddleSpeech。


 ### 2. 准备配置文件
 配置文件可参见 `conf/application.yaml` 。
 其中，`engine_list`表示即将启动的服务将会包含的语音引擎，格式为 <语音任务>_<引擎类型>。
-目前服务集成的语音任务有： asr(语音识别)、tts(语音合成)以及cls(音频分类)。
+目前服务集成的语音任务有： asr(语音识别)、tts(语音合成)、cls(音频分类)、vector(声纹识别)以及text(文本处理)。
 目前引擎类型支持两种形式：python 及 inference (Paddle Inference)
+**注意：** 如果在容器里可正常启动服务，但客户端访问 ip 不可达，可尝试将配置文件中 `host` 地址换成本地 ip 地址。


-这个 ASR client 的输入应该是一个 WAV 文件（`.wav`），并且采样率必须与模型的采样率相同。
+ASR client 的输入是一个 WAV 文件（`.wav`），并且采样率必须与模型的采样率相同。

-可以下载此 ASR client的示例音频：
+可以下载此 ASR client 的示例音频：
 ```bash
 wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav
 ```
@ -83,31 +84,34 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 ### 4. ASR 客户端使用方法
 **注意：** 初次使用客户端时响应时间会略长
 - 命令行 (推荐使用)
-   ```
-   paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input ./zh.wav

-   ```
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址

-    使用帮助:
-  
-    ```bash
-    paddlespeech_client asr --help
-    ```
+  ```
+  paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input ./zh.wav

-    参数:
-    - `server_ip`: 服务端ip地址，默认: 127.0.0.1。
-    - `port`: 服务端口，默认: 8090。
-    - `input`(必须输入): 用于识别的音频文件。
-    - `sample_rate`: 音频采样率，默认值：16000。
-    - `lang`: 模型语言，默认值：zh_cn。
-    - `audio_format`: 音频格式，默认值：wav。
+  ```

-    输出:
+  使用帮助:

-    ```bash
-    [2022-02-23 18:11:22,819] [    INFO] - {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'transcription': '我认为跑步最重要的就是给我带来了身体健康'}}
-    [2022-02-23 18:11:22,820] [    INFO] - time cost 0.689145 s.
-    ```
+  ```bash
+  paddlespeech_client asr --help
+  ```
+
+  参数:
+  - `server_ip`: 服务端 ip 地址，默认: 127.0.0.1。
+  - `port`: 服务端口，默认: 8090。
+  - `input`(必须输入): 用于识别的音频文件。
+  - `sample_rate`: 音频采样率，默认值：16000。
+  - `lang`: 模型语言，默认值：zh_cn。
+  - `audio_format`: 音频格式，默认值：wav。
+
+  输出:
+
+  ```bash
+  [2022-02-23 18:11:22,819] [    INFO] - {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'transcription': '我认为跑步最重要的就是给我带来了身体健康'}}
+  [2022-02-23 18:11:22,820] [    INFO] - time cost 0.689145 s.
+  ```

 - Python API
  ```python
@ -134,33 +138,35 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 ### 5. TTS 客户端使用方法
 **注意：** 初次使用客户端时响应时间会略长
 - 命令行 (推荐使用)
-
-    ```bash
-    paddlespeech_client tts --server_ip 127.0.0.1 --port 8090 --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
-    ```
-    使用帮助:
  
-    ```bash
-    paddlespeech_client tts --help
-    ```
-
-    参数:
-    - `server_ip`: 服务端ip地址，默认: 127.0.0.1。
-    - `port`: 服务端口，默认: 8090。
-    - `input`(必须输入): 待合成的文本。
-    - `spk_id`: 说话人 id，用于多说话人语音合成，默认值： 0。
-    - `speed`: 音频速度，该值应设置在 0 到 3 之间。 默认值：1.0
-    - `volume`: 音频音量，该值应设置在 0 到 3 之间。 默认值： 1.0
-    - `sample_rate`: 采样率，可选 [0, 8000, 16000]，默认与模型相同。 默认值：0
-    - `output`: 输出音频的路径， 默认值：None，表示不保存音频到本地。
-
-    输出:
-    ```bash
-    [2022-02-23 15:20:37,875] [    INFO] - {'description': 'success.'}
-    [2022-02-23 15:20:37,875] [    INFO] - Save synthesized audio successfully on output.wav.
-    [2022-02-23 15:20:37,875] [    INFO] - Audio duration: 3.612500 s.
-    [2022-02-23 15:20:37,875] [    INFO] - Response time: 0.348050 s.
-    ```
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+  ```bash
+  paddlespeech_client tts --server_ip 127.0.0.1 --port 8090 --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+  ```
+  使用帮助:
+
+  ```bash
+  paddlespeech_client tts --help
+  ```
+
+  参数:
+  - `server_ip`: 服务端ip地址，默认: 127.0.0.1。
+  - `port`: 服务端口，默认: 8090。
+  - `input`(必须输入): 待合成的文本。
+  - `spk_id`: 说话人 id，用于多说话人语音合成，默认值： 0。
+  - `speed`: 音频速度，该值应设置在 0 到 3 之间。 默认值：1.0
+  - `volume`: 音频音量，该值应设置在 0 到 3 之间。 默认值： 1.0
+  - `sample_rate`: 采样率，可选 [0, 8000, 16000]，默认与模型相同。 默认值：0
+  - `output`: 输出音频的路径， 默认值：None，表示不保存音频到本地。
+
+  输出:
+  ```bash
+  [2022-02-23 15:20:37,875] [    INFO] - {'description': 'success.'}
+  [2022-02-23 15:20:37,875] [    INFO] - Save synthesized audio successfully on output.wav.
+  [2022-02-23 15:20:37,875] [    INFO] - Audio duration: 3.612500 s.
+  [2022-02-23 15:20:37,875] [    INFO] - Response time: 0.348050 s.
+  ```

 - Python API
  ```python
@ -192,12 +198,17 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee

  ```

-  ### 6. CLS 客户端使用方法
-  **注意：** 初次使用客户端时响应时间会略长
-  - 命令行 (推荐使用)
-   ```
-   paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
-   ```
+### 6. CLS 客户端使用方法
+
+**注意：** 初次使用客户端时响应时间会略长
+
+- 命令行 (推荐使用)
+
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+  ```
+  paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
+  ```

  使用帮助:
  
@ -205,7 +216,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
  paddlespeech_client cls --help
  ```
  参数:
-  - `server_ip`: 服务端ip地址，默认: 127.0.0.1。
+  - `server_ip`: 服务端 ip 地址，默认: 127.0.0.1。
  - `port`: 服务端口，默认: 8090。
  - `input`(必须输入): 用于分类的音频文件。
  - `topk`: 分类结果的topk。
@ -239,13 +250,181 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee

  ```

+### 7. 声纹客户端使用方法
+
+#### 7.1 提取声纹特征
+注意： 初次使用客户端时响应时间会略长
+* 命令行 (推荐使用)
+
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+  ``` bash
+  paddlespeech_client vector --task spk  --server_ip 127.0.0.1 --port 8090 --input 85236145389.wav
+  ```
+
+  使用帮助:
+
+  ``` bash
+  paddlespeech_client vector --help
+  ```
+  参数:
+  * server_ip: 服务端ip地址，默认: 127.0.0.1。
+  * port: 服务端口，默认: 8090。
+  * input(必须输入): 用于识别的音频文件。
+  * task: vector 的任务，可选spk或者score。默认是 spk。
+  * enroll: 注册音频；。
+  * test: 测试音频。
+  输出:
+
+  ``` bash
+    [2022-05-25 12:25:36,165] [    INFO] - vector http client start
+    [2022-05-25 12:25:36,165] [    INFO] - the input audio: 85236145389.wav
+    [2022-05-25 12:25:36,165] [    INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector
+    [2022-05-25 12:25:36,166] [    INFO] - http://127.0.0.1:8790/paddlespeech/vector
+    [2022-05-25 12:25:36,324] [    INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
+    [2022-05-25 12:25:36,324] [    INFO] - Response time 0.159053 s.
+  ```
+
+* Python API
+
+  ``` python
+  from paddlespeech.server.bin.paddlespeech_client import VectorClientExecutor
+
+  vectorclient_executor = VectorClientExecutor()
+  res = vectorclient_executor(
+      input="85236145389.wav",
+      server_ip="127.0.0.1",
+      port=8090,
+      task="spk")
+  print(res)
+  ```
+
+  输出:
+
+  ``` bash
+    {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
+  ```
+
+#### 7.2 音频声纹打分
+
+注意： 初次使用客户端时响应时间会略长
+* 命令行 (推荐使用)
+
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+  ``` bash
+  paddlespeech_client vector --task score  --server_ip 127.0.0.1 --port 8090 --enroll 85236145389.wav --test 123456789.wav
+  ```
+
+  使用帮助:
+
+  ``` bash
+  paddlespeech_client vector --help
+  ```
+
+  参数:
+  * server_ip: 服务端ip地址，默认: 127.0.0.1。
+  * port: 服务端口，默认: 8090。
+  * input(必须输入): 用于识别的音频文件。
+  * task: vector 的任务，可选spk或者score。默认是 spk。
+  * enroll: 注册音频；。
+  * test: 测试音频。
+
+  输出:
+
+  ``` bash
+    [2022-05-25 12:33:24,527] [    INFO] - vector score http client start
+    [2022-05-25 12:33:24,527] [    INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
+    [2022-05-25 12:33:24,528] [    INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
+    [2022-05-25 12:33:24,695] [    INFO] - The vector score is: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
+    [2022-05-25 12:33:24,696] [    INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
+    [2022-05-25 12:33:24,696] [    INFO] - Response time 0.168271 s.
+  ```
+
+* Python API
+
+  ``` python 
+  from paddlespeech.server.bin.paddlespeech_client import VectorClientExecutor
+
+  vectorclient_executor = VectorClientExecutor()
+  res = vectorclient_executor(
+      input=None,
+      enroll_audio="85236145389.wav",
+      test_audio="123456789.wav",
+      server_ip="127.0.0.1",
+      port=8090,
+      task="score")
+  print(res)
+  ```
+
+  输出:
+
+  ``` bash
+  [2022-05-25 12:30:14,143] [    INFO] - vector score http client start
+  [2022-05-25 12:30:14,143] [    INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
+  [2022-05-25 12:30:14,143] [    INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
+  [2022-05-25 12:30:14,363] [    INFO] - The vector score is: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
+  {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
+  ```
+
+
+### 8. 标点预测
+  
+  **注意：** 初次使用客户端时响应时间会略长
+- 命令行 (推荐使用)
+
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+  ``` bash
+  paddlespeech_client text --server_ip 127.0.0.1 --port 8090 --input "我认为跑步最重要的就是给我带来了身体健康"
+  ```
+
+  使用帮助:
+  
+  ```bash
+  paddlespeech_client text --help
+  ```
+  参数:
+  - `server_ip`: 服务端ip地址，默认: 127.0.0.1。
+  - `port`: 服务端口，默认: 8090。
+  - `input`(必须输入): 用于标点预测的文本内容。
+
+  输出:
+  ```bash
+    [2022-05-09 18:19:04,397] [    INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
+    [2022-05-09 18:19:04,397] [    INFO] - Response time 0.092407 s.
+  ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import TextClientExecutor
+
+  textclient_executor = TextClientExecutor()
+  res = textclient_executor(
+      input="我认为跑步最重要的就是给我带来了身体健康",
+      server_ip="127.0.0.1",
+      port=8090,)
+  print(res)
+
+  ```
+
+  输出:
+  ```bash
+  我认为跑步最重要的就是给我带来了身体健康。
+  ```

 ## 服务支持的模型
-### ASR支持的模型
-通过 `paddlespeech_server stats --task asr` 获取ASR服务支持的所有模型，其中静态模型可用于 paddle inference 推理。 
+### ASR 支持的模型
+通过 `paddlespeech_server stats --task asr` 获取 ASR 服务支持的所有模型，其中静态模型可用于 paddle inference 推理。 
+
+### TTS 支持的模型
+通过 `paddlespeech_server stats --task tts` 获取 TTS 服务支持的所有模型，其中静态模型可用于  paddle inference 推理。
+
+### CLS 支持的模型
+通过 `paddlespeech_server stats --task cls` 获取 CLS 服务支持的所有模型，其中静态模型可用于  paddle inference 推理。

-### TTS支持的模型
-通过 `paddlespeech_server stats --task tts` 获取TTS服务支持的所有模型，其中静态模型可用于 paddle inference 推理。
+### Vector 支持的模型
+通过 `paddlespeech_server stats --task vector` 获取 Vector 服务支持的所有模型。

-### CLS支持的模型
-通过 `paddlespeech_server stats --task cls` 获取CLS服务支持的所有模型，其中静态模型可用于 paddle inference 推理。
+### Text支持的模型
+通过 `paddlespeech_server stats --task text` 获取 Text 服务支持的所有模型。
--- a/demos/speech_server/asr_client.sh
+++ b/demos/speech_server/asr_client.sh
@ -1,4 +1,6 @@
 #!/bin/bash

 wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav
+
+# If `127.0.0.1` is not accessible, you need to use the actual service IP address.
 paddlespeech_client asr --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
--- a/demos/speech_server/cls_client.sh
+++ b/demos/speech_server/cls_client.sh
@ -1,4 +1,6 @@
 #!/bin/bash

 wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav
+
+# If `127.0.0.1` is not accessible, you need to use the actual service IP address.
 paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input ./zh.wav --topk 1
--- a/demos/speech_server/conf/application.yaml
+++ b/demos/speech_server/conf/application.yaml
@ -1,15 +1,15 @@
-# This is the parameter configuration file for PaddleSpeech Serving.
+# This is the parameter configuration file for PaddleSpeech Offline Serving.

 #################################################################################
 #                             SERVER SETTING                                    #
 #################################################################################
-host: 127.0.0.1
+host: 0.0.0.0
 port: 8090

 # The task format in the engin_list is: <speech task>_<engine type>
-# task choices = ['asr_python', 'asr_inference', 'tts_python', 'tts_inference']
-
-engine_list: ['asr_python', 'tts_python', 'cls_python']
+# task choices = ['asr_python', 'asr_inference', 'tts_python', 'tts_inference', 'cls_python', 'cls_inference']
+protocol: 'http'
+engine_list: ['asr_python', 'tts_python', 'cls_python', 'text_python', 'vector_python']


 #################################################################################
@ -135,3 +135,26 @@ cls_inference:
        glog_info: False  # True -> print glog
        summary: True  # False -> do not show predictor config

+
+################################### Text #########################################
+################### text task: punc; engine_type: python #######################
+text_python:
+    task: punc
+    model_type: 'ernie_linear_p3_wudao'
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: # [optional]
+    ckpt_path: # [optional]
+    vocab_file: # [optional]
+    device:  # set 'gpu:id' or 'cpu'
+
+
+################################### Vector ######################################
+################### Vector task: spk; engine_type: python #######################
+vector_python:
+    task: spk
+    model_type: 'ecapatdnn_voxceleb12'
+    sample_rate: 16000
+    cfg_path: # [optional]
+    ckpt_path: # [optional]
+    device:  # set 'gpu:id' or 'cpu'
--- a/demos/speech_server/start_multi_progress_server.py
+++ b/demos/speech_server/start_multi_progress_server.py
@ -0,0 +1,70 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+import warnings
+
+import uvicorn
+from fastapi import FastAPI
+from starlette.middleware.cors import CORSMiddleware
+
+from paddlespeech.server.engine.engine_pool import init_engine_pool
+from paddlespeech.server.restful.api import setup_router as setup_http_router
+from paddlespeech.server.utils.config import get_config
+from paddlespeech.server.ws.api import setup_router as setup_ws_router
+warnings.filterwarnings("ignore")
+import sys
+
+app = FastAPI(
+    title="PaddleSpeech Serving API", description="Api", version="0.0.1")
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"])
+
+# change yaml file here
+config_file = "./conf/application.yaml"
+config = get_config(config_file)
+
+# init engine
+if not init_engine_pool(config):
+    print("Failed to init engine.")
+    sys.exit(-1)
+
+# get api_router
+api_list = list(engine.split("_")[0] for engine in config.engine_list)
+if config.protocol == "websocket":
+    api_router = setup_ws_router(api_list)
+elif config.protocol == "http":
+    api_router = setup_http_router(api_list)
+else:
+    raise Exception("unsupported protocol")
+    sys.exit(-1)
+
+# app needs to operate outside the main function 
+app.include_router(api_router)
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(add_help=True)
+    parser.add_argument(
+        "--workers", type=int, help="workers of server", default=1)
+    args = parser.parse_args()
+
+    uvicorn.run(
+        "start_multi_progress_server:app",
+        host=config.host,
+        port=config.port,
+        debug=True,
+        workers=args.workers)
--- a/demos/speech_server/tts_client.sh
+++ b/demos/speech_server/tts_client.sh
@ -1,3 +1,4 @@
 #!/bin/bash

+# If `127.0.0.1` is not accessible, you need to use the actual service IP address.
 paddlespeech_client tts --server_ip 127.0.0.1 --port 8090 --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
--- a/demos/speech_translation/README.md
+++ b/demos/speech_translation/README.md
@ -47,7 +47,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import STExecutor
+  from paddlespeech.cli.st import STExecutor

  st_executor = STExecutor()
  text = st_executor(
--- a/demos/speech_translation/README_cn.md
+++ b/demos/speech_translation/README_cn.md
@ -47,7 +47,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import STExecutor
+  from paddlespeech.cli.st import STExecutor
  
  st_executor = STExecutor()
  text = st_executor(
--- a/demos/streaming_asr_server/.gitignore
+++ b/demos/streaming_asr_server/.gitignore
@ -0,0 +1,2 @@
+exp
+
--- a/demos/streaming_asr_server/README.md
+++ b/demos/streaming_asr_server/README.md
@ -1,10 +1,11 @@
 ([简体中文](./README_cn.md)|English)

-# Speech Server
+# Streaming ASR Server

 ## Introduction
 This demo is an implementation of starting the streaming speech service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python.

+Streaming ASR server only support `websocket` protocol, and doesn't support `http` protocol.

 ## Usage
 ### 1. Installation
@ -14,7 +15,7 @@ It is recommended to use **paddlepaddle 2.2.1** or above.
 You can choose one way from meduim and hard to install paddlespeech.

 ### 2. Prepare config File
-The configuration file can be found in `conf/ws_application.yaml` 和 `conf/ws_conformer_application.yaml`.
+The configuration file can be found in `conf/ws_application.yaml` 和 `conf/ws_conformer_wenetspeech_application.yaml`.

 At present, the speech tasks integrated by the model include: DeepSpeech2 and conformer.

@ -28,10 +29,12 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav

 ### 3. Server Usage
 - Command Line (Recommended)
-
+  **Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file.
  ```bash
-  # start the service
-   paddlespeech_server start --config_file ./conf/ws_conformer_application.yaml
+  # in PaddleSpeech/demos/streaming_asr_server start the service
+   paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application.yaml
+  # if you want to increase decoding speed, you can use the config file below, it will increase decoding speed and reduce accuracy  
+   paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application_faster.yaml
  ```

  Usage:
@ -45,151 +48,77 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav

  Output:
  ```bash
-    [2022-04-21 15:52:18,126] [    INFO] - create the online asr engine instance
-    [2022-04-21 15:52:18,127] [    INFO] - paddlespeech_server set the device: cpu
-    [2022-04-21 15:52:18,128] [    INFO] - Load the pretrained model, tag = conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,128] [    INFO] - File /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/asr1_chunk_conformer_multi_cn_ckpt_0.2.3.model.tar.gz md5 checking...
-    [2022-04-21 15:52:18,727] [    INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/model.yaml
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/exp/chunk_conformer/checkpoints/multi_cn.pdparams
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/exp/chunk_conformer/checkpoints/multi_cn.pdparams
-    [2022-04-21 15:52:19,446] [    INFO] - start to create the stream conformer asr engine
-    [2022-04-21 15:52:19,473] [    INFO] - model name: conformer_online
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    [2022-04-21 15:52:21,731] [    INFO] - create the transformer like model success
-    [2022-04-21 15:52:21,733] [    INFO] - Initialize ASR server engine successfully.
-    INFO:     Started server process [11173]
-    [2022-04-21 15:52:21] [INFO] [server.py:75] Started server process [11173]
-    INFO:     Waiting for application startup.
-    [2022-04-21 15:52:21] [INFO] [on.py:45] Waiting for application startup.
-    INFO:     Application startup complete.
-    [2022-04-21 15:52:21] [INFO] [on.py:59] Application startup complete.
-    /home/users/xiongxinlei/.conda/envs/paddlespeech/lib/python3.9/asyncio/base_events.py:1460: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
-    infos = await tasks.gather(*fs, loop=self)
-    /home/users/xiongxinlei/.conda/envs/paddlespeech/lib/python3.9/asyncio/base_events.py:1518: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
-    await tasks.sleep(0, loop=self)
-    INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
-    [2022-04-21 15:52:21] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+     [2022-05-14 04:56:13,086] [    INFO] - create the online asr engine instance
+     [2022-05-14 04:56:13,086] [    INFO] - paddlespeech_server set the device: cpu
+     [2022-05-14 04:56:13,087] [    INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
+     [2022-05-14 04:56:13,087] [    INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5        checking...
+     [2022-05-14 04:56:17,542] [    INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.  0.0a.model.tar
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/               chunk_conformer/checkpoints/avg_10.pdparams
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/               chunk_conformer/checkpoints/avg_10.pdparams
+     [2022-05-14 04:56:17,852] [    INFO] - start to create the stream conformer asr engine
+     [2022-05-14 04:56:17,863] [    INFO] - model name: conformer_online
+     [2022-05-14 04:56:22,756] [    INFO] - create the transformer like model success
+     [2022-05-14 04:56:22,758] [    INFO] - Initialize ASR server engine successfully.
+     INFO:     Started server process [4242]
+     [2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
+     INFO:     Waiting for application startup.
+     [2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
+     INFO:     Application startup complete.
+     [2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
+     INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+     [2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
  ```

 - Python API
+  **Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file.
  ```python
+  # in PaddleSpeech/demos/streaming_asr_server directory
  from paddlespeech.server.bin.paddlespeech_server import ServerExecutor

  server_executor = ServerExecutor()
  server_executor(
-      config_file="./conf/ws_conformer_application.yaml", 
+      config_file="./conf/ws_conformer_wenetspeech_application.yaml",
      log_file="./log/paddlespeech.log")
  ```

  Output:
  ```bash
-    [2022-04-21 15:52:18,126] [    INFO] - create the online asr engine instance
-    [2022-04-21 15:52:18,127] [    INFO] - paddlespeech_server set the device: cpu
-    [2022-04-21 15:52:18,128] [    INFO] - Load the pretrained model, tag = conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,128] [    INFO] - File /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/asr1_chunk_conformer_multi_cn_ckpt_0.2.3.model.tar.gz md5 checking...
-    [2022-04-21 15:52:18,727] [    INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/model.yaml
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/exp/chunk_conformer/checkpoints/multi_cn.pdparams
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/exp/chunk_conformer/checkpoints/multi_cn.pdparams
-    [2022-04-21 15:52:19,446] [    INFO] - start to create the stream conformer asr engine
-    [2022-04-21 15:52:19,473] [    INFO] - model name: conformer_online
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    [2022-04-21 15:52:21,731] [    INFO] - create the transformer like model success
-    [2022-04-21 15:52:21,733] [    INFO] - Initialize ASR server engine successfully.
-    INFO:     Started server process [11173]
-    [2022-04-21 15:52:21] [INFO] [server.py:75] Started server process [11173]
-    INFO:     Waiting for application startup.
-    [2022-04-21 15:52:21] [INFO] [on.py:45] Waiting for application startup.
-    INFO:     Application startup complete.
-    [2022-04-21 15:52:21] [INFO] [on.py:59] Application startup complete.
-    /home/users/xiongxinlei/.conda/envs/paddlespeech/lib/python3.9/asyncio/base_events.py:1460: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
-    infos = await tasks.gather(*fs, loop=self)
-    /home/users/xiongxinlei/.conda/envs/paddlespeech/lib/python3.9/asyncio/base_events.py:1518: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
-    await tasks.sleep(0, loop=self)
-    INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
-    [2022-04-21 15:52:21] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+     [2022-05-14 04:56:13,086] [    INFO] - create the online asr engine instance
+     [2022-05-14 04:56:13,086] [    INFO] - paddlespeech_server set the device: cpu
+     [2022-05-14 04:56:13,087] [    INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
+     [2022-05-14 04:56:13,087] [    INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5        checking...
+     [2022-05-14 04:56:17,542] [    INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.  0.0a.model.tar
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/               chunk_conformer/checkpoints/avg_10.pdparams
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/               chunk_conformer/checkpoints/avg_10.pdparams
+     [2022-05-14 04:56:17,852] [    INFO] - start to create the stream conformer asr engine
+     [2022-05-14 04:56:17,863] [    INFO] - model name: conformer_online
+     [2022-05-14 04:56:22,756] [    INFO] - create the transformer like model success
+     [2022-05-14 04:56:22,758] [    INFO] - Initialize ASR server engine successfully.
+     INFO:     Started server process [4242]
+     [2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
+     INFO:     Waiting for application startup.
+     [2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
+     INFO:     Application startup complete.
+     [2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
+     INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+     [2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
  ```


 ### 4. ASR Client Usage
+
 **Note:** The response time will be slightly longer when using the client for the first time
 - Command Line (Recommended)
-   ```
-   paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
-   ```
+
+  If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+  ```
+  paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
+  ```

  Usage:
  
@ -203,81 +132,86 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
  - `sample_rate`: Audio ampling rate, default: 16000.
  - `lang`: Language. Default: "zh_cn".
  - `audio_format`: Audio format. Default: "wav".
+  - `punc.server_ip`: punctuation server ip. Default: None.
+  - `punc.server_port`: punctuation server port. Default: None.

  Output:
  ```bash
-        [2022-04-21 15:59:03,904] [    INFO] - receive msg={"status": "ok", "signal": "server_ready"}
-        [2022-04-21 15:59:03,960] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:03,973] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:03,987] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,000] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,012] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,024] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,036] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,047] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,607] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,620] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,633] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,645] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,657] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,669] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,680] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:05,176] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,185] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,192] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,200] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,208] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,216] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,224] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,232] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,724] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,732] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,740] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,747] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,755] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,763] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,770] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:06,271] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,279] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,287] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,294] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,302] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,310] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,318] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,326] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,833] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,842] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,850] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,858] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,866] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,874] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,882] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:07,400] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,408] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,416] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,424] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,432] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,440] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,447] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,455] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,984] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:07,992] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,001] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,008] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,016] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,024] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:12,883] [    INFO] - final receive msg={'status': 'ok', 'signal': 'finished', 'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:12,884] [    INFO] - 我认为跑步最重要的就是给我带来了身体健康
-        [2022-04-21 15:59:12,884] [    INFO] - Response time 9.051567 s.
+      [2022-05-06 21:10:35,598] [    INFO] - Start to do streaming asr client
+      [2022-05-06 21:10:35,600] [    INFO] - asr websocket client start
+      [2022-05-06 21:10:35,600] [    INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
+      [2022-05-06 21:10:35,600] [    INFO] - start to process the wavscp: ./zh.wav
+      [2022-05-06 21:10:35,670] [    INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
+      [2022-05-06 21:10:35,699] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,713] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,726] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,738] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,750] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,762] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,774] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,786] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,387] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,398] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,407] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,416] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,425] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,434] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,442] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,930] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,938] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,946] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,954] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,962] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,970] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,977] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,985] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:37,484] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,492] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,500] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,508] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,517] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,525] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,532] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:38,050] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,058] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,066] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,073] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,081] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,089] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,097] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,105] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,630] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,639] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,647] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,655] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,663] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,671] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,679] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:39,216] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,224] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,232] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,240] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,248] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,256] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,264] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,272] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,885] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,896] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,905] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,915] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,924] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,934] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:44,827] [    INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
+      [2022-05-06 21:10:44,827] [    INFO] - audio duration: 4.9968125, elapsed time: 9.225094079971313, RTF=1.846195765794957
+      [2022-05-06 21:10:44,828] [    INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康

  ```

 - Python API
  ```python
-  from paddlespeech.server.bin.paddlespeech_client import ASRClientExecutor
-  import json
+  from paddlespeech.server.bin.paddlespeech_client import ASROnlineClientExecutor

-  asrclient_executor = ASRClientExecutor()
+  asrclient_executor = ASROnlineClientExecutor()
  res = asrclient_executor(
      input="./zh.wav",
      server_ip="127.0.0.1",
@ -285,71 +219,359 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
      sample_rate=16000,
      lang="zh_cn",
      audio_format="wav")
-  print(res.json())
+  print(res)
  ```

  Output:
  ```bash
-        [2022-04-21 15:59:03,904] [    INFO] - receive msg={"status": "ok", "signal": "server_ready"}
-        [2022-04-21 15:59:03,960] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:03,973] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:03,987] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,000] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,012] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,024] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,036] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,047] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,607] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,620] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,633] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,645] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,657] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,669] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,680] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:05,176] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,185] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,192] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,200] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,208] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,216] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,224] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,232] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,724] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,732] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,740] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,747] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,755] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,763] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,770] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:06,271] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,279] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,287] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,294] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,302] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,310] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,318] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,326] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,833] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,842] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,850] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,858] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,866] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,874] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,882] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:07,400] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,408] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,416] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,424] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,432] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,440] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,447] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,455] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,984] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:07,992] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,001] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,008] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,016] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,024] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:12,883] [    INFO] - final receive msg={'status': 'ok', 'signal': 'finished', 'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:12,884] [    INFO] - 我认为跑步最重要的就是给我带来了身体健康
+  [2022-05-06 21:14:03,137] [    INFO] - asr websocket client start
+  [2022-05-06 21:14:03,137] [    INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
+  [2022-05-06 21:14:03,149] [    INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
+  [2022-05-06 21:14:03,167] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,181] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,194] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,207] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,219] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,230] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,241] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,252] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,768] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,776] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,784] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,792] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,800] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,807] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,815] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:04,301] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,309] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,317] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,325] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,333] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,341] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,349] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,356] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,855] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,864] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,871] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,879] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,887] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,894] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,902] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:05,418] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,426] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,434] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,442] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,449] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,457] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,465] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,473] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,996] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,006] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,013] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,021] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,029] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,037] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,045] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,581] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,589] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,597] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,605] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,613] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,621] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,628] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,636] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:07,188] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,196] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,203] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,211] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,219] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,226] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:12,158] [    INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
+  [2022-05-06 21:14:12,159] [    INFO] - audio duration: 4.9968125, elapsed time: 9.019973039627075, RTF=1.8051453881103354
+  [2022-05-06 21:14:12,160] [    INFO] - asr websocket client finished
+  ```
+
+
+## Punctuation service
+
+### 1. Server usage
+ 
+- Command Line
+  **Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file.
+  ``` bash
+  In PaddleSpeech/demos/streaming_asr_server directory to lanuch punctuation service
+  paddlespeech_server start --config_file conf/punc_application.yaml
+  ```
+
+
+   Usage:
+  ```bash
+  paddlespeech_server start --help
+  ```
+  
+  Arguments:
+  - `config_file`: configuration file.
+  - `log_file`: log file.
+
+
+  Output:
+  ``` bash
+  [2022-05-02 17:59:26,285] [    INFO] - Create the TextEngine Instance
+  [2022-05-02 17:59:26,285] [    INFO] - Init the text engine
+  [2022-05-02 17:59:26,285] [    INFO] - Text Engine set the device: gpu:0
+  [2022-05-02 17:59:26,286] [    INFO] - File /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar.gz md5 checking...
+  [2022-05-02 17:59:30,810] [    INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar
+  W0502 17:59:31.486552  9595 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.2
+  W0502 17:59:31.491360  9595 device_context.cc:465] device: 0, cuDNN Version: 7.6.
+  [2022-05-02 17:59:34,688] [    INFO] - Already cached /home/users/xiongxinlei/.paddlenlp/models/ernie-1.0/vocab.txt
+  [2022-05-02 17:59:34,701] [    INFO] - Init the text engine successfully
+  INFO:     Started server process [9595]
+  [2022-05-02 17:59:34] [INFO] [server.py:75] Started server process [9595]
+  INFO:     Waiting for application startup.
+  [2022-05-02 17:59:34] [INFO] [on.py:45] Waiting for application startup.
+  INFO:     Application startup complete.
+  [2022-05-02 17:59:34] [INFO] [on.py:59] Application startup complete.
+  INFO:     Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
+  [2022-05-02 17:59:34] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
+  ```
+
+- Python API
+  **Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file.
+  ```python
+  # 在 PaddleSpeech/demos/streaming_asr_server 目录
+  from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
+
+  server_executor = ServerExecutor()
+  server_executor(
+      config_file="./conf/punc_application.yaml", 
+      log_file="./log/paddlespeech.log")
+  ```
+
+   Output:
+   ```
+    [2022-05-02 18:09:02,542] [    INFO] - Create the TextEngine Instance
+    [2022-05-02 18:09:02,543] [    INFO] - Init the text engine
+    [2022-05-02 18:09:02,543] [    INFO] - Text Engine set the device: gpu:0
+    [2022-05-02 18:09:02,545] [    INFO] - File /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar.gz md5 checking...
+    [2022-05-02 18:09:06,919] [    INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar
+    W0502 18:09:07.523002 22615 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.2
+    W0502 18:09:07.527882 22615 device_context.cc:465] device: 0, cuDNN Version: 7.6.
+    [2022-05-02 18:09:10,900] [    INFO] - Already cached /home/users/xiongxinlei/.paddlenlp/models/ernie-1.0/vocab.txt
+    [2022-05-02 18:09:10,913] [    INFO] - Init the text engine successfully
+    INFO:     Started server process [22615]
+    [2022-05-02 18:09:10] [INFO] [server.py:75] Started server process [22615]
+    INFO:     Waiting for application startup.
+    [2022-05-02 18:09:10] [INFO] [on.py:45] Waiting for application startup.
+    INFO:     Application startup complete.
+    [2022-05-02 18:09:10] [INFO] [on.py:59] Application startup complete.
+    INFO:     Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
+    [2022-05-02 18:09:10] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
+   ```
+
+### 2. Client usage
+**Note** The response time will be slightly longer when using the client for the first time
+
+- Command line:
+
+  If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+  ```
+  paddlespeech_client text --server_ip 127.0.0.1 --port 8190 --input "我认为跑步最重要的就是给我带来了身体健康"
+  ```
+  
+  Output
+  ```
+  [2022-05-02 18:12:29,767] [    INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
+  [2022-05-02 18:12:29,767] [    INFO] - Response time 0.096548 s.
+  ```
+
+- Python3 API
+
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import TextClientExecutor
+
+  textclient_executor = TextClientExecutor()
+  res = textclient_executor(
+      input="我认为跑步最重要的就是给我带来了身体健康",
+      server_ip="127.0.0.1",
+      port=8190,)
+  print(res)
+  ```
+
+  Output:
+  ``` bash
+  我认为跑步最重要的就是给我带来了身体健康。
+  ```
+
+
+## Join streaming asr and punctuation server
+
+By default, each server is deployed on the 'CPU' device and speech recognition and punctuation prediction can be deployed on different 'GPU' by modifying the' device 'parameter in the service configuration file respectively.
+
+We use `streaming_ asr_server.py` and `punc_server.py` two services to lanuch streaming speech recognition and punctuation prediction services respectively. And the `websocket_client.py` script can be used to call streaming speech recognition and punctuation prediction services at the same time.
+
+### 1. Start two server
+
+``` bash
+Note: streaming speech recognition and punctuation prediction are configured on different graphics cards through configuration files
+bash server.sh
+```
+
+### 2. Call client
+- Command line
+
+  If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+  ```
+  paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --input ./zh.wav
+  ```
+  Output:
+  ```
+  [2022-05-07 11:21:47,060] [    INFO] - asr websocket client start
+  [2022-05-07 11:21:47,060] [    INFO] - endpoint: ws://127.0.0.1:8490/paddlespeech/asr/streaming
+  [2022-05-07 11:21:47,080] [    INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
+  [2022-05-07 11:21:47,096] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,108] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,120] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,131] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,142] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,152] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,163] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,173] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,705] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,713] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,721] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,728] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,736] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,743] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,751] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:48,459] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:48,572] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:48,681] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:48,790] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:48,898] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:49,005] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:49,112] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:49,219] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:49,935] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,062] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,186] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,310] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,435] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,560] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,686] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:51,444] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:51,606] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:51,744] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:51,882] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:52,020] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:52,159] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:52,298] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:52,437] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:53,298] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:53,450] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:53,589] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:53,728] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:53,867] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:54,007] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:54,146] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:55,002] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,148] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,292] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,437] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,584] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,731] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,877] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:56,021] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:56,842] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,013] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,174] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,336] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,497] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,659] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:22:03,035] [    INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康。', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
+  [2022-05-07 11:22:03,035] [    INFO] - audio duration: 4.9968125, elapsed time: 15.974023818969727, RTF=3.1968427510477384
+  [2022-05-07 11:22:03,037] [    INFO] - asr websocket client finished
+  [2022-05-07 11:22:03,037] [    INFO] - 我认为跑步最重要的就是给我带来了身体健康。
+  [2022-05-07 11:22:03,037] [    INFO] - Response time 15.977116 s.
  ```
+
+- Use script
+
+  If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+  ```
+  python3 websocket_client.py --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --wavfile ./zh.wav
+  ```
+  Output:
+  ```
+  [2022-05-07 11:11:02,984] [    INFO] - Start to do streaming asr client
+  [2022-05-07 11:11:02,985] [    INFO] - asr websocket client start
+  [2022-05-07 11:11:02,985] [    INFO] - endpoint: ws://127.0.0.1:8490/paddlespeech/asr/streaming
+  [2022-05-07 11:11:02,986] [    INFO] - start to process the wavscp: ./zh.wav
+  [2022-05-07 11:11:03,006] [    INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
+  [2022-05-07 11:11:03,021] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,034] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,046] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,058] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,070] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,081] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,092] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,102] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,629] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,638] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,645] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,653] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,661] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,668] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,676] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:04,402] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,510] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,619] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,743] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,849] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,956] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:05,063] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:05,170] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:05,876] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,019] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,184] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,342] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,537] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,727] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,871] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:07,617] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:07,769] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:07,905] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,043] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,186] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,326] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,466] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,611] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:09,431] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:09,571] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:09,714] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:09,853] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:09,992] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:10,129] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:10,266] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:11,113] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,296] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,439] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,582] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,727] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,869] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:12,011] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:12,153] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:12,969] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,137] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,297] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,456] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,615] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,776] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:18,915] [    INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康。', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
+  [2022-05-07 11:11:18,915] [    INFO] - audio duration: 4.9968125, elapsed time: 15.928460597991943, RTF=3.187724293835709
+  [2022-05-07 11:11:18,916] [    INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
+  ```
+
+  
--- a/demos/streaming_asr_server/README_cn.md
+++ b/demos/streaming_asr_server/README_cn.md
@ -1,22 +1,30 @@
 ([English](./README.md)|中文)

-# 语音服务
+# 流式语音识别服务

 ## 介绍
 这个demo是一个启动流式语音服务和访问服务的实现。 它可以通过使用`paddlespeech_server` 和 `paddlespeech_client`的单个命令或 python 的几行代码来实现。

+**流式语音识别服务只支持 `weboscket` 协议，不支持 `http` 协议。**

 ## 使用方法
 ### 1. 安装
-请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
+安装 PaddleSpeech 的详细过程请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md)。

 推荐使用 **paddlepaddle 2.2.1** 或以上版本。
-你可以从 medium，hard 三中方式中选择一种方式安装 PaddleSpeech。
+你可以从medium，hard 两种方式中选择一种方式安装 PaddleSpeech。


 ### 2. 准备配置文件
-配置文件可参见 `conf/ws_application.yaml` 和 `conf/ws_conformer_application.yaml` 。
-目前服务集成的模型有： DeepSpeech2和conformer模型。
+
+流式ASR的服务启动脚本和服务测试脚本存放在 `PaddleSpeech/demos/streaming_asr_server` 目录。
+下载好 `PaddleSpeech` 之后，进入到 `PaddleSpeech/demos/streaming_asr_server` 目录。
+配置文件可参见该目录下 `conf/ws_application.yaml` 和 `conf/ws_conformer_wenetspeech_application.yaml` 。
+
+目前服务集成的模型有： DeepSpeech2 和 conformer模型，对应的配置文件如下：
+* DeepSpeech: `conf/ws_application.yaml`
+* conformer: `conf/ws_conformer_wenetspeech_application.yaml`
+


 这个 ASR client 的输入应该是一个 WAV 文件（`.wav`），并且采样率必须与模型的采样率相同。
@ -28,10 +36,12 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav

 ### 3. 服务端使用方法
 - 命令行 (推荐使用)
-
+  **注意:** 默认部署在 `cpu` 设备上，可以通过修改服务配置文件中 `device` 参数部署在 `gpu` 上。
  ```bash
-  # 启动服务
-  paddlespeech_server start --config_file ./conf/ws_conformer_application.yaml
+  # 在 PaddleSpeech/demos/streaming_asr_server 目录启动服务
+  paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application.yaml
+  # 你如果愿意为了增加解码的速度而牺牲一定的模型精度，你可以使用如下的脚本 
+   paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application_faster.yaml
  ```

  使用方法：
@ -45,150 +55,75 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav

  输出:
  ```bash
-    [2022-04-21 15:52:18,126] [    INFO] - create the online asr engine instance
-    [2022-04-21 15:52:18,127] [    INFO] - paddlespeech_server set the device: cpu
-    [2022-04-21 15:52:18,128] [    INFO] - Load the pretrained model, tag = conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,128] [    INFO] - File /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/asr1_chunk_conformer_multi_cn_ckpt_0.2.3.model.tar.gz md5 checking...
-    [2022-04-21 15:52:18,727] [    INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/model.yaml
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/exp/chunk_conformer/checkpoints/multi_cn.pdparams
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/exp/chunk_conformer/checkpoints/multi_cn.pdparams
-    [2022-04-21 15:52:19,446] [    INFO] - start to create the stream conformer asr engine
-    [2022-04-21 15:52:19,473] [    INFO] - model name: conformer_online
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    [2022-04-21 15:52:21,731] [    INFO] - create the transformer like model success
-    [2022-04-21 15:52:21,733] [    INFO] - Initialize ASR server engine successfully.
-    INFO:     Started server process [11173]
-    [2022-04-21 15:52:21] [INFO] [server.py:75] Started server process [11173]
-    INFO:     Waiting for application startup.
-    [2022-04-21 15:52:21] [INFO] [on.py:45] Waiting for application startup.
-    INFO:     Application startup complete.
-    [2022-04-21 15:52:21] [INFO] [on.py:59] Application startup complete.
-    /home/users/xiongxinlei/.conda/envs/paddlespeech/lib/python3.9/asyncio/base_events.py:1460: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
-    infos = await tasks.gather(*fs, loop=self)
-    /home/users/xiongxinlei/.conda/envs/paddlespeech/lib/python3.9/asyncio/base_events.py:1518: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
-    await tasks.sleep(0, loop=self)
-    INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
-    [2022-04-21 15:52:21] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+     [2022-05-14 04:56:13,086] [    INFO] - create the online asr engine instance
+     [2022-05-14 04:56:13,086] [    INFO] - paddlespeech_server set the device: cpu
+     [2022-05-14 04:56:13,087] [    INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
+     [2022-05-14 04:56:13,087] [    INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5        checking...
+     [2022-05-14 04:56:17,542] [    INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.  0.0a.model.tar
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/               chunk_conformer/checkpoints/avg_10.pdparams
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/               chunk_conformer/checkpoints/avg_10.pdparams
+     [2022-05-14 04:56:17,852] [    INFO] - start to create the stream conformer asr engine
+     [2022-05-14 04:56:17,863] [    INFO] - model name: conformer_online
+     [2022-05-14 04:56:22,756] [    INFO] - create the transformer like model success
+     [2022-05-14 04:56:22,758] [    INFO] - Initialize ASR server engine successfully.
+     INFO:     Started server process [4242]
+     [2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
+     INFO:     Waiting for application startup.
+     [2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
+     INFO:     Application startup complete.
+     [2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
+     INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+     [2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
  ```

 - Python API
+  **注意:** 默认部署在 `cpu` 设备上，可以通过修改服务配置文件中 `device` 参数部署在 `gpu` 上。
  ```python
+  # 在 PaddleSpeech/demos/streaming_asr_server 目录
  from paddlespeech.server.bin.paddlespeech_server import ServerExecutor

  server_executor = ServerExecutor()
  server_executor(
-      config_file="./conf/ws_conformer_application.yaml", 
+      config_file="./conf/ws_conformer_wenetspeech_application", 
      log_file="./log/paddlespeech.log")
  ```

  输出：
  ```bash
-    [2022-04-21 15:52:18,126] [    INFO] - create the online asr engine instance
-    [2022-04-21 15:52:18,127] [    INFO] - paddlespeech_server set the device: cpu
-    [2022-04-21 15:52:18,128] [    INFO] - Load the pretrained model, tag = conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,128] [    INFO] - File /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/asr1_chunk_conformer_multi_cn_ckpt_0.2.3.model.tar.gz md5 checking...
-    [2022-04-21 15:52:18,727] [    INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/model.yaml
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/exp/chunk_conformer/checkpoints/multi_cn.pdparams
-    [2022-04-21 15:52:18,727] [    INFO] - /home/users/xiongxinlei/.paddlespeech/models/conformer_online_multicn-zh-16k/exp/chunk_conformer/checkpoints/multi_cn.pdparams
-    [2022-04-21 15:52:19,446] [    INFO] - start to create the stream conformer asr engine
-    [2022-04-21 15:52:19,473] [    INFO] - model name: conformer_online
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    set kaiming_uniform
-    [2022-04-21 15:52:21,731] [    INFO] - create the transformer like model success
-    [2022-04-21 15:52:21,733] [    INFO] - Initialize ASR server engine successfully.
-    INFO:     Started server process [11173]
-    [2022-04-21 15:52:21] [INFO] [server.py:75] Started server process [11173]
-    INFO:     Waiting for application startup.
-    [2022-04-21 15:52:21] [INFO] [on.py:45] Waiting for application startup.
-    INFO:     Application startup complete.
-    [2022-04-21 15:52:21] [INFO] [on.py:59] Application startup complete.
-    /home/users/xiongxinlei/.conda/envs/paddlespeech/lib/python3.9/asyncio/base_events.py:1460: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
-    infos = await tasks.gather(*fs, loop=self)
-    /home/users/xiongxinlei/.conda/envs/paddlespeech/lib/python3.9/asyncio/base_events.py:1518: DeprecationWarning: The loop argument is deprecated since Python 3.8, and scheduled for removal in Python 3.10.
-    await tasks.sleep(0, loop=self)
-    INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
-    [2022-04-21 15:52:21] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+     [2022-05-14 04:56:13,086] [    INFO] - create the online asr engine instance
+     [2022-05-14 04:56:13,086] [    INFO] - paddlespeech_server set the device: cpu
+     [2022-05-14 04:56:13,087] [    INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
+     [2022-05-14 04:56:13,087] [    INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5        checking...
+     [2022-05-14 04:56:17,542] [    INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.  0.0a.model.tar
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/               chunk_conformer/checkpoints/avg_10.pdparams
+     [2022-05-14 04:56:17,543] [    INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/               chunk_conformer/checkpoints/avg_10.pdparams
+     [2022-05-14 04:56:17,852] [    INFO] - start to create the stream conformer asr engine
+     [2022-05-14 04:56:17,863] [    INFO] - model name: conformer_online
+     [2022-05-14 04:56:22,756] [    INFO] - create the transformer like model success
+     [2022-05-14 04:56:22,758] [    INFO] - Initialize ASR server engine successfully.
+     INFO:     Started server process [4242]
+     [2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
+     INFO:     Waiting for application startup.
+     [2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
+     INFO:     Application startup complete.
+     [2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
+     INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
+     [2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
  ```

 ### 4. ASR 客户端使用方法
+
 **注意：** 初次使用客户端时响应时间会略长
 - 命令行 (推荐使用)
+
+   若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
   ```
   paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
-
   ```

    使用帮助:
@ -204,79 +139,84 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
    - `sample_rate`: 音频采样率，默认值：16000。
    - `lang`: 模型语言，默认值：zh_cn。
    - `audio_format`: 音频格式，默认值：wav。
+    - `punc.server_ip` 标点预测服务的ip。默认是None。
+    - `punc.server_port` 标点预测服务的端口port。默认是None。

    输出:

    ```bash
-        [2022-04-21 15:59:03,904] [    INFO] - receive msg={"status": "ok", "signal": "server_ready"}
-        [2022-04-21 15:59:03,960] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:03,973] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:03,987] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,000] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,012] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,024] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,036] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,047] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,607] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,620] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,633] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,645] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,657] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,669] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,680] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:05,176] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,185] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,192] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,200] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,208] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,216] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,224] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,232] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,724] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,732] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,740] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,747] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,755] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,763] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,770] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:06,271] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,279] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,287] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,294] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,302] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,310] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,318] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,326] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,833] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,842] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,850] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,858] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,866] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,874] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,882] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:07,400] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,408] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,416] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,424] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,432] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,440] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,447] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,455] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,984] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:07,992] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,001] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,008] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,016] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,024] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:12,883] [    INFO] - final receive msg={'status': 'ok', 'signal': 'finished', 'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:12,884] [    INFO] - 我认为跑步最重要的就是给我带来了身体健康
-        [2022-04-21 15:59:12,884] [    INFO] - Response time 9.051567 s.
+      [2022-05-06 21:10:35,598] [    INFO] - Start to do streaming asr client
+      [2022-05-06 21:10:35,600] [    INFO] - asr websocket client start
+      [2022-05-06 21:10:35,600] [    INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
+      [2022-05-06 21:10:35,600] [    INFO] - start to process the wavscp: ./zh.wav
+      [2022-05-06 21:10:35,670] [    INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
+      [2022-05-06 21:10:35,699] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,713] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,726] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,738] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,750] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,762] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,774] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:35,786] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,387] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,398] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,407] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,416] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,425] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,434] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,442] [    INFO] - client receive msg={'result': ''}
+      [2022-05-06 21:10:36,930] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,938] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,946] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,954] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,962] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,970] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,977] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:36,985] [    INFO] - client receive msg={'result': '我认为跑'}
+      [2022-05-06 21:10:37,484] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,492] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,500] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,508] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,517] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,525] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:37,532] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+      [2022-05-06 21:10:38,050] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,058] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,066] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,073] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,081] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,089] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,097] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,105] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+      [2022-05-06 21:10:38,630] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,639] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,647] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,655] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,663] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,671] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:38,679] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+      [2022-05-06 21:10:39,216] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,224] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,232] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,240] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,248] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,256] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,264] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,272] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+      [2022-05-06 21:10:39,885] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,896] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,905] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,915] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,924] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:39,934] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+      [2022-05-06 21:10:44,827] [    INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
+      [2022-05-06 21:10:44,827] [    INFO] - audio duration: 4.9968125, elapsed time: 9.225094079971313, RTF=1.846195765794957
+      [2022-05-06 21:10:44,828] [    INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
    ```

 - Python API
  ```python
  from paddlespeech.server.bin.paddlespeech_client import ASROnlineClientExecutor
-  import json

  asrclient_executor = ASROnlineClientExecutor()
  res = asrclient_executor(
@ -286,71 +226,360 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
      sample_rate=16000,
      lang="zh_cn",
      audio_format="wav")
-  print(res.json())
+  print(res)
  ```

  输出:
  ```bash
-        [2022-04-21 15:59:03,904] [    INFO] - receive msg={"status": "ok", "signal": "server_ready"}
-        [2022-04-21 15:59:03,960] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:03,973] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:03,987] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,000] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,012] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,024] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,036] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,047] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,607] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,620] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,633] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,645] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,657] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,669] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:04,680] [    INFO] - receive msg={'asr_results': ''}
-        [2022-04-21 15:59:05,176] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,185] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,192] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,200] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,208] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,216] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,224] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,232] [    INFO] - receive msg={'asr_results': '我认为跑'}
-        [2022-04-21 15:59:05,724] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,732] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,740] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,747] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,755] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,763] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:05,770] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的'}
-        [2022-04-21 15:59:06,271] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,279] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,287] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,294] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,302] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,310] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,318] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,326] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是'}
-        [2022-04-21 15:59:06,833] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,842] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,850] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,858] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,866] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,874] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:06,882] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给'}
-        [2022-04-21 15:59:07,400] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,408] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,416] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,424] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,432] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,440] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,447] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,455] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了'}
-        [2022-04-21 15:59:07,984] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:07,992] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,001] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,008] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,016] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:08,024] [    INFO] - receive msg={'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:12,883] [    INFO] - final receive msg={'status': 'ok', 'signal': 'finished', 'asr_results': '我认为跑步最重要的就是给我带来了身体健康'}
-        [2022-04-21 15:59:12,884] [    INFO] - 我认为跑步最重要的就是给我带来了身体健康
+  [2022-05-06 21:14:03,137] [    INFO] - asr websocket client start
+  [2022-05-06 21:14:03,137] [    INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
+  [2022-05-06 21:14:03,149] [    INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
+  [2022-05-06 21:14:03,167] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,181] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,194] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,207] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,219] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,230] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,241] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,252] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,768] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,776] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,784] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,792] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,800] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,807] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:03,815] [    INFO] - client receive msg={'result': ''}
+  [2022-05-06 21:14:04,301] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,309] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,317] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,325] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,333] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,341] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,349] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,356] [    INFO] - client receive msg={'result': '我认为跑'}
+  [2022-05-06 21:14:04,855] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,864] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,871] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,879] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,887] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,894] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:04,902] [    INFO] - client receive msg={'result': '我认为跑步最重要的'}
+  [2022-05-06 21:14:05,418] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,426] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,434] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,442] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,449] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,457] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,465] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,473] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
+  [2022-05-06 21:14:05,996] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,006] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,013] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,021] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,029] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,037] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,045] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
+  [2022-05-06 21:14:06,581] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,589] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,597] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,605] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,613] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,621] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,628] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:06,636] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
+  [2022-05-06 21:14:07,188] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,196] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,203] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,211] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,219] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:07,226] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
+  [2022-05-06 21:14:12,158] [    INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
+  [2022-05-06 21:14:12,159] [    INFO] - audio duration: 4.9968125, elapsed time: 9.019973039627075, RTF=1.8051453881103354
+  [2022-05-06 21:14:12,160] [    INFO] - asr websocket client finished
+  ```
+
+
+
+## 标点预测
+
+### 1. 服务端使用方法
+
+- 命令行
+  **注意:** 默认部署在 `cpu` 设备上，可以通过修改服务配置文件中 `device` 参数部署在 `gpu` 上。
+  ``` bash
+  在 PaddleSpeech/demos/streaming_asr_server 目录下启动标点预测服务
+  paddlespeech_server start --config_file conf/punc_application.yaml
+  ```
+
+
+   使用方法：
+  
+  ```bash
+  paddlespeech_server start --help
+  ```
+  
+  参数：
+  - `config_file`: 服务的配置文件。
+  - `log_file`: log 文件。
+
+
+  输出：
+  ``` bash
+  [2022-05-02 17:59:26,285] [    INFO] - Create the TextEngine Instance
+  [2022-05-02 17:59:26,285] [    INFO] - Init the text engine
+  [2022-05-02 17:59:26,285] [    INFO] - Text Engine set the device: gpu:0
+  [2022-05-02 17:59:26,286] [    INFO] - File /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar.gz md5 checking...
+  [2022-05-02 17:59:30,810] [    INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar
+  W0502 17:59:31.486552  9595 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.2
+  W0502 17:59:31.491360  9595 device_context.cc:465] device: 0, cuDNN Version: 7.6.
+  [2022-05-02 17:59:34,688] [    INFO] - Already cached /home/users/xiongxinlei/.paddlenlp/models/ernie-1.0/vocab.txt
+  [2022-05-02 17:59:34,701] [    INFO] - Init the text engine successfully
+  INFO:     Started server process [9595]
+  [2022-05-02 17:59:34] [INFO] [server.py:75] Started server process [9595]
+  INFO:     Waiting for application startup.
+  [2022-05-02 17:59:34] [INFO] [on.py:45] Waiting for application startup.
+  INFO:     Application startup complete.
+  [2022-05-02 17:59:34] [INFO] [on.py:59] Application startup complete.
+  INFO:     Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
+  [2022-05-02 17:59:34] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
+  ```
+
+- Python API
+  **注意:** 默认部署在 `cpu` 设备上，可以通过修改服务配置文件中 `device` 参数部署在 `gpu` 上。
+  ```python
+  # 在 PaddleSpeech/demos/streaming_asr_server 目录
+  from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
+
+  server_executor = ServerExecutor()
+  server_executor(
+      config_file="./conf/punc_application.yaml", 
+      log_file="./log/paddlespeech.log")
  ```
+
+   输出
+   ```
+    [2022-05-02 18:09:02,542] [    INFO] - Create the TextEngine Instance
+    [2022-05-02 18:09:02,543] [    INFO] - Init the text engine
+    [2022-05-02 18:09:02,543] [    INFO] - Text Engine set the device: gpu:0
+    [2022-05-02 18:09:02,545] [    INFO] - File /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar.gz md5 checking...
+    [2022-05-02 18:09:06,919] [    INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar
+    W0502 18:09:07.523002 22615 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.2
+    W0502 18:09:07.527882 22615 device_context.cc:465] device: 0, cuDNN Version: 7.6.
+    [2022-05-02 18:09:10,900] [    INFO] - Already cached /home/users/xiongxinlei/.paddlenlp/models/ernie-1.0/vocab.txt
+    [2022-05-02 18:09:10,913] [    INFO] - Init the text engine successfully
+    INFO:     Started server process [22615]
+    [2022-05-02 18:09:10] [INFO] [server.py:75] Started server process [22615]
+    INFO:     Waiting for application startup.
+    [2022-05-02 18:09:10] [INFO] [on.py:45] Waiting for application startup.
+    INFO:     Application startup complete.
+    [2022-05-02 18:09:10] [INFO] [on.py:59] Application startup complete.
+    INFO:     Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
+    [2022-05-02 18:09:10] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
+   ```
+
+### 2. 标点预测客户端使用方法
+**注意：** 初次使用客户端时响应时间会略长
+
+- 命令行 (推荐使用)
+
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+   ```
+   paddlespeech_client text --server_ip 127.0.0.1 --port 8190 --input "我认为跑步最重要的就是给我带来了身体健康"
+   ```
+  
+  输出
+  ```
+  [2022-05-02 18:12:29,767] [    INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
+  [2022-05-02 18:12:29,767] [    INFO] - Response time 0.096548 s.
+  ```
+
+- Python3 API
+
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import TextClientExecutor
+
+  textclient_executor = TextClientExecutor()
+  res = textclient_executor(
+      input="我认为跑步最重要的就是给我带来了身体健康",
+      server_ip="127.0.0.1",
+      port=8190,)
+  print(res)
+  ```
+
+  输出：
+  ``` bash
+  我认为跑步最重要的就是给我带来了身体健康。
+  ```
+
+
+## 联合流式语音识别和标点预测
+**注意:** 默认部署在 `cpu` 设备上，可以通过修改服务配置文件中 `device` 参数将语音识别和标点预测部署在不同的 `gpu` 上。
+
+使用 `streaming_asr_server.py` 和 `punc_server.py` 两个服务，分别启动流式语音识别和标点预测服务。调用 `websocket_client.py` 脚本可以同时调用流式语音识别和标点预测服务。
+
+### 1. 启动服务
+
+``` bash
+注意：流式语音识别和标点预测通过配置文件配置到不同的显卡上
+bash server.sh
+```
+
+### 2. 调用服务
+- 使用命令行：
+
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+  ```
+  paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --input ./zh.wav
+  ```
+  输出：
+  ```
+  [2022-05-07 11:21:47,060] [    INFO] - asr websocket client start
+  [2022-05-07 11:21:47,060] [    INFO] - endpoint: ws://127.0.0.1:8490/paddlespeech/asr/streaming
+  [2022-05-07 11:21:47,080] [    INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
+  [2022-05-07 11:21:47,096] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,108] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,120] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,131] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,142] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,152] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,163] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,173] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,705] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,713] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,721] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,728] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,736] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,743] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:47,751] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:21:48,459] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:48,572] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:48,681] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:48,790] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:48,898] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:49,005] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:49,112] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:49,219] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:21:49,935] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,062] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,186] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,310] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,435] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,560] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:50,686] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:21:51,444] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:51,606] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:51,744] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:51,882] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:52,020] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:52,159] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:52,298] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:52,437] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:21:53,298] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:53,450] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:53,589] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:53,728] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:53,867] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:54,007] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:54,146] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:21:55,002] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,148] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,292] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,437] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,584] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,731] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:55,877] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:56,021] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:21:56,842] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,013] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,174] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,336] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,497] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:21:57,659] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:22:03,035] [    INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康。', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
+  [2022-05-07 11:22:03,035] [    INFO] - audio duration: 4.9968125, elapsed time: 15.974023818969727, RTF=3.1968427510477384
+  [2022-05-07 11:22:03,037] [    INFO] - asr websocket client finished
+  [2022-05-07 11:22:03,037] [    INFO] - 我认为跑步最重要的就是给我带来了身体健康。
+  [2022-05-07 11:22:03,037] [    INFO] - Response time 15.977116 s.
+  ```
+
+- 使用脚本调用
+  
+  若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+  ```
+  python3 websocket_client.py --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --wavfile ./zh.wav
+  ```
+  输出：
+  ```
+  [2022-05-07 11:11:02,984] [    INFO] - Start to do streaming asr client
+  [2022-05-07 11:11:02,985] [    INFO] - asr websocket client start
+  [2022-05-07 11:11:02,985] [    INFO] - endpoint: ws://127.0.0.1:8490/paddlespeech/asr/streaming
+  [2022-05-07 11:11:02,986] [    INFO] - start to process the wavscp: ./zh.wav
+  [2022-05-07 11:11:03,006] [    INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
+  [2022-05-07 11:11:03,021] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,034] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,046] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,058] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,070] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,081] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,092] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,102] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,629] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,638] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,645] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,653] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,661] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,668] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:03,676] [    INFO] - client receive msg={'result': ''}
+  [2022-05-07 11:11:04,402] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,510] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,619] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,743] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,849] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:04,956] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:05,063] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:05,170] [    INFO] - client receive msg={'result': '我认为，跑'}
+  [2022-05-07 11:11:05,876] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,019] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,184] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,342] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,537] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,727] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:06,871] [    INFO] - client receive msg={'result': '我认为，跑步最重要的。'}
+  [2022-05-07 11:11:07,617] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:07,769] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:07,905] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,043] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,186] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,326] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,466] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:08,611] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是。'}
+  [2022-05-07 11:11:09,431] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:09,571] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:09,714] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:09,853] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:09,992] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:10,129] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:10,266] [    INFO] - client receive msg={'result': '我认为，跑步最重要的就是给。'}
+  [2022-05-07 11:11:11,113] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,296] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,439] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,582] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,727] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:11,869] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:12,011] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:12,153] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了。'}
+  [2022-05-07 11:11:12,969] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,137] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,297] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,456] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,615] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:13,776] [    INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康。'}
+  [2022-05-07 11:11:18,915] [    INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康。', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
+  [2022-05-07 11:11:18,915] [    INFO] - audio duration: 4.9968125, elapsed time: 15.928460597991943, RTF=3.187724293835709
+  [2022-05-07 11:11:18,916] [    INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
+  ```
+
+  
--- a/demos/streaming_asr_server/conf/ws_application.yaml
+++ b/demos/streaming_asr_server/conf/ws_application.yaml
@ -7,8 +7,8 @@ host: 0.0.0.0
 port: 8090

 # The task format in the engin_list is: <speech task>_<engine type>
-# task choices = ['asr_online', 'tts_online']
-# protocol = ['websocket', 'http'] (only one can be selected).
+# task choices = ['asr_online']
+# protocol = ['websocket'] (only one can be selected).
 # websocket only support online engine type.
 protocol: 'websocket'
 engine_list: ['asr_online']
@ -21,7 +21,7 @@ engine_list: ['asr_online']
 ################################### ASR #########################################
 ################### speech task: asr; engine_type: online #######################
 asr_online:
-    model_type: 'deepspeech2online_aishell'
+    model_type: 'conformer_online_wenetspeech'
    am_model: # the pdmodel file of am static model [optional]
    am_params:  # the pdiparams file of am static model [optional]
    lang: 'zh'
@ -29,6 +29,9 @@ asr_online:
    cfg_path: 
    decode_method: 
    force_yes: True
+    device: 'cpu' # cpu or gpu:id
+    decode_method: "attention_rescoring"
+    continuous_decoding: True # enable continue decoding when endpoint detected

    am_predictor_conf:
        device:  # set 'gpu:id' or 'cpu'
@ -37,11 +40,9 @@ asr_online:
        summary: True  # False -> do not show predictor config

    chunk_buffer_conf:
-        frame_duration_ms: 80
-        shift_ms: 40
-        sample_rate: 16000
-        sample_width: 2
        window_n: 7     # frame
        shift_n: 4      # frame
-        window_ms: 20   # ms
+        window_ms: 25   # ms
        shift_ms: 10    # ms
+        sample_rate: 16000
+        sample_width: 2
--- a/demos/streaming_asr_server/conf/punc_application.yaml
+++ b/demos/streaming_asr_server/conf/punc_application.yaml
@ -0,0 +1,35 @@
+# This is the parameter configuration file for PaddleSpeech Serving.
+
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 8190
+
+# The task format in the engin_list is: <speech task>_<engine type>
+# task choices = ['asr_python']
+# protocol = ['http'] (only one can be selected). 
+# http only support offline engine type.
+protocol: 'http'
+engine_list: ['text_python']
+
+
+#################################################################################
+#                                ENGINE CONFIG                                  #
+#################################################################################
+
+################################### Text #########################################
+################### text task: punc; engine_type: python #######################
+text_python:
+    task: punc
+    model_type: 'ernie_linear_p3_wudao'
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: # [optional]
+    ckpt_path: # [optional]
+    vocab_file: # [optional]
+    device: 'cpu' # set 'gpu:id' or 'cpu'
+
+
+
+
--- a/demos/streaming_asr_server/conf/ws_conformer_application.yaml
+++ b/demos/streaming_asr_server/conf/ws_conformer_application.yaml
@ -4,11 +4,11 @@
 #                             SERVER SETTING                                    #
 #################################################################################
 host: 0.0.0.0
-port: 8090
+port: 8091

 # The task format in the engin_list is: <speech task>_<engine type>
-# task choices = ['asr_online', 'tts_online']
-# protocol = ['websocket', 'http'] (only one can be selected).
+# task choices = ['asr_online']
+# protocol = ['websocket'] (only one can be selected).
 # websocket only support online engine type.
 protocol: 'websocket'
 engine_list: ['asr_online']
@ -28,8 +28,12 @@ asr_online:
    sample_rate: 16000
    cfg_path: 
    decode_method: 
+    num_decoding_left_chunks: -1
    force_yes: True
-    device: # cpu or gpu:id
+    device: 'cpu' # cpu or gpu:id
+    decode_method: "attention_rescoring"
+    continuous_decoding: True # enable continue decoding when endpoint detected
+
    am_predictor_conf:
        device:  # set 'gpu:id' or 'cpu'
        switch_ir_optim: True
@ -42,4 +46,4 @@ asr_online:
        window_ms: 25   # ms
        shift_ms: 10    # ms
        sample_rate: 16000
-        sample_width: 2
+        sample_width: 2
--- a/demos/streaming_asr_server/conf/ws_conformer_wenetspeech_application.yaml
+++ b/demos/streaming_asr_server/conf/ws_conformer_wenetspeech_application.yaml
@ -0,0 +1,48 @@
+# This is the parameter configuration file for PaddleSpeech Serving.
+
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 8090
+
+# The task format in the engin_list is: <speech task>_<engine type>
+# task choices = ['asr_online']
+# protocol = ['websocket'] (only one can be selected).
+# websocket only support online engine type.
+protocol: 'websocket'
+engine_list: ['asr_online']
+
+
+#################################################################################
+#                                ENGINE CONFIG                                  #
+#################################################################################
+
+################################### ASR #########################################
+################### speech task: asr; engine_type: online #######################
+asr_online:
+    model_type: 'conformer_online_wenetspeech'
+    am_model: # the pdmodel file of am static model [optional]
+    am_params:  # the pdiparams file of am static model [optional]
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: 
+    decode_method: 
+    force_yes: True
+    device: 'cpu' # cpu or gpu:id
+    decode_method: "attention_rescoring"
+    continuous_decoding: True # enable continue decoding when endpoint detected
+    num_decoding_left_chunks: -1
+    am_predictor_conf:
+        device:  # set 'gpu:id' or 'cpu'
+        switch_ir_optim: True
+        glog_info: False  # True -> print glog
+        summary: True  # False -> do not show predictor config
+
+    chunk_buffer_conf:
+        window_n: 7     # frame
+        shift_n: 4      # frame
+        window_ms: 25   # ms
+        shift_ms: 10    # ms
+        sample_rate: 16000
+        sample_width: 2
--- a/demos/streaming_asr_server/conf/ws_conformer_wenetspeech_application_faster.yaml
+++ b/demos/streaming_asr_server/conf/ws_conformer_wenetspeech_application_faster.yaml
@ -0,0 +1,48 @@
+# This is the parameter configuration file for PaddleSpeech Serving.
+
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 8090
+
+# The task format in the engin_list is: <speech task>_<engine type>
+# task choices = ['asr_online']
+# protocol = ['websocket'] (only one can be selected).
+# websocket only support online engine type.
+protocol: 'websocket'
+engine_list: ['asr_online']
+
+
+#################################################################################
+#                                ENGINE CONFIG                                  #
+#################################################################################
+
+################################### ASR #########################################
+################### speech task: asr; engine_type: online #######################
+asr_online:
+    model_type: 'conformer_online_wenetspeech'
+    am_model: # the pdmodel file of am static model [optional]
+    am_params:  # the pdiparams file of am static model [optional]
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: 
+    decode_method: 
+    force_yes: True
+    device: 'cpu' # cpu or gpu:id
+    decode_method: "attention_rescoring"
+    continuous_decoding: True # enable continue decoding when endpoint detected
+    num_decoding_left_chunks: 16
+    am_predictor_conf:
+        device:  # set 'gpu:id' or 'cpu'
+        switch_ir_optim: True
+        glog_info: False  # True -> print glog
+        summary: True  # False -> do not show predictor config
+
+    chunk_buffer_conf:
+        window_n: 7     # frame
+        shift_n: 4      # frame
+        window_ms: 25   # ms
+        shift_ms: 10    # ms
+        sample_rate: 16000
+        sample_width: 2
--- a/demos/streaming_asr_server/conf/ws_ds2_application.yaml
+++ b/demos/streaming_asr_server/conf/ws_ds2_application.yaml
@ -0,0 +1,84 @@
+# This is the parameter configuration file for PaddleSpeech Serving.
+
+#################################################################################
+#                             SERVER SETTING                                    #
+#################################################################################
+host: 0.0.0.0
+port: 8090
+
+# The task format in the engin_list is: <speech task>_<engine type>
+# task choices = ['asr_online-inference', 'asr_online-onnx']
+# protocol = ['websocket'] (only one can be selected).
+# websocket only support online engine type.
+protocol: 'websocket'
+engine_list: ['asr_online-onnx']
+
+
+#################################################################################
+#                                ENGINE CONFIG                                  #
+#################################################################################
+
+################################### ASR #########################################
+################### speech task: asr; engine_type: online-inference #######################
+asr_online-inference:
+    model_type: 'deepspeech2online_wenetspeech'
+    am_model:    # the pdmodel file of am static model [optional]
+    am_params:   # the pdiparams file of am static model [optional]
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: 
+    decode_method: 
+    num_decoding_left_chunks: 
+    force_yes: True
+    device: 'cpu' # cpu or gpu:id
+
+    am_predictor_conf:
+        device:  # set 'gpu:id' or 'cpu'
+        switch_ir_optim: True
+        glog_info: False  # True -> print glog
+        summary: True  # False -> do not show predictor config
+
+    chunk_buffer_conf:
+        frame_duration_ms: 85
+        shift_ms: 40
+        sample_rate: 16000
+        sample_width: 2
+        window_n: 7     # frame
+        shift_n: 4      # frame
+        window_ms: 25   # ms
+        shift_ms: 10    # ms
+
+
+
+################################### ASR #########################################
+################### speech task: asr; engine_type: online-onnx #######################
+asr_online-onnx:
+    model_type: 'deepspeech2online_wenetspeech'
+    am_model:  # the pdmodel file of onnx am static model [optional]
+    am_params:  # the pdiparams file of am static model [optional]
+    lang: 'zh'
+    sample_rate: 16000
+    cfg_path: 
+    decode_method: 
+    num_decoding_left_chunks: 
+    force_yes: True
+    device: 'cpu' # cpu or gpu:id
+
+    # https://onnxruntime.ai/docs/api/python/api_summary.html#inferencesession
+    am_predictor_conf:
+        device: 'cpu' # set 'gpu:id' or 'cpu'
+        graph_optimization_level: 0 
+        intra_op_num_threads: 0 # Sets the number of threads used to parallelize the execution within nodes.
+        inter_op_num_threads: 0 # Sets the number of threads used to parallelize the execution of the graph (across nodes).
+        log_severity_level: 2   # Log severity level. Applies to session load, initialization, etc. 0:Verbose, 1:Info, 2:Warning. 3:Error, 4:Fatal. Default is 2.
+        log_verbosity_level: 0  # VLOG level if DEBUG build and session_log_severity_level is 0. Applies to session load, initialization, etc. Default is 0.
+
+    chunk_buffer_conf:
+        frame_duration_ms: 80
+        shift_ms: 40
+        sample_rate: 16000
+        sample_width: 2
+        window_n: 7     # frame
+        shift_n: 4      # frame
+        window_ms: 25   # ms
+        shift_ms: 10    # ms
--- a/demos/streaming_asr_server/local/rtf_from_log.py
+++ b/demos/streaming_asr_server/local/rtf_from_log.py
@ -0,0 +1,40 @@
+#!/usr/bin/env python3
+import argparse
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(prog=__doc__)
+    parser.add_argument(
+        '--logfile', type=str, required=True, help='ws client log file')
+
+    args = parser.parse_args()
+
+    rtfs = []
+    with open(args.logfile, 'r') as f:
+        for line in f:
+            if 'RTF=' in line:
+                # udio duration: 6.126, elapsed time: 3.471978187561035, RTF=0.5667610492264177
+                line = line.strip()
+                beg = line.index("audio")
+                line = line[beg:]
+
+                items = line.split(',')
+                vals = []
+                for elem in items:
+                    if "RTF=" in elem:
+                        continue
+                    _, val = elem.split(":")
+                    vals.append(eval(val))
+                keys = ['T', 'P']
+                meta = dict(zip(keys, vals))
+
+                rtfs.append(meta)
+
+    T = 0.0
+    P = 0.0
+    n = 0
+    for m in rtfs:
+        n += 1
+        T += m['T']
+        P += m['P']
+
+    print(f"RTF: {P/T}, utts: {n}")
--- a/demos/streaming_asr_server/local/test.sh
+++ b/demos/streaming_asr_server/local/test.sh
@ -0,0 +1,21 @@
+#!/bin/bash 
+
+if [ $# != 1 ];then
+    echo "usage: $0 wav_scp"
+    exit -1
+fi
+
+scp=$1
+
+# calc RTF
+# wav_scp can generate from `speechx/examples/ds2_ol/aishell`
+
+exp=exp
+mkdir -p $exp
+
+python3 local/websocket_client.py --server_ip 127.0.0.1 --port 8090 --wavscp $scp &> $exp/log.rsl
+
+python3 local/rtf_from_log.py --logfile $exp/log.rsl
+
+
+ 
--- a/demos/streaming_asr_server/local/websocket_client.py
+++ b/demos/streaming_asr_server/local/websocket_client.py
@ -1,3 +1,4 @@
+#!/usr/bin/python
 # Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
@ -11,8 +12,9 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-#!/usr/bin/python
-# -*- coding: UTF-8 -*-
+# calc avg RTF(NOT Accurate): grep -rn RTF log.txt | awk '{print $NF}' | awk -F "=" '{sum += $NF} END {print "all time",sum, "audio num", NR,  "RTF", sum/NR}'
+# python3 websocket_client.py --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --wavfile ./zh.wav
+# python3 websocket_client.py --server_ip 127.0.0.1 --port 8290 --wavfile ./zh.wav
 import argparse
 import asyncio
 import codecs
@ -28,6 +30,7 @@ def main(args):
    handler = ASRWsAudioHandler(
        args.server_ip,
        args.port,
+        endpoint=args.endpoint,
        punc_server_ip=args.punc_server_ip,
        punc_server_port=args.punc_server_port)
    loop = asyncio.get_event_loop()
@ -39,7 +42,7 @@ def main(args):
        result = result["result"]
        logger.info(f"asr websocket client finished : {result}")

-    # support to process batch audios from wav.scp 
+    # support to process batch audios from wav.scp
    if args.wavscp and os.path.exists(args.wavscp):
        logging.info(f"start to process the wavscp: {args.wavscp}")
        with codecs.open(args.wavscp, 'r', encoding='utf-8') as f,\
@ -69,7 +72,11 @@ if __name__ == "__main__":
        default=8091,
        dest="punc_server_port",
        help='Punctuation server port')
-
+    parser.add_argument(
+        "--endpoint",
+        type=str,
+        default="/paddlespeech/asr/streaming",
+        help="ASR websocket endpoint")
    parser.add_argument(
        "--wavfile",
        action="store",
--- a/demos/streaming_asr_server/punc_server.py
+++ b/demos/streaming_asr_server/punc_server.py
@ -0,0 +1,38 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+
+from paddlespeech.cli.log import logger
+from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(
+        prog='paddlespeech_server.start', add_help=True)
+    parser.add_argument(
+        "--config_file",
+        action="store",
+        help="yaml file of the app",
+        default=None,
+        required=True)
+
+    parser.add_argument(
+        "--log_file",
+        action="store",
+        help="log file",
+        default="./log/paddlespeech.log")
+    logger.info("start to parse the args")
+    args = parser.parse_args()
+
+    logger.info("start to launch the punctuation server")
+    punc_server = ServerExecutor()
+    punc_server(config_file=args.config_file, log_file=args.log_file)
--- a/demos/streaming_asr_server/server.sh
+++ b/demos/streaming_asr_server/server.sh
@ -0,0 +1,9 @@
+export CUDA_VISIBLE_DEVICE=0,1,2,3
+ export CUDA_VISIBLE_DEVICE=0,1,2,3
+
+# nohup python3 punc_server.py --config_file conf/punc_application.yaml > punc.log 2>&1 &
+paddlespeech_server start --config_file conf/punc_application.yaml &> punc.log &
+
+# nohup python3 streaming_asr_server.py --config_file conf/ws_conformer_wenetspeech_application.yaml > streaming_asr.log 2>&1 &
+paddlespeech_server start --config_file conf/ws_conformer_wenetspeech_application.yaml &> streaming_asr.log  &
+
--- a/demos/streaming_asr_server/streaming_asr_server.py
+++ b/demos/streaming_asr_server/streaming_asr_server.py
@ -0,0 +1,38 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import argparse
+
+from paddlespeech.cli.log import logger
+from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(
+        prog='paddlespeech_server.start', add_help=True)
+    parser.add_argument(
+        "--config_file",
+        action="store",
+        help="yaml file of the app",
+        default=None,
+        required=True)
+
+    parser.add_argument(
+        "--log_file",
+        action="store",
+        help="log file",
+        default="./log/paddlespeech.log")
+    logger.info("start to parse the args")
+    args = parser.parse_args()
+
+    logger.info("start to launch the streaming asr server")
+    streaming_asr_server = ServerExecutor()
+    streaming_asr_server(config_file=args.config_file, log_file=args.log_file)
--- a/demos/streaming_asr_server/test.sh
+++ b/demos/streaming_asr_server/test.sh
@ -1,5 +1,11 @@
 # download the test wav
 wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav 

-# read the wav and pass it to service
-python3 websocket_client.py --wavfile ./zh.wav
+# read the wav and pass it to only streaming asr service
+# If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
+
+# read the wav and call streaming and punc service
+# If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --input ./zh.wav
+
--- a/demos/streaming_asr_server/web/templates/index.html
+++ b/demos/streaming_asr_server/web/templates/index.html
@ -93,6 +93,7 @@

    function parseResult(data) {
      var data = JSON.parse(data)
+      console.log('result json:', data)
      var result = data.result
      console.log(result)
      $("#resultPanel").html(result)
--- a/demos/streaming_tts_server/README.md
+++ b/demos/streaming_tts_server/README.md
@ -10,13 +10,13 @@ This demo is an implementation of starting the streaming speech synthesis servic
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).

-It is recommended to use **paddlepaddle 2.2.1** or above.
+It is recommended to use **paddlepaddle 2.2.2** or above.
 You can choose one way from meduim and hard to install paddlespeech.


 ### 2. Prepare config File
 The configuration file can be found in `conf/tts_online_application.yaml`.
- `protocol` indicates the network protocol used by the streaming TTS service. Currently, both http and websocket are supported.
+- `protocol` indicates the network protocol used by the streaming TTS service. Currently, both **http and websocket** are supported.
 - `engine_list` indicates the speech engine that will be included in the service to be started, in the format of `<speech task>_<engine type>`.
    - This demo mainly introduces the streaming speech synthesis service, so the speech task should be set to `tts`.
    - the engine type supports two forms: **online**  and **online-onnx**. `online` indicates an engine that uses python for dynamic graph inference; `online-onnx` indicates an engine that uses onnxruntime for inference. The inference speed of online-onnx is faster.
@ -27,16 +27,18 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
 - In streaming voc inference, one chunk of data is inferred at a time to achieve a streaming effect. Where `voc_block` indicates the number of valid frames in the chunk, and `voc_pad` indicates the number of frames added before and after the voc_block in a chunk. The existence of voc_pad is used to eliminate errors caused by streaming inference and avoid the influence of streaming inference on the quality of synthesized audio.
    - Both hifigan and mb_melgan support streaming voc inference.
    - When the voc model is mb_melgan, when voc_pad=14, the synthetic audio for streaming inference is consistent with the non-streaming synthetic audio; the minimum voc_pad can be set to 7, and the synthetic audio has no abnormal hearing. If the voc_pad is less than 7, the synthetic audio sounds abnormal.
-    - When the voc model is hifigan, when voc_pad=20, the streaming inference synthetic audio is consistent with the non-streaming synthetic audio; when voc_pad=14, the synthetic audio has no abnormal hearing.
+    - When the voc model is hifigan, when voc_pad=19, the streaming inference synthetic audio is consistent with the non-streaming synthetic audio; when voc_pad=14, the synthetic audio has no abnormal hearing.
 - Inference speed: mb_melgan > hifigan; Audio quality: mb_melgan < hifigan
+- **Note:** If the service can be started normally in the container, but the client access IP is unreachable, you can try to replace the `host` address in the configuration file with the local IP address.



-### 3. Server Usage
+### 3. Streaming speech synthesis server and client using http protocol
+#### 3.1 Server Usage
 - Command Line (Recommended)

+  Start the service (the configuration file uses http by default):
  ```bash
-  # start the service
  paddlespeech_server start --config_file ./conf/tts_online_application.yaml
  ```

@ -61,8 +63,8 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  [2022-04-24 20:05:28] [INFO] [on.py:45] Waiting for application startup.
  INFO:     Application startup complete.
  [2022-04-24 20:05:28] [INFO] [on.py:59] Application startup complete.
-  INFO:     Uvicorn running on http://127.0.0.1:8092 (Press CTRL+C to quit)
-  [2022-04-24 20:05:28] [INFO] [server.py:211] Uvicorn running on http://127.0.0.1:8092 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  [2022-04-24 20:05:28] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)

  ```

@ -76,7 +78,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
      log_file="./log/paddlespeech.log")
  ```

-  Output:
+ Output:
  ```bash
  [2022-04-24 21:00:16,934] [    INFO] - The first response time of the 0 warm up: 1.268730878829956 s
  [2022-04-24 21:00:17,046] [    INFO] - The first response time of the 1 warm up: 0.11168622970581055 s
@ -88,23 +90,23 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  [2022-04-24 21:00:17] [INFO] [on.py:45] Waiting for application startup.
  INFO:     Application startup complete.
  [2022-04-24 21:00:17] [INFO] [on.py:59] Application startup complete.
-  INFO:     Uvicorn running on http://127.0.0.1:8092 (Press CTRL+C to quit)
-  [2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://127.0.0.1:8092 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  [2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)


  ```

- 
-### 4. Streaming TTS client Usage
+#### 3.2 Streaming TTS client Usage
 - Command Line (Recommended)

-    ```bash
-    # Access http streaming TTS service
-    paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+    Access http streaming TTS service:

-    # Access websocket streaming TTS service
-    paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol websocket --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+    If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+    ```bash
+    paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol http --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
    ```
+
    Usage:
  
    ```bash
@ -122,7 +124,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
    - `sample_rate`: Sampling rate, choices: [0, 8000, 16000], the default is the same as the model. Default: 0
    - `output`: Output wave filepath. Default: None, which means not to save the audio to the local.
    - `play`: Whether to play audio, play while synthesizing, default value: False, which means not playing. **Playing audio needs to rely on the pyaudio library**.
-
+    - `spk_id, speed, volume, sample_rate` do not take effect in streaming speech synthesis service temporarily.
    
    Output:
    ```bash
@ -165,8 +167,147 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
  [2022-04-24 21:11:16,802] [    INFO] - 音频时长：3.825 s
  [2022-04-24 21:11:16,802] [    INFO] - RTF: 0.7846773683635238
  [2022-04-24 21:11:16,837] [    INFO] - 音频保存至：./output.wav
+  ```
+
+ 
+### 4. Streaming speech synthesis server and client using websocket protocol
+#### 4.1 Server Usage
+- Command Line (Recommended)
+  First modify the configuration file `conf/tts_online_application.yaml`, **set `protocol` to `websocket`**.
+  Start the service:
+  ```bash
+  paddlespeech_server start --config_file ./conf/tts_online_application.yaml
+  ```
+
+  Usage:
+  
+  ```bash
+  paddlespeech_server start --help
+  ```
+  Arguments:
+  - `config_file`: yaml file of the app, defalut: ./conf/tts_online_application.yaml
+  - `log_file`: log file. Default: ./log/paddlespeech.log
+
+  Output:
+  ```bash
+    [2022-04-27 10:18:09,107] [    INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
+    [2022-04-27 10:18:09,219] [    INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
+    [2022-04-27 10:18:09,324] [    INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
+    [2022-04-27 10:18:09,325] [    INFO] - **********************************************************************
+    INFO:     Started server process [17600]
+    [2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
+    INFO:     Waiting for application startup.
+    [2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
+    INFO:     Application startup complete.
+    [2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
+    INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+    [2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+
+
+  ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
+
+  server_executor = ServerExecutor()
+  server_executor(
+      config_file="./conf/tts_online_application.yaml", 
+      log_file="./log/paddlespeech.log")
+  ```
+
+  Output:
+  ```bash
+    [2022-04-27 10:20:16,660] [    INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
+    [2022-04-27 10:20:16,773] [    INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
+    [2022-04-27 10:20:16,878] [    INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
+    [2022-04-27 10:20:16,878] [    INFO] - **********************************************************************
+    INFO:     Started server process [23466]
+    [2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
+    INFO:     Waiting for application startup.
+    [2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
+    INFO:     Application startup complete.
+    [2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
+    INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+    [2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+
+  ```
+
+#### 4.2 Streaming TTS client Usage
+- Command Line (Recommended)
+
+    Access websocket streaming TTS service:
+
+    If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+
+    ```bash
+    paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol websocket --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+    ```
+
+    Usage:
+  
+    ```bash
+    paddlespeech_client tts_online --help
+    ```
+
+    Arguments:
+    - `server_ip`: erver ip. Default: 127.0.0.1
+    - `port`: server port. Default: 8092
+    - `protocol`: Service protocol, choices: [http, websocket], default: http.
+    - `input`: (required): Input text to generate.
+    - `spk_id`: Speaker id for multi-speaker text to speech. Default: 0
+    - `speed`: Audio speed, the value should be set between 0 and 3. Default: 1.0
+    - `volume`: Audio volume, the value should be set between 0 and 3. Default: 1.0
+    - `sample_rate`: Sampling rate, choices: [0, 8000, 16000], the default is the same as the model. Default: 0
+    - `output`: Output wave filepath. Default: None, which means not to save the audio to the local.
+    - `play`: Whether to play audio, play while synthesizing, default value: False, which means not playing. **Playing audio needs to rely on the pyaudio library**.
+    - `spk_id, speed, volume, sample_rate` do not take effect in streaming speech synthesis service temporarily.
+
+    
+    Output:
+    ```bash
+    [2022-04-27 10:21:04,262] [    INFO] - tts websocket client start
+    [2022-04-27 10:21:04,496] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
+    [2022-04-27 10:21:04,496] [    INFO] - 首包响应：0.2124948501586914 s
+    [2022-04-27 10:21:07,483] [    INFO] - 尾包响应：3.199106454849243 s
+    [2022-04-27 10:21:07,484] [    INFO] - 音频时长：3.825 s
+    [2022-04-27 10:21:07,484] [    INFO] - RTF: 0.8363677006141812
+    [2022-04-27 10:21:07,516] [    INFO] - 音频保存至：output.wav

+    ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import TTSOnlineClientExecutor
+  import json
+
+  executor = TTSOnlineClientExecutor()
+  executor(
+      input="您好，欢迎使用百度飞桨语音合成服务。",
+      server_ip="127.0.0.1",
+      port=8092,
+      protocol="websocket",
+      spk_id=0,
+      speed=1.0,
+      volume=1.0,
+      sample_rate=0,
+      output="./output.wav",
+      play=False)
+
+  ```
+
+  Output:
+  ```bash
+    [2022-04-27 10:22:48,852] [    INFO] - tts websocket client start
+    [2022-04-27 10:22:49,080] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
+    [2022-04-27 10:22:49,080] [    INFO] - 首包响应：0.21017956733703613 s
+    [2022-04-27 10:22:52,100] [    INFO] - 尾包响应：3.2304444313049316 s
+    [2022-04-27 10:22:52,101] [    INFO] - 音频时长：3.825 s
+    [2022-04-27 10:22:52,101] [    INFO] - RTF: 0.8445606356352762
+    [2022-04-27 10:22:52,134] [    INFO] - 音频保存至：./output.wav

  ```

+
+
  
--- a/demos/streaming_tts_server/README_cn.md
+++ b/demos/streaming_tts_server/README_cn.md
@ -1,4 +1,4 @@
-([简体中文](./README_cn.md)|English)
+(简体中文|[English](./README.md))

 # 流式语音合成服务

@ -10,31 +10,34 @@
 ### 1. 安装
 请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).

-推荐使用 **paddlepaddle 2.2.1** 或以上版本。
+推荐使用 **paddlepaddle 2.2.2** 或以上版本。
 你可以从 medium，hard 两种方式中选择一种方式安装 PaddleSpeech。


 ### 2. 准备配置文件
 配置文件可参见 `conf/tts_online_application.yaml` 。
- `protocol`表示该流式TTS服务使用的网络协议，目前支持 http 和 websocket 两种。
- `engine_list`表示即将启动的服务将会包含的语音引擎，格式为 <语音任务>_<引擎类型>。
-    - 该demo主要介绍流式语音合成服务，因此语音任务应设置为tts。
-    - 目前引擎类型支持两种形式：**online** 表示使用python进行动态图推理的引擎；**online-onnx** 表示使用onnxruntime进行推理的引擎。其中，online-onnx的推理速度更快。
- 流式TTS引擎的AM模型支持：fastspeech2 以及fastspeech2_cnndecoder; Voc 模型支持：hifigan, mb_melgan
- 流式am推理中，每次会对一个chunk的数据进行推理以达到流式的效果。其中`am_block`表示chunk中的有效帧数，`am_pad` 表示一个chunk中am_block前后各加的帧数。am_pad的存在用于消除流式推理产生的误差，避免由流式推理对合成音频质量的影响。
-    - fastspeech2不支持流式am推理，因此am_pad与am_block对它无效
-    - fastspeech2_cnndecoder 支持流式推理，当am_pad=12时，流式推理合成音频与非流式合成音频一致
- 流式voc推理中，每次会对一个chunk的数据进行推理以达到流式的效果。其中`voc_block`表示chunk中的有效帧数，`voc_pad` 表示一个chunk中voc_block前后各加的帧数。voc_pad的存在用于消除流式推理产生的误差，避免由流式推理对合成音频质量的影响。
-    - hifigan, mb_melgan 均支持流式voc 推理
-    - 当voc模型为mb_melgan，当voc_pad=14时，流式推理合成音频与非流式合成音频一致；voc_pad最小可以设置为7，合成音频听感上没有异常，若voc_pad小于7，合成音频听感上存在异常。
-    - 当voc模型为hifigan，当voc_pad=20时，流式推理合成音频与非流式合成音频一致；当voc_pad=14时，合成音频听感上没有异常。
+- `protocol` 表示该流式 TTS 服务使用的网络协议，目前支持 **http 和 websocket** 两种。
+- `engine_list` 表示即将启动的服务将会包含的语音引擎，格式为 <语音任务>_<引擎类型>。
+    - 该 demo 主要介绍流式语音合成服务，因此语音任务应设置为 tts。
+    - 目前引擎类型支持两种形式：**online** 表示使用python进行动态图推理的引擎；**online-onnx** 表示使用 onnxruntime 进行推理的引擎。其中，online-onnx 的推理速度更快。
+- 流式 TTS 引擎的 AM 模型支持：**fastspeech2 以及fastspeech2_cnndecoder**; Voc 模型支持：**hifigan, mb_melgan**
+- 流式 am 推理中，每次会对一个 chunk 的数据进行推理以达到流式的效果。其中 `am_block` 表示 chunk 中的有效帧数，`am_pad` 表示一个 chunk 中 am_block 前后各加的帧数。am_pad 的存在用于消除流式推理产生的误差，避免由流式推理对合成音频质量的影响。
+    - fastspeech2 不支持流式 am 推理，因此 am_pad 与 m_block 对它无效
+    - fastspeech2_cnndecoder 支持流式推理，当 am_pad=12 时，流式推理合成音频与非流式合成音频一致
+- 流式 voc 推理中，每次会对一个 chunk 的数据进行推理以达到流式的效果。其中 `voc_block` 表示chunk中的有效帧数，`voc_pad` 表示一个 chunk 中 voc_block 前后各加的帧数。voc_pad 的存在用于消除流式推理产生的误差，避免由流式推理对合成音频质量的影响。
+    - hifigan, mb_melgan 均支持流式 voc 推理
+    - 当 voc 模型为 mb_melgan，当 voc_pad=14 时，流式推理合成音频与非流式合成音频一致；voc_pad 最小可以设置为7，合成音频听感上没有异常，若 voc_pad 小于7，合成音频听感上存在异常。
+    - 当 voc 模型为 hifigan，当 voc_pad=19 时，流式推理合成音频与非流式合成音频一致；当 voc_pad=14 时，合成音频听感上没有异常。
 - 推理速度：mb_melgan > hifigan; 音频质量：mb_melgan < hifigan
+- **注意：** 如果在容器里可正常启动服务，但客户端访问 ip 不可达，可尝试将配置文件中 `host` 地址换成本地 ip 地址。

-### 3. 服务端使用方法
+
+### 3. 使用http协议的流式语音合成服务端及客户端使用方法
+#### 3.1 服务端使用方法
 - 命令行 (推荐使用)

+  启动服务（配置文件默认使用http）：
  ```bash
-  # 启动服务
  paddlespeech_server start --config_file ./conf/tts_online_application.yaml
  ```

@ -44,7 +47,7 @@
  paddlespeech_server start --help
  ```
  参数:
-  - `config_file`: 服务的配置文件，默认： ./conf/application.yaml
+  - `config_file`: 服务的配置文件，默认： ./conf/tts_online_application.yaml
  - `log_file`: log 文件. 默认：./log/paddlespeech.log

  输出:
@ -59,8 +62,8 @@
  [2022-04-24 20:05:28] [INFO] [on.py:45] Waiting for application startup.
  INFO:     Application startup complete.
  [2022-04-24 20:05:28] [INFO] [on.py:59] Application startup complete.
-  INFO:     Uvicorn running on http://127.0.0.1:8092 (Press CTRL+C to quit)
-  [2022-04-24 20:05:28] [INFO] [server.py:211] Uvicorn running on http://127.0.0.1:8092 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  [2022-04-24 20:05:28] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)

  ```

@ -86,23 +89,23 @@
  [2022-04-24 21:00:17] [INFO] [on.py:45] Waiting for application startup.
  INFO:     Application startup complete.
  [2022-04-24 21:00:17] [INFO] [on.py:59] Application startup complete.
-  INFO:     Uvicorn running on http://127.0.0.1:8092 (Press CTRL+C to quit)
-  [2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://127.0.0.1:8092 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+  [2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)


  ```

- 
-### 4. 流式TTS 客户端使用方法
+#### 3.2 客户端使用方法
 - 命令行 (推荐使用)

-    ```bash
-    # 访问 http 流式TTS服务
-    paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+    访问 http 流式TTS服务：

-    # 访问 websocket 流式TTS服务
-    paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol websocket --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+    若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+    ```bash
+    paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol http --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
    ```
+
    使用帮助:
  
    ```bash
@ -120,6 +123,7 @@
    - `sample_rate`: 采样率，可选 [0, 8000, 16000]，默认值：0，表示与模型采样率相同
    - `output`: 输出音频的路径， 默认值：None，表示不保存音频到本地。
    - `play`: 是否播放音频，边合成边播放， 默认值：False，表示不播放。**播放音频需要依赖pyaudio库**。
+    - `spk_id, speed, volume, sample_rate` 在流式语音合成服务中暂时不生效。

    
    输出:
@ -163,8 +167,146 @@
  [2022-04-24 21:11:16,802] [    INFO] - 音频时长：3.825 s
  [2022-04-24 21:11:16,802] [    INFO] - RTF: 0.7846773683635238
  [2022-04-24 21:11:16,837] [    INFO] - 音频保存至：./output.wav
+  ```
+
+ 
+### 4. 使用websocket协议的流式语音合成服务端及客户端使用方法
+#### 4.1 服务端使用方法
+- 命令行 (推荐使用)
+  首先修改配置文件 `conf/tts_online_application.yaml`， **将 `protocol` 设置为 `websocket`**。
+  启动服务：
+  ```bash
+  paddlespeech_server start --config_file ./conf/tts_online_application.yaml
+  ```
+
+  使用方法：
+  
+  ```bash
+  paddlespeech_server start --help
+  ```
+  参数:
+  - `config_file`: 服务的配置文件，默认： ./conf/tts_online_application.yaml
+  - `log_file`: log 文件. 默认：./log/paddlespeech.log
+
+  输出:
+  ```bash
+    [2022-04-27 10:18:09,107] [    INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
+    [2022-04-27 10:18:09,219] [    INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
+    [2022-04-27 10:18:09,324] [    INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
+    [2022-04-27 10:18:09,325] [    INFO] - **********************************************************************
+    INFO:     Started server process [17600]
+    [2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
+    INFO:     Waiting for application startup.
+    [2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
+    INFO:     Application startup complete.
+    [2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
+    INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+    [2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)


  ```

+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
+
+  server_executor = ServerExecutor()
+  server_executor(
+      config_file="./conf/tts_online_application.yaml", 
+      log_file="./log/paddlespeech.log")
+  ```
+
+  输出：
+  ```bash
+    [2022-04-27 10:20:16,660] [    INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
+    [2022-04-27 10:20:16,773] [    INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
+    [2022-04-27 10:20:16,878] [    INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
+    [2022-04-27 10:20:16,878] [    INFO] - **********************************************************************
+    INFO:     Started server process [23466]
+    [2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
+    INFO:     Waiting for application startup.
+    [2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
+    INFO:     Application startup complete.
+    [2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
+    INFO:     Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+    [2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
+
+  ```
+
+#### 4.2 客户端使用方法
+- 命令行 (推荐使用)
+
+    访问 websocket 流式TTS服务：
+
+    若 `127.0.0.1` 不能访问，则需要使用实际服务 IP 地址
+
+    ```bash
+    paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol websocket --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+    ```
+
+    使用帮助:
+  
+    ```bash
+    paddlespeech_client tts_online --help
+    ```
+
+    参数:
+    - `server_ip`: 服务端ip地址，默认: 127.0.0.1。
+    - `port`: 服务端口，默认: 8092。
+    - `protocol`: 服务协议，可选 [http, websocket], 默认: http。
+    - `input`: (必须输入): 待合成的文本。
+    - `spk_id`: 说话人 id，用于多说话人语音合成，默认值： 0。
+    - `speed`: 音频速度，该值应设置在 0 到 3 之间。 默认值：1.0
+    - `volume`: 音频音量，该值应设置在 0 到 3 之间。 默认值： 1.0
+    - `sample_rate`: 采样率，可选 [0, 8000, 16000]，默认值：0，表示与模型采样率相同
+    - `output`: 输出音频的路径， 默认值：None，表示不保存音频到本地。
+    - `play`: 是否播放音频，边合成边播放， 默认值：False，表示不播放。**播放音频需要依赖pyaudio库**。
+    - `spk_id, speed, volume, sample_rate` 在流式语音合成服务中暂时不生效。
+
+    
+    输出:
+    ```bash
+    [2022-04-27 10:21:04,262] [    INFO] - tts websocket client start
+    [2022-04-27 10:21:04,496] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
+    [2022-04-27 10:21:04,496] [    INFO] - 首包响应：0.2124948501586914 s
+    [2022-04-27 10:21:07,483] [    INFO] - 尾包响应：3.199106454849243 s
+    [2022-04-27 10:21:07,484] [    INFO] - 音频时长：3.825 s
+    [2022-04-27 10:21:07,484] [    INFO] - RTF: 0.8363677006141812
+    [2022-04-27 10:21:07,516] [    INFO] - 音频保存至：output.wav
+
+    ```
+
+- Python API
+  ```python
+  from paddlespeech.server.bin.paddlespeech_client import TTSOnlineClientExecutor
+  import json
+
+  executor = TTSOnlineClientExecutor()
+  executor(
+      input="您好，欢迎使用百度飞桨语音合成服务。",
+      server_ip="127.0.0.1",
+      port=8092,
+      protocol="websocket",
+      spk_id=0,
+      speed=1.0,
+      volume=1.0,
+      sample_rate=0,
+      output="./output.wav",
+      play=False)
+
+  ```
+
+  输出:
+  ```bash
+    [2022-04-27 10:22:48,852] [    INFO] - tts websocket client start
+    [2022-04-27 10:22:49,080] [    INFO] - 句子：您好，欢迎使用百度飞桨语音合成服务。
+    [2022-04-27 10:22:49,080] [    INFO] - 首包响应：0.21017956733703613 s
+    [2022-04-27 10:22:52,100] [    INFO] - 尾包响应：3.2304444313049316 s
+    [2022-04-27 10:22:52,101] [    INFO] - 音频时长：3.825 s
+    [2022-04-27 10:22:52,101] [    INFO] - RTF: 0.8445606356352762
+    [2022-04-27 10:22:52,134] [    INFO] - 音频保存至：./output.wav
+
+  ```
+
+
  
--- a/demos/streaming_tts_server/conf/tts_online_application.yaml
+++ b/demos/streaming_tts_server/conf/tts_online_application.yaml
@ -3,7 +3,7 @@
 #################################################################################
 #                             SERVER SETTING                                    #
 #################################################################################
-host: 127.0.0.1
+host: 0.0.0.0
 port: 8092

 # The task format in the engin_list is: <speech task>_<engine type>
@ -43,12 +43,12 @@ tts_online:
    device: 'cpu' # set 'gpu:id' or 'cpu'
    # am_block and am_pad only for fastspeech2_cnndecoder_onnx model to streaming am infer,
    # when am_pad set 12, streaming synthetic audio is the same as non-streaming synthetic audio
-    am_block: 42
+    am_block: 72
    am_pad: 12
    # voc_pad and voc_block voc model to streaming voc infer,
    # when voc model is mb_melgan_csmsc, voc_pad set 14, streaming synthetic audio is the same as non-streaming synthetic audio; The minimum value of pad can be set to 7, streaming synthetic audio sounds normal
-    # when voc model is hifigan_csmsc, voc_pad set 20, streaming synthetic audio is the same as non-streaming synthetic audio; voc_pad set 14, streaming synthetic audio sounds normal
-    voc_block: 14
+    # when voc model is hifigan_csmsc, voc_pad set 19, streaming synthetic audio is the same as non-streaming synthetic audio; voc_pad set 14, streaming synthetic audio sounds normal
+    voc_block: 36
    voc_pad: 14
    

@ -91,12 +91,12 @@ tts_online-onnx:
    lang: 'zh'
    # am_block and am_pad only for fastspeech2_cnndecoder_onnx model to streaming am infer,
    # when am_pad set 12, streaming synthetic audio is the same as non-streaming synthetic audio
-    am_block: 42
+    am_block: 72
    am_pad: 12
    # voc_pad and voc_block voc model to streaming voc infer,
    # when voc model is mb_melgan_csmsc_onnx, voc_pad set 14, streaming synthetic audio is the same as non-streaming synthetic audio; The minimum value of pad can be set to 7, streaming synthetic audio sounds normal
-    # when voc model is hifigan_csmsc_onnx, voc_pad set 20, streaming synthetic audio is the same as non-streaming synthetic audio; voc_pad set 14, streaming synthetic audio sounds normal
-    voc_block: 14
+    # when voc model is hifigan_csmsc_onnx, voc_pad set 19, streaming synthetic audio is the same as non-streaming synthetic audio; voc_pad set 14, streaming synthetic audio sounds normal
+    voc_block: 36
    voc_pad: 14
    # voc_upsample should be same as n_shift on voc config.
    voc_upsample: 300
--- a/demos/streaming_tts_server/test_client.sh
+++ b/demos/streaming_tts_server/test_client.sh
@ -1,7 +1,9 @@
 #!/bin/bash

 # http client test
+# If `127.0.0.1` is not accessible, you need to use the actual service IP address.
 paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol http --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav

 # websocket client test
-#paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol websocket --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
+# If `127.0.0.1` is not accessible, you need to use the actual service IP address.
+# paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol websocket --input "您好，欢迎使用百度飞桨语音合成服务。" --output output.wav
--- a/demos/text_to_speech/README.md
+++ b/demos/text_to_speech/README.md
@ -77,7 +77,7 @@ The input of this demo should be a text of the specific language that can be pas
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import TTSExecutor
+  from paddlespeech.cli.tts import TTSExecutor

  tts_executor = TTSExecutor()
  wav_file = tts_executor(
--- a/demos/text_to_speech/README_cn.md
+++ b/demos/text_to_speech/README_cn.md
@ -80,7 +80,7 @@
 - Python API
  ```python
  import paddle
-  from paddlespeech.cli import TTSExecutor
+  from paddlespeech.cli.tts import TTSExecutor

  tts_executor = TTSExecutor()
  wav_file = tts_executor(
--- a/docker/ubuntu18-cpu/Dockerfile
+++ b/docker/ubuntu18-cpu/Dockerfile
@ -0,0 +1,15 @@
+FROM registry.baidubce.com/paddlepaddle/paddle:2.2.2
+LABEL maintainer="paddlesl@baidu.com"
+
+RUN git clone --depth 1 https://github.com/PaddlePaddle/PaddleSpeech.git /home/PaddleSpeech  
+RUN pip3 uninstall mccabe -y ; exit 0;
+RUN pip3 install multiprocess==0.70.12 importlib-metadata==4.2.0 dill==0.3.4
+
+RUN cd /home/PaddleSpeech/audio
+RUN python setup.py bdist_wheel
+
+RUN cd /home/PaddleSpeech
+RUN python setup.py bdist_wheel
+RUN pip install audio/dist/*.whl dist/*.whl
+
+WORKDIR /home/PaddleSpeech/
--- a/docs/paddlespeech.pdf
+++ b/docs/paddlespeech.pdf
--- a/docs/source/asr/PPASR.md
+++ b/docs/source/asr/PPASR.md
@ -0,0 +1,96 @@
+([简体中文](./PPASR_cn.md)|English)
+# PP-ASR
+
+## Catalogue
+- [1. Introduction](#1)
+- [2. Characteristic](#2)
+- [3. Tutorials](#3)
+    - [3.1 Pre-trained Models](#31)
+    - [3.2 Training](#32)
+    - [3.3 Inference](#33)
+    - [3.4 Service Deployment](#33)
+    - [3.5 Customized Auto Speech Recognition and Deployment](#33)
+- [4. Quick Start](#4)
+
+<a name="1"></a>
+## 1. Introduction
+
+PP-ASR is a tool to provide ASR(Automatic speech recognition) function. It provides a variety of Chinese and English models and supports model training. It also supports model inference using the command line. In addition, PP-ASR supports the deployment of streaming models and customized ASR.
+
+<a name="2"></a>
+## 2. Characteristic
+The basic process of ASR is shown in the figure below:  
+<center><img src=https://user-images.githubusercontent.com/87408988/168259962-cbe2008b-47b6-443d-9566-d77a5ca2eb25.png width="800" ></center>
+
+
+The main characteristics of PP-ASR are shown below:
+-  Provides pre-trained models on Chinese/English open source datasets: aishell(Chinese), wenetspeech(Chinese) and librispeech(English). The models include deepspeech2 and conformer/transformer.
+-  Support model training on Chinese/English datasets.
+-  Support model inference using the command line. You can use to use `paddlespeech asr --model xxx --input xxx.wav` to use the pre-trained model to do model inference. 
+-  Support deployment of streaming ASR server. Besides ASR function, the server supports timestamp function.
+-  Support customized auto speech recognition and deployment.
+
+<a name="3"></a>
+## 3. Tutorials
+
+<a name="31"></a>
+## 3.1 Pre-trained Models
+The support pre-trained model list: [released_model](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md).  
+The model with good effect are Ds2 Online Wenetspeech ASR0 Model and Conformer Online Wenetspeech ASR1 Model. Both two models support streaming ASR.  
+For more information about model design, you can refer to the aistudio tutorial:
+- [Deepspeech2](https://aistudio.baidu.com/aistudio/projectdetail/3866807)
+- [Transformer](https://aistudio.baidu.com/aistudio/projectdetail/3470110)
+
+<a name="32"></a>
+## 3.2 Training
+The referenced script for model training is stored in [examples](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples) and stored according to "examples/dataset/model". The dataset mainly supports aishell and librispeech. The model supports deepspeech2 and u2(conformer/transformer).
+The specific steps of executing the script are recorded in `run.sh`.
+
+For more information, you can refer to [asr1](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
+
+
+<a name="33"></a>
+## 3.3 Inference
+
+PP-ASR supports use `paddlespeech asr --model xxx --input xxx.wav` to use the pre-trained model to do model inference after install `paddlespeech` by `pip install paddlespeech`.
+
+Specific supported functions include:
+
+- Prediction of single audio
+- Use the pipe to predict multiple audio
+- Support RTF calculation
+
+For specific usage, please refer to: [speech_recognition](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md) 
+
+
+<a name="34"></a>
+## 3.4 Service Deployment
+
+PP-ASR supports the service deployment of streaming ASR. Support the simultaneous use of speech recognition and punctuation processing.
+
+Demo of ASR Server: [streaming_asr_server](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/streaming_asr_server)
+
+![image](https://user-images.githubusercontent.com/87408988/168255342-1fc790c0-16f4-4540-a861-db239076727c.png)
+
+Display of using ASR server on Web page: [streaming_asr_demo_video](https://paddlespeech.readthedocs.io/en/latest/streaming_asr_demo_video.html)
+
+
+For more information about service deployment, you can refer to the aistudio tutorial:
+- [Streaming service - model part](https://aistudio.baidu.com/aistudio/projectdetail/3839884)
+- [Streaming service](https://aistudio.baidu.com/aistudio/projectdetail/4017905)
+
+<a name="35"></a>
+## 3.5 Customized Auto Speech Recognition and Deployment
+
+For customized auto speech recognition and deployment, PP-ASR provides feature extraction(fbank) => Inference model（Scoring Library）=> C++ program of TLG（WFST, token, lexion, grammer). For specific usage, please refer to: [speechx](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx)   
+If you want to quickly use it, you can refer to [custom_streaming_asr](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/custom_streaming_asr/README_cn.md)
+
+For more information about customized auto speech recognition and deployment, you can refer to the aistudio tutorial:
+- [Customized Auto Speech Recognition](https://aistudio.baidu.com/aistudio/projectdetail/4021561)
+
+
+<a name="4"></a>
+
+## 4. Quick Start
+
+To use PP-ASR, you can see here [install](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md), It supplies three methods to install `paddlespeech`, which are **Easy**, **Medium** and **Hard**. If you want to experience the inference function of paddlespeech, you can use **Easy** installation method.
--- a/docs/source/asr/PPASR_cn.md
+++ b/docs/source/asr/PPASR_cn.md
@ -0,0 +1,94 @@
+(简体中文|[English](./PPASR.md))
+# PP-ASR
+
+## 目录
+- [1. 简介](#1)
+- [2. 特点](#2)
+- [3. 使用教程](#3)
+    - [3.1 预训练模型](#31)
+    - [3.2 模型训练](#32)
+    - [3.3 模型推理](#33)
+    - [3.4 服务部署](#33)
+    - [3.5 支持个性化场景部署](#33)
+- [4. 快速开始](#4)
+
+<a name="1"></a>
+## 1. 简介
+
+PP-ASR 是一个 提供 ASR 功能的工具。其提供了多种中文和英文的模型，支持模型的训练，并且支持使用命令行的方式进行模型的推理。 PP-ASR 也支持流式模型的部署，以及个性化场景的部署。
+
+<a name="2"></a>
+## 2. 特点
+语音识别的基本流程如下图所示：  
+<center><img src=https://user-images.githubusercontent.com/87408988/168259962-cbe2008b-47b6-443d-9566-d77a5ca2eb25.png width="800" ></center>
+
+
+PP-ASR 的主要特点如下：
+-  提供在中/英文开源数据集 aishell （中文），wenetspeech（中文），librispeech （英文）上的预训练模型。模型包含 deepspeech2 模型以及 conformer/transformer 模型。
+-  支持中/英文的模型训练功能。
+-  支持命令行方式的模型推理，可使用 `paddlespeech asr --model xxx --input xxx.wav` 方式调用各个预训练模型进行推理。
+-  支持流式 ASR 的服务部署，也支持输出时间戳。
+-  支持个性化场景的部署。
+
+<a name="3"></a>
+## 3. 使用教程
+
+<a name="31"></a>
+## 3.1 预训练模型
+支持的预训练模型列表：[released_model](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md)。
+其中效果较好的模型为 Ds2 Online Wenetspeech ASR0 Model 以及 Conformer Online Wenetspeech ASR1 Model。 两个模型都支持流式 ASR。
+更多关于模型设计的部分，可以参考 AIStudio 教程：
+- [Deepspeech2](https://aistudio.baidu.com/aistudio/projectdetail/3866807)
+- [Transformer](https://aistudio.baidu.com/aistudio/projectdetail/3470110)
+
+<a name="32"></a>
+## 3.2 模型训练
+
+模型的训练的参考脚本存放在 [examples](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples) 中，并按照 `examples/数据集/模型` 存放，数据集主要支持 aishell 和 librispeech，模型支持 deepspeech2 模型和 u2 (conformer/transformer) 模型。
+具体的执行脚本的步骤记录在 `run.sh` 当中。具体可参考： [asr1](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell/asr1)
+
+
+<a name="33"></a>
+## 3.3 模型推理
+
+PP-ASR 支持在使用`pip install paddlespeech`后 使用命令行的方式来使用预训练模型进行推理。
+
+具体支持的功能包括：
+
+- 对单条音频进行预测
+- 使用管道的方式对多条音频进行预测
+- 支持 RTF 的计算
+
+具体的使用方式可以参考： [speech_recognition](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md) 
+
+
+<a name="34"></a>
+## 3.4 服务部署
+
+PP-ASR 支持流式ASR的服务部署。支持 语音识别 + 标点处理两个功能同时使用。
+
+server 的 demo： [streaming_asr_server](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/streaming_asr_server)
+
+![image](https://user-images.githubusercontent.com/87408988/168255342-1fc790c0-16f4-4540-a861-db239076727c.png)
+
+网页上使用 asr server 的效果展示：[streaming_asr_demo_video](https://paddlespeech.readthedocs.io/en/latest/streaming_asr_demo_video.html)
+
+关于服务部署方面的更多资料，可以参考 AIStudio 教程：
+- [流式服务-模型部分](https://aistudio.baidu.com/aistudio/projectdetail/3839884)
+- [流式服务](https://aistudio.baidu.com/aistudio/projectdetail/4017905)
+
+<a name="35"></a>
+## 3.5 支持个性化场景部署
+
+针对个性化场景部署，提供了特征提取（fbank） => 推理模型（打分库）=> TLG（WFST， token, lexion, grammer）的 C++ 程序。具体参考 [speechx](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/speechx)。  
+如果想快速了解和使用，可以参考： [custom_streaming_asr](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/custom_streaming_asr/README_cn.md)
+
+关于支持个性化场景部署的更多资料，可以参考 AIStudio 教程：
+- [定制化识别](https://aistudio.baidu.com/aistudio/projectdetail/4021561)
+
+
+<a name="4"></a>
+
+## 4. 快速开始
+
+关于如果使用 PP-ASR，可以看这里的 [install](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md)，其中提供了 **简单**、**中等**、**困难** 三种安装方式。如果想体验 paddlespeech 的推理功能，可以用 **简单** 安装方式。
--- a/docs/source/audio/_static/custom.css
+++ b/docs/source/audio/_static/custom.css
--- a/docs/source/audio/_templates/module.rst_t
+++ b/docs/source/audio/_templates/module.rst_t
--- a/docs/source/audio/_templates/package.rst_t
+++ b/docs/source/audio/_templates/package.rst_t
--- a/docs/source/audio/_templates/toc.rst_t
+++ b/docs/source/audio/_templates/toc.rst_t
--- a/docs/source/audio/conf.py
+++ b/docs/source/audio/conf.py
--- a/docs/source/audio/index.rst
+++ b/docs/source/audio/index.rst
--- a/docs/source/cls/custom_dataset.md
+++ b/docs/source/cls/custom_dataset.md
@ -1,8 +1,8 @@
 # Customize Dataset for Audio Classification

-Following this tutorial you can customize your dataset for audio classification task by using `paddlespeech` and `paddleaudio`.
+Following this tutorial you can customize your dataset for audio classification task by using `paddlespeech`.

-A base class of classification dataset is `paddleaudio.dataset.AudioClassificationDataset`. To customize your dataset you should write a dataset class derived from `AudioClassificationDataset`. 
+A base class of classification dataset is `paddlespeech.audio.dataset.AudioClassificationDataset`. To customize your dataset you should write a dataset class derived from `AudioClassificationDataset`. 

 Assuming you have some wave files that stored in your own directory. You should prepare a meta file with the information of filepaths and labels. For example the absolute path of it is `/PATH/TO/META_FILE.txt`:
 ```
@ -14,7 +14,7 @@ Assuming you have some wave files that stored in your own directory. You should
 Here is an example to build your custom dataset in `custom_dataset.py`:

 ```python
-from paddleaudio.datasets.dataset import AudioClassificationDataset
+from paddlespeech.audio.datasets.dataset import AudioClassificationDataset

 class CustomDataset(AudioClassificationDataset):
    meta_file = '/PATH/TO/META_FILE.txt'
@ -48,7 +48,7 @@ class CustomDataset(AudioClassificationDataset):
 Then you can build dataset and data loader from `CustomDataset`:
 ```python
 import paddle
-from paddleaudio.features import LogMelSpectrogram
+from paddlespeech.audio.features import LogMelSpectrogram

 from custom_dataset import CustomDataset

--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@ -54,7 +54,9 @@ Contents
   :caption: Demos

   demo_video
+   streaming_asr_demo_video
   tts_demo_video
+   streaming_tts_demo_video


 .. toctree::
--- a/docs/source/install.md
+++ b/docs/source/install.md
@ -4,7 +4,7 @@ There are 3 ways to use `PaddleSpeech`. According to the degree of difficulty, t

 | Way | Function                                                     | Support|
 |:---- |:----------------------------------------------------------- |:----|
-| Easy     | (1) Use command-line functions of PaddleSpeech. <br> (2) Experience PaddleSpeech on Ai Studio. | Linux, Mac(not support M1 chip)，Windows |
+| Easy     | (1) Use command-line functions of PaddleSpeech. <br> (2) Experience PaddleSpeech on Ai Studio. | Linux, Mac(not support M1 chip)，Windows ( For more information about installation, see [#1195](https://github.com/PaddlePaddle/PaddleSpeech/discussions/1195)) |
 | Medium     | Support major functions ，such as using the` ready-made `examples and using PaddleSpeech to train your model.                                           | Linux |
 | Hard     | Support full function of Paddlespeech, including using join ctc decoder with kaldi, training n-gram language model, Montreal-Forced-Aligner, and so on. And you are more able to be a developer! | Ubuntu |

@ -139,28 +139,13 @@ pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple
 To avoid the trouble of environment setup, running in a Docker container is highly recommended. Otherwise, if you work on `Ubuntu` with `root` privilege, you can still complete the installation.

 ### Choice 1: Running in Docker Container (Recommend)
-Docker is an open-source tool to build, ship, and run distributed applications in an isolated environment. A Docker image for this project has been provided in [hub.docker.com](https://hub.docker.com) with all the dependencies installed. This Docker image requires the support of NVIDIA GPU, so please make sure its availability and the [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) has been installed.
+Docker is an open-source tool to build, ship, and run distributed applications in an isolated environment. If you  do not have a Docker environment, please refer to [Docker](https://www.docker.com/). If you will use GPU version, you also need to install [nvidia-docker](https://github.com/NVIDIA/nvidia-docker).

-Take several steps to launch the Docker image:
- Download the Docker image
+We provide docker images containing the latest PaddleSpeech code, and all environment and package dependencies are pre-installed. All you have to do is to **pull and run the docker image**. Then you can enjoy PaddleSpeech without any extra steps.

-For example, pull paddle 2.2.0 image:
-```bash
-sudo nvidia-docker pull registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7
-```
- Clone this repository
-```bash
-git clone https://github.com/PaddlePaddle/PaddleSpeech.git
-```
- Run the Docker image
-```bash
-sudo nvidia-docker run --net=host --ipc=host --rm -it -v $(pwd)/PaddleSpeech:/PaddleSpeech registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7 /bin/bash
-```
- Enter PaddleSpeech directory.
-```bash
-cd /PaddleSpeech
-```
-Now you can execute training, inference, and hyper-parameters tuning in  Docker container.
+Get these images and guidance in [docker hub](https://hub.docker.com/repository/docker/paddlecloud/paddlespeech), including CPU, GPU, ROCm environment versions. 
+
+If you have some customized requirements about automatic building docker images, you can get it in github repo [PaddlePaddle/PaddleCloud](https://github.com/PaddlePaddle/PaddleCloud/tree/main/tekton).

 ### Choice 2: Running in Ubuntu with Root Privilege
 - Install `build-essential` by apt
--- a/docs/source/install_cn.md
+++ b/docs/source/install_cn.md
@ -3,7 +3,7 @@
 `PaddleSpeech` 有三种安装方法。根据安装的难易程度，这三种方法可以分为 **简单**, **中等** 和 **困难**.
 | 方式 | 功能                                                         | 支持系统            |
 | :--- | :----------------------------------------------------------- | :------------------ |
-| 简单 | (1) 使用 PaddleSpeech 的命令行功能. <br> (2) 在 Aistudio上体验 PaddleSpeech. | Linux, Mac(不支持M1芯片)，Windows |
+| 简单 | (1) 使用 PaddleSpeech 的命令行功能. <br> (2) 在 Aistudio上体验 PaddleSpeech. | Linux, Mac(不支持M1芯片)，Windows (安装详情查看[#1195](https://github.com/PaddlePaddle/PaddleSpeech/discussions/1195)) |
 | 中等 | 支持 PaddleSpeech 主要功能，比如使用已有 examples 中的模型和使用 PaddleSpeech 来训练自己的模型. | Linux               |
 | 困难 | 支持 PaddleSpeech 的各项功能，包含结合kaldi使用 join ctc decoder 方式解码，训练语言模型,使用强制对齐等。并且你更能成为一名开发者！ | Ubuntu              |
 ## 先决条件
@ -130,26 +130,14 @@ pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple
 - 选择 2： 使用`Ubuntu` ，并且拥有 root 权限。

 为了避免各种环境配置问题，我们非常推荐你使用 docker 容器。如果你不想使用 docker，但是可以使用拥有 root 权限的 Ubuntu 系统，你也可以完成**困难**方式的安装。
-### 选择1： 使用Docker容器（推荐）
-Docker 是一种开源工具，用于在和系统本身环境相隔离的环境中构建、发布和运行各类应用程序。你可以访问 [hub.docker.com](https://hub.docker.com) 来下载各种版本的 docker，目前已经有适用于 `PaddleSpeech` 的 docker 提供在了该网站上。Docker 镜像需要使用 Nvidia GPU，所以你也需要提前安装好 [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) 。
-你需要完成几个步骤来启动docker：
- 下载 docker 镜像:
-  例如，拉取 paddle2.2.0 镜像：
-```bash
-sudo nvidia-docker pull registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7
-```
- 克隆 `PaddleSpeech` 仓库
-```bash
-git clone https://github.com/PaddlePaddle/PaddleSpeech.git
-```
- 启动 docker 镜像
-```bash
-sudo nvidia-docker run --net=host --ipc=host --rm -it -v $(pwd)/PaddleSpeech:/PaddleSpeech registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7 /bin/bash
-```
- 进入 PaddleSpeech 目录
-```bash
-cd /PaddleSpeech
-```
+### 选择1： 使用 Docker 容器（推荐）
+Docker 是一种开源工具，用于在和系统本身环境相隔离的环境中构建、发布和运行各类应用程序。如果您没有 Docker 运行环境，请参考 [Docker 官网](https://www.docker.com/)进行安装，如果您准备使用 GPU 版本镜像，还需要提前安装好 [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) 。 
+
+我们提供了包含最新 PaddleSpeech 代码的 docker 镜像，并预先安装好了所有的环境和库依赖，您只需要**拉取并运行 docker 镜像**，无需其他任何额外操作，即可开始享用 PaddleSpeech 的所有功能。
+
+在 [Docker Hub](https://hub.docker.com/repository/docker/paddlecloud/paddlespeech) 中获取这些镜像及相应的使用指南，包括 CPU、GPU、ROCm 版本。
+
+如果您对自动化制作docker镜像感兴趣，或有自定义需求，请访问 [PaddlePaddle/PaddleCloud](https://github.com/PaddlePaddle/PaddleCloud/tree/main/tekton) 做进一步了解。
 完成这些以后，你就可以在 docker 容器中执行训练、推理和超参 fine-tune。
 ### 选择2： 使用有 root 权限的 Ubuntu
 - 使用apt安装 `build-essential`
--- a/docs/source/reference.md
+++ b/docs/source/reference.md
@ -13,6 +13,7 @@ We borrowed a lot of code from these repos to build `model` and `engine`, thanks
 - Apache-2.0 License
 - U2 model
 - Building TLG based Graph
+- websocket server & client

 * [kaldi](https://github.com/kaldi-asr/kaldi/blob/master/COPYING)
 - Apache-2.0 License
--- a/docs/source/released_model.md
+++ b/docs/source/released_model.md
@ -6,13 +6,15 @@
 ### Speech Recognition Model
 Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech | Example Link 
 :-------------:| :------------:| :-----: | -----: | :-----: |:-----:| :-----:  | :-----:  | :-----: 
-[Ds2 Online Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_online_aishell_fbank161_ckpt_0.2.0.model.tar.gz) | Aishell Dataset | Char-based | 479 MB  | 2 Conv + 5 LSTM layers with only forward direction | 0.0718 |-| 151 h | [D2 Online Aishell ASR0](../../examples/aishell/asr0) 
-[Ds2 Offline Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_aishell_ckpt_0.1.1.model.tar.gz)| Aishell Dataset | Char-based | 306 MB | 2 Conv + 3 bidirectional GRU layers| 0.064 |-| 151 h | [Ds2 Offline Aishell ASR0](../../examples/aishell/asr0) 
+[Ds2 Online Wenetspeech ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr0/asr0_deepspeech2_online_wenetspeech_ckpt_1.0.2.model.tar.gz) | Wenetspeech Dataset | Char-based | 1.2 GB  | 2 Conv + 5 LSTM layers | 0.152 (test\_net, w/o LM) <br> 0.2417 (test\_meeting, w/o LM) <br> 0.053 (aishell, w/ LM) |-| 10000 h |- 
+[Ds2 Online Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_online_aishell_fbank161_ckpt_0.2.1.model.tar.gz) | Aishell Dataset | Char-based | 491 MB  | 2 Conv + 5 LSTM layers | 0.0666 |-| 151 h | [D2 Online Aishell ASR0](../../examples/aishell/asr0) 
+[Ds2 Offline Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_offline_aishell_ckpt_1.0.1.model.tar.gz)| Aishell Dataset | Char-based | 1.4 GB | 2 Conv + 5 bidirectional LSTM layers| 0.0554 |-| 151 h | [Ds2 Offline Aishell ASR0](../../examples/aishell/asr0) 
+[Conformer Online Wenetspeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz) | WenetSpeech Dataset | Char-based | 457 MB  | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring| 0.11 (test\_net) 0.1879 (test\_meeting) |-| 10000 h |- 
 [Conformer Online Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_chunk_conformer_aishell_ckpt_0.2.0.model.tar.gz) | Aishell Dataset | Char-based | 189 MB  | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring| 0.0544 |-| 151 h | [Conformer Online Aishell ASR1](../../examples/aishell/asr1) 
 [Conformer Offline Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_conformer_aishell_ckpt_0.1.2.model.tar.gz) | Aishell Dataset | Char-based | 189 MB  | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0464 |-| 151 h | [Conformer Offline Aishell ASR1](../../examples/aishell/asr1) 
 [Transformer Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz) | Aishell Dataset | Char-based | 128 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0523 || 151 h | [Transformer  Aishell ASR1](../../examples/aishell/asr1) 
-[Ds2 Offline Librispeech ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr0/asr0_deepspeech2_librispeech_ckpt_0.1.1.model.tar.gz)| Librispeech Dataset | Char-based | 518 MB | 2 Conv + 3 bidirectional LSTM layers| - |0.0725| 960 h | [Ds2 Offline Librispeech ASR0](../../examples/librispeech/asr0) 
-[Conformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 191 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0337 | 960 h | [Conformer Librispeech ASR1](../../examples/librispeech/asr1) 
+[Ds2 Offline Librispeech ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr0/asr0_deepspeech2_offline_librispeech_ckpt_1.0.1.model.tar.gz)| Librispeech Dataset | Char-based | 1.3 GB | 2 Conv + 5 bidirectional LSTM layers| - |0.0467| 960 h | [Ds2 Offline Librispeech ASR0](../../examples/librispeech/asr0) 
+[Conformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 191 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0338 | 960 h | [Conformer Librispeech ASR1](../../examples/librispeech/asr1) 
 [Transformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_transformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 131 MB  | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0381 | 960 h | [Transformer Librispeech ASR1](../../examples/librispeech/asr1) 
 [Transformer Librispeech ASR2 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 131 MB  | Encoder:Transformer, Decoder:Transformer, Decoding method: JoinCTC w/ LM |-| 0.0240 | 960 h | [Transformer Librispeech ASR2](../../examples/librispeech/asr2) 

@ -80,17 +82,9 @@ PANN | ESC-50 |[pann-esc50](../../examples/esc50/cls0)|[esc50_cnn6.tar.gz](https

 Model Type | Dataset| Example Link | Pretrained Models | Static Models 
 :-------------:| :------------:| :-----: | :-----: | :-----:
-PANN | VoxCeleb| [voxceleb_ecapatdnn](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/voxceleb/sv0) | [ecapatdnn.tar.gz](https://paddlespeech.bj.bcebos.com/vector/voxceleb/sv0_ecapa_tdnn_voxceleb12_ckpt_0_2_0.tar.gz) | -
+ECAPA-TDNN | VoxCeleb| [voxceleb_ecapatdnn](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/voxceleb/sv0) | [ecapatdnn.tar.gz](https://paddlespeech.bj.bcebos.com/vector/voxceleb/sv0_ecapa_tdnn_voxceleb12_ckpt_0_2_1.tar.gz) | -

 ## Punctuation Restoration Models
 Model Type | Dataset| Example Link | Pretrained Models
 :-------------:| :------------:| :-----: | :-----:
 Ernie Linear | IWLST2012_zh |[iwslt2012_punc0](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/iwslt2012/punc0)|[ernie_linear_p3_iwslt2012_zh_ckpt_0.1.1.zip](https://paddlespeech.bj.bcebos.com/text/ernie_linear_p3_iwslt2012_zh_ckpt_0.1.1.zip)
-
-## Speech Recognition Model  from paddle 1.8
-
-| Acoustic Model |Training Data| Token-based | Size | Descriptions | CER | WER | Hours of speech |
-| :-----:| :-----:  |  :-----:  |  :-----:  | :-----:  |  :-----: | :-----:  | :-----: |
-| [Ds2 Offline Aishell model](https://deepspeech.bj.bcebos.com/mandarin_models/aishell_model_v1.8_to_v2.x.tar.gz) |        Aishell Dataset  | Char-based  | 234 MB | 2 Conv + 3 bidirectional GRU layers  | 0.0804 | —  | 151 h  |
-| [Ds2 Offline Librispeech model](https://deepspeech.bj.bcebos.com/eng_models/librispeech_v1.8_to_v2.x.tar.gz) |      Librispeech Dataset | Word-based  | 307 MB | 2 Conv + 3 bidirectional sharing weight RNN layers | —  | 0.0685 | 960 h |
-| [Ds2 Offline Baidu en8k model](https://deepspeech.bj.bcebos.com/eng_models/baidu_en8k_v1.8_to_v2.x.tar.gz) | Baidu Internal English Dataset | Word-based  | 273 MB | 2 Conv + 3 bidirectional GRU layers   |—  | 0.0541 | 8628 h|
--- a/docs/source/streaming_asr_demo_video.rst
+++ b/docs/source/streaming_asr_demo_video.rst
@ -0,0 +1,10 @@
+Streaming ASR Demo Video
+==================
+
+.. raw:: html
+     
+    <video controls width="1024">
+
+    <source src="https://paddlespeech.bj.bcebos.com/demos/asr_demos/streaming_ASR_slice.mp4" type="video/mp4">
+    Sorry, your browser doesn't support embedded videos.
+    </video>
--- a/docs/source/streaming_tts_demo_video.rst
+++ b/docs/source/streaming_tts_demo_video.rst
@ -0,0 +1,12 @@
+Streaming TTS Demo Video
+==================
+
+.. raw:: html
+     
+    <video controls width="1024">
+
+    <source src="https://paddlespeech.bj.bcebos.com/Parakeet/docs/demos/streaming_tts_demo.mp4"
+            type="video/mp4">
+    Sorry, your browser doesn't support embedded videos.
+    </video>
+
--- a/docs/source/tts/PPTTS.md
+++ b/docs/source/tts/PPTTS.md
@ -0,0 +1,76 @@
+([简体中文](./PPTTS_cn.md)|English)
+
+# PPTTS
+
+- [1. Introduction](#1)
+- [2. Characteristic](#2)
+- [3. Benchmark](#3)
+- [4. Demo](#4)
+- [5. Tutorials](#5)
+    - [5.1 Training and Inference Optimization](#51)
+    - [5.2 Characteristic APPs of TTS](#52)
+    - [5.3 TTS Server](#53)
+
+<a name="1"></a>
+## 1. Introduction
+
+PP-TTS is a streaming speech synthesis system developed by PaddleSpeech. Based on the implementation of [SOTA Algorithms](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md#text-to-speech-models), a faster inference engine is used to realize streaming speech synthesis technology to meet the needs of commercial speech interaction scenarios.
+
+#### PP-TTS
+Pipline of TTS：
+<center><img src=https://ai-studio-static-online.cdn.bcebos.com/ea69ae1faff84940a59c7079d16b3a8db2741d2c423846f68822f4a7f28726e9 width="600" ></center>
+
+PP-TTS provides a Chinese streaming speech synthesis system based on FastSpeech2 and HiFiGAN by default:
+
+- Text Frontend： The rule-based Chinese text frontend system is adopted to optimize Chinese text such as text normalization, polyphony, and tone sandhi.
+- Acoustic Model: The decoder of FastSpeech2 is improved so that it can be stream synthesized
+- Vocoder: Streaming synthesis of GAN vocoder is supported
+- Inference Engine： Using ONNXRuntime to optimize the inference of TTS models, so that the TTS system can also achieve RTF < 1 on low-voltage, meeting the requirements of streaming synthesis
+
+<a name="2"></a>
+## 2. Characteristic
+- Open source leading Chinese TTS system
+- Using ONNXRuntime to optimize the inference of TTS models
+- The only open-source streaming TTS system
+- Easy disassembly: Developers can easily replace different acoustic models and vocoders in different languages, use different inference engines (Paddle dynamic graph, PaddleInference, ONNXRuntime, etc.), and use different network services (HTTP, WebSocket)
+
+<a name="3"></a>
+## 3. Benchmark
+PaddleSpeech TTS models' benchmark: [TTS-Benchmark](https://github.com/PaddlePaddle/PaddleSpeech/wiki/TTS-Benchmark)。
+
+<a name="4"></a>
+## 4. Demo 
+See: [Streaming TTS Demo Video](https://paddlespeech.readthedocs.io/en/latest/streaming_tts_demo_video.html)
+
+<a name="5"></a>
+## 5. Tutorials
+
+<a name="51"></a>
+### 5.1 Training and Inference Optimization
+
+Default FastSpeech2: [tts3/run.sh](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/examples/csmsc/tts3/run.sh)
+
+Streaming FastSpeech2: [tts3/run_cnndecoder.sh](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/examples/csmsc/tts3/run_cnndecoder.sh)
+
+HiFiGAN：[voc5/run.sh](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/examples/csmsc/voc5/run.sh)
+
+<a name="52"></a>
+### 5.2 Characteristic APPs of TTS
+text_to_speech - convert text into speech: [text_to_speech](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/text_to_speech)
+
+style_fs2 - multi style control for FastSpeech2 model: [style_fs2](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/style_fs2)
+
+story talker - book reader based on OCR and TTS: [story_talker](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/story_talker)
+
+metaverse - 2D AR with TTS: [metaverse](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/metaverse)
+
+<a name="53"></a>
+### 5.3 TTS Server
+
+Non-streaming TTS Server: [speech_server](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/speech_server)
+
+Streaming TTS Server: [streaming_tts_server](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/streaming_tts_server)
+
+
+For more tutorials please see: [PP-TTS：流式语音合成原理及服务部署
+](https://aistudio.baidu.com/aistudio/projectdetail/3885352)
--- a/docs/source/tts/PPTTS_cn.md
+++ b/docs/source/tts/PPTTS_cn.md
@ -0,0 +1,76 @@
+(简体中文|[English](./PPTTS.md))
+
+# PP-TTS
+
+- [1. 简介](#1)
+- [2. 特性](#2)
+- [3. Benchmark](#3)
+- [4. 效果展示](#4)
+- [5. 使用教程](#5)
+    - [5.1 模型训练与推理优化](#51)
+    - [5.2 语音合成特色应用](#52)
+    - [5.3 语音合成服务搭建](#53)
+
+<a name="1"></a>
+## 1. 简介
+
+PP-TTS 是 PaddleSpeech 自研的流式语音合成系统。在实现[前沿算法](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md#text-to-speech-models)的基础上，使用了更快的推理引擎，实现了流式语音合成技术，使其满足商业语音交互场景的需求。
+
+#### PP-TTS
+语音合成基本流程如下图所示：
+<center><img src=https://ai-studio-static-online.cdn.bcebos.com/ea69ae1faff84940a59c7079d16b3a8db2741d2c423846f68822f4a7f28726e9 width="600" ></center>
+
+PP-TTS 默认提供基于 FastSpeech2 声学模型和 HiFiGAN 声码器的中文流式语音合成系统：
+
+- 文本前端：采用基于规则的中文文本前端系统，对文本正则、多音字、变调等中文文本场景进行了优化。
+- 声学模型：对 FastSpeech2 模型的 Decoder 进行改进，使其可以流式合成
+- 声码器：支持对 GAN Vocoder 的流式合成
+- 推理引擎：使用 ONNXRuntime 推理引擎优化模型推理性能，使得语音合成系统在低压 CPU 上也能达到 RTF<1，满足流式合成的要求
+
+<a name="2"></a>
+## 2. 特性
+- 开源领先的中文语音合成系统
+- 使用 ONNXRuntime 推理引擎优化模型推理性能
+- 唯一开源的流式语音合成系统
+- 易拆卸性：可以很方便地更换不同语种上的不同声学模型和声码器、使用不同的推理引擎（Paddle 动态图、PaddleInference 和 ONNXRuntime 等）、使用不同的网络服务（HTTP、Websocket）
+
+<a name="3"></a>
+## 3. Benchmark
+PaddleSpeech TTS 模型之间的性能对比，请查看 [TTS-Benchmark](https://github.com/PaddlePaddle/PaddleSpeech/wiki/TTS-Benchmark)。
+
+<a name="4"></a>
+## 4. 效果展示 
+请参考：[Streaming TTS Demo Video](https://paddlespeech.readthedocs.io/en/latest/streaming_tts_demo_video.html)
+
+<a name="5"></a>
+## 5. 使用教程
+
+<a name="51"></a>
+### 5.1 模型训练与推理优化
+
+Default FastSpeech2：[tts3/run.sh](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/examples/csmsc/tts3/run.sh)
+
+流式 FastSpeech2：[tts3/run_cnndecoder.sh](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/examples/csmsc/tts3/run_cnndecoder.sh)
+
+HiFiGAN：[voc5/run.sh](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/examples/csmsc/voc5/run.sh)
+
+<a name="52"></a>
+### 5.2 语音合成特色应用
+一键式实现语音合成：[text_to_speech](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/text_to_speech)
+
+个性化语音合成 - 基于 FastSpeech2 模型的个性化语音合成：[style_fs2](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/style_fs2)
+
+会说话的故事书 - 基于 OCR 和语音合成的会说话的故事书：[story_talker](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/story_talker)
+
+元宇宙 - 基于语音合成的 2D 增强现实：[metaverse](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/metaverse)
+
+<a name="53"></a>
+### 5.3 语音合成服务搭建
+
+一键式搭建非流式语音合成服务：[speech_server](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/speech_server)
+
+一键式搭建流式语音合成服务：[streaming_tts_server](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/streaming_tts_server)
+
+
+更多教程，包括模型设计、模型训练、推理部署等，请参考 AIStudio 教程：[PP-TTS：流式语音合成原理及服务部署
+](https://aistudio.baidu.com/aistudio/projectdetail/3885352)
--- a/Show More
+++ b/Show More
				`@ -0,0 +1 @@`
				`sudo nvidia-docker run --privileged --net=host --ipc=host -it --rm -v $PWD:/paddle --name=paddle_demo_docker registry.baidubce.com/paddlepaddle/paddle:2.2.2 /bin/bash`