Merge pull request #2189 from yt605155624/fix_name_bug

[doc]update server readme
pull/2200/head
TianYuan 2 years ago committed by GitHub
commit 663cfc0172
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -5,14 +5,19 @@
## Introduction
This demo is an implementation of starting the voice service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python.
For service interface definition, please check:
- [PaddleSpeech Server RESTful API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-RESTful-API)
## Usage
### 1. Installation
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
It is recommended to use **paddlepaddle 2.2.2** or above.
It is recommended to use **paddlepaddle 2.3.1** or above.
You can choose one way from easy, meduim and hard to install paddlespeech.
**If you install in simple mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
**If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
### 2. Prepare config File
The configuration file can be found in `conf/application.yaml` .
@ -47,7 +52,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `log_file`: log file. Default: ./log/paddlespeech.log
Output:
```bash
```text
[2022-02-23 11:17:32] [INFO] [server.py:64] Started server process [6384]
INFO: Waiting for application startup.
[2022-02-23 11:17:32] [INFO] [on.py:26] Waiting for application startup.
@ -55,7 +60,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
[2022-02-23 11:17:32] [INFO] [on.py:38] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-02-23 11:17:32] [INFO] [server.py:204] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
- Python API
@ -69,7 +73,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
Output:
```bash
```text
INFO: Started server process [529]
[2022-02-23 14:57:56] [INFO] [server.py:64] Started server process [529]
INFO: Waiting for application startup.
@ -78,7 +82,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
[2022-02-23 14:57:56] [INFO] [on.py:38] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-02-23 14:57:56] [INFO] [server.py:204] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
@ -106,7 +109,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `audio_format`: Audio format. Default: "wav".
Output:
```bash
```text
[2022-02-23 18:11:22,819] [ INFO] - {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'transcription': '我认为跑步最重要的就是给我带来了身体健康'}}
[2022-02-23 18:11:22,820] [ INFO] - time cost 0.689145 s.
@ -129,7 +132,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
Output:
```bash
```text
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'transcription': '我认为跑步最重要的就是给我带来了身体健康'}}
```
@ -158,12 +161,11 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `output`: Output wave filepath. Default: None, which means not to save the audio to the local.
Output:
```bash
```text
[2022-02-23 15:20:37,875] [ INFO] - {'description': 'success.'}
[2022-02-23 15:20:37,875] [ INFO] - Save synthesized audio successfully on output.wav.
[2022-02-23 15:20:37,875] [ INFO] - Audio duration: 3.612500 s.
[2022-02-23 15:20:37,875] [ INFO] - Response time: 0.348050 s.
```
- Python API
@ -189,11 +191,10 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
Output:
```bash
```text
{'description': 'success.'}
Save synthesized audio successfully on ./output.wav.
Audio duration: 3.612500 s.
```
### 6. CLS Client Usage
@ -202,7 +203,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
If `127.0.0.1` is not accessible, you need to use the actual service IP address.
```
```bash
paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
```
@ -218,11 +219,9 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `topk`: topk scores of classification result.
Output:
```bash
```text
[2022-03-09 20:44:39,974] [ INFO] - {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'topk': 1, 'results': [{'class_name': 'Speech', 'prob': 0.9027184844017029}]}}
[2022-03-09 20:44:39,975] [ INFO] - Response time 0.104360 s.
```
- Python API
@ -240,9 +239,8 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
Output:
```bash
```text
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'topk': 1, 'results': [{'class_name': 'Speech', 'prob': 0.9027184844017029}]}}
```
@ -274,13 +272,13 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
Output:
```bash
[2022-05-25 12:25:36,165] [ INFO] - vector http client start
[2022-05-25 12:25:36,165] [ INFO] - the input audio: 85236145389.wav
[2022-05-25 12:25:36,165] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector
[2022-05-25 12:25:36,166] [ INFO] - http://127.0.0.1:8790/paddlespeech/vector
[2022-05-25 12:25:36,324] [ INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
[2022-05-25 12:25:36,324] [ INFO] - Response time 0.159053 s.
```text
[2022-05-25 12:25:36,165] [ INFO] - vector http client start
[2022-05-25 12:25:36,165] [ INFO] - the input audio: 85236145389.wav
[2022-05-25 12:25:36,165] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector
[2022-05-25 12:25:36,166] [ INFO] - http://127.0.0.1:8790/paddlespeech/vector
[2022-05-25 12:25:36,324] [ INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
[2022-05-25 12:25:36,324] [ INFO] - Response time 0.159053 s.
```
* Python API
@ -299,8 +297,8 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
Output:
``` bash
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
```text
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
```
#### 7.2 Get the score between speaker audio embedding
@ -331,13 +329,13 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
Output:
``` bash
[2022-05-25 12:33:24,527] [ INFO] - vector score http client start
[2022-05-25 12:33:24,527] [ INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
[2022-05-25 12:33:24,528] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
[2022-05-25 12:33:24,695] [ INFO] - The vector score is: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
[2022-05-25 12:33:24,696] [ INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
[2022-05-25 12:33:24,696] [ INFO] - Response time 0.168271 s.
```text
[2022-05-25 12:33:24,527] [ INFO] - vector score http client start
[2022-05-25 12:33:24,527] [ INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
[2022-05-25 12:33:24,528] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
[2022-05-25 12:33:24,695] [ INFO] - The vector score is: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
[2022-05-25 12:33:24,696] [ INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
[2022-05-25 12:33:24,696] [ INFO] - Response time 0.168271 s.
```
* Python API
@ -358,7 +356,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
Output:
``` bash
```text
[2022-05-25 12:30:14,143] [ INFO] - vector score http client start
[2022-05-25 12:30:14,143] [ INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
[2022-05-25 12:30:14,143] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
@ -389,9 +387,9 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `input`(required): Input text to get punctuation.
Output:
```bash
[2022-05-09 18:19:04,397] [ INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
[2022-05-09 18:19:04,397] [ INFO] - Response time 0.092407 s.
```text
[2022-05-09 18:19:04,397] [ INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
[2022-05-09 18:19:04,397] [ INFO] - Response time 0.092407 s.
```
- Python API
@ -408,11 +406,10 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
Output:
```bash
```text
我认为跑步最重要的就是给我带来了身体健康。
```
## Models supported by the service
### ASR model
Get all models supported by the ASR service via `paddlespeech_server stats --task asr`, where static models can be used for paddle inference inference.

@ -3,22 +3,29 @@
# 语音服务
## 介绍
这个 demo 是一个启动离线语音服务和访问服务的实现。它可以通过使用`paddlespeech_server` 和 `paddlespeech_client` 的单个命令或 python 的几行代码来实现。
这个 demo 是一个启动离线语音服务和访问服务的实现。它可以通过使用 `paddlespeech_server``paddlespeech_client` 的单个命令或 python 的几行代码来实现。
服务接口定义请参考:
- [PaddleSpeech Server RESTful API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-RESTful-API)
## 使用方法
### 1. 安装
请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
推荐使用 **paddlepaddle 2.2.2** 或以上版本。
你可以从简单,中等,困难几种方式中选择一种方式安装 PaddleSpeech。
**如果使用简单模式安装,需要自行准备 yaml 文件,可参考 conf 目录下的 yaml 文件。**
推荐使用 **paddlepaddle 2.3.1** 或以上版本。
你可以从简单,中等,困难 几种方式中选择一种方式安装 PaddleSpeech。
**如果使用简单模式安装,需要自行准备 yaml 文件,可参考 conf 目录下的 yaml 文件。**
### 2. 准备配置文件
配置文件可参见 `conf/application.yaml`
其中,`engine_list`表示即将启动的服务将会包含的语音引擎,格式为 <语音任务>_<引擎类型>。
目前服务集成的语音任务有: asr(语音识别)、tts(语音合成)、cls(音频分类)、vector(声纹识别)以及 text(文本处理)。
其中,`engine_list` 表示即将启动的服务将会包含的语音引擎,格式为 <语音任务>_<引擎类型>。
目前服务集成的语音任务有: asr (语音识别)、tts (语音合成)、cls (音频分类)、vector (声纹识别)以及 text (文本处理)。
目前引擎类型支持两种形式python 及 inference (Paddle Inference)
**注意:** 如果在容器里可正常启动服务,但客户端访问 ip 不可达,可尝试将配置文件中 `host` 地址换成本地 ip 地址。
@ -48,7 +55,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `log_file`: log 文件. 默认:./log/paddlespeech.log
输出:
```bash
```text
[2022-02-23 11:17:32] [INFO] [server.py:64] Started server process [6384]
INFO: Waiting for application startup.
[2022-02-23 11:17:32] [INFO] [on.py:26] Waiting for application startup.
@ -56,7 +63,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
[2022-02-23 11:17:32] [INFO] [on.py:38] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-02-23 11:17:32] [INFO] [server.py:204] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
- Python API
@ -70,7 +76,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
输出:
```bash
```text
INFO: Started server process [529]
[2022-02-23 14:57:56] [INFO] [server.py:64] Started server process [529]
INFO: Waiting for application startup.
@ -79,7 +85,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
[2022-02-23 14:57:56] [INFO] [on.py:38] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-02-23 14:57:56] [INFO] [server.py:204] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
### 4. ASR 客户端使用方法
@ -108,8 +113,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `audio_format`: 音频格式默认值wav。
输出:
```bash
```text
[2022-02-23 18:11:22,819] [ INFO] - {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'transcription': '我认为跑步最重要的就是给我带来了身体健康'}}
[2022-02-23 18:11:22,820] [ INFO] - time cost 0.689145 s.
```
@ -131,9 +135,8 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
输出:
```bash
```text
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'transcription': '我认为跑步最重要的就是给我带来了身体健康'}}
```
### 5. TTS 客户端使用方法
@ -162,7 +165,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `output`: 输出音频的路径, 默认值None表示不保存音频到本地。
输出:
```bash
```text
[2022-02-23 15:20:37,875] [ INFO] - {'description': 'success.'}
[2022-02-23 15:20:37,875] [ INFO] - Save synthesized audio successfully on output.wav.
[2022-02-23 15:20:37,875] [ INFO] - Audio duration: 3.612500 s.
@ -192,11 +195,10 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
输出:
```bash
```text
{'description': 'success.'}
Save synthesized audio successfully on ./output.wav.
Audio duration: 3.612500 s.
```
### 6. CLS 客户端使用方法
@ -207,7 +209,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
`127.0.0.1` 不能访问,则需要使用实际服务 IP 地址
```
```bash
paddlespeech_client cls --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
```
@ -223,11 +225,9 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `topk`: 分类结果的topk。
输出:
```bash
```text
[2022-03-09 20:44:39,974] [ INFO] - {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'topk': 1, 'results': [{'class_name': 'Speech', 'prob': 0.9027184844017029}]}}
[2022-03-09 20:44:39,975] [ INFO] - Response time 0.104360 s.
```
- Python API
@ -242,13 +242,11 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
port=8090,
topk=1)
print(res.json())
```
输出:
```bash
```text
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'topk': 1, 'results': [{'class_name': 'Speech', 'prob': 0.9027184844017029}]}}
```
### 7. 声纹客户端使用方法
@ -259,7 +257,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
`127.0.0.1` 不能访问,则需要使用实际服务 IP 地址
``` bash
```bash
paddlespeech_client vector --task spk --server_ip 127.0.0.1 --port 8090 --input 85236145389.wav
```
@ -275,15 +273,15 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
* task: vector 的任务可选spk或者score。默认是 spk。
* enroll: 注册音频;。
* test: 测试音频。
输出:
``` bash
[2022-05-25 12:25:36,165] [ INFO] - vector http client start
[2022-05-25 12:25:36,165] [ INFO] - the input audio: 85236145389.wav
[2022-05-25 12:25:36,165] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector
[2022-05-25 12:25:36,166] [ INFO] - http://127.0.0.1:8790/paddlespeech/vector
[2022-05-25 12:25:36,324] [ INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
[2022-05-25 12:25:36,324] [ INFO] - Response time 0.159053 s.
输出:
```text
[2022-05-25 12:25:36,165] [ INFO] - vector http client start
[2022-05-25 12:25:36,165] [ INFO] - the input audio: 85236145389.wav
[2022-05-25 12:25:36,165] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector
[2022-05-25 12:25:36,166] [ INFO] - http://127.0.0.1:8790/paddlespeech/vector
[2022-05-25 12:25:36,324] [ INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
[2022-05-25 12:25:36,324] [ INFO] - Response time 0.159053 s.
```
* Python API
@ -301,9 +299,8 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
输出:
``` bash
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
```text
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'vec': [-1.3251205682754517, 7.860682487487793, -4.620625972747803, 0.3000721037387848, 2.2648534774780273, -1.1931440830230713, 3.064713716506958, 7.673594951629639, -6.004472732543945, -12.024259567260742, -1.9496068954467773, 3.126953601837158, 1.6188379526138306, -7.638310432434082, -1.2299772500991821, -12.33833122253418, 2.1373026371002197, -5.395712375640869, 9.717328071594238, 5.675230503082275, 3.7805123329162598, 3.0597171783447266, 3.429692029953003, 8.9760103225708, 13.174124717712402, -0.5313228368759155, 8.942471504211426, 4.465109825134277, -4.426247596740723, -9.726503372192383, 8.399328231811523, 7.223917484283447, -7.435853958129883, 2.9441683292388916, -4.343039512634277, -13.886964797973633, -1.6346734762191772, -10.902740478515625, -5.311244964599609, 3.800722122192383, 3.897603750228882, -2.123077392578125, -2.3521194458007812, 4.151031017303467, -7.404866695404053, 0.13911646604537964, 2.4626107215881348, 4.96645450592041, 0.9897574186325073, 5.483975410461426, -3.3574001789093018, 10.13400650024414, -0.6120170950889587, -10.403095245361328, 4.600754261016846, 16.009349822998047, -7.78369140625, -4.194530487060547, -6.93686056137085, 1.1789555549621582, 11.490800857543945, 4.23802375793457, 9.550930976867676, 8.375045776367188, 7.508914470672607, -0.6570729613304138, -0.3005157709121704, 2.8406054973602295, 3.0828027725219727, 0.7308170199394226, 6.1483540534973145, 0.1376611888408661, -13.424735069274902, -7.746140480041504, -2.322798252105713, -8.305252075195312, 2.98791241645813, -10.99522876739502, 0.15211068093776703, -2.3820347785949707, -1.7984174489974976, 8.49562931060791, -5.852236747741699, -3.755497932434082, 0.6989710927009583, -5.270299434661865, -2.6188621520996094, -1.8828465938568115, -4.6466498374938965, 14.078543663024902, -0.5495333075523376, 10.579157829284668, -3.216050148010254, 9.349003791809082, -4.381077766418457, -11.675816535949707, -2.863020658493042, 4.5721755027771, 2.246612071990967, -4.574341773986816, 1.8610187768936157, 2.3767874240875244, 5.625787734985352, -9.784077644348145, 0.6496725678443909, -1.457950472831726, 0.4263263940811157, -4.921126365661621, -2.4547839164733887, 3.4869801998138428, -0.4265422224998474, 8.341268539428711, 1.356552004814148, 7.096688270568848, -13.102828979492188, 8.01673412322998, -7.115934371948242, 1.8699780702590942, 0.20872099697589874, 14.699383735656738, -1.0252779722213745, -2.6107232570648193, -2.5082311630249023, 8.427192687988281, 6.913852691650391, -6.29124641418457, 0.6157366037368774, 2.489687919616699, -3.4668266773223877, 9.92176342010498, 11.200815200805664, -0.19664029777050018, 7.491600513458252, -0.6231271624565125, -0.2584814429283142, -9.947997093200684, -0.9611040949821472, 1.1649218797683716, -2.1907122135162354, -1.502848744392395, -0.5192610621452332, 15.165953636169434, 2.4649462699890137, -0.998044490814209, 7.44166374206543, -2.0768048763275146, 3.5896823406219482, -7.305543422698975, -7.562084674835205, 4.32333517074585, 0.08044180274009705, -6.564010143280029, -2.314805269241333, -1.7642345428466797, -2.470881700515747, -7.6756181716918945, -9.548877716064453, -1.017755389213562, 0.1698644608259201, 2.5877134799957275, -1.8752295970916748, -0.36614322662353516, -6.049378395080566, -2.3965611457824707, -5.945338726043701, 0.9424033164978027, -13.155974388122559, -7.45780086517334, 0.14658108353614807, -3.7427968978881836, 5.841492652893066, -1.2872905731201172, 5.569431304931641, 12.570590019226074, 1.0939218997955322, 2.2142086029052734, 1.9181575775146484, 6.991420745849609, -5.888138771057129, 3.1409823894500732, -2.0036280155181885, 2.4434285163879395, 9.973138809204102, 5.036680221557617, 2.005120277404785, 2.861560344696045, 5.860223770141602, 2.917618751525879, -1.63111412525177, 2.0292205810546875, -4.070415019989014, -6.831437110900879]}}
```
#### 7.2 音频声纹打分
@ -332,19 +329,18 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
* test: 测试音频。
输出:
``` bash
[2022-05-25 12:33:24,527] [ INFO] - vector score http client start
[2022-05-25 12:33:24,527] [ INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
[2022-05-25 12:33:24,528] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
[2022-05-25 12:33:24,695] [ INFO] - The vector score is: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
[2022-05-25 12:33:24,696] [ INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
[2022-05-25 12:33:24,696] [ INFO] - Response time 0.168271 s.
```text
[2022-05-25 12:33:24,527] [ INFO] - vector score http client start
[2022-05-25 12:33:24,527] [ INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
[2022-05-25 12:33:24,528] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
[2022-05-25 12:33:24,695] [ INFO] - The vector score is: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
[2022-05-25 12:33:24,696] [ INFO] - The vector: {'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
[2022-05-25 12:33:24,696] [ INFO] - Response time 0.168271 s.
```
* Python API
``` python
```python
from paddlespeech.server.bin.paddlespeech_client import VectorClientExecutor
vectorclient_executor = VectorClientExecutor()
@ -359,8 +355,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
```
输出:
``` bash
```text
[2022-05-25 12:30:14,143] [ INFO] - vector score http client start
[2022-05-25 12:30:14,143] [ INFO] - enroll audio: 85236145389.wav, test audio: 123456789.wav
[2022-05-25 12:30:14,143] [ INFO] - endpoint: http://127.0.0.1:8790/paddlespeech/vector/score
@ -368,7 +363,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
{'success': True, 'code': 200, 'message': {'description': 'success'}, 'result': {'score': 0.45332613587379456}}
```
### 8. 标点预测
**注意:** 初次使用客户端时响应时间会略长
@ -391,9 +385,9 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
- `input`(必须输入): 用于标点预测的文本内容。
输出:
```bash
[2022-05-09 18:19:04,397] [ INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
[2022-05-09 18:19:04,397] [ INFO] - Response time 0.092407 s.
```text
[2022-05-09 18:19:04,397] [ INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
[2022-05-09 18:19:04,397] [ INFO] - Response time 0.092407 s.
```
- Python API
@ -406,11 +400,10 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
server_ip="127.0.0.1",
port=8090,)
print(res)
```
输出:
```bash
```text
我认为跑步最重要的就是给我带来了身体健康。
```
@ -419,10 +412,10 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee
通过 `paddlespeech_server stats --task asr` 获取 ASR 服务支持的所有模型,其中静态模型可用于 paddle inference 推理。
### TTS 支持的模型
通过 `paddlespeech_server stats --task tts` 获取 TTS 服务支持的所有模型,其中静态模型可用于 paddle inference 推理。
通过 `paddlespeech_server stats --task tts` 获取 TTS 服务支持的所有模型,其中静态模型可用于 paddle inference 推理。
### CLS 支持的模型
通过 `paddlespeech_server stats --task cls` 获取 CLS 服务支持的所有模型,其中静态模型可用于 paddle inference 推理。
通过 `paddlespeech_server stats --task cls` 获取 CLS 服务支持的所有模型,其中静态模型可用于 paddle inference 推理。
### Vector 支持的模型
通过 `paddlespeech_server stats --task vector` 获取 Vector 服务支持的所有模型。

@ -8,7 +8,7 @@ http://0.0.0.0:8010/docs
### 【POST】/asr/offline
说明上传16k,16bit wav文件返回 offline 语音识别模型识别结果
说明:上传 16k, 16bit wav 文件,返回 offline 语音识别模型识别结果
返回: JSON
@ -26,11 +26,11 @@ http://0.0.0.0:8010/docs
### 【POST】/asr/offlinefile
说明上传16k,16bit wav文件返回 offline 语音识别模型识别结果 + wav数据的base64
说明上传16k,16bit wav文件返回 offline 语音识别模型识别结果 + wav 数据的 base64
返回: JSON
前端接口: 音频文件识别(播放这段base64还原后记得添加wav头采样率16k, int16添加后才能播放)
前端接口: 音频文件识别(播放这段base64还原后记得添加 wav 头,采样率 16k, int16添加后才能播放)
示例:
@ -48,7 +48,7 @@ http://0.0.0.0:8010/docs
### 【POST】/asr/collectEnv
说明: 通过采集环境噪音上传16k, int16 wav文件来生成后台VAD的能量阈值 返回阈值结果
说明: 通过采集环境噪音,上传 16k, int16 wav 文件,来生成后台 VAD 的能量阈值, 返回阈值结果
前端接口ASR-环境采样
@ -64,9 +64,9 @@ http://0.0.0.0:8010/docs
### 【GET】/asr/stopRecord
说明:通过 GET 请求 /asr/stopRecord, 后台停止接收 offlineStream 中通过 WS协议 上传的数据
说明:通过 GET 请求 /asr/stopRecord, 后台停止接收 offlineStream 中通过 WS 协议 上传的数据
前端接口:语音聊天-暂停录音获取NLP播放TTS时暂停
前端接口:语音聊天-暂停录音(获取 NLP播放 TTS 时暂停)
返回: JSON
@ -80,9 +80,9 @@ http://0.0.0.0:8010/docs
### 【GET】/asr/resumeRecord
说明:通过 GET 请求 /asr/resumeRecord, 后台停止接收 offlineStream 中通过 WS协议 上传的数据
说明:通过 GET 请求 /asr/resumeRecord, 后台停止接收 offlineStream 中通过 WS 协议 上传的数据
前端接口:语音聊天-恢复录音TTS播放完毕时告诉后台恢复录音
前端接口:语音聊天-恢复录音( TTS 播放完毕时,告诉后台恢复录音)
返回: JSON
@ -100,16 +100,16 @@ http://0.0.0.0:8010/docs
前端接口:语音聊天-开始录音,持续将麦克风语音传给后端,后端推送语音识别结果
返回后端返回识别结果offline模型识别结果 由WS推送
返回后端返回识别结果offline 模型识别结果, 由WS推送
### 【Websocket】/ws/asr/onlineStream
说明:通过 WS 协议,将前端音频持续上传到后台,前端采集 16kInt16 类型的PCM片段持续上传到后端
说明:通过 WS 协议,将前端音频持续上传到后台,前端采集 16kInt16 类型的 PCM 片段,持续上传到后端
前端接口ASR-流式识别开始录音,持续将麦克风语音传给后端,后端推送语音识别结果
返回后端返回识别结果online模型识别结果 由WS推送
返回后端返回识别结果online 模型识别结果, 由 WS 推送
## NLP
@ -202,7 +202,7 @@ http://0.0.0.0:8010/docs
### 【POST】/tts/offline
说明获取TTS离线模型音频
说明:获取 TTS 离线模型音频
前端接口TTS-端到端合成
@ -272,7 +272,7 @@ curl -X 'POST' \
### 【POST】/vpr/recog
说明:声纹识别,识别文件,提取文件的声纹信息做比对 音频 16k, int 16 wav格式
说明:声纹识别,识别文件,提取文件的声纹信息做比对 音频 16k, int 16 wav 格式
前端接口:声纹识别-上传音频,返回声纹识别结果
@ -383,9 +383,9 @@ curl -X 'GET' \
### 【GET】/vpr/database64
说明: 根据 vpr_id 获取用户vpr时注册使用音频转换成 16k, int16 类型的数组返回base64编码
说明: 根据 vpr_id 获取用户 vpr 时注册使用音频转换成 16k, int16 类型的数组,返回 base64 编码
前端接口:声纹识别-获取vpr对应的音频注意播放时需要添加 wav头16k,int16, 可参考tts播放时添加wav的方式注意更改采样率
前端接口:声纹识别-获取 vpr 对应的音频(注意:播放时需要添加 wav头16k,int16, 可参考 tts 播放时添加 wav 的方式,注意更改采样率)
访问示例:
@ -401,6 +401,4 @@ curl -X 'GET' \
"code": 0,
"result":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
"message": "ok"
```
```

@ -1,16 +1,16 @@
# Paddle Speech Demo
PaddleSpeechDemo是一个以PaddleSpeech的语音交互功能为主体开发的Demo展示项目用于帮助大家更好的上手PaddleSpeech以及使用PaddleSpeech构建自己的应用。
PaddleSpeechDemo 是一个以 PaddleSpeech 的语音交互功能为主体开发的 Demo 展示项目,用于帮助大家更好的上手 PaddleSpeech 以及使用 PaddleSpeech 构建自己的应用。
智能语音交互部分使用PaddleSpeech对话以及信息抽取部分使用PaddleNLP网页前端展示部分基于Vue3进行开发
智能语音交互部分使用 PaddleSpeech对话以及信息抽取部分使用 PaddleNLP网页前端展示部分基于 Vue3 进行开发
主要功能:
+ 语音聊天PaddleSpeech的语音识别能力+语音合成能力对话部分基于PaddleNLP的闲聊功能
+ 声纹识别PaddleSpeech的声纹识别功能展示
+ 语音聊天PaddleSpeech 的语音识别能力+语音合成能力,对话部分基于 PaddleNLP 的闲聊功能
+ 声纹识别PaddleSpeech 的声纹识别功能展示
+ 语音识别:支持【实时语音识别】,【端到端识别】,【音频文件识别】三种模式
+ 语音合成:支持【流式合成】与【端到端合成】两种方式
+ 语音指令基于PaddleSpeech的语音识别能力与PaddleNLP的信息抽取实现交通费的智能报销
+ 语音指令:基于 PaddleSpeech 的语音识别能力与 PaddleNLP 的信息抽取,实现交通费的智能报销
运行效果:
@ -32,23 +32,21 @@ cd model
wget https://bj.bcebos.com/paddlenlp/applications/speech-cmd-analysis/finetune/model_state.pdparams
```
### 前端环境安装
前端依赖node.js 需要提前安装确保npm可用npm测试版本8.3.1,建议下载[官网](https://nodejs.org/en/)稳定版的node.js
前端依赖 `node.js` ,需要提前安装,确保 `npm` 可用,`npm` 测试版本 `8.3.1`,建议下载[官网](https://nodejs.org/en/)稳定版的 `node.js`
```
# 进入前端目录
cd web_client
# 安装yarn已经安装可跳过
# 安装 `yarn`,已经安装可跳过
npm install -g yarn
# 使用yarn安装前端依赖
yarn install
```
## 启动服务
### 开启后端服务
@ -66,18 +64,18 @@ cd web_client
yarn dev --port 8011
```
默认配置下前端中配置的后台地址信息是localhost确保后端服务器和打开页面的游览器在同一台机器上不在一台机器的配置方式见下方的FAQ【后端如果部署在其它机器或者别的端口如何修改】
默认配置下,前端中配置的后台地址信息是 localhost确保后端服务器和打开页面的游览器在同一台机器上不在一台机器的配置方式见下方的 FAQ【后端如果部署在其它机器或者别的端口如何修改】
## FAQ
#### Q: 如何安装node.js
A node.js的安装可以参考[【菜鸟教程】](https://www.runoob.com/nodejs/nodejs-install-setup.html), 确保npm可用
A node.js的安装可以参考[【菜鸟教程】](https://www.runoob.com/nodejs/nodejs-install-setup.html), 确保 npm 可用
#### Q后端如果部署在其它机器或者别的端口如何修改
A后端的配置地址有分散在两个文件中
修改第一个文件`PaddleSpeechWebClient/vite.config.js`
修改第一个文件 `PaddleSpeechWebClient/vite.config.js`
```
server: {
@ -92,7 +90,7 @@ server: {
}
```
修改第二个文件`PaddleSpeechWebClient/src/api/API.js`Websocket代理配置失败所以需要在这个文件中修改
修改第二个文件 `PaddleSpeechWebClient/src/api/API.js` Websocket 代理配置失败,所以需要在这个文件中修改)
```
// websocket (这里改成后端所在的接口)
@ -107,9 +105,6 @@ A这里主要是游览器安全策略的限制需要配置游览器后重
chrome设置地址: chrome://flags/#unsafely-treat-insecure-origin-as-secure
## 参考资料
vue实现录音参考资料https://blog.csdn.net/qq_41619796/article/details/107865602#t1

@ -7,13 +7,18 @@ This demo is an implementation of starting the streaming speech service and acce
Streaming ASR server only support `websocket` protocol, and doesn't support `http` protocol.
服务接口定义请参考:
- [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
## Usage
### 1. Installation
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
It is recommended to use **paddlepaddle 2.2.1** or above.
It is recommended to use **paddlepaddle 2.3.1** or above.
You can choose one way from easy, meduim and hard to install paddlespeech.
**If you install in simple mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
**If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to
### 2. Prepare config File
The configuration file can be found in `conf/ws_application.yaml``conf/ws_conformer_wenetspeech_application.yaml`.
@ -48,28 +53,28 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
- `log_file`: log file. Default: `./log/paddlespeech.log`
Output:
```bash
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```text
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
- Python API
@ -85,28 +90,28 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
```
Output:
```bash
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```text
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
@ -117,7 +122,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
If `127.0.0.1` is not accessible, you need to use the actual service IP address.
```
```bash
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
```
@ -126,6 +131,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
```bash
paddlespeech_client asr_online --help
```
Arguments:
- `server_ip`: server ip. Default: 127.0.0.1
- `port`: server port. Default: 8090
@ -137,75 +143,74 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
- `punc.server_port`: punctuation server port. Default: None.
Output:
```bash
[2022-05-06 21:10:35,598] [ INFO] - Start to do streaming asr client
[2022-05-06 21:10:35,600] [ INFO] - asr websocket client start
[2022-05-06 21:10:35,600] [ INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
[2022-05-06 21:10:35,600] [ INFO] - start to process the wavscp: ./zh.wav
[2022-05-06 21:10:35,670] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
[2022-05-06 21:10:35,699] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,713] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,726] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,738] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,750] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,762] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,774] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,786] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,387] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,398] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,407] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,416] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,425] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,434] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,442] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,930] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,938] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,946] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,954] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,962] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,970] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,977] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,985] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:37,484] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,492] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,500] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,508] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,517] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,525] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,532] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:38,050] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,058] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,066] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,073] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,081] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,089] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,097] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,105] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,630] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,639] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,647] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,655] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,663] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,671] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,679] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:39,216] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,224] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,232] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,240] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,248] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,256] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,264] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,272] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,885] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,896] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,905] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,915] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,924] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,934] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:44,827] [ INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
[2022-05-06 21:10:44,827] [ INFO] - audio duration: 4.9968125, elapsed time: 9.225094079971313, RTF=1.846195765794957
[2022-05-06 21:10:44,828] [ INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
```text
[2022-05-06 21:10:35,598] [ INFO] - Start to do streaming asr client
[2022-05-06 21:10:35,600] [ INFO] - asr websocket client start
[2022-05-06 21:10:35,600] [ INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
[2022-05-06 21:10:35,600] [ INFO] - start to process the wavscp: ./zh.wav
[2022-05-06 21:10:35,670] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
[2022-05-06 21:10:35,699] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,713] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,726] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,738] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,750] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,762] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,774] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,786] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,387] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,398] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,407] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,416] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,425] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,434] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,442] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,930] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,938] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,946] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,954] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,962] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,970] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,977] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,985] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:37,484] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,492] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,500] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,508] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,517] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,525] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,532] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:38,050] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,058] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,066] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,073] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,081] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,089] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,097] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,105] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,630] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,639] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,647] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,655] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,663] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,671] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,679] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:39,216] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,224] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,232] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,240] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,248] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,256] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,264] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,272] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,885] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,896] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,905] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,915] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,924] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,934] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:44,827] [ INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
[2022-05-06 21:10:44,827] [ INFO] - audio duration: 4.9968125, elapsed time: 9.225094079971313, RTF=1.846195765794957
[2022-05-06 21:10:44,828] [ INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
```
- Python API
@ -224,7 +229,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
```
Output:
```bash
```text
[2022-05-06 21:14:03,137] [ INFO] - asr websocket client start
[2022-05-06 21:14:03,137] [ INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
[2022-05-06 21:14:03,149] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
@ -299,12 +304,11 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
- Command Line
**Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file.
``` bash
```bash
In PaddleSpeech/demos/streaming_asr_server directory to lanuch punctuation service
paddlespeech_server start --config_file conf/punc_application.yaml
```
Usage:
```bash
paddlespeech_server start --help
@ -316,7 +320,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
Output:
``` bash
```text
[2022-05-02 17:59:26,285] [ INFO] - Create the TextEngine Instance
[2022-05-02 17:59:26,285] [ INFO] - Init the text engine
[2022-05-02 17:59:26,285] [ INFO] - Text Engine set the device: gpu:0
@ -348,26 +352,26 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
log_file="./log/paddlespeech.log")
```
Output:
```
[2022-05-02 18:09:02,542] [ INFO] - Create the TextEngine Instance
[2022-05-02 18:09:02,543] [ INFO] - Init the text engine
[2022-05-02 18:09:02,543] [ INFO] - Text Engine set the device: gpu:0
[2022-05-02 18:09:02,545] [ INFO] - File /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar.gz md5 checking...
[2022-05-02 18:09:06,919] [ INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar
W0502 18:09:07.523002 22615 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.2
W0502 18:09:07.527882 22615 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[2022-05-02 18:09:10,900] [ INFO] - Already cached /home/users/xiongxinlei/.paddlenlp/models/ernie-1.0/vocab.txt
[2022-05-02 18:09:10,913] [ INFO] - Init the text engine successfully
INFO: Started server process [22615]
[2022-05-02 18:09:10] [INFO] [server.py:75] Started server process [22615]
INFO: Waiting for application startup.
[2022-05-02 18:09:10] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-02 18:09:10] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
[2022-05-02 18:09:10] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
```
Output:
```text
[2022-05-02 18:09:02,542] [ INFO] - Create the TextEngine Instance
[2022-05-02 18:09:02,543] [ INFO] - Init the text engine
[2022-05-02 18:09:02,543] [ INFO] - Text Engine set the device: gpu:0
[2022-05-02 18:09:02,545] [ INFO] - File /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar.gz md5 checking...
[2022-05-02 18:09:06,919] [ INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar
W0502 18:09:07.523002 22615 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.2
W0502 18:09:07.527882 22615 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[2022-05-02 18:09:10,900] [ INFO] - Already cached /home/users/xiongxinlei/.paddlenlp/models/ernie-1.0/vocab.txt
[2022-05-02 18:09:10,913] [ INFO] - Init the text engine successfully
INFO: Started server process [22615]
[2022-05-02 18:09:10] [INFO] [server.py:75] Started server process [22615]
INFO: Waiting for application startup.
[2022-05-02 18:09:10] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-02 18:09:10] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
[2022-05-02 18:09:10] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
```
### 2. Client usage
**Note** The response time will be slightly longer when using the client for the first time
@ -376,17 +380,17 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
If `127.0.0.1` is not accessible, you need to use the actual service IP address.
```
```bash
paddlespeech_client text --server_ip 127.0.0.1 --port 8190 --input "我认为跑步最重要的就是给我带来了身体健康"
```
Output
```
```text
[2022-05-02 18:12:29,767] [ INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
[2022-05-02 18:12:29,767] [ INFO] - Response time 0.096548 s.
```
- Python3 API
- Python API
```python
from paddlespeech.server.bin.paddlespeech_client import TextClientExecutor
@ -400,11 +404,10 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
```
Output:
``` bash
```text
我认为跑步最重要的就是给我带来了身体健康。
```
## Join streaming asr and punctuation server
By default, each server is deployed on the 'CPU' device and speech recognition and punctuation prediction can be deployed on different 'GPU' by modifying the' device 'parameter in the service configuration file respectively.
@ -413,7 +416,7 @@ We use `streaming_ asr_server.py` and `punc_server.py` two services to lanuch st
### 1. Start two server
``` bash
```bash
Note: streaming speech recognition and punctuation prediction are configured on different graphics cards through configuration files
bash server.sh
```
@ -423,11 +426,11 @@ bash server.sh
If `127.0.0.1` is not accessible, you need to use the actual service IP address.
```
```bash
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --input ./zh.wav
```
Output:
```
```text
[2022-05-07 11:21:47,060] [ INFO] - asr websocket client start
[2022-05-07 11:21:47,060] [ INFO] - endpoint: ws://127.0.0.1:8490/paddlespeech/asr/streaming
[2022-05-07 11:21:47,080] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
@ -501,11 +504,11 @@ bash server.sh
If `127.0.0.1` is not accessible, you need to use the actual service IP address.
```
```bash
python3 websocket_client.py --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --wavfile ./zh.wav
```
Output:
```
```text
[2022-05-07 11:11:02,984] [ INFO] - Start to do streaming asr client
[2022-05-07 11:11:02,985] [ INFO] - asr websocket client start
[2022-05-07 11:11:02,985] [ INFO] - endpoint: ws://127.0.0.1:8490/paddlespeech/asr/streaming
@ -574,5 +577,3 @@ bash server.sh
[2022-05-07 11:11:18,915] [ INFO] - audio duration: 4.9968125, elapsed time: 15.928460597991943, RTF=3.187724293835709
[2022-05-07 11:11:18,916] [ INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
```

@ -3,16 +3,21 @@
# 流式语音识别服务
## 介绍
这个demo是一个启动流式语音服务和访问服务的实现。 它可以通过使用`paddlespeech_server` 和 `paddlespeech_client`的单个命令或 python 的几行代码来实现。
这个 demo 是一个启动流式语音服务和访问服务的实现。 它可以通过使用 `paddlespeech_server``paddlespeech_client` 的单个命令或 python 的几行代码来实现。
**流式语音识别服务只支持 `weboscket` 协议,不支持 `http` 协议。**
服务接口定义请参考:
- [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
## 使用方法
### 1. 安装
安装 PaddleSpeech 的详细过程请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md)。
推荐使用 **paddlepaddle 2.2.1** 或以上版本。
你可以从简单, 中等,困难 几种种方式中选择一种方式安装 PaddleSpeech。
推荐使用 **paddlepaddle 2.3.1** 或以上版本。
你可以从简单,中等,困难 几种方式中选择一种方式安装 PaddleSpeech。
**如果使用简单模式安装,需要自行准备 yaml 文件,可参考 conf 目录下的 yaml 文件。**
### 2. 准备配置文件
@ -26,7 +31,6 @@
* conformer: `conf/ws_conformer_wenetspeech_application.yaml`
这个 ASR client 的输入应该是一个 WAV 文件(`.wav`),并且采样率必须与模型的采样率相同。
可以下载此 ASR client的示例音频
@ -54,28 +58,28 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
- `log_file`: log 文件. 默认:`./log/paddlespeech.log`
输出:
```bash
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```text
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
- Python API
@ -90,29 +94,29 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
log_file="./log/paddlespeech.log")
```
输出
```bash
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
输出:
```text
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
### 4. ASR 客户端使用方法
@ -120,98 +124,97 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
**注意:** 初次使用客户端时响应时间会略长
- 命令行 (推荐使用)
`127.0.0.1` 不能访问,则需要使用实际服务 IP 地址
`127.0.0.1` 不能访问,则需要使用实际服务 IP 地址
```
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
```
```bash
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
```
使用帮助:
```bash
paddlespeech_client asr_online --help
```
使用帮助:
```bash
paddlespeech_client asr_online --help
```
参数:
- `server_ip`: 服务端ip地址默认: 127.0.0.1。
- `port`: 服务端口,默认: 8090。
- `input`(必须输入): 用于识别的音频文件。
- `sample_rate`: 音频采样率默认值16000。
- `lang`: 模型语言默认值zh_cn。
- `audio_format`: 音频格式默认值wav。
- `punc.server_ip` 标点预测服务的ip。默认是None。
- `punc.server_port` 标点预测服务的端口port。默认是None。
参数:
- `server_ip`: 服务端ip地址默认: 127.0.0.1。
- `port`: 服务端口,默认: 8090。
- `input`(必须输入): 用于识别的音频文件。
- `sample_rate`: 音频采样率默认值16000。
- `lang`: 模型语言默认值zh_cn。
- `audio_format`: 音频格式默认值wav。
- `punc.server_ip` 标点预测服务的ip。默认是None。
- `punc.server_port` 标点预测服务的端口port。默认是None。
输出:
```bash
[2022-05-06 21:10:35,598] [ INFO] - Start to do streaming asr client
[2022-05-06 21:10:35,600] [ INFO] - asr websocket client start
[2022-05-06 21:10:35,600] [ INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
[2022-05-06 21:10:35,600] [ INFO] - start to process the wavscp: ./zh.wav
[2022-05-06 21:10:35,670] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
[2022-05-06 21:10:35,699] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,713] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,726] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,738] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,750] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,762] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,774] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,786] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,387] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,398] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,407] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,416] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,425] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,434] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,442] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,930] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,938] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,946] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,954] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,962] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,970] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,977] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,985] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:37,484] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,492] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,500] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,508] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,517] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,525] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,532] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:38,050] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,058] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,066] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,073] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,081] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,089] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,097] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,105] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,630] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,639] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,647] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,655] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,663] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,671] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,679] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:39,216] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,224] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,232] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,240] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,248] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,256] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,264] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,272] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,885] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,896] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,905] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,915] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,924] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,934] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:44,827] [ INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
[2022-05-06 21:10:44,827] [ INFO] - audio duration: 4.9968125, elapsed time: 9.225094079971313, RTF=1.846195765794957
[2022-05-06 21:10:44,828] [ INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
输出:
```text
[2022-05-06 21:10:35,598] [ INFO] - Start to do streaming asr client
[2022-05-06 21:10:35,600] [ INFO] - asr websocket client start
[2022-05-06 21:10:35,600] [ INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
[2022-05-06 21:10:35,600] [ INFO] - start to process the wavscp: ./zh.wav
[2022-05-06 21:10:35,670] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
[2022-05-06 21:10:35,699] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,713] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,726] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,738] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,750] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,762] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,774] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:35,786] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,387] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,398] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,407] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,416] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,425] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,434] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,442] [ INFO] - client receive msg={'result': ''}
[2022-05-06 21:10:36,930] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,938] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,946] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,954] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,962] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,970] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,977] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:36,985] [ INFO] - client receive msg={'result': '我认为跑'}
[2022-05-06 21:10:37,484] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,492] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,500] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,508] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,517] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,525] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:37,532] [ INFO] - client receive msg={'result': '我认为跑步最重要的'}
[2022-05-06 21:10:38,050] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,058] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,066] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,073] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,081] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,089] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,097] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,105] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是'}
[2022-05-06 21:10:38,630] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,639] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,647] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,655] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,663] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,671] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:38,679] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给'}
[2022-05-06 21:10:39,216] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,224] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,232] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,240] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,248] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,256] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,264] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,272] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了'}
[2022-05-06 21:10:39,885] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,896] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,905] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,915] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,924] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:39,934] [ INFO] - client receive msg={'result': '我认为跑步最重要的就是给我带来了身体健康'}
[2022-05-06 21:10:44,827] [ INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '我认为跑步最重要的就是给我带来了身体健康', 'times': [{'w': '我', 'bg': 0.0, 'ed': 0.7000000000000001}, {'w': '认', 'bg': 0.7000000000000001, 'ed': 0.84}, {'w': '为', 'bg': 0.84, 'ed': 1.0}, {'w': '跑', 'bg': 1.0, 'ed': 1.18}, {'w': '步', 'bg': 1.18, 'ed': 1.36}, {'w': '最', 'bg': 1.36, 'ed': 1.5}, {'w': '重', 'bg': 1.5, 'ed': 1.6400000000000001}, {'w': '要', 'bg': 1.6400000000000001, 'ed': 1.78}, {'w': '的', 'bg': 1.78, 'ed': 1.9000000000000001}, {'w': '就', 'bg': 1.9000000000000001, 'ed': 2.06}, {'w': '是', 'bg': 2.06, 'ed': 2.62}, {'w': '给', 'bg': 2.62, 'ed': 3.16}, {'w': '我', 'bg': 3.16, 'ed': 3.3200000000000003}, {'w': '带', 'bg': 3.3200000000000003, 'ed': 3.48}, {'w': '来', 'bg': 3.48, 'ed': 3.62}, {'w': '了', 'bg': 3.62, 'ed': 3.7600000000000002}, {'w': '身', 'bg': 3.7600000000000002, 'ed': 3.9}, {'w': '体', 'bg': 3.9, 'ed': 4.0600000000000005}, {'w': '健', 'bg': 4.0600000000000005, 'ed': 4.26}, {'w': '康', 'bg': 4.26, 'ed': 4.96}]}
[2022-05-06 21:10:44,827] [ INFO] - audio duration: 4.9968125, elapsed time: 9.225094079971313, RTF=1.846195765794957
[2022-05-06 21:10:44,828] [ INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
```
- Python API
@ -230,7 +233,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
```
输出:
```bash
```text
[2022-05-06 21:14:03,137] [ INFO] - asr websocket client start
[2022-05-06 21:14:03,137] [ INFO] - endpoint: ws://127.0.0.1:8390/paddlespeech/asr/streaming
[2022-05-06 21:14:03,149] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
@ -297,34 +300,29 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
[2022-05-06 21:14:12,159] [ INFO] - audio duration: 4.9968125, elapsed time: 9.019973039627075, RTF=1.8051453881103354
[2022-05-06 21:14:12,160] [ INFO] - asr websocket client finished
```
## 标点预测
### 1. 服务端使用方法
- 命令行
**注意:** 默认部署在 `cpu` 设备上,可以通过修改服务配置文件中 `device` 参数部署在 `gpu` 上。
``` bash
在 PaddleSpeech/demos/streaming_asr_server 目录下启动标点预测服务
```bash
# 在 PaddleSpeech/demos/streaming_asr_server 目录下启动标点预测服务
paddlespeech_server start --config_file conf/punc_application.yaml
```
使用方法:
使用方法:
```bash
paddlespeech_server start --help
```
参数
参数:
- `config_file`: 服务的配置文件。
- `log_file`: log 文件。
输出
``` bash
输出:
```text
[2022-05-02 17:59:26,285] [ INFO] - Create the TextEngine Instance
[2022-05-02 17:59:26,285] [ INFO] - Init the text engine
[2022-05-02 17:59:26,285] [ INFO] - Text Engine set the device: gpu:0
@ -356,26 +354,26 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
log_file="./log/paddlespeech.log")
```
输出
```
[2022-05-02 18:09:02,542] [ INFO] - Create the TextEngine Instance
[2022-05-02 18:09:02,543] [ INFO] - Init the text engine
[2022-05-02 18:09:02,543] [ INFO] - Text Engine set the device: gpu:0
[2022-05-02 18:09:02,545] [ INFO] - File /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar.gz md5 checking...
[2022-05-02 18:09:06,919] [ INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar
W0502 18:09:07.523002 22615 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.2
W0502 18:09:07.527882 22615 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[2022-05-02 18:09:10,900] [ INFO] - Already cached /home/users/xiongxinlei/.paddlenlp/models/ernie-1.0/vocab.txt
[2022-05-02 18:09:10,913] [ INFO] - Init the text engine successfully
INFO: Started server process [22615]
[2022-05-02 18:09:10] [INFO] [server.py:75] Started server process [22615]
INFO: Waiting for application startup.
[2022-05-02 18:09:10] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-02 18:09:10] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
[2022-05-02 18:09:10] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
```
输出:
```text
[2022-05-02 18:09:02,542] [ INFO] - Create the TextEngine Instance
[2022-05-02 18:09:02,543] [ INFO] - Init the text engine
[2022-05-02 18:09:02,543] [ INFO] - Text Engine set the device: gpu:0
[2022-05-02 18:09:02,545] [ INFO] - File /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar.gz md5 checking...
[2022-05-02 18:09:06,919] [ INFO] - Use pretrained model stored in: /home/users/xiongxinlei/.paddlespeech/models/ernie_linear_p3_wudao-punc-zh/ernie_linear_p3_wudao-punc-zh.tar
W0502 18:09:07.523002 22615 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.2
W0502 18:09:07.527882 22615 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[2022-05-02 18:09:10,900] [ INFO] - Already cached /home/users/xiongxinlei/.paddlenlp/models/ernie-1.0/vocab.txt
[2022-05-02 18:09:10,913] [ INFO] - Init the text engine successfully
INFO: Started server process [22615]
[2022-05-02 18:09:10] [INFO] [server.py:75] Started server process [22615]
INFO: Waiting for application startup.
[2022-05-02 18:09:10] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-02 18:09:10] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
[2022-05-02 18:09:10] [INFO] [server.py:206] Uvicorn running on http://0.0.0.0:8190 (Press CTRL+C to quit)
```
### 2. 标点预测客户端使用方法
**注意:** 初次使用客户端时响应时间会略长
@ -384,17 +382,17 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
`127.0.0.1` 不能访问,则需要使用实际服务 IP 地址
```
paddlespeech_client text --server_ip 127.0.0.1 --port 8190 --input "我认为跑步最重要的就是给我带来了身体健康"
```
输出
```bash
paddlespeech_client text --server_ip 127.0.0.1 --port 8190 --input "我认为跑步最重要的就是给我带来了身体健康"
```
输出:
```text
[2022-05-02 18:12:29,767] [ INFO] - The punc text: 我认为跑步最重要的就是给我带来了身体健康。
[2022-05-02 18:12:29,767] [ INFO] - Response time 0.096548 s.
```
- Python3 API
- Python API
```python
from paddlespeech.server.bin.paddlespeech_client import TextClientExecutor
@ -407,12 +405,11 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
print(res)
```
输出
``` bash
输出:
```text
我认为跑步最重要的就是给我带来了身体健康。
```
## 联合流式语音识别和标点预测
**注意:** 默认部署在 `cpu` 设备上,可以通过修改服务配置文件中 `device` 参数将语音识别和标点预测部署在不同的 `gpu` 上。
@ -420,7 +417,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
### 1. 启动服务
``` bash
```bash
注意:流式语音识别和标点预测通过配置文件配置到不同的显卡上
bash server.sh
```
@ -430,11 +427,11 @@ bash server.sh
`127.0.0.1` 不能访问,则需要使用实际服务 IP 地址
```
```bash
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --input ./zh.wav
```
输出
```
输出:
```text
[2022-05-07 11:21:47,060] [ INFO] - asr websocket client start
[2022-05-07 11:21:47,060] [ INFO] - endpoint: ws://127.0.0.1:8490/paddlespeech/asr/streaming
[2022-05-07 11:21:47,080] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"}
@ -508,11 +505,11 @@ bash server.sh
`127.0.0.1` 不能访问,则需要使用实际服务 IP 地址
```
```bash
python3 websocket_client.py --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --wavfile ./zh.wav
```
输出
```
输出:
```text
[2022-05-07 11:11:02,984] [ INFO] - Start to do streaming asr client
[2022-05-07 11:11:02,985] [ INFO] - asr websocket client start
[2022-05-07 11:11:02,985] [ INFO] - endpoint: ws://127.0.0.1:8490/paddlespeech/asr/streaming
@ -581,5 +578,3 @@ bash server.sh
[2022-05-07 11:11:18,915] [ INFO] - audio duration: 4.9968125, elapsed time: 15.928460597991943, RTF=3.187724293835709
[2022-05-07 11:11:18,916] [ INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康
```

@ -5,15 +5,19 @@
## Introduction
This demo is an implementation of starting the streaming speech synthesis service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python.
For service interface definition, please check:
- [PaddleSpeech Server RESTful API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-RESTful-API)
- [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
## Usage
### 1. Installation
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
It is recommended to use **paddlepaddle 2.2.2** or above.
It is recommended to use **paddlepaddle 2.3.1** or above.
You can choose one way from easy, meduim and hard to install paddlespeech.
**If you install in simple mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
**If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
### 2. Prepare config File
The configuration file can be found in `conf/tts_online_application.yaml`.
@ -29,11 +33,10 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
- Both hifigan and mb_melgan support streaming voc inference.
- When the voc model is mb_melgan, when voc_pad=14, the synthetic audio for streaming inference is consistent with the non-streaming synthetic audio; the minimum voc_pad can be set to 7, and the synthetic audio has no abnormal hearing. If the voc_pad is less than 7, the synthetic audio sounds abnormal.
- When the voc model is hifigan, when voc_pad=19, the streaming inference synthetic audio is consistent with the non-streaming synthetic audio; when voc_pad=14, the synthetic audio has no abnormal hearing.
- Pad calculation method of streaming vocoder in PaddleSpeech: [AIStudio tutorial](https://aistudio.baidu.com/aistudio/projectdetail/4151335)
- Inference speed: mb_melgan > hifigan; Audio quality: mb_melgan < hifigan
- **Note:** If the service can be started normally in the container, but the client access IP is unreachable, you can try to replace the `host` address in the configuration file with the local IP address.
### 3. Streaming speech synthesis server and client using http protocol
#### 3.1 Server Usage
- Command Line (Recommended)
@ -53,7 +56,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
- `log_file`: log file. Default: ./log/paddlespeech.log
Output:
```bash
```text
[2022-04-24 20:05:27,887] [ INFO] - The first response time of the 0 warm up: 1.0123658180236816 s
[2022-04-24 20:05:28,038] [ INFO] - The first response time of the 1 warm up: 0.15108466148376465 s
[2022-04-24 20:05:28,191] [ INFO] - The first response time of the 2 warm up: 0.15317344665527344 s
@ -79,8 +82,8 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
log_file="./log/paddlespeech.log")
```
Output:
```bash
Output:
```text
[2022-04-24 21:00:16,934] [ INFO] - The first response time of the 0 warm up: 1.268730878829956 s
[2022-04-24 21:00:17,046] [ INFO] - The first response time of the 1 warm up: 0.11168622970581055 s
[2022-04-24 21:00:17,151] [ INFO] - The first response time of the 2 warm up: 0.10413002967834473 s
@ -93,8 +96,6 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
[2022-04-24 21:00:17] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```
#### 3.2 Streaming TTS client Usage
@ -125,7 +126,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
- Currently, only the single-speaker model is supported in the code, so `spk_id` does not take effect. Streaming TTS does not support changing sample rate, variable speed and volume.
Output:
```bash
```text
[2022-04-24 21:08:18,559] [ INFO] - tts http client start
[2022-04-24 21:08:21,702] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-24 21:08:21,703] [ INFO] - 首包响应0.18863153457641602 s
@ -154,7 +155,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
```
Output:
```bash
```text
[2022-04-24 21:11:13,798] [ INFO] - tts http client start
[2022-04-24 21:11:16,800] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-24 21:11:16,801] [ INFO] - 首包响应0.18234872817993164 s
@ -164,7 +165,6 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
[2022-04-24 21:11:16,837] [ INFO] - 音频保存至:./output.wav
```
### 4. Streaming speech synthesis server and client using websocket protocol
#### 4.1 Server Usage
- Command Line (Recommended)
@ -184,21 +184,19 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
- `log_file`: log file. Default: ./log/paddlespeech.log
Output:
```bash
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
INFO: Started server process [17600]
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
INFO: Waiting for application startup.
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```text
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
INFO: Started server process [17600]
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
INFO: Waiting for application startup.
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```
- Python API
@ -212,20 +210,19 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
```
Output:
```bash
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
INFO: Started server process [23466]
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
INFO: Waiting for application startup.
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```text
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
INFO: Started server process [23466]
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
INFO: Waiting for application startup.
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```
#### 4.2 Streaming TTS client Usage
@ -258,7 +255,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
Output:
```bash
```text
[2022-04-27 10:21:04,262] [ INFO] - tts websocket client start
[2022-04-27 10:21:04,496] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-27 10:21:04,496] [ INFO] - 首包响应0.2124948501586914 s
@ -266,7 +263,6 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
[2022-04-27 10:21:07,484] [ INFO] - 音频时长3.825 s
[2022-04-27 10:21:07,484] [ INFO] - RTF: 0.8363677006141812
[2022-04-27 10:21:07,516] [ INFO] - 音频保存至output.wav
```
- Python API
@ -283,21 +279,15 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
spk_id=0,
output="./output.wav",
play=False)
```
Output:
```bash
[2022-04-27 10:22:48,852] [ INFO] - tts websocket client start
[2022-04-27 10:22:49,080] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-27 10:22:49,080] [ INFO] - 首包响应0.21017956733703613 s
[2022-04-27 10:22:52,100] [ INFO] - 尾包响应3.2304444313049316 s
[2022-04-27 10:22:52,101] [ INFO] - 音频时长3.825 s
[2022-04-27 10:22:52,101] [ INFO] - RTF: 0.8445606356352762
[2022-04-27 10:22:52,134] [ INFO] - 音频保存至:./output.wav
```text
[2022-04-27 10:22:48,852] [ INFO] - tts websocket client start
[2022-04-27 10:22:49,080] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-27 10:22:49,080] [ INFO] - 首包响应0.21017956733703613 s
[2022-04-27 10:22:52,100] [ INFO] - 尾包响应3.2304444313049316 s
[2022-04-27 10:22:52,101] [ INFO] - 音频时长3.825 s
[2022-04-27 10:22:52,101] [ INFO] - RTF: 0.8445606356352762
[2022-04-27 10:22:52,134] [ INFO] - 音频保存至:./output.wav
```

@ -3,15 +3,19 @@
# 流式语音合成服务
## 介绍
这个demo是一个启动流式语音合成服务和访问该服务的实现。 它可以通过使用`paddlespeech_server` 和 `paddlespeech_client`的单个命令或 python 的几行代码来实现。
这个 demo 是一个启动流式语音合成服务和访问该服务的实现。 它可以通过使用 `paddlespeech_server``paddlespeech_client` 的单个命令或 python 的几行代码来实现。
服务接口定义请参考:
- [PaddleSpeech Server RESTful API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-RESTful-API)
- [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
## 使用方法
### 1. 安装
请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
推荐使用 **paddlepaddle 2.2.2** 或以上版本。
你可以从简单,中等,困难几种方式中选择一种方式安装 PaddleSpeech。
推荐使用 **paddlepaddle 2.3.1** 或以上版本。
你可以从简单,中等,困难 几种方式中选择一种方式安装 PaddleSpeech。
**如果使用简单模式安装,需要自行准备 yaml 文件,可参考 conf 目录下的 yaml 文件。**
### 2. 准备配置文件
@ -20,19 +24,20 @@
- `engine_list` 表示即将启动的服务将会包含的语音引擎,格式为 <语音任务>_<引擎类型>。
- 该 demo 主要介绍流式语音合成服务,因此语音任务应设置为 tts。
- 目前引擎类型支持两种形式:**online** 表示使用python进行动态图推理的引擎**online-onnx** 表示使用 onnxruntime 进行推理的引擎。其中online-onnx 的推理速度更快。
- 流式 TTS 引擎的 AM 模型支持:**fastspeech2 以及fastspeech2_cnndecoder**; Voc 模型支持:**hifigan, mb_melgan**
- 流式 TTS 引擎的 AM 模型支持:**fastspeech2 以及 fastspeech2_cnndecoder**; Voc 模型支持:**hifigan, mb_melgan**
- 流式 am 推理中,每次会对一个 chunk 的数据进行推理以达到流式的效果。其中 `am_block` 表示 chunk 中的有效帧数,`am_pad` 表示一个 chunk 中 am_block 前后各加的帧数。am_pad 的存在用于消除流式推理产生的误差,避免由流式推理对合成音频质量的影响。
- fastspeech2 不支持流式 am 推理,因此 am_pad 与 m_block 对它无效
- fastspeech2 不支持流式 am 推理,因此 am_pad 与 am_block 对它无效
- fastspeech2_cnndecoder 支持流式推理,当 am_pad=12 时,流式推理合成音频与非流式合成音频一致
- 流式 voc 推理中,每次会对一个 chunk 的数据进行推理以达到流式的效果。其中 `voc_block` 表示chunk中的有效帧数`voc_pad` 表示一个 chunk 中 voc_block 前后各加的帧数。voc_pad 的存在用于消除流式推理产生的误差,避免由流式推理对合成音频质量的影响。
- 流式 voc 推理中,每次会对一个 chunk 的数据进行推理以达到流式的效果。其中 `voc_block` 表示 chunk 中的有效帧数,`voc_pad` 表示一个 chunk 中 voc_block 前后各加的帧数。voc_pad 的存在用于消除流式推理产生的误差,避免由流式推理对合成音频质量的影响。
- hifigan, mb_melgan 均支持流式 voc 推理
- 当 voc 模型为 mb_melgan当 voc_pad=14 时流式推理合成音频与非流式合成音频一致voc_pad 最小可以设置为7合成音频听感上没有异常若 voc_pad 小于7合成音频听感上存在异常。
- 当 voc 模型为 hifigan当 voc_pad=19 时,流式推理合成音频与非流式合成音频一致;当 voc_pad=14 时,合成音频听感上没有异常。
- PaddleSpeech 中流式声码器 Pad 计算方法: [AIStudio 教程](https://aistudio.baidu.com/aistudio/projectdetail/4151335)
- 推理速度mb_melgan > hifigan; 音频质量mb_melgan < hifigan
- **注意:** 如果在容器里可正常启动服务,但客户端访问 ip 不可达,可尝试将配置文件中 `host` 地址换成本地 ip 地址。
### 3. 使用http协议的流式语音合成服务端及客户端使用方法
### 3. 使用 http 协议的流式语音合成服务端及客户端使用方法
#### 3.1 服务端使用方法
- 命令行 (推荐使用)
@ -51,7 +56,7 @@
- `log_file`: log 文件. 默认:./log/paddlespeech.log
输出:
```bash
```text
[2022-04-24 20:05:27,887] [ INFO] - The first response time of the 0 warm up: 1.0123658180236816 s
[2022-04-24 20:05:28,038] [ INFO] - The first response time of the 1 warm up: 0.15108466148376465 s
[2022-04-24 20:05:28,191] [ INFO] - The first response time of the 2 warm up: 0.15317344665527344 s
@ -64,7 +69,6 @@
[2022-04-24 20:05:28] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-24 20:05:28] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```
- Python API
@ -77,8 +81,8 @@
log_file="./log/paddlespeech.log")
```
输出
```bash
输出:
```text
[2022-04-24 21:00:16,934] [ INFO] - The first response time of the 0 warm up: 1.268730878829956 s
[2022-04-24 21:00:17,046] [ INFO] - The first response time of the 1 warm up: 0.11168622970581055 s
[2022-04-24 21:00:17,151] [ INFO] - The first response time of the 2 warm up: 0.10413002967834473 s
@ -91,8 +95,6 @@
[2022-04-24 21:00:17] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```
#### 3.2 客户端使用方法
@ -124,7 +126,7 @@
输出:
```bash
```text
[2022-04-24 21:08:18,559] [ INFO] - tts http client start
[2022-04-24 21:08:21,702] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-24 21:08:21,703] [ INFO] - 首包响应0.18863153457641602 s
@ -162,9 +164,8 @@
[2022-04-24 21:11:16,802] [ INFO] - RTF: 0.7846773683635238
[2022-04-24 21:11:16,837] [ INFO] - 音频保存至:./output.wav
```
### 4. 使用websocket协议的流式语音合成服务端及客户端使用方法
### 4. 使用 websocket 协议的流式语音合成服务端及客户端使用方法
#### 4.1 服务端使用方法
- 命令行 (推荐使用)
首先修改配置文件 `conf/tts_online_application.yaml` **`protocol` 设置为 `websocket`**。
@ -183,21 +184,19 @@
- `log_file`: log 文件. 默认:./log/paddlespeech.log
输出:
```bash
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
INFO: Started server process [17600]
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
INFO: Waiting for application startup.
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```text
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
INFO: Started server process [17600]
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
INFO: Waiting for application startup.
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```
- Python API
@ -210,27 +209,26 @@
log_file="./log/paddlespeech.log")
```
输出:
```bash
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
INFO: Started server process [23466]
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
INFO: Waiting for application startup.
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
输出:
```text
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
INFO: Started server process [23466]
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
INFO: Waiting for application startup.
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
```
#### 4.2 客户端使用方法
- 命令行 (推荐使用)
访问 websocket 流式TTS服务
访问 websocket 流式 TTS 服务:
`127.0.0.1` 不能访问,则需要使用实际服务 IP 地址
@ -255,9 +253,8 @@
- 目前代码中只支持单说话人的模型,因此 spk_id 的选择并不生效。流式 TTS 不支持更换采样率,变速和变音量等功能。
输出:
```bash
```text
[2022-04-27 10:21:04,262] [ INFO] - tts websocket client start
[2022-04-27 10:21:04,496] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-27 10:21:04,496] [ INFO] - 首包响应0.2124948501586914 s
@ -265,7 +262,6 @@
[2022-04-27 10:21:07,484] [ INFO] - 音频时长3.825 s
[2022-04-27 10:21:07,484] [ INFO] - RTF: 0.8363677006141812
[2022-04-27 10:21:07,516] [ INFO] - 音频保存至output.wav
```
- Python API
@ -282,11 +278,10 @@
spk_id=0,
output="./output.wav",
play=False)
```
输出:
```bash
```text
[2022-04-27 10:22:48,852] [ INFO] - tts websocket client start
[2022-04-27 10:22:49,080] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-27 10:22:49,080] [ INFO] - 首包响应0.21017956733703613 s
@ -294,8 +289,4 @@
[2022-04-27 10:22:52,101] [ INFO] - 音频时长3.825 s
[2022-04-27 10:22:52,101] [ INFO] - RTF: 0.8445606356352762
[2022-04-27 10:22:52,134] [ INFO] - 音频保存至:./output.wav
```

Loading…
Cancel
Save