Merge pull request #2510 from Zth9730/u2pp_jit_export

[s2t] use reverse_weight in decode.yaml
pull/2544/head
Hui Zhang 2 years ago committed by GitHub
commit c6f9764ed6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -183,19 +183,19 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision
## Installation
We strongly recommend our users to install PaddleSpeech in **Linux** with *python>=3.7* and *paddlepaddle>=2.3.1*.
We strongly recommend our users to install PaddleSpeech in **Linux** with *python>=3.7* and *paddlepaddle>=2.4rc*.
### **Dependency Introduction**
+ gcc >= 4.8.5
+ paddlepaddle >= 2.3.1
+ paddlepaddle >= 2.4rc
+ python >= 3.7
+ OS support: Linux(recommend), Windows, Mac OSX
PaddleSpeech depends on paddlepaddle. For installation, please refer to the official website of [paddlepaddle](https://www.paddlepaddle.org.cn/en) and choose according to your own machine. Here is an example of the cpu version.
```bash
pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
pip install paddlepaddle==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple
```
There are two quick installation methods for PaddleSpeech, one is pip installation, and the other is source code compilation (recommended).

@ -215,14 +215,14 @@
### 相关依赖
+ gcc >= 4.8.5
+ paddlepaddle >= 2.3.1
+ paddlepaddle >= 2.4rc
+ python >= 3.7
+ linux(推荐), mac, windows
PaddleSpeech 依赖于 paddlepaddle安装可以参考[ paddlepaddle 官网](https://www.paddlepaddle.org.cn/),根据自己机器的情况进行选择。这里给出 cpu 版本示例,其它版本大家可以根据自己机器的情况进行安装。
```shell
pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
pip install paddlepaddle==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple
```
PaddleSpeech 快速安装方式有两种,一种是 pip 安装,一种是源码编译(推荐)。

@ -13,7 +13,7 @@ For service interface definition, please check:
### 1. Installation
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
It is recommended to use **paddlepaddle 2.3.1** or above.
It is recommended to use **paddlepaddle 2.4rc** or above.
You can choose one way from easy, meduim and hard to install paddlespeech.

@ -14,7 +14,7 @@
### 1. 安装
请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
推荐使用 **paddlepaddle 2.3.1** 或以上版本。
推荐使用 **paddlepaddle 2.4rc** 或以上版本。
你可以从简单,中等,困难 几种方式中选择一种方式安装 PaddleSpeech。

@ -14,7 +14,7 @@ Streaming ASR server only support `websocket` protocol, and doesn't support `htt
### 1. Installation
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
It is recommended to use **paddlepaddle 2.3.1** or above.
It is recommended to use **paddlepaddle 2.4rc** or above.
You can choose one way from easy, meduim and hard to install paddlespeech.

@ -14,7 +14,7 @@
### 1. 安装
安装 PaddleSpeech 的详细过程请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md)。
推荐使用 **paddlepaddle 2.3.1** 或以上版本。
推荐使用 **paddlepaddle 2.4rc** 或以上版本。
你可以从简单,中等,困难 几种方式中选择一种方式安装 PaddleSpeech。

@ -13,7 +13,7 @@ For service interface definition, please check:
### 1. Installation
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
It is recommended to use **paddlepaddle 2.3.1** or above.
It is recommended to use **paddlepaddle 2.4rc** or above.
You can choose one way from easy, meduim and hard to install paddlespeech.

@ -12,7 +12,7 @@
### 1. 安装
请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
推荐使用 **paddlepaddle 2.3.1** 或以上版本。
推荐使用 **paddlepaddle 2.4rc** 或以上版本。
你可以从简单,中等,困难 几种方式中选择一种方式安装 PaddleSpeech。

@ -62,7 +62,7 @@ RUN mkdir -p ~/.pip && echo "[global]" > ~/.pip/pip.conf && \
echo "index-url=https://mirror.baidu.com/pypi/simple" >> ~/.pip/pip.conf && \
echo "trusted-host=mirror.baidu.com" >> ~/.pip/pip.conf && \
python3 -m pip install --upgrade pip && \
pip install paddlepaddle-gpu==2.3.1.post112 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html && \
pip install paddlepaddle-gpu==2.4.0rc0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html && \
rm -rf ~/.cache/pip
RUN git clone https://github.com/PaddlePaddle/PaddleSpeech.git && cd PaddleSpeech && \

@ -58,7 +58,7 @@ pip install pytest-runner -i https://pypi.tuna.tsinghua.edu.cn/simple
```
Then you can use the following commands:
```bash
pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
pip install paddlepaddle==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple
pip install paddlespeech -i https://pypi.tuna.tsinghua.edu.cn/simple
```
> If you encounter problem with downloading **nltk_data** while using paddlespeech, it maybe due to your poor network, we suggest you download the [nltk_data](https://paddlespeech.bj.bcebos.com/Parakeet/tools/nltk_data.tar.gz) provided by us, and extract it to your `${HOME}`.
@ -117,9 +117,13 @@ conda install -y -c gcc_linux-64=8.4.0 gxx_linux-64=8.4.0
```
(Hip: Do not use the last script if you want to install by **Hard** way):
### Install PaddlePaddle
You can choose the `PaddlePaddle` version based on your system. For example, for CUDA 10.2, CuDNN7.5 install paddlepaddle-gpu 2.3.1:
You can choose the `PaddlePaddle` version based on your system. For example, for CUDA 10.2, CuDNN7.5 install paddlepaddle-gpu 2.4rc:
```bash
python3 -m pip install paddlepaddle-gpu==2.3.1 -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle-gpu==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple
```
You can also install the develop version of paddlepaddle. For example, for CUDA 10.2, CuDNN7.5 install paddlepaddle-gpu develop:
```bash
python3 -m pip install paddlepaddle-gpu==0.0.0.post102 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
```
### Install PaddleSpeech
You can install `paddlespeech` by the following commandthen you can use the `ready-made` examples in `paddlespeech` :
@ -180,9 +184,13 @@ Some users may fail to install `kaldiio` due to the default download source, you
```bash
pip install pytest-runner -i https://pypi.tuna.tsinghua.edu.cn/simple
```
Make sure you have GPU and the paddlepaddle version is right. For example, for CUDA 10.2, CuDNN7.5 install paddle 2.3.1:
Make sure you have GPU and the paddlepaddle version is right. For example, for CUDA 10.2, CuDNN7.5 install paddle 2.4rc:
```bash
python3 -m pip install paddlepaddle-gpu==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple
```
You can also install the develop version of paddlepaddle. For example, for CUDA 10.2, CuDNN7.5 install paddlepaddle-gpu develop:
```bash
python3 -m pip install paddlepaddle-gpu==2.3.1 -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle-gpu==0.0.0.post102 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
```
### Install PaddleSpeech in Developing Mode
```bash

@ -55,7 +55,7 @@ pip install pytest-runner -i https://pypi.tuna.tsinghua.edu.cn/simple
```
然后你可以使用如下命令:
```bash
pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
pip install paddlepaddle==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple
pip install paddlespeech -i https://pypi.tuna.tsinghua.edu.cn/simple
```
> 如果您在使用 paddlespeech 的过程中遇到关于下载 **nltk_data** 的问题,可能是您的网络不佳,我们建议您下载我们提供的 [nltk_data](https://paddlespeech.bj.bcebos.com/Parakeet/tools/nltk_data.tar.gz) 并解压缩到您的 `${HOME}` 目录下。
@ -111,9 +111,13 @@ conda install -y -c gcc_linux-64=8.4.0 gxx_linux-64=8.4.0
```
(提示: 如果你想使用**困难**方式完成安装,请不要使用最后一条命令)
### 安装 PaddlePaddle
你可以根据系统配置选择 PaddlePaddle 版本,例如系统使用 CUDA 10.2 CuDNN7.5 ,你可以安装 paddlepaddle-gpu 2.3.1
你可以根据系统配置选择 PaddlePaddle 版本,例如系统使用 CUDA 10.2 CuDNN7.5 ,你可以安装 paddlepaddle-gpu 2.4rc
```bash
python3 -m pip install paddlepaddle-gpu==2.3.1 -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle-gpu==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple
```
你也可以安装 develop 版本的PaddlePaddle. 例如系统使用 CUDA 10.2 CuDNN7.5 ,你可以安装 paddlepaddle-gpu develop:
```bash
python3 -m pip install paddlepaddle-gpu==0.0.0.post102 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
```
### 安装 PaddleSpeech
最后安装 `paddlespeech`,这样你就可以使用 `paddlespeech` 中已有的 examples
@ -168,9 +172,13 @@ conda activate tools/venv
conda install -y -c conda-forge sox libsndfile swig bzip2 libflac bc
```
### 安装 PaddlePaddle
请确认你系统是否有 GPU并且使用了正确版本的 paddlepaddle。例如系统使用 CUDA 10.2, CuDNN7.5 ,你可以安装 paddlepaddle-gpu 2.3.1
请确认你系统是否有 GPU并且使用了正确版本的 paddlepaddle。例如系统使用 CUDA 10.2, CuDNN7.5 ,你可以安装 paddlepaddle-gpu 2.4rc
```bash
python3 -m pip install paddlepaddle-gpu==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple
```
你也可以安装 develop 版本的PaddlePaddle. 例如系统使用 CUDA 10.2 CuDNN7.5 ,你可以安装 paddlepaddle-gpu develop:
```bash
python3 -m pip install paddlepaddle-gpu==2.3.1 -i https://mirror.baidu.com/pypi/simple
python3 -m pip install paddlepaddle-gpu==0.0.0.post102 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
```
### 用开发者模式安装 PaddleSpeech
部分用户系统由于默认源的问题,安装中会出现 kaldiio 安转出错的问题,建议首先安装 pytest-runner:

@ -9,7 +9,7 @@ Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER |
[Ds2 Online Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_online_aishell_fbank161_ckpt_0.2.1.model.tar.gz) | Aishell Dataset | Char-based | 491 MB | 2 Conv + 5 LSTM layers | 0.0666 |-| 151 h | [D2 Online Aishell ASR0](../../examples/aishell/asr0) | onnx/inference/python |
[Ds2 Offline Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_offline_aishell_ckpt_1.0.1.model.tar.gz)| Aishell Dataset | Char-based | 1.4 GB | 2 Conv + 5 bidirectional LSTM layers| 0.0554 |-| 151 h | [Ds2 Offline Aishell ASR0](../../examples/aishell/asr0) | inference/python |
[Conformer Online Wenetspeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz) | WenetSpeech Dataset | Char-based | 457 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring| 0.11 (test\_net) 0.1879 (test\_meeting) |-| 10000 h |- | python |
[Conformer U2PP Online Wenetspeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.1.1.model.tar.gz) | WenetSpeech Dataset | Char-based | 476 MB | Encoder:Conformer, Decoder:BiTransformer, Decoding method: Attention rescoring| 0.047198 (aishell test\_-1) 0.059212 (aishell test\_16) |-| 10000 h |- | python |
[Conformer U2PP Online Wenetspeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.1.4.model.tar.gz) | WenetSpeech Dataset | Char-based | 476 MB | Encoder:Conformer, Decoder:BiTransformer, Decoding method: Attention rescoring| 0.047198 (aishell test\_-1) 0.059212 (aishell test\_16) |-| 10000 h |- | python |
[Conformer Online Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_chunk_conformer_aishell_ckpt_0.2.0.model.tar.gz) | Aishell Dataset | Char-based | 189 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring| 0.0544 |-| 151 h | [Conformer Online Aishell ASR1](../../examples/aishell/asr1) | python |
[Conformer Offline Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_conformer_aishell_ckpt_1.0.1.model.tar.gz) | Aishell Dataset | Char-based | 189 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0460 |-| 151 h | [Conformer Offline Aishell ASR1](../../examples/aishell/asr1) | python |
[Transformer Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz) | Aishell Dataset | Char-based | 128 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0523 || 151 h | [Transformer Aishell ASR1](../../examples/aishell/asr1) | python |

@ -71,9 +71,9 @@ asr_dynamic_pretrained_models = {
"conformer_u2pp_wenetspeech-zh-16k": {
'1.1': {
'url':
'https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.1.1.model.tar.gz',
'https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.1.3.model.tar.gz',
'md5':
'eae678c04ed3b3f89672052fdc0c5e10',
'662b347e1d2131b7a4dc5398365e2134',
'cfg_path':
'model.yaml',
'ckpt_path':
@ -91,9 +91,9 @@ asr_dynamic_pretrained_models = {
"conformer_u2pp_online_wenetspeech-zh-16k": {
'1.1': {
'url':
'https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.1.2.model.tar.gz',
'https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.1.4.model.tar.gz',
'md5':
'925d047e9188dea7f421a718230c9ae3',
'3100fc1eac5779486cab859366992d0b',
'cfg_path':
'model.yaml',
'ckpt_path':

@ -39,7 +39,6 @@ class U2Infer():
self.preprocess_conf = config.preprocess_config
self.preprocess_args = {"train": False}
self.preprocessing = Transformation(self.preprocess_conf)
self.reverse_weight = getattr(config.model_conf, 'reverse_weight', 0.0)
self.text_feature = TextFeaturizer(
unit_type=config.unit_type,
vocab=config.vocab_filepath,
@ -81,6 +80,7 @@ class U2Infer():
xs = paddle.to_tensor(feat, dtype='float32').unsqueeze(0)
decode_config = self.config.decode
logger.info(f"decode cfg: {decode_config}")
reverse_weight = getattr(decode_config, 'reverse_weight', 0.0)
result_transcripts = self.model.decode(
xs,
ilen,
@ -91,7 +91,7 @@ class U2Infer():
decoding_chunk_size=decode_config.decoding_chunk_size,
num_decoding_left_chunks=decode_config.num_decoding_left_chunks,
simulate_streaming=decode_config.simulate_streaming,
reverse_weight=decode_config.reverse_weight)
reverse_weight=reverse_weight)
rsl = result_transcripts[0][0]
utt = Path(self.audio_file).name
logger.info(f"hyp: {utt} {rsl}")

@ -79,6 +79,7 @@ class U2Infer():
xs = paddle.to_tensor(feat, dtype='float32').unsqueeze(0)
decode_config = self.config.decode
logger.info(f"decode cfg: {decode_config}")
reverse_weight = getattr(decode_config, 'reverse_weight', 0.0)
result_transcripts = self.model.decode(
xs,
ilen,
@ -89,7 +90,7 @@ class U2Infer():
decoding_chunk_size=decode_config.decoding_chunk_size,
num_decoding_left_chunks=decode_config.num_decoding_left_chunks,
simulate_streaming=decode_config.simulate_streaming,
reverse_weight=decode_config.reverse_weight)
reverse_weight=reverse_weight)
rsl = result_transcripts[0][0]
utt = Path(self.audio_file).name
logger.info(f"hyp: {utt} {result_transcripts[0][0]}")

@ -337,6 +337,7 @@ class U2Tester(U2Trainer):
errors_sum, len_refs, num_ins = 0.0, 0, 0
errors_func = error_rate.char_errors if decode_config.error_rate_type == 'cer' else error_rate.word_errors
error_rate_func = error_rate.cer if decode_config.error_rate_type == 'cer' else error_rate.wer
reverse_weight = getattr(decode_config, 'reverse_weight', 0.0)
start_time = time.time()
target_transcripts = self.id2token(texts, texts_len, self.text_feature)
@ -351,7 +352,7 @@ class U2Tester(U2Trainer):
decoding_chunk_size=decode_config.decoding_chunk_size,
num_decoding_left_chunks=decode_config.num_decoding_left_chunks,
simulate_streaming=decode_config.simulate_streaming,
reverse_weight=decode_config.reverse_weight)
reverse_weight=reverse_weight)
decode_time = time.time() - start_time
for utt, target, result, rec_tids in zip(

@ -580,6 +580,7 @@ class PaddleASRConnectionHanddler:
self.update_result()
beam_size = self.ctc_decode_config.beam_size
reverse_weight = getattr(self.ctc_decode_config, 'reverse_weight', 0.0)
hyps = self.searcher.get_hyps()
if hyps is None or len(hyps) == 0:
logger.info("No Hyps!")
@ -613,7 +614,7 @@ class PaddleASRConnectionHanddler:
# ctc score in ln domain
# (beam_size, max_hyps_len, vocab_size)
decoder_out, r_decoder_out = self.model.forward_attention_decoder(
hyps_pad, hyps_lens, self.encoder_out, self.model.reverse_weight)
hyps_pad, hyps_lens, self.encoder_out, reverse_weight)
decoder_out = decoder_out.numpy()
# r_decoder_out will be 0.0, if reverse_weight is 0.0 or decoder is a
@ -631,13 +632,12 @@ class PaddleASRConnectionHanddler:
# last decoder output token is `eos`, for laste decoder input token.
score += decoder_out[i][len(hyp[0])][self.model.eos]
if self.model.reverse_weight > 0:
if reverse_weight > 0:
r_score = 0.0
for j, w in enumerate(hyp[0]):
r_score += r_decoder_out[i][len(hyp[0]) - j - 1][w]
r_score += r_decoder_out[i][len(hyp[0])][self.model.eos]
score = score * (1 - self.model.reverse_weight
) + r_score * self.model.reverse_weight
score = score * (1 - reverse_weight) + r_score * reverse_weight
# add ctc score (which in ln domain)
score += hyp[1] * self.ctc_decode_config.ctc_weight

@ -11,8 +11,8 @@ This document introduces a client for streaming asr service: microphone
### 1. Install
Refer [Install](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
**paddlepaddle 2.2.1** 或以上版本。
It is recommended to use **paddlepaddle 2.2.1** or above.
**paddlepaddle 2.4rc** 或以上版本。
It is recommended to use **paddlepaddle 2.4rc** or above.
You can choose one way from meduim and hard to install paddlespeech.

@ -10,7 +10,7 @@
### 1. 安装
请看 [安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
推荐使用 **paddlepaddle 2.2.1** 或以上版本。
推荐使用 **paddlepaddle 2.4rc** 或以上版本。
你可以从 mediumhard 三中方式中选择一种方式安装 PaddleSpeech。

Loading…
Cancel
Save