History

Hui Zhang bf914a9c8b [runtime] optimization compile and add vad interface (#3026 ) * vad recipe ok * refactor vad, add vad conf, vad inerface, vad recipe * format * install vad lib/bin/inc * using cpack * add vad doc, fix vad state name * add comment * refactor fastdeploy download * add vad jni; format code * add timer; compute vad rtf; vad add beam param * andorid find library * fix log; add vad rtf * fix glog * fix BUILD_TYPE bug * update doc * rm jni		2 years ago
..
local	[engine] rename speechx (#2892 )	2 years ago
.gitignore	[engine] rename speechx (#2892 )	2 years ago
README.md	[engine] rename speechx (#2892 )	2 years ago
RESULTS.md	[engine] rename speechx (#2892 )	2 years ago
path.sh	[runtime] optimization compile and add vad interface (#3026 )	2 years ago
run.sh	[engine] rename speechx (#2892 )	2 years ago
utils	[engine] rename speechx (#2892 )	2 years ago

README.md

U2/U2++ Streaming ASR

A C++ deployment example for PaddleSpeech/examples/wenetspeech/asr1 recipe. The model is static model from export, how to export model please see here. If you want using exported model, run.sh will download it, for the model link please see run.sh.

This example will demonstrate how to using the u2/u2++ model to recognize wav and compute CER. We using AISHELL-1 as test data.

Testing with Aishell Test Data

Source path.sh

. path.sh

SpeechX bins is under echo $SPEECHX_BUILD, more info please see path.sh.

Download dataset and model

./run.sh --stop_stage 0

process `cmvn` and compute feature

./run.sh --stage 1 --stop_stage 1

If you only want to convert cmvn file format, can using this cmd:

./local/feat.sh --stage 1 --stop_stage 1

Decoding using `feature` input

./run.sh --stage 2 --stop_stage 2

Decoding using `wav` input

./run.sh --stage 3 --stop_stage 3

This stage using u2_recognizer_main to recognize wav file.

The input is scp file which look like this:

# head data/split1/1/aishell_test.scp 
BAC009S0764W0121        /workspace/PaddleSpeech/runtime/examples/u2pp_ol/wenetspeech/data/test/S0764/BAC009S0764W0121.wav
BAC009S0764W0122        /workspace/PaddleSpeech/runtime/examples/u2pp_ol/wenetspeech/data/test/S0764/BAC009S0764W0122.wav
...
BAC009S0764W0125        /workspace/PaddleSpeech/runtime/examples/u2pp_ol/wenetspeech/data/test/S0764/BAC009S0764W0125.wav

If you want to recognize one wav, you can make scp file like this:

key  path/to/wav/file

Then specify --wav_rspecifier= param for u2_recognizer_main bin. For other flags meaning, please see help:

u2_recognizer_main --help

The exmaple using u2_recgonize_main bin please see local/recognizer.sh.

Decoding with `wav` using quant model

local/recognizer_quant.sh is same to local/recognizer.sh, but using quanted model.

Results

Please see here.