You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
PaddleSpeech/speechx/examples/ds2_ol/aishell
Yang Zhou f852514a3e
mv text_to_lexicon.py to utils
3 years ago
..
local refactor example dir & add aishell build TLG 3 years ago
.gitignore update aishell rsl 3 years ago
README.md refactor example dir & add aishell build TLG 3 years ago
path.sh refactor example dir & add aishell build TLG 3 years ago
run.sh rm to_float32 flags, default is fbank 3 years ago
run_build_tlg.sh mv text_to_lexicon.py to utils 3 years ago
run_fbank.sh refactor example dir & add aishell build TLG 3 years ago
utils refactor speechx egs 3 years ago

README.md

Aishell - Deepspeech2 Streaming

How to run

bash run.sh

Results

CTC Prefix Beam Search w/o LM

Overall -> 16.14 % N=104612 C=88190 S=16110 D=312 I=465
Mandarin -> 16.14 % N=104612 C=88190 S=16110 D=312 I=465
Other -> 0.00 % N=0 C=0 S=0 D=0 I=0

CTC Prefix Beam Search w/ LM

LM: zh_giga.no_cna_cmn.prune01244.klm

Overall -> 7.86 % N=104768 C=96865 S=7573 D=330 I=327
Mandarin -> 7.86 % N=104768 C=96865 S=7573 D=330 I=327
Other -> 0.00 % N=0 C=0 S=0 D=0 I=0

CTC WFST

LM: aishell train --acoustic_scale=1.2

Overall -> 11.14 % N=103017 C=93363 S=9583 D=71 I=1819
Mandarin -> 11.14 % N=103017 C=93363 S=9583 D=71 I=1818
Other -> 0.00 % N=0 C=0 S=0 D=0 I=1

LM: wenetspeech --acoustic_scale=1.5

Overall -> 10.93 % N=104765 C=93410 S=9780 D=1575 I=95
Mandarin -> 10.93 % N=104762 C=93410 S=9779 D=1573 I=95
Other -> 100.00 % N=3 C=0 S=1 D=2 I=0

fbank

bash run_fbank.sh

CTC Prefix Beam Search w/o LM

Overall -> 10.44 % N=104765 C=94194 S=10174 D=397 I=369
Mandarin -> 10.44 % N=104762 C=94194 S=10171 D=397 I=369
Other -> 100.00 % N=3 C=0 S=3 D=0 I=0

CTC Prefix Beam Search w/ LM

LM: zh_giga.no_cna_cmn.prune01244.klm

Overall -> 5.82 % N=104765 C=99386 S=4944 D=435 I=720
Mandarin -> 5.82 % N=104762 C=99386 S=4941 D=435 I=720
English -> 0.00 % N=0 C=0 S=0 D=0 I=0

CTC WFST

LM: aishell train

Overall -> 9.58 % N=104765 C=94817 S=4326 D=5622 I=84
Mandarin -> 9.57 % N=104762 C=94817 S=4325 D=5620 I=84
Other -> 100.00 % N=3 C=0 S=1 D=2 I=0

build TLG graph

 bash run_build_tlg.sh