diff --git a/docs/src/text_front_end.md b/docs/src/text_front_end.md new file mode 100644 index 000000000..a8cde1750 --- /dev/null +++ b/docs/src/text_front_end.md @@ -0,0 +1,15 @@ +# Text Front End + +## MMSEG +* [MMSEG: A Word Identification System for Mandarin Chinese Text Based on Two Variants of the Maximum Matching Algorithm](http://technology.chtsai.org/mmseg/) +* [`中文分词`简单高效的MMSeg](https://www.cnblogs.com/en-heng/p/5872308.html) +* [mmseg分词算法及实现](https://blog.csdn.net/daniel_ustc/article/details/50488040) +* [Mmseg算法](https://www.jianshu.com/p/e4ae8d194487) +* [浅谈中文分词](http://www.isnowfy.com/introduction-to-chinese-segmentation/) + +* [ustcdane/mmseg](https://github.com/ustcdane/mmseg) +* [jkom-cloud/mmseg](https://github.com/jkom-cloud/mmseg) + + +## CScanner +* [CScanner - A Chinese Lexical Scanner](http://technology.chtsai.org/cscanner/) diff --git a/examples/aishell/s0/.gitignore b/examples/aishell/s0/.gitignore new file mode 100644 index 000000000..b7fa0dd7c --- /dev/null +++ b/examples/aishell/s0/.gitignore @@ -0,0 +1,3 @@ +exp +data +*log diff --git a/examples/aishell/s0/conf/deepspeech2.yaml b/examples/aishell/s0/conf/deepspeech2.yaml index 835cf58b0..02c68df9c 100644 --- a/examples/aishell/s0/conf/deepspeech2.yaml +++ b/examples/aishell/s0/conf/deepspeech2.yaml @@ -3,7 +3,7 @@ data: train_manifest: data/manifest.train dev_manifest: data/manifest.dev test_manifest: data/manifest.test - mean_std_filepath: data/mean_std.npz + mean_std_filepath: data/mean_std.json vocab_filepath: data/vocab.txt augmentation_config: conf/augmentation.json batch_size: 64 # one gpu diff --git a/examples/librispeech/s0/conf/deepspeech2.yaml b/examples/librispeech/s0/conf/deepspeech2.yaml index 32496428f..688f0cba9 100644 --- a/examples/librispeech/s0/conf/deepspeech2.yaml +++ b/examples/librispeech/s0/conf/deepspeech2.yaml @@ -3,7 +3,7 @@ data: train_manifest: data/manifest.train dev_manifest: data/manifest.dev-clean test_manifest: data/manifest.test-clean - mean_std_filepath: data/mean_std.npz + mean_std_filepath: data/mean_std.json vocab_filepath: data/vocab.txt augmentation_config: conf/augmentation.json batch_size: 20