From fafdeac32198a43bdeb97eb38be5b68dc96fc9f1 Mon Sep 17 00:00:00 2001 From: Hui Zhang Date: Thu, 13 May 2021 11:05:25 +0800 Subject: [PATCH] add more speech doc --- docs/src/asr_postprocess.md | 15 +++++++++------ docs/src/dataset.md | 16 ++++++++++++++++ docs/src/reference.md | 1 + docs/src/server.md | 4 ++-- docs/src/speech_synthesis.md | 2 +- docs/src/text_front_end.md | 13 ++++++++++++- tools/Makefile | 2 +- 7 files changed, 42 insertions(+), 11 deletions(-) create mode 100644 docs/src/dataset.md diff --git a/docs/src/asr_postprocess.md b/docs/src/asr_postprocess.md index 772bf8b25..8c18efd49 100644 --- a/docs/src/asr_postprocess.md +++ b/docs/src/asr_postprocess.md @@ -1,8 +1,9 @@ # ASR PostProcess -* Text Corrector -* Text Filter -* Add Punctuation +1. [Text Segmentation](text_front_end#text segmentation) +2. Text Corrector +3. Add Punctuation +4. Text Filter @@ -10,6 +11,7 @@ * [pycorrector](https://github.com/shibing624/pycorrector) 本项目重点解决其中的谐音、混淆音、形似字错误、中文拼音全拼、语法错误带来的纠错任务。PS:[网友源码解读](https://zhuanlan.zhihu.com/p/138981644) +* DeepCorrection [1](https://praneethbedapudi.medium.com/deepcorrection-1-sentence-segmentation-of-unpunctuated-text-a1dbc0db4e98) [2](https://praneethbedapudi.medium.com/deepcorrection2-automatic-punctuation-restoration-ac4a837d92d9) [3](https://praneethbedapudi.medium.com/deepcorrection-3-spell-correction-and-simple-grammar-correction-d033a52bc11d) [4](https://praneethbedapudi.medium.com/deepsegment-2-0-multilingual-text-segmentation-with-vector-alignment-fd76ce62194f) @@ -88,12 +90,13 @@ -## Text Filter +## Add Punctuation -* 敏感词(黄暴、涉政、违法违禁等) +* DeepCorrection [1](https://praneethbedapudi.medium.com/deepcorrection-1-sentence-segmentation-of-unpunctuated-text-a1dbc0db4e98) [2](https://praneethbedapudi.medium.com/deepcorrection2-automatic-punctuation-restoration-ac4a837d92d9) [3](https://praneethbedapudi.medium.com/deepcorrection-3-spell-correction-and-simple-grammar-correction-d033a52bc11d) [4](https://praneethbedapudi.medium.com/deepsegment-2-0-multilingual-text-segmentation-with-vector-alignment-fd76ce62194f) +## Text Filter +* 敏感词(黄暴、涉政、违法违禁等) -## Add Punctuation diff --git a/docs/src/dataset.md b/docs/src/dataset.md new file mode 100644 index 000000000..76e54bc4f --- /dev/null +++ b/docs/src/dataset.md @@ -0,0 +1,16 @@ +# Dataset + +## Text + +* [Tatoeba](https://tatoeba.org/cmn) + + **Tatoeba is a collection of sentences and translations.** It's collaborative, open, free and even addictive. An open data initiative aimed at translation and speech recognition. + + + +## Speech + +* [Tatoeba](https://tatoeba.org/cmn) + + **Tatoeba is a collection of sentences and translations.** It's collaborative, open, free and even addictive. An open data initiative aimed at translation and speech recognition. + diff --git a/docs/src/reference.md b/docs/src/reference.md index 69ff6ab88..b492fcaf2 100644 --- a/docs/src/reference.md +++ b/docs/src/reference.md @@ -1,3 +1,4 @@ # Reference * [wenet](https://github.com/mobvoi/wenet) + diff --git a/docs/src/server.md b/docs/src/server.md index 019ebcfa4..bc8b62dc5 100644 --- a/docs/src/server.md +++ b/docs/src/server.md @@ -25,10 +25,10 @@ Then to start the client, please run this in another console: ```bash CUDA_VISIBLE_DEVICES=0 bash local/client.sh -``` +``` Now, in the client console, press the `whitespace` key, hold, and start speaking. Until finishing your utterance, release the key to let the speech-to-text results shown in the console. To quit the client, just press `ESC` key. Notice that `deepspeech/exps/deepspeech2/deploy/client.py` must be run on a machine with a microphone device, while `deepspeech/exps/deepspeech2/deploy/server.py` could be run on one without any audio recording hardware, e.g. any remote server machine. Just be careful to set the `host_ip` and `host_port` argument with the actual accessible IP address and port, if the server and client are running with two separate machines. Nothing should be done if they are running on one single machine. -Please also refer to `examples/aishell/local/server.sh`, which will first download a pre-trained Chinese model (trained with AISHELL1) and then start the demo server with the model. With running `examples/aishell/local/client.sh`, you can speak Chinese to test it. If you would like to try some other models, just update `--checkpoint_path` argument in the script.   +Please also refer to `examples/aishell/local/server.sh`, which will first download a pre-trained Chinese model (trained with AISHELL1) and then start the demo server with the model. With running `examples/aishell/local/client.sh`, you can speak Chinese to test it. If you would like to try some other models, just update `--checkpoint_path` argument in the script.   \ No newline at end of file diff --git a/docs/src/speech_synthesis.md b/docs/src/speech_synthesis.md index dc36b911b..3b0a904c5 100644 --- a/docs/src/speech_synthesis.md +++ b/docs/src/speech_synthesis.md @@ -142,4 +142,4 @@ HMM 应用到 TTS 这里和 ASR 还是有些区别的。主要参考的论文是 * https://slyne.github.io/%E5%85%AC%E5%BC%80%E8%AF%BE/2020/09/26/TTS/ * https://slyne.github.io/%E5%85%AC%E5%BC%80%E8%AF%BE/2020/10/25/TTS2/ -* https://slyne.github.io/%E5%85%AC%E5%BC%80%E8%AF%BE/2020/12/04/TTS6/ +* https://slyne.github.io/%E5%85%AC%E5%BC%80%E8%AF%BE/2020/12/04/TTS6/ \ No newline at end of file diff --git a/docs/src/text_front_end.md b/docs/src/text_front_end.md index 5d53f5137..7bc367102 100644 --- a/docs/src/text_front_end.md +++ b/docs/src/text_front_end.md @@ -1,5 +1,16 @@ # Text Front End + + +## Text Segmentation + +There are various libraries including some of the most popular ones like NLTK, Spacy, Stanford CoreNLP that that provide excellent, easy to use functions for sentence segmentation. + +* https://github.com/bminixhofer/nnsplit +* [DeepSegment](https://github.com/notAI-tech/deepsegment) [blog](http://bpraneeth.com/projects/deepsegment) [1](https://praneethbedapudi.medium.com/deepcorrection-1-sentence-segmentation-of-unpunctuated-text-a1dbc0db4e98) [2](https://praneethbedapudi.medium.com/deepcorrection2-automatic-punctuation-restoration-ac4a837d92d9) [3](https://praneethbedapudi.medium.com/deepcorrection-3-spell-correction-and-simple-grammar-correction-d033a52bc11d) [4](https://praneethbedapudi.medium.com/deepsegment-2-0-multilingual-text-segmentation-with-vector-alignment-fd76ce62194f) + + + ## Text Normalization(文本正则) 文本正则化 文本正则化主要是讲非标准词(NSW)进行转化,比如: @@ -136,4 +147,4 @@ TN: 基于规则的方法 ## Reference -* [Text Front End](https://slyne.github.io/%E5%85%AC%E5%BC%80%E8%AF%BE/2020/10/03/TTS1/) +* [Text Front End](https://slyne.github.io/%E5%85%AC%E5%BC%80%E8%AF%BE/2020/10/03/TTS1/) \ No newline at end of file diff --git a/tools/Makefile b/tools/Makefile index ef721c2b8..ea57cd2c0 100644 --- a/tools/Makefile +++ b/tools/Makefile @@ -1,4 +1,4 @@ -PYTHON:= python3.7 +PYTHON:= python3.8 .PHONY: all clean all: virtualenv