diff --git a/.github/CODE_OF_CONDUCT.md b/.github/CODE_OF_CONDUCT.md deleted file mode 100644 index 33d53d9f5..000000000 --- a/.github/CODE_OF_CONDUCT.md +++ /dev/null @@ -1,77 +0,0 @@ -# Contributor Covenant Code of Conduct - -## Our Pledge - -In the interest of fostering an open and welcoming environment, we as -contributors and maintainers pledge to making participation in our project and -our community a harassment-free experience for everyone, regardless of age, body -size, disability, ethnicity, sex characteristics, gender identity and expression, -level of experience, education, socio-economic status, nationality, personal -appearance, race, religion, or sexual identity and orientation. - -## Our Standards - -Examples of behavior that contributes to creating a positive environment -include: - -* Using welcoming and inclusive language -* Being respectful of differing viewpoints and experiences -* Gracefully accepting constructive criticism -* Focusing on what is best for the community -* Showing empathy towards other community members - -Examples of unacceptable behavior by participants include: - -* The use of sexualized language or imagery and unwelcome sexual attention or - advances -* Racial or political allusions -* Trolling, insulting/derogatory comments, and personal or political attacks -* Public or private harassment -* Publishing others' private information, such as a physical or electronic - address, without explicit permission -* Other conduct which could reasonably be considered inappropriate in a - professional setting - -## Our Responsibilities - -Project maintainers are responsible for clarifying the standards of acceptable -behavior and are expected to take appropriate and fair corrective action in -response to any instances of unacceptable behavior. - -Project maintainers have the right and responsibility to remove, edit, or -reject comments, commits, code, wiki edits, issues, and other contributions -that are not aligned to this Code of Conduct, or to ban temporarily or -permanently any contributor for other behaviors that they deem inappropriate, -threatening, offensive, or harmful. - -## Scope - -This Code of Conduct applies both within project spaces and in public spaces -when an individual is representing the project or its community. Examples of -representing a project or community include using an official project e-mail -address, posting via an official social media account, or acting as an appointed -representative at an online or offline event. Representation of a project may be -further defined and clarified by project maintainers. - -## Enforcement - -Instances of abusive, harassing, or otherwise unacceptable behavior may be -reported by contacting the project team at paddlespeech@baidu.com. All -complaints will be reviewed and investigated and will result in a response that -is deemed necessary and appropriate to the circumstances. The project team is -obligated to maintain confidentiality with regard to the reporter of an incident. -Further details of specific enforcement policies may be posted separately. - -Project maintainers who do not follow or enforce the Code of Conduct in good -faith may face temporary or permanent repercussions as determined by other -members of the project's leadership. - -## Attribution - -This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, -available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html - -[homepage]: https://www.contributor-covenant.org - -For answers to common questions about this code of conduct, see -https://www.contributor-covenant.org/faq diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md deleted file mode 100644 index 1ff473308..000000000 --- a/.github/CONTRIBUTING.md +++ /dev/null @@ -1,30 +0,0 @@ -# 💡 paddlespeech 提交代码须知 - -### Discussed in https://github.com/PaddlePaddle/PaddleSpeech/discussions/1326 - -
- -Originally posted by **yt605155624** January 12, 2022 -1. 写完代码之后可以用我们的 pre-commit 检查一下代码格式,注意只改自己修改的代码的格式即可,其他的代码有可能也被改了格式,不要 add 就好 -``` -pip install pre-commit -pre-commit run --file 你修改的代码 -``` -2. 提交 commit 中增加必要信息跳过不必要的 CI -- 提交 asr 相关代码 -```text -git commit -m "xxxxxx, test=asr" -``` -- 提交 tts 相关代码 -```text -git commit -m "xxxxxx, test=tts" -``` -- 仅修改文档 -```text -git commit -m "xxxxxx, test=doc" -``` -注意: -1. 虽然跳过了 CI,但是还要先排队排到才能跳过,所以非自己方向看到 pending 不要着急 🤣 -2. 在 `git commit --amend` 的时候才加 `test=xxx` 可能不太有效 -3. 一个 pr 多次提交 commit 注意每次都要加 `test=xxx`,因为每个 commit 都会触发 CI -4. 删除 python 环境中已经安装好的 paddlespeech,否则可能会影响 import paddlespeech 的顺序
diff --git a/.github/ISSUE_TEMPLATE/bug-report-tts.md b/.github/ISSUE_TEMPLATE/bug-report-tts.md index e2322c239..64b33c32e 100644 --- a/.github/ISSUE_TEMPLATE/bug-report-tts.md +++ b/.github/ISSUE_TEMPLATE/bug-report-tts.md @@ -3,6 +3,7 @@ name: "\U0001F41B TTS Bug Report" about: Create a report to help us improve title: "[TTS]XXXX" labels: Bug, T2S +assignees: yt605155624 --- diff --git a/.github/stale.yml b/.github/stale.yml index 6b0da9b98..da19b6606 100644 --- a/.github/stale.yml +++ b/.github/stale.yml @@ -6,8 +6,7 @@ daysUntilClose: 30 exemptLabels: - Roadmap - Bug - - feature request - - Tips + - New Feature # Label to use when marking an issue as stale staleLabel: Stale # Comment to post when marking an issue as stale. Set to `false` to disable @@ -18,4 +17,4 @@ markComment: > unmarkComment: false # Comment to post when closing a stale issue. Set to `false` to disable closeComment: > - This issue is closed. Please re-open if needed. + This issue is closed. Please re-open if needed. \ No newline at end of file diff --git a/.gitignore b/.gitignore index 4a0c43312..75f56b604 100644 --- a/.gitignore +++ b/.gitignore @@ -15,7 +15,6 @@ *.egg-info build *output/ -.history audio/dist/ audio/fc_patch/ diff --git a/.pre-commit-hooks/copyright-check.hook b/.pre-commit-hooks/copyright-check.hook index 5a409e062..761edbc01 100644 --- a/.pre-commit-hooks/copyright-check.hook +++ b/.pre-commit-hooks/copyright-check.hook @@ -19,7 +19,7 @@ import subprocess import platform COPYRIGHT = ''' -Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. +Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. @@ -128,4 +128,4 @@ def main(argv=None): if __name__ == '__main__': - exit(main()) + exit(main()) \ No newline at end of file diff --git a/README.md b/README.md index 9ed823116..0a12ec049 100644 --- a/README.md +++ b/README.md @@ -97,47 +97,26 @@ - Life was like a box of chocolates, you never know what you're gonna get. + Life was like a box of chocolates, you never know what you're gonna get.
- 早上好,今天是2020/10/29,最低温度是-3°C。 + 早上好,今天是2020/10/29,最低温度是-3°C。
- 季姬寂,集鸡,鸡即棘鸡。棘鸡饥叽,季姬及箕稷济鸡。鸡既济,跻姬笈,季姬忌,急咭鸡,鸡急,继圾几,季姬急,即籍箕击鸡,箕疾击几伎,伎即齑,鸡叽集几基,季姬急极屐击鸡,鸡既殛,季姬激,即记《季姬击鸡记》。 + 季姬寂,集鸡,鸡即棘鸡。棘鸡饥叽,季姬及箕稷济鸡。鸡既济,跻姬笈,季姬忌,急咭鸡,鸡急,继圾几,季姬急,即籍箕击鸡,箕疾击几伎,伎即齑,鸡叽集几基,季姬急极屐击鸡,鸡既殛,季姬激,即记《季姬击鸡记》。
- - 大家好,我是 parrot 虚拟老师,我们来读一首诗,我与春风皆过客,I and the spring breeze are passing by,你携秋水揽星河,you take the autumn water to take the galaxy。 - - -
- - - - 宜家唔系事必要你讲,但系你所讲嘅说话将会变成呈堂证供。 - - -
- - - - 各个国家有各个国家嘅国歌 - - -
- - @@ -178,24 +157,16 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision - 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV). ### Recent Update -- 🔥 2023.04.06: Add [subtitle file (.srt format) generation example](./demos/streaming_asr_server). -- 🔥 2023.03.14: Add SVS(Singing Voice Synthesis) examples with Opencpop dataset, including [DiffSinger](./examples/opencpop/svs1)、[PWGAN](./examples/opencpop/voc1) and [HiFiGAN](./examples/opencpop/voc5), the effect is continuously optimized. -- 👑 2023.03.09: Add [Wav2vec2ASR-zh](./examples/aishell/asr3). -- 🎉 2023.03.07: Add [TTS ARM Linux C++ Demo (with C++ Chinese Text Frontend)](./demos/TTSArmLinux). -- 🔥 2023.03.03 Add Voice Conversion [StarGANv2-VC synthesize pipeline](./examples/vctk/vc3). -- 🎉 2023.02.16: Add [Cantonese TTS](./examples/canton/tts3). -- 🔥 2023.01.10: Add [code-switch asr CLI and Demos](./demos/speech_recognition). -- 👑 2023.01.06: Add [code-switch asr tal_cs recipe](./examples/tal_cs/asr1/). -- 🎉 2022.12.02: Add [end-to-end Prosody Prediction pipeline](./examples/csmsc/tts3_rhy) (including using prosody labels in Acoustic Model). -- 🎉 2022.11.30: Add [TTS Android Demo](./demos/TTSAndroid). +- 🎉 2022.12.02: Add [end-to-end Prosody Prediction pipeline](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts3_rhy) (including using prosody labels in Acoustic Model). +- 🎉 2022.11.30: Add [TTS Android Demo](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/TTSAndroid). - 🤗 2022.11.28: PP-TTS and PP-ASR demos are available in [AIStudio](https://aistudio.baidu.com/aistudio/modelsoverview) and [official website of paddlepaddle](https://www.paddlepaddle.org.cn/models). - 👑 2022.11.18: Add [Whisper CLI and Demos](https://github.com/PaddlePaddle/PaddleSpeech/pull/2640), support multi language recognition and translation. -- 🔥 2022.11.18: Add [Wav2vec2 CLI and Demos](./demos/speech_ssl), Support ASR and Feature Extraction. +- 🔥 2022.11.18: Add [Wav2vec2 CLI and Demos](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_ssl), Support ASR and Feature Extraction. - 🎉 2022.11.17: Add [male voice for TTS](https://github.com/PaddlePaddle/PaddleSpeech/pull/2660). - 🔥 2022.11.07: Add [U2/U2++ C++ High Performance Streaming ASR Deployment](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/runtime/examples/u2pp_ol/wenetspeech). - 👑 2022.11.01: Add [Adversarial Loss](https://arxiv.org/pdf/1907.04448.pdf) for [Chinese English mixed TTS](./examples/zh_en_tts/tts3). -- 🔥 2022.10.26: Add [Prosody Prediction](./examples/other/rhy) for TTS. +- 🔥 2022.10.26: Add [Prosody Prediction](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/rhy) for TTS. - 🎉 2022.10.21: Add [SSML](https://github.com/PaddlePaddle/PaddleSpeech/discussions/2538) for TTS Chinese Text Frontend. - 👑 2022.10.11: Add [Wav2vec2ASR-en](./examples/librispeech/asr3), wav2vec2.0 fine-tuning for ASR on LibriSpeech. - 🔥 2022.09.26: Add Voice Cloning, TTS finetune, and [ERNIE-SAT](https://arxiv.org/abs/2211.03545) in [PaddleSpeech Web Demo](./demos/speech_web). @@ -209,16 +180,16 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision - 🎉 2022.06.22: All TTS models support ONNX format. - 🍀 2022.06.17: Add [PaddleSpeech Web Demo](./demos/speech_web). - 👑 2022.05.13: Release [PP-ASR](./docs/source/asr/PPASR.md)、[PP-TTS](./docs/source/tts/PPTTS.md)、[PP-VPR](docs/source/vpr/PPVPR.md). -- 👏🏻 2022.05.06: `PaddleSpeech Streaming Server` is available for `Streaming ASR` with `Punctuation Restoration` and `Token Timestamp` and `Text-to-Speech`. -- 👏🏻 2022.05.06: `PaddleSpeech Server` is available for `Audio Classification`, `Automatic Speech Recognition` and `Text-to-Speech`, `Speaker Verification` and `Punctuation Restoration`. -- 👏🏻 2022.03.28: `PaddleSpeech CLI` is available for `Speaker Verification`. -- 👏🏻 2021.12.10: `PaddleSpeech CLI` is available for `Audio Classification`, `Automatic Speech Recognition`, `Speech Translation (English to Chinese)` and `Text-to-Speech`. +- 👏🏻 2022.05.06: `PaddleSpeech Streaming Server` is available for `Streaming ASR` with `Punctuation Restoration` and `Token Timestamp` and `Text-to-Speech`. +- 👏🏻 2022.05.06: `PaddleSpeech Server` is available for `Audio Classification`, `Automatic Speech Recognition` and `Text-to-Speech`, `Speaker Verification` and `Punctuation Restoration`. +- 👏🏻 2022.03.28: `PaddleSpeech CLI` is available for `Speaker Verification`. +- 👏🏻 2021.12.10: `PaddleSpeech CLI` is available for `Audio Classification`, `Automatic Speech Recognition`, `Speech Translation (English to Chinese)` and `Text-to-Speech`. ### Community - Scan the QR code below with your Wechat, you can access to official technical exchange group and get the bonus ( more than 20GB learning materials, such as papers, codes and videos ) and the live link of the lessons. Look forward to your participation.
- +
## Installation @@ -579,14 +550,14 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r - Text Frontend -   - - tn / g2p - + Text Frontend +   + + tn / g2p + - Acoustic Model + Acoustic Model Tacotron2 LJSpeech / CSMSC @@ -621,13 +592,6 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r ERNIE-SAT-vctk / ERNIE-SAT-aishell3 / ERNIE-SAT-zh_en - - DiffSinger - Opencpop - - DiffSinger-opencpop - - Vocoder WaveFlow @@ -638,9 +602,9 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r Parallel WaveGAN - LJSpeech / VCTK / CSMSC / AISHELL-3 / Opencpop + LJSpeech / VCTK / CSMSC / AISHELL-3 - PWGAN-ljspeech / PWGAN-vctk / PWGAN-csmsc / PWGAN-aishell3 / PWGAN-opencpop + PWGAN-ljspeech / PWGAN-vctk / PWGAN-csmsc / PWGAN-aishell3 @@ -659,9 +623,9 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r HiFiGAN - LJSpeech / VCTK / CSMSC / AISHELL-3 / Opencpop + LJSpeech / VCTK / CSMSC / AISHELL-3 - HiFiGAN-ljspeech / HiFiGAN-vctk / HiFiGAN-csmsc / HiFiGAN-aishell3 / HiFiGAN-opencpop + HiFiGAN-ljspeech / HiFiGAN-vctk / HiFiGAN-csmsc / HiFiGAN-aishell3 @@ -1021,16 +985,10 @@ You are warmly welcome to submit questions in [discussions](https://github.com/P - Many thanks to [vpegasus](https://github.com/vpegasus)/[xuesebot](https://github.com/vpegasus/xuesebot) for developing a rasa chatbot,which is able to speak and listen thanks to PaddleSpeech. - Many thanks to [chenkui164](https://github.com/chenkui164)/[FastASR](https://github.com/chenkui164/FastASR) for the C++ inference implementation of PaddleSpeech ASR. - Many thanks to [heyudage](https://github.com/heyudage)/[VoiceTyping](https://github.com/heyudage/VoiceTyping) for the real-time voice typing tool implementation of PaddleSpeech ASR streaming services. -- Many thanks to [EscaticZheng](https://github.com/EscaticZheng)/[ps3.9wheel-install](https://github.com/EscaticZheng/ps3.9wheel-install) for the python3.9 prebuilt wheel for PaddleSpeech installation in Windows without Viusal Studio. + Besides, PaddleSpeech depends on a lot of open source repositories. See [references](./docs/source/reference.md) for more information. -- Many thanks to [chinobing](https://github.com/chinobing)/[FastAPI-PaddleSpeech-Audio-To-Text](https://github.com/chinobing/FastAPI-PaddleSpeech-Audio-To-Text) for converting audio to text based on FastAPI and PaddleSpeech. -- Many thanks to [MistEO](https://github.com/MistEO)/[Pallas-Bot](https://github.com/MistEO/Pallas-Bot) for QQ bot based on PaddleSpeech TTS. ## License PaddleSpeech is provided under the [Apache-2.0 License](./LICENSE). - -## Stargazers over time - -[![Stargazers over time](https://starchart.cc/PaddlePaddle/PaddleSpeech.svg)](https://starchart.cc/PaddlePaddle/PaddleSpeech) diff --git a/README_cn.md b/README_cn.md index 8b98b61ce..5cc156c9f 100644 --- a/README_cn.md +++ b/README_cn.md @@ -122,27 +122,6 @@
- - 大家好,我是 parrot 虚拟老师,我们来读一首诗,我与春风皆过客,I and the spring breeze are passing by,你携秋水揽星河,you take the autumn water to take the galaxy。 - - -
- - - - 宜家唔系事必要你讲,但系你所讲嘅说话将会变成呈堂证供。 - - -
- - - - 各个国家有各个国家嘅国歌 - - -
- - @@ -182,24 +161,18 @@ - 🔬 主流模型及数据集: 本工具包实现了参与整条语音任务流水线的各个模块,并且采用了主流数据集如 LibriSpeech、LJSpeech、AIShell、CSMSC,详情请见 [模型列表](#model-list)。 - 🧩 级联模型应用: 作为传统语音任务的扩展,我们结合了自然语言处理、计算机视觉等任务,实现更接近实际需求的产业级应用。 + + ### 近期更新 -- 👑 2023.04.06: 新增 [srt格式字幕生成功能](./demos/streaming_asr_server)。 -- 🔥 2023.03.14: 新增基于 Opencpop 数据集的 SVS (歌唱合成) 示例,包含 [DiffSinger](./examples/opencpop/svs1)、[PWGAN](./examples/opencpop/voc1) 和 [HiFiGAN](./examples/opencpop/voc5),效果持续优化中。 -- 👑 2023.03.09: 新增 [Wav2vec2ASR-zh](./examples/aishell/asr3)。 -- 🎉 2023.03.07: 新增 [TTS ARM Linux C++ 部署示例 (包含 C++ 中文文本前端模块)](./demos/TTSArmLinux)。 -- 🔥 2023.03.03: 新增声音转换模型 [StarGANv2-VC 合成流程](./examples/vctk/vc3)。 -- 🎉 2023.02.16: 新增[粤语语音合成](./examples/canton/tts3)。 -- 🔥 2023.01.10: 新增[中英混合 ASR CLI 和 Demos](./demos/speech_recognition)。 -- 👑 2023.01.06: 新增 [ASR 中英混合 tal_cs 训练推理流程](./examples/tal_cs/asr1/)。 -- 🎉 2022.12.02: 新增[端到端韵律预测全流程](./examples/csmsc/tts3_rhy) (包含在声学模型中使用韵律标签)。 -- 🎉 2022.11.30: 新增 [TTS Android 部署示例](./demos/TTSAndroid)。 +- 🎉 2022.12.02: 新增 [端到端韵律预测全流程](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/tts3_rhy) (包含在声学模型中使用韵律标签)。 +- 🎉 2022.11.30: 新增 [TTS Android 部署示例](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/TTSAndroid)。 - 🤗 2022.11.28: PP-TTS and PP-ASR 示例可在 [AIStudio](https://aistudio.baidu.com/aistudio/modelsoverview) 和[飞桨官网](https://www.paddlepaddle.org.cn/models)体验! - 👑 2022.11.18: 新增 [Whisper CLI 和 Demos](https://github.com/PaddlePaddle/PaddleSpeech/pull/2640), 支持多种语言的识别与翻译。 -- 🔥 2022.11.18: 新增 [Wav2vec2 CLI 和 Demos](./demos/speech_ssl), 支持 ASR 和特征提取。 +- 🔥 2022.11.18: 新增 [Wav2vec2 CLI 和 Demos](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_ssl), 支持 ASR 和 特征提取. - 🎉 2022.11.17: TTS 新增[高质量男性音色](https://github.com/PaddlePaddle/PaddleSpeech/pull/2660)。 -- 🔥 2022.11.07: 新增 [U2/U2++ 高性能流式 ASR C++ 部署](./speechx/examples/u2pp_ol/wenetspeech)。 +- 🔥 2022.11.07: 新增 [U2/U2++ 高性能流式 ASR C++ 部署](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/speechx/examples/u2pp_ol/wenetspeech)。 - 👑 2022.11.01: [中英文混合 TTS](./examples/zh_en_tts/tts3) 新增 [Adversarial Loss](https://arxiv.org/pdf/1907.04448.pdf) 模块。 -- 🔥 2022.10.26: TTS 新增[韵律预测](./develop/examples/other/rhy)功能。 +- 🔥 2022.10.26: TTS 新增[韵律预测](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/rhy)功能。 - 🎉 2022.10.21: TTS 中文文本前端新增 [SSML](https://github.com/PaddlePaddle/PaddleSpeech/discussions/2538) 功能。 - 👑 2022.10.11: 新增 [Wav2vec2ASR-en](./examples/librispeech/asr3), 在 LibriSpeech 上针对 ASR 任务对 wav2vec2.0 的 finetuning。 - 🔥 2022.09.26: 新增 Voice Cloning, TTS finetune 和 [ERNIE-SAT](https://arxiv.org/abs/2211.03545) 到 [PaddleSpeech 网页应用](./demos/speech_web)。 @@ -227,7 +200,7 @@ 微信扫描二维码关注公众号,点击“马上报名”填写问卷加入官方交流群,获得更高效的问题答疑,与各行各业开发者充分交流,期待您的加入。
- +
@@ -578,50 +551,43 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 tn / g2p - - - 声学模型 + + + 声学模型 Tacotron2 LJSpeech / CSMSC tacotron2-ljspeech / tacotron2-csmsc - - + + Transformer TTS LJSpeech transformer-ljspeech - - + + SpeedySpeech CSMSC speedyspeech-csmsc - - + + FastSpeech2 LJSpeech / VCTK / CSMSC / AISHELL-3 / ZH_EN / finetune fastspeech2-ljspeech / fastspeech2-vctk / fastspeech2-csmsc / fastspeech2-aishell3 / fastspeech2-zh_en / fastspeech2-finetune - - + + ERNIE-SAT VCTK / AISHELL-3 / ZH_EN ERNIE-SAT-vctk / ERNIE-SAT-aishell3 / ERNIE-SAT-zh_en - - - DiffSinger - Opencpop - - DiffSinger-opencpop - - + 声码器 WaveFlow @@ -632,9 +598,9 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 Parallel WaveGAN - LJSpeech / VCTK / CSMSC / AISHELL-3 / Opencpop + LJSpeech / VCTK / CSMSC / AISHELL-3 - PWGAN-ljspeech / PWGAN-vctk / PWGAN-csmsc / PWGAN-aishell3 / PWGAN-opencpop + PWGAN-ljspeech / PWGAN-vctk / PWGAN-csmsc / PWGAN-aishell3 @@ -653,9 +619,9 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 HiFiGAN - LJSpeech / VCTK / CSMSC / AISHELL-3 / Opencpop + LJSpeech / VCTK / CSMSC / AISHELL-3 - HiFiGAN-ljspeech / HiFiGAN-vctk / HiFiGAN-csmsc / HiFiGAN-aishell3 / HiFiGAN-opencpop + HiFiGAN-ljspeech / HiFiGAN-vctk / HiFiGAN-csmsc / HiFiGAN-aishell3 @@ -712,7 +678,6 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 - **声音分类** @@ -1021,19 +986,13 @@ PaddleSpeech 的 **语音合成** 主要包含三个模块:文本前端、声 - 非常感谢 [awmmmm](https://github.com/awmmmm) 提供 fastspeech2 aishell3 conformer 预训练模型。 - 非常感谢 [phecda-xu](https://github.com/phecda-xu)/[PaddleDubbing](https://github.com/phecda-xu/PaddleDubbing) 基于 PaddleSpeech 的 TTS 模型搭建带 GUI 操作界面的配音工具。 - 非常感谢 [jerryuhoo](https://github.com/jerryuhoo)/[VTuberTalk](https://github.com/jerryuhoo/VTuberTalk) 基于 PaddleSpeech 的 TTS GUI 界面和基于 ASR 制作数据集的相关代码。 + - 非常感谢 [vpegasus](https://github.com/vpegasus)/[xuesebot](https://github.com/vpegasus/xuesebot) 基于 PaddleSpeech 的 ASR 与 TTS 设计的可听、说对话机器人。 - 非常感谢 [chenkui164](https://github.com/chenkui164)/[FastASR](https://github.com/chenkui164/FastASR) 对 PaddleSpeech 的 ASR 进行 C++ 推理实现。 - 非常感谢 [heyudage](https://github.com/heyudage)/[VoiceTyping](https://github.com/heyudage/VoiceTyping) 基于 PaddleSpeech 的 ASR 流式服务实现的实时语音输入法工具。 -- 非常感谢 [EscaticZheng](https://github.com/EscaticZheng)/[ps3.9wheel-install](https://github.com/EscaticZheng/ps3.9wheel-install) 对PaddleSpeech在Windows下的安装提供了无需Visua Studio,基于python3.9的预编译依赖安装包。 -- 非常感谢 [chinobing](https://github.com/chinobing)/[FastAPI-PaddleSpeech-Audio-To-Text](https://github.com/chinobing/FastAPI-PaddleSpeech-Audio-To-Text) 利用 FastAPI 实现 PaddleSpeech 语音转文字,文件上传、分割、转换进度显示、后台更新任务并以 csv 格式输出。 -- 非常感谢 [MistEO](https://github.com/MistEO)/[Pallas-Bot](https://github.com/MistEO/Pallas-Bot) 基于 PaddleSpeech TTS 的 QQ Bot 项目。 此外,PaddleSpeech 依赖于许多开源存储库。有关更多信息,请参阅 [references](./docs/source/reference.md)。 ## License PaddleSpeech 在 [Apache-2.0 许可](./LICENSE) 下提供。 - -## Stargazers over time - -[![Stargazers over time](https://starchart.cc/PaddlePaddle/PaddleSpeech.svg)](https://starchart.cc/PaddlePaddle/PaddleSpeech) diff --git a/audio/CMakeLists.txt b/audio/CMakeLists.txt index 021e24477..d9ae63cd2 100644 --- a/audio/CMakeLists.txt +++ b/audio/CMakeLists.txt @@ -41,18 +41,24 @@ option(BUILD_PADDLEAUDIO_PYTHON_EXTENSION "Build Python extension" ON) # cmake set(CMAKE_MODULE_PATH "${CMAKE_MODULE_PATH};${PROJECT_SOURCE_DIR}/cmake;${PROJECT_SOURCE_DIR}/cmake/external") +if (NOT MSVC) + find_package(GFortranLibs REQUIRED) + include(FortranCInterface) + include(FindGFortranLibs REQUIRED) +endif() + # fc_patch dir set(FETCHCONTENT_QUIET off) get_filename_component(fc_patch "fc_patch" REALPATH BASE_DIR "${CMAKE_SOURCE_DIR}") set(FETCHCONTENT_BASE_DIR ${fc_patch}) set(THIRD_PARTY_PATH ${fc_patch}) +include(openblas) + set(PYBIND11_PYTHON_VERSION ${PY_VERSION}) include(cmake/pybind.cmake) include_directories(${PYTHON_INCLUDE_DIR}) -include_directories(${CMAKE_CURRENT_SOURCE_DIR}/paddleaudio/third_party/) - # packages find_package(Python3 COMPONENTS Interpreter Development) diff --git a/audio/README.md b/audio/README.md index d42d41229..bfd8625f0 100644 --- a/audio/README.md +++ b/audio/README.md @@ -2,22 +2,33 @@ 安装方式: pip install paddleaudio -目前支持的平台:Linux, Mac, Windows +目前支持的平台:Linux: ## Environment ## Build wheel -cmd: python setup.py bdist_wheel Linux test build whl environment: +* docker - `registry.baidubce.com/paddlepaddle/paddle:2.2.2` * os - Ubuntu 16.04.7 LTS -* gcc/g++ - 8.2.0 +* gcc/g++/gfortran - 8.2.0 * cmake - 3.18.0 (need install) +* [How to Install Docker](https://docs.docker.com/engine/install/) +* [A Docker Tutorial for Beginners](https://docker-curriculum.com/) + +1. First to launch docker container. + +``` +docker run --privileged --net=host --ipc=host -it --rm -v $PWD:/workspace --name=dev registry.baidubce.com/paddlepaddle/paddle:2.2.2 /bin/bash +``` +2. python setup.py bdist_wheel + MAC:test build whl envrioment: * os -* gcc/g++ 12.2.0 +* gcc/g++/gfortran 12.2.0 * cpu Intel Xeon E5 x86_64 Windows: -not support paddleaudio C++ extension lib (sox io, kaldi native fbank) +not support: paddleaudio C++ extension lib (sox io, kaldi native fbank) +python setup.py bdist_wheel diff --git a/audio/paddleaudio/CMakeLists.txt b/audio/paddleaudio/CMakeLists.txt index c6b43c780..dbf2bd3eb 100644 --- a/audio/paddleaudio/CMakeLists.txt +++ b/audio/paddleaudio/CMakeLists.txt @@ -1,3 +1,19 @@ add_subdirectory(third_party) add_subdirectory(src) + +if (APPLE) + file(COPY ${GFORTRAN_LIBRARIES_DIR}/libgcc_s.1.1.dylib + DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/lib) +endif(APPLE) + +if (UNIX AND NOT APPLE) + file(COPY ${GFORTRAN_LIBRARIES_DIR}/libgfortran.so.5 + DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/lib FOLLOW_SYMLINK_CHAIN) + + file(COPY ${GFORTRAN_LIBRARIES_DIR}/libquadmath.so.0 + DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/lib FOLLOW_SYMLINK_CHAIN) + + file(COPY ${GFORTRAN_LIBRARIES_DIR}/libgcc_s.so.1 + DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/lib FOLLOW_SYMLINK_CHAIN) +endif() diff --git a/audio/paddleaudio/_internal/module_utils.py b/audio/paddleaudio/_internal/module_utils.py index becd23cd8..7b3230de9 100644 --- a/audio/paddleaudio/_internal/module_utils.py +++ b/audio/paddleaudio/_internal/module_utils.py @@ -67,11 +67,8 @@ def deprecated(direction: str, version: Optional[str]=None): def is_kaldi_available(): - try: - from paddleaudio import _paddleaudio - return True - except Exception: - return False + return is_module_available("paddleaudio._paddleaudio") + def requires_kaldi(): if is_kaldi_available(): @@ -131,11 +128,9 @@ def requires_soundfile(): def is_sox_available(): - try: - from paddleaudio import _paddleaudio - return True - except Exception: + if platform.system() == "Windows": # not support sox in windows return False + return is_module_available("paddleaudio._paddleaudio") def requires_sox(): diff --git a/audio/paddleaudio/backends/soundfile_backend.py b/audio/paddleaudio/backends/soundfile_backend.py index 9195ea097..ae7b5b52d 100644 --- a/audio/paddleaudio/backends/soundfile_backend.py +++ b/audio/paddleaudio/backends/soundfile_backend.py @@ -191,7 +191,7 @@ def soundfile_save(y: np.ndarray, sr: int, file: os.PathLike) -> None: if sr <= 0: raise ParameterError( - f'Sample rate should be larger than 0, received sr = {sr}') + f'Sample rate should be larger than 0, recieved sr = {sr}') if y.dtype not in ['int16', 'int8']: warnings.warn( diff --git a/audio/paddleaudio/kaldi/__init__.py b/audio/paddleaudio/kaldi/__init__.py index a0ae644d1..f951e280a 100644 --- a/audio/paddleaudio/kaldi/__init__.py +++ b/audio/paddleaudio/kaldi/__init__.py @@ -12,4 +12,4 @@ # See the License for the specific language governing permissions and # limitations under the License. from .kaldi import fbank -#from .kaldi import pitch +from .kaldi import pitch diff --git a/audio/paddleaudio/kaldi/kaldi.py b/audio/paddleaudio/kaldi/kaldi.py index 0f080de04..16969d772 100644 --- a/audio/paddleaudio/kaldi/kaldi.py +++ b/audio/paddleaudio/kaldi/kaldi.py @@ -16,6 +16,7 @@ from paddleaudio._internal import module_utils __all__ = [ 'fbank', + 'pitch', ] @@ -32,6 +33,8 @@ def fbank( round_to_power_of_two: bool=True, blackman_coeff: float=0.42, snip_edges: bool=True, + allow_downsample: bool=False, + allow_upsample: bool=False, max_feature_vectors: int=-1, num_bins: int=23, low_freq: float=20, @@ -59,6 +62,8 @@ def fbank( frame_opts.round_to_power_of_two = round_to_power_of_two frame_opts.blackman_coeff = blackman_coeff frame_opts.snip_edges = snip_edges + frame_opts.allow_downsample = allow_downsample + frame_opts.allow_upsample = allow_upsample frame_opts.max_feature_vectors = max_feature_vectors mel_opts.num_bins = num_bins @@ -80,48 +85,48 @@ def fbank( return feat -#@module_utils.requires_kaldi() -#def pitch(wav, -#samp_freq: int=16000, -#frame_shift_ms: float=10.0, -#frame_length_ms: float=25.0, -#preemph_coeff: float=0.0, -#min_f0: int=50, -#max_f0: int=400, -#soft_min_f0: float=10.0, -#penalty_factor: float=0.1, -#lowpass_cutoff: int=1000, -#resample_freq: int=4000, -#delta_pitch: float=0.005, -#nccf_ballast: int=7000, -#lowpass_filter_width: int=1, -#upsample_filter_width: int=5, -#max_frames_latency: int=0, -#frames_per_chunk: int=0, -#simulate_first_pass_online: bool=False, -#recompute_frame: int=500, -#nccf_ballast_online: bool=False, -#snip_edges: bool=True): -#pitch_opts = paddleaudio._paddleaudio.PitchExtractionOptions() -#pitch_opts.samp_freq = samp_freq -#pitch_opts.frame_shift_ms = frame_shift_ms -#pitch_opts.frame_length_ms = frame_length_ms -#pitch_opts.preemph_coeff = preemph_coeff -#pitch_opts.min_f0 = min_f0 -#pitch_opts.max_f0 = max_f0 -#pitch_opts.soft_min_f0 = soft_min_f0 -#pitch_opts.penalty_factor = penalty_factor -#pitch_opts.lowpass_cutoff = lowpass_cutoff -#pitch_opts.resample_freq = resample_freq -#pitch_opts.delta_pitch = delta_pitch -#pitch_opts.nccf_ballast = nccf_ballast -#pitch_opts.lowpass_filter_width = lowpass_filter_width -#pitch_opts.upsample_filter_width = upsample_filter_width -#pitch_opts.max_frames_latency = max_frames_latency -#pitch_opts.frames_per_chunk = frames_per_chunk -#pitch_opts.simulate_first_pass_online = simulate_first_pass_online -#pitch_opts.recompute_frame = recompute_frame -#pitch_opts.nccf_ballast_online = nccf_ballast_online -#pitch_opts.snip_edges = snip_edges -#pitch = paddleaudio._paddleaudio.ComputeKaldiPitch(pitch_opts, wav) -#return pitch +@module_utils.requires_kaldi() +def pitch(wav, + samp_freq: int=16000, + frame_shift_ms: float=10.0, + frame_length_ms: float=25.0, + preemph_coeff: float=0.0, + min_f0: int=50, + max_f0: int=400, + soft_min_f0: float=10.0, + penalty_factor: float=0.1, + lowpass_cutoff: int=1000, + resample_freq: int=4000, + delta_pitch: float=0.005, + nccf_ballast: int=7000, + lowpass_filter_width: int=1, + upsample_filter_width: int=5, + max_frames_latency: int=0, + frames_per_chunk: int=0, + simulate_first_pass_online: bool=False, + recompute_frame: int=500, + nccf_ballast_online: bool=False, + snip_edges: bool=True): + pitch_opts = paddleaudio._paddleaudio.PitchExtractionOptions() + pitch_opts.samp_freq = samp_freq + pitch_opts.frame_shift_ms = frame_shift_ms + pitch_opts.frame_length_ms = frame_length_ms + pitch_opts.preemph_coeff = preemph_coeff + pitch_opts.min_f0 = min_f0 + pitch_opts.max_f0 = max_f0 + pitch_opts.soft_min_f0 = soft_min_f0 + pitch_opts.penalty_factor = penalty_factor + pitch_opts.lowpass_cutoff = lowpass_cutoff + pitch_opts.resample_freq = resample_freq + pitch_opts.delta_pitch = delta_pitch + pitch_opts.nccf_ballast = nccf_ballast + pitch_opts.lowpass_filter_width = lowpass_filter_width + pitch_opts.upsample_filter_width = upsample_filter_width + pitch_opts.max_frames_latency = max_frames_latency + pitch_opts.frames_per_chunk = frames_per_chunk + pitch_opts.simulate_first_pass_online = simulate_first_pass_online + pitch_opts.recompute_frame = recompute_frame + pitch_opts.nccf_ballast_online = nccf_ballast_online + pitch_opts.snip_edges = snip_edges + pitch = paddleaudio._paddleaudio.ComputeKaldiPitch(pitch_opts, wav) + return pitch diff --git a/audio/paddleaudio/src/CMakeLists.txt b/audio/paddleaudio/src/CMakeLists.txt index 21e0f170d..fb6f32092 100644 --- a/audio/paddleaudio/src/CMakeLists.txt +++ b/audio/paddleaudio/src/CMakeLists.txt @@ -52,7 +52,7 @@ if(BUILD_KALDI) list( APPEND LIBPADDLEAUDIO_LINK_LIBRARIES - kaldi-native-fbank-core + libkaldi ) list( APPEND @@ -92,6 +92,14 @@ define_library( "${LIBPADDLEAUDIO_COMPILE_DEFINITIONS}" ) +if (APPLE) + add_custom_command(TARGET libpaddleaudio POST_BUILD COMMAND install_name_tool -change "${GFORTRAN_LIBRARIES_DIR}/libgcc_s.1.1.dylib" "@loader_path/libgcc_s.1.1.dylib" libpaddleaudio.so) +endif(APPLE) + +if (UNIX AND NOT APPLE) + set_target_properties(libpaddleaudio PROPERTIES INSTALL_RPATH "$ORIGIN") +endif() + if (APPLE) set(AUDIO_LIBRARY libpaddleaudio CACHE INTERNAL "") else() @@ -199,3 +207,11 @@ define_extension( # ) # endif() endif() + +if (APPLE) + add_custom_command(TARGET _paddleaudio POST_BUILD COMMAND install_name_tool -change "${GFORTRAN_LIBRARIES_DIR}/libgcc_s.1.1.dylib" "@loader_path/lib/libgcc_s.1.1.dylib" _paddleaudio.so) +endif(APPLE) + +if (UNIX AND NOT APPLE) + set_target_properties(_paddleaudio PROPERTIES INSTALL_RPATH "$ORIGIN/lib") +endif() diff --git a/audio/paddleaudio/src/pybind/kaldi/feature_common.h b/audio/paddleaudio/src/pybind/kaldi/feature_common.h index 6571fa3eb..05522bb7e 100644 --- a/audio/paddleaudio/src/pybind/kaldi/feature_common.h +++ b/audio/paddleaudio/src/pybind/kaldi/feature_common.h @@ -16,7 +16,7 @@ #include "pybind11/pybind11.h" #include "pybind11/numpy.h" -#include "kaldi-native-fbank/csrc/feature-window.h" +#include "feat/feature-window.h" namespace paddleaudio { namespace kaldi { @@ -28,18 +28,18 @@ class StreamingFeatureTpl { public: typedef typename F::Options Options; StreamingFeatureTpl(const Options& opts); - bool ComputeFeature(const std::vector& wav, - std::vector* feats); - void Reset() { remained_wav_.resize(0); } + bool ComputeFeature(const ::kaldi::VectorBase<::kaldi::BaseFloat>& wav, + ::kaldi::Vector<::kaldi::BaseFloat>* feats); + void Reset() { remained_wav_.Resize(0); } int Dim() { return computer_.Dim(); } private: - bool Compute(const std::vector& waves, - std::vector* feats); + bool Compute(const ::kaldi::Vector<::kaldi::BaseFloat>& waves, + ::kaldi::Vector<::kaldi::BaseFloat>* feats); Options opts_; - knf::FeatureWindowFunction window_function_; - std::vector remained_wav_; + ::kaldi::FeatureWindowFunction window_function_; + ::kaldi::Vector<::kaldi::BaseFloat> remained_wav_; F computer_; }; diff --git a/audio/paddleaudio/src/pybind/kaldi/feature_common_inl.h b/audio/paddleaudio/src/pybind/kaldi/feature_common_inl.h index 985d586fe..c894b9775 100644 --- a/audio/paddleaudio/src/pybind/kaldi/feature_common_inl.h +++ b/audio/paddleaudio/src/pybind/kaldi/feature_common_inl.h @@ -12,6 +12,7 @@ // See the License for the specific language governing permissions and // limitations under the License. +#include "base/kaldi-common.h" namespace paddleaudio { namespace kaldi { @@ -24,29 +25,24 @@ StreamingFeatureTpl::StreamingFeatureTpl(const Options& opts) template bool StreamingFeatureTpl::ComputeFeature( - const std::vector& wav, - std::vector* feats) { + const ::kaldi::VectorBase<::kaldi::BaseFloat>& wav, + ::kaldi::Vector<::kaldi::BaseFloat>* feats) { // append remaned waves - int wav_len = wav.size(); + ::kaldi::int32 wav_len = wav.Dim(); if (wav_len == 0) return false; - int left_len = remained_wav_.size(); - std::vector waves(left_len + wav_len); - std::memcpy(waves.data(), - remained_wav_.data(), - left_len * sizeof(float)); - std::memcpy(waves.data() + left_len, - wav.data(), - wav_len * sizeof(float)); + ::kaldi::int32 left_len = remained_wav_.Dim(); + ::kaldi::Vector<::kaldi::BaseFloat> waves(left_len + wav_len); + waves.Range(0, left_len).CopyFromVec(remained_wav_); + waves.Range(left_len, wav_len).CopyFromVec(wav); // cache remaned waves - knf::FrameExtractionOptions frame_opts = computer_.GetFrameOptions(); - int num_frames = knf::NumFrames(waves.size(), frame_opts); - int frame_shift = frame_opts.WindowShift(); - int left_samples = waves.size() - frame_shift * num_frames; - remained_wav_.resize(left_samples); - std::memcpy(remained_wav_.data(), - waves.data() + frame_shift * num_frames, - left_samples * sizeof(float)); + ::kaldi::FrameExtractionOptions frame_opts = computer_.GetFrameOptions(); + ::kaldi::int32 num_frames = ::kaldi::NumFrames(waves.Dim(), frame_opts); + ::kaldi::int32 frame_shift = frame_opts.WindowShift(); + ::kaldi::int32 left_samples = waves.Dim() - frame_shift * num_frames; + remained_wav_.Resize(left_samples); + remained_wav_.CopyFromVec( + waves.Range(frame_shift * num_frames, left_samples)); // compute speech feature Compute(waves, feats); @@ -55,39 +51,40 @@ bool StreamingFeatureTpl::ComputeFeature( // Compute feat template -bool StreamingFeatureTpl::Compute(const std::vector& waves, - std::vector* feats) { - const knf::FrameExtractionOptions& frame_opts = computer_.GetFrameOptions(); - int num_samples = waves.size(); - int frame_length = frame_opts.WindowSize(); - int sample_rate = frame_opts.samp_freq; +bool StreamingFeatureTpl::Compute( + const ::kaldi::Vector<::kaldi::BaseFloat>& waves, + ::kaldi::Vector<::kaldi::BaseFloat>* feats) { + ::kaldi::BaseFloat vtln_warp = 1.0; + const ::kaldi::FrameExtractionOptions& frame_opts = + computer_.GetFrameOptions(); + ::kaldi::int32 num_samples = waves.Dim(); + ::kaldi::int32 frame_length = frame_opts.WindowSize(); + ::kaldi::int32 sample_rate = frame_opts.samp_freq; if (num_samples < frame_length) { - return true; + return false; } - int num_frames = knf::NumFrames(num_samples, frame_opts); - feats->resize(num_frames * Dim()); + ::kaldi::int32 num_frames = ::kaldi::NumFrames(num_samples, frame_opts); + feats->Resize(num_frames * Dim()); - std::vector window; + ::kaldi::Vector<::kaldi::BaseFloat> window; bool need_raw_log_energy = computer_.NeedRawLogEnergy(); - for (int frame = 0; frame < num_frames; frame++) { - std::fill(window.begin(), window.end(), 0); - float raw_log_energy = 0.0; - float vtln_warp = 1.0; - knf::ExtractWindow(0, - waves, - frame, - frame_opts, - window_function_, - &window, - need_raw_log_energy ? &raw_log_energy : NULL); + for (::kaldi::int32 frame = 0; frame < num_frames; frame++) { + ::kaldi::BaseFloat raw_log_energy = 0.0; + ::kaldi::ExtractWindow(0, + waves, + frame, + frame_opts, + window_function_, + &window, + need_raw_log_energy ? &raw_log_energy : NULL); - std::vector this_feature(computer_.Dim()); - computer_.Compute( - raw_log_energy, vtln_warp, &window, this_feature.data()); - std::memcpy(feats->data() + frame * Dim(), - this_feature.data(), - sizeof(float) * Dim()); + ::kaldi::Vector<::kaldi::BaseFloat> this_feature(computer_.Dim(), + ::kaldi::kUndefined); + computer_.Compute(raw_log_energy, vtln_warp, &window, &this_feature); + ::kaldi::SubVector<::kaldi::BaseFloat> output_row( + feats->Data() + frame * Dim(), Dim()); + output_row.CopyFromVec(this_feature); } return true; } diff --git a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature.cc b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature.cc index 83df454c5..40e3786e8 100644 --- a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature.cc +++ b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature.cc @@ -13,16 +13,16 @@ // limitations under the License. #include "paddleaudio/src/pybind/kaldi/kaldi_feature.h" -//#include "feat/pitch-functions.h" +#include "feat/pitch-functions.h" namespace paddleaudio { namespace kaldi { bool InitFbank( - knf::FrameExtractionOptions frame_opts, - knf::MelBanksOptions mel_opts, + ::kaldi::FrameExtractionOptions frame_opts, + ::kaldi::MelBanksOptions mel_opts, FbankOptions fbank_opts) { - knf::FbankOptions opts; + ::kaldi::FbankOptions opts; opts.frame_opts = frame_opts; opts.mel_opts = mel_opts; opts.use_energy = fbank_opts.use_energy; @@ -41,8 +41,8 @@ py::array_t ComputeFbankStreaming(const py::array_t& wav) { } py::array_t ComputeFbank( - knf::FrameExtractionOptions frame_opts, - knf::MelBanksOptions mel_opts, + ::kaldi::FrameExtractionOptions frame_opts, + ::kaldi::MelBanksOptions mel_opts, FbankOptions fbank_opts, const py::array_t& wav) { InitFbank(frame_opts, mel_opts, fbank_opts); @@ -55,21 +55,21 @@ void ResetFbank() { paddleaudio::kaldi::KaldiFeatureWrapper::GetInstance()->ResetFbank(); } -//py::array_t ComputeKaldiPitch( - //const ::kaldi::PitchExtractionOptions& opts, - //const py::array_t& wav) { - //py::buffer_info info = wav.request(); - //::kaldi::SubVector<::kaldi::BaseFloat> input_wav((float*)info.ptr, info.size); +py::array_t ComputeKaldiPitch( + const ::kaldi::PitchExtractionOptions& opts, + const py::array_t& wav) { + py::buffer_info info = wav.request(); + ::kaldi::SubVector<::kaldi::BaseFloat> input_wav((float*)info.ptr, info.size); - //::kaldi::Matrix<::kaldi::BaseFloat> features; - //::kaldi::ComputeKaldiPitch(opts, input_wav, &features); - //auto result = py::array_t({features.NumRows(), features.NumCols()}); - //for (int row_idx = 0; row_idx < features.NumRows(); ++row_idx) { - //std::memcpy(result.mutable_data(row_idx), features.Row(row_idx).Data(), - //sizeof(float)*features.NumCols()); - //} - //return result; -//} + ::kaldi::Matrix<::kaldi::BaseFloat> features; + ::kaldi::ComputeKaldiPitch(opts, input_wav, &features); + auto result = py::array_t({features.NumRows(), features.NumCols()}); + for (int row_idx = 0; row_idx < features.NumRows(); ++row_idx) { + std::memcpy(result.mutable_data(row_idx), features.Row(row_idx).Data(), + sizeof(float)*features.NumCols()); + } + return result; +} } // namespace kaldi } // namespace paddleaudio diff --git a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature.h b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature.h index 031ec863b..e059c52c1 100644 --- a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature.h +++ b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature.h @@ -19,7 +19,7 @@ #include #include "paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.h" -//#include "feat/pitch-functions.h" +#include "feat/pitch-functions.h" namespace py = pybind11; @@ -42,13 +42,13 @@ struct FbankOptions{ }; bool InitFbank( - knf::FrameExtractionOptions frame_opts, - knf::MelBanksOptions mel_opts, + ::kaldi::FrameExtractionOptions frame_opts, + ::kaldi::MelBanksOptions mel_opts, FbankOptions fbank_opts); py::array_t ComputeFbank( - knf::FrameExtractionOptions frame_opts, - knf::MelBanksOptions mel_opts, + ::kaldi::FrameExtractionOptions frame_opts, + ::kaldi::MelBanksOptions mel_opts, FbankOptions fbank_opts, const py::array_t& wav); @@ -56,9 +56,9 @@ py::array_t ComputeFbankStreaming(const py::array_t& wav); void ResetFbank(); -//py::array_t ComputeKaldiPitch( - //const ::kaldi::PitchExtractionOptions& opts, - //const py::array_t& wav); +py::array_t ComputeKaldiPitch( + const ::kaldi::PitchExtractionOptions& opts, + const py::array_t& wav); } // namespace kaldi } // namespace paddleaudio diff --git a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.cc b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.cc index 8b8ff18be..79558046b 100644 --- a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.cc +++ b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.cc @@ -22,7 +22,7 @@ KaldiFeatureWrapper* KaldiFeatureWrapper::GetInstance() { return &instance; } -bool KaldiFeatureWrapper::InitFbank(knf::FbankOptions opts) { +bool KaldiFeatureWrapper::InitFbank(::kaldi::FbankOptions opts) { fbank_.reset(new Fbank(opts)); return true; } @@ -30,18 +30,21 @@ bool KaldiFeatureWrapper::InitFbank(knf::FbankOptions opts) { py::array_t KaldiFeatureWrapper::ComputeFbank( const py::array_t wav) { py::buffer_info info = wav.request(); - std::vector input_wav((float*)info.ptr, (float*)info.ptr + info.size); + ::kaldi::SubVector<::kaldi::BaseFloat> input_wav((float*)info.ptr, info.size); - std::vector feats; + ::kaldi::Vector<::kaldi::BaseFloat> feats; bool flag = fbank_->ComputeFeature(input_wav, &feats); - if (flag == false || feats.size() == 0) return py::array_t(); - auto result = py::array_t(feats.size()); + if (flag == false || feats.Dim() == 0) return py::array_t(); + auto result = py::array_t(feats.Dim()); py::buffer_info xs = result.request(); + std::cout << std::endl; float* res_ptr = (float*)xs.ptr; - std::memcpy(res_ptr, feats.data(), sizeof(float)*feats.size()); - std::vector shape{static_cast(feats.size() / Dim()), - static_cast(Dim())}; - return result.reshape(shape); + for (int idx = 0; idx < feats.Dim(); ++idx) { + *res_ptr = feats(idx); + res_ptr++; + } + + return result.reshape({feats.Dim() / Dim(), Dim()}); } } // namesapce kaldi diff --git a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.h b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.h index daad2d587..bee1eee02 100644 --- a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.h +++ b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.h @@ -14,18 +14,20 @@ #pragma once -#include "paddleaudio/third_party/kaldi-native-fbank/csrc/feature-fbank.h" +#include "base/kaldi-common.h" +#include "feat/feature-fbank.h" + #include "paddleaudio/src/pybind/kaldi/feature_common.h" namespace paddleaudio { namespace kaldi { -typedef StreamingFeatureTpl Fbank; +typedef StreamingFeatureTpl<::kaldi::FbankComputer> Fbank; class KaldiFeatureWrapper { public: static KaldiFeatureWrapper* GetInstance(); - bool InitFbank(knf::FbankOptions opts); + bool InitFbank(::kaldi::FbankOptions opts); py::array_t ComputeFbank(const py::array_t wav); int Dim() { return fbank_->Dim(); } void ResetFbank() { fbank_->Reset(); } diff --git a/audio/paddleaudio/src/pybind/pybind.cpp b/audio/paddleaudio/src/pybind/pybind.cpp index 510712034..692e80995 100644 --- a/audio/paddleaudio/src/pybind/pybind.cpp +++ b/audio/paddleaudio/src/pybind/pybind.cpp @@ -2,7 +2,7 @@ #ifdef INCLUDE_KALDI #include "paddleaudio/src/pybind/kaldi/kaldi_feature.h" -#include "paddleaudio/third_party/kaldi-native-fbank/csrc/feature-fbank.h" +#include "paddleaudio/third_party/kaldi/feat/feature-fbank.h" #endif #ifdef INCLUDE_SOX @@ -89,51 +89,53 @@ PYBIND11_MODULE(_paddleaudio, m) { #ifdef INCLUDE_KALDI m.def("ComputeFbank", &paddleaudio::kaldi::ComputeFbank, "compute fbank"); - //py::class_(m, "PitchExtractionOptions") - //.def(py::init<>()) - //.def_readwrite("samp_freq", &kaldi::PitchExtractionOptions::samp_freq) - //.def_readwrite("frame_shift_ms", &kaldi::PitchExtractionOptions::frame_shift_ms) - //.def_readwrite("frame_length_ms", &kaldi::PitchExtractionOptions::frame_length_ms) - //.def_readwrite("preemph_coeff", &kaldi::PitchExtractionOptions::preemph_coeff) - //.def_readwrite("min_f0", &kaldi::PitchExtractionOptions::min_f0) - //.def_readwrite("max_f0", &kaldi::PitchExtractionOptions::max_f0) - //.def_readwrite("soft_min_f0", &kaldi::PitchExtractionOptions::soft_min_f0) - //.def_readwrite("penalty_factor", &kaldi::PitchExtractionOptions::penalty_factor) - //.def_readwrite("lowpass_cutoff", &kaldi::PitchExtractionOptions::lowpass_cutoff) - //.def_readwrite("resample_freq", &kaldi::PitchExtractionOptions::resample_freq) - //.def_readwrite("delta_pitch", &kaldi::PitchExtractionOptions::delta_pitch) - //.def_readwrite("nccf_ballast", &kaldi::PitchExtractionOptions::nccf_ballast) - //.def_readwrite("lowpass_filter_width", &kaldi::PitchExtractionOptions::lowpass_filter_width) - //.def_readwrite("upsample_filter_width", &kaldi::PitchExtractionOptions::upsample_filter_width) - //.def_readwrite("max_frames_latency", &kaldi::PitchExtractionOptions::max_frames_latency) - //.def_readwrite("frames_per_chunk", &kaldi::PitchExtractionOptions::frames_per_chunk) - //.def_readwrite("simulate_first_pass_online", &kaldi::PitchExtractionOptions::simulate_first_pass_online) - //.def_readwrite("recompute_frame", &kaldi::PitchExtractionOptions::recompute_frame) - //.def_readwrite("nccf_ballast_online", &kaldi::PitchExtractionOptions::nccf_ballast_online) - //.def_readwrite("snip_edges", &kaldi::PitchExtractionOptions::snip_edges); - //m.def("ComputeKaldiPitch", &paddleaudio::kaldi::ComputeKaldiPitch, "compute kaldi pitch"); - py::class_(m, "FrameExtractionOptions") + py::class_(m, "PitchExtractionOptions") + .def(py::init<>()) + .def_readwrite("samp_freq", &kaldi::PitchExtractionOptions::samp_freq) + .def_readwrite("frame_shift_ms", &kaldi::PitchExtractionOptions::frame_shift_ms) + .def_readwrite("frame_length_ms", &kaldi::PitchExtractionOptions::frame_length_ms) + .def_readwrite("preemph_coeff", &kaldi::PitchExtractionOptions::preemph_coeff) + .def_readwrite("min_f0", &kaldi::PitchExtractionOptions::min_f0) + .def_readwrite("max_f0", &kaldi::PitchExtractionOptions::max_f0) + .def_readwrite("soft_min_f0", &kaldi::PitchExtractionOptions::soft_min_f0) + .def_readwrite("penalty_factor", &kaldi::PitchExtractionOptions::penalty_factor) + .def_readwrite("lowpass_cutoff", &kaldi::PitchExtractionOptions::lowpass_cutoff) + .def_readwrite("resample_freq", &kaldi::PitchExtractionOptions::resample_freq) + .def_readwrite("delta_pitch", &kaldi::PitchExtractionOptions::delta_pitch) + .def_readwrite("nccf_ballast", &kaldi::PitchExtractionOptions::nccf_ballast) + .def_readwrite("lowpass_filter_width", &kaldi::PitchExtractionOptions::lowpass_filter_width) + .def_readwrite("upsample_filter_width", &kaldi::PitchExtractionOptions::upsample_filter_width) + .def_readwrite("max_frames_latency", &kaldi::PitchExtractionOptions::max_frames_latency) + .def_readwrite("frames_per_chunk", &kaldi::PitchExtractionOptions::frames_per_chunk) + .def_readwrite("simulate_first_pass_online", &kaldi::PitchExtractionOptions::simulate_first_pass_online) + .def_readwrite("recompute_frame", &kaldi::PitchExtractionOptions::recompute_frame) + .def_readwrite("nccf_ballast_online", &kaldi::PitchExtractionOptions::nccf_ballast_online) + .def_readwrite("snip_edges", &kaldi::PitchExtractionOptions::snip_edges); + m.def("ComputeKaldiPitch", &paddleaudio::kaldi::ComputeKaldiPitch, "compute kaldi pitch"); + py::class_(m, "FrameExtractionOptions") .def(py::init<>()) - .def_readwrite("samp_freq", &knf::FrameExtractionOptions::samp_freq) - .def_readwrite("frame_shift_ms", &knf::FrameExtractionOptions::frame_shift_ms) - .def_readwrite("frame_length_ms", &knf::FrameExtractionOptions::frame_length_ms) - .def_readwrite("dither", &knf::FrameExtractionOptions::dither) - .def_readwrite("preemph_coeff", &knf::FrameExtractionOptions::preemph_coeff) - .def_readwrite("remove_dc_offset", &knf::FrameExtractionOptions::remove_dc_offset) - .def_readwrite("window_type", &knf::FrameExtractionOptions::window_type) - .def_readwrite("round_to_power_of_two", &knf::FrameExtractionOptions::round_to_power_of_two) - .def_readwrite("blackman_coeff", &knf::FrameExtractionOptions::blackman_coeff) - .def_readwrite("snip_edges", &knf::FrameExtractionOptions::snip_edges) - .def_readwrite("max_feature_vectors", &knf::FrameExtractionOptions::max_feature_vectors); - py::class_(m, "MelBanksOptions") + .def_readwrite("samp_freq", &kaldi::FrameExtractionOptions::samp_freq) + .def_readwrite("frame_shift_ms", &kaldi::FrameExtractionOptions::frame_shift_ms) + .def_readwrite("frame_length_ms", &kaldi::FrameExtractionOptions::frame_length_ms) + .def_readwrite("dither", &kaldi::FrameExtractionOptions::dither) + .def_readwrite("preemph_coeff", &kaldi::FrameExtractionOptions::preemph_coeff) + .def_readwrite("remove_dc_offset", &kaldi::FrameExtractionOptions::remove_dc_offset) + .def_readwrite("window_type", &kaldi::FrameExtractionOptions::window_type) + .def_readwrite("round_to_power_of_two", &kaldi::FrameExtractionOptions::round_to_power_of_two) + .def_readwrite("blackman_coeff", &kaldi::FrameExtractionOptions::blackman_coeff) + .def_readwrite("snip_edges", &kaldi::FrameExtractionOptions::snip_edges) + .def_readwrite("allow_downsample", &kaldi::FrameExtractionOptions::allow_downsample) + .def_readwrite("allow_upsample", &kaldi::FrameExtractionOptions::allow_upsample) + .def_readwrite("max_feature_vectors", &kaldi::FrameExtractionOptions::max_feature_vectors); + py::class_(m, "MelBanksOptions") .def(py::init<>()) - .def_readwrite("num_bins", &knf::MelBanksOptions::num_bins) - .def_readwrite("low_freq", &knf::MelBanksOptions::low_freq) - .def_readwrite("high_freq", &knf::MelBanksOptions::high_freq) - .def_readwrite("vtln_low", &knf::MelBanksOptions::vtln_low) - .def_readwrite("vtln_high", &knf::MelBanksOptions::vtln_high) - .def_readwrite("debug_mel", &knf::MelBanksOptions::debug_mel) - .def_readwrite("htk_mode", &knf::MelBanksOptions::htk_mode); + .def_readwrite("num_bins", &kaldi::MelBanksOptions::num_bins) + .def_readwrite("low_freq", &kaldi::MelBanksOptions::low_freq) + .def_readwrite("high_freq", &kaldi::MelBanksOptions::high_freq) + .def_readwrite("vtln_low", &kaldi::MelBanksOptions::vtln_low) + .def_readwrite("vtln_high", &kaldi::MelBanksOptions::vtln_high) + .def_readwrite("debug_mel", &kaldi::MelBanksOptions::debug_mel) + .def_readwrite("htk_mode", &kaldi::MelBanksOptions::htk_mode); py::class_(m, "FbankOptions") .def(py::init<>()) diff --git a/audio/paddleaudio/third_party/CMakeLists.txt b/audio/paddleaudio/third_party/CMakeLists.txt index 4b85bada0..43288f39b 100644 --- a/audio/paddleaudio/third_party/CMakeLists.txt +++ b/audio/paddleaudio/third_party/CMakeLists.txt @@ -11,6 +11,5 @@ endif() # kaldi ################################################################################ if (BUILD_KALDI) - include_directories(${CMAKE_CURRENT_SOURCE_DIR}) - add_subdirectory(kaldi-native-fbank/csrc) -endif() + add_subdirectory(kaldi) +endif() \ No newline at end of file diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/CMakeLists.txt b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/CMakeLists.txt deleted file mode 100644 index 176607fc0..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/CMakeLists.txt +++ /dev/null @@ -1,22 +0,0 @@ -include_directories(${CMAKE_CURRENT_SOURCE_DIR}/../../) -add_library(kaldi-native-fbank-core - feature-fbank.cc - feature-functions.cc - feature-window.cc - fftsg.c - log.cc - mel-computations.cc - rfft.cc -) -# We are using std::call_once() in log.h,which requires us to link with -pthread -if(NOT WIN32) - target_link_libraries(kaldi-native-fbank-core -pthread) -endif() - -if(KNF_HAVE_EXECINFO_H) - target_compile_definitions(kaldi-native-fbank-core PRIVATE KNF_HAVE_EXECINFO_H=1) -endif() - -if(KNF_HAVE_CXXABI_H) - target_compile_definitions(kaldi-native-fbank-core PRIVATE KNF_HAVE_CXXABI_H=1) -endif() diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-fbank.cc b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-fbank.cc deleted file mode 100644 index 740ee17e9..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-fbank.cc +++ /dev/null @@ -1,117 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -// This file is copied/modified from kaldi/src/feat/feature-fbank.cc -// -#include "kaldi-native-fbank/csrc/feature-fbank.h" - -#include - -#include "kaldi-native-fbank/csrc/feature-functions.h" - -namespace knf { - -static void Sqrt(float *in_out, int32_t n) { - for (int32_t i = 0; i != n; ++i) { - in_out[i] = std::sqrt(in_out[i]); - } -} - -std::ostream &operator<<(std::ostream &os, const FbankOptions &opts) { - os << opts.ToString(); - return os; -} - -FbankComputer::FbankComputer(const FbankOptions &opts) - : opts_(opts), rfft_(opts.frame_opts.PaddedWindowSize()) { - if (opts.energy_floor > 0.0f) { - log_energy_floor_ = logf(opts.energy_floor); - } - - // We'll definitely need the filterbanks info for VTLN warping factor 1.0. - // [note: this call caches it.] - GetMelBanks(1.0f); -} - -FbankComputer::~FbankComputer() { - for (auto iter = mel_banks_.begin(); iter != mel_banks_.end(); ++iter) - delete iter->second; -} - -const MelBanks *FbankComputer::GetMelBanks(float vtln_warp) { - MelBanks *this_mel_banks = nullptr; - - // std::map::iterator iter = mel_banks_.find(vtln_warp); - auto iter = mel_banks_.find(vtln_warp); - if (iter == mel_banks_.end()) { - this_mel_banks = new MelBanks(opts_.mel_opts, opts_.frame_opts, vtln_warp); - mel_banks_[vtln_warp] = this_mel_banks; - } else { - this_mel_banks = iter->second; - } - return this_mel_banks; -} - -void FbankComputer::Compute(float signal_raw_log_energy, float vtln_warp, - std::vector *signal_frame, float *feature) { - const MelBanks &mel_banks = *(GetMelBanks(vtln_warp)); - - KNF_CHECK_EQ(signal_frame->size(), opts_.frame_opts.PaddedWindowSize()); - - // Compute energy after window function (not the raw one). - if (opts_.use_energy && !opts_.raw_energy) { - signal_raw_log_energy = std::log( - std::max(InnerProduct(signal_frame->data(), signal_frame->data(), - signal_frame->size()), - std::numeric_limits::epsilon())); - } - rfft_.Compute(signal_frame->data()); // signal_frame is modified in-place - ComputePowerSpectrum(signal_frame); - - // Use magnitude instead of power if requested. - if (!opts_.use_power) { - Sqrt(signal_frame->data(), signal_frame->size() / 2 + 1); - } - - int32_t mel_offset = ((opts_.use_energy && !opts_.htk_compat) ? 1 : 0); - - // Its length is opts_.mel_opts.num_bins - float *mel_energies = feature + mel_offset; - - // Sum with mel filter banks over the power spectrum - mel_banks.Compute(signal_frame->data(), mel_energies); - - if (opts_.use_log_fbank) { - // Avoid log of zero (which should be prevented anyway by dithering). - for (int32_t i = 0; i != opts_.mel_opts.num_bins; ++i) { - auto t = std::max(mel_energies[i], std::numeric_limits::epsilon()); - mel_energies[i] = std::log(t); - } - } - - // Copy energy as first value (or the last, if htk_compat == true). - if (opts_.use_energy) { - if (opts_.energy_floor > 0.0 && signal_raw_log_energy < log_energy_floor_) { - signal_raw_log_energy = log_energy_floor_; - } - int32_t energy_index = opts_.htk_compat ? opts_.mel_opts.num_bins : 0; - feature[energy_index] = signal_raw_log_energy; - } -} - -} // namespace knf diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-fbank.h b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-fbank.h deleted file mode 100644 index 0ef3fac0d..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-fbank.h +++ /dev/null @@ -1,132 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -// This file is copied/modified from kaldi/src/feat/feature-fbank.h - -#ifndef KALDI_NATIVE_FBANK_CSRC_FEATURE_FBANK_H_ -#define KALDI_NATIVE_FBANK_CSRC_FEATURE_FBANK_H_ - -#include - -#include "kaldi-native-fbank/csrc/feature-window.h" -#include "kaldi-native-fbank/csrc/mel-computations.h" -#include "kaldi-native-fbank/csrc/rfft.h" - -namespace knf { - -struct FbankOptions { - FrameExtractionOptions frame_opts; - MelBanksOptions mel_opts; - // append an extra dimension with energy to the filter banks - bool use_energy = false; - float energy_floor = 0.0f; // active iff use_energy==true - - // If true, compute log_energy before preemphasis and windowing - // If false, compute log_energy after preemphasis ans windowing - bool raw_energy = true; // active iff use_energy==true - - // If true, put energy last (if using energy) - // If false, put energy first - bool htk_compat = false; // active iff use_energy==true - - // if true (default), produce log-filterbank, else linear - bool use_log_fbank = true; - - // if true (default), use power in filterbank - // analysis, else magnitude. - bool use_power = true; - - FbankOptions() { mel_opts.num_bins = 23; } - - std::string ToString() const { - std::ostringstream os; - os << "frame_opts: \n"; - os << frame_opts << "\n"; - os << "\n"; - - os << "mel_opts: \n"; - os << mel_opts << "\n"; - - os << "use_energy: " << use_energy << "\n"; - os << "energy_floor: " << energy_floor << "\n"; - os << "raw_energy: " << raw_energy << "\n"; - os << "htk_compat: " << htk_compat << "\n"; - os << "use_log_fbank: " << use_log_fbank << "\n"; - os << "use_power: " << use_power << "\n"; - return os.str(); - } -}; - -std::ostream &operator<<(std::ostream &os, const FbankOptions &opts); - -class FbankComputer { - public: - using Options = FbankOptions; - - explicit FbankComputer(const FbankOptions &opts); - ~FbankComputer(); - - int32_t Dim() const { - return opts_.mel_opts.num_bins + (opts_.use_energy ? 1 : 0); - } - - // if true, compute log_energy_pre_window but after dithering and dc removal - bool NeedRawLogEnergy() const { return opts_.use_energy && opts_.raw_energy; } - - const FrameExtractionOptions &GetFrameOptions() const { - return opts_.frame_opts; - } - - const FbankOptions &GetOptions() const { return opts_; } - - /** - Function that computes one frame of features from - one frame of signal. - - @param [in] signal_raw_log_energy The log-energy of the frame of the signal - prior to windowing and pre-emphasis, or - log(numeric_limits::min()), whichever is greater. Must be - ignored by this function if this class returns false from - this->NeedsRawLogEnergy(). - @param [in] vtln_warp The VTLN warping factor that the user wants - to be applied when computing features for this utterance. Will - normally be 1.0, meaning no warping is to be done. The value will - be ignored for feature types that don't support VLTN, such as - spectrogram features. - @param [in] signal_frame One frame of the signal, - as extracted using the function ExtractWindow() using the options - returned by this->GetFrameOptions(). The function will use the - vector as a workspace, which is why it's a non-const pointer. - @param [out] feature Pointer to a vector of size this->Dim(), to which - the computed feature will be written. It should be pre-allocated. - */ - void Compute(float signal_raw_log_energy, float vtln_warp, - std::vector *signal_frame, float *feature); - - private: - const MelBanks *GetMelBanks(float vtln_warp); - - FbankOptions opts_; - float log_energy_floor_; - std::map mel_banks_; // float is VTLN coefficient. - Rfft rfft_; -}; - -} // namespace knf - -#endif // KALDI_NATIVE_FBANK_CSRC_FEATURE_FBANK_H_ diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-functions.cc b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-functions.cc deleted file mode 100644 index 00ae4c798..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-functions.cc +++ /dev/null @@ -1,49 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -// This file is copied/modified from kaldi/src/feat/feature-functions.cc - -#include "kaldi-native-fbank/csrc/feature-functions.h" - -#include -#include - -namespace knf { - -void ComputePowerSpectrum(std::vector *complex_fft) { - int32_t dim = complex_fft->size(); - - // now we have in complex_fft, first half of complex spectrum - // it's stored as [real0, realN/2, real1, im1, real2, im2, ...] - - float *p = complex_fft->data(); - int32_t half_dim = dim / 2; - float first_energy = p[0] * p[0]; - float last_energy = p[1] * p[1]; // handle this special case - - for (int32_t i = 1; i < half_dim; ++i) { - float real = p[i * 2]; - float im = p[i * 2 + 1]; - p[i] = real * real + im * im; - } - p[0] = first_energy; - p[half_dim] = last_energy; // Will actually never be used, and anyway - // if the signal has been bandlimited sensibly this should be zero. -} - -} // namespace knf diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-functions.h b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-functions.h deleted file mode 100644 index 852d0612c..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-functions.h +++ /dev/null @@ -1,38 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -// This file is copied/modified from kaldi/src/feat/feature-functions.h -#ifndef KALDI_NATIVE_FBANK_CSRC_FEATURE_FUNCTIONS_H -#define KALDI_NATIVE_FBANK_CSRC_FEATURE_FUNCTIONS_H - -#include -namespace knf { - -// ComputePowerSpectrum converts a complex FFT (as produced by the FFT -// functions in csrc/rfft.h), and converts it into -// a power spectrum. If the complex FFT is a vector of size n (representing -// half of the complex FFT of a real signal of size n, as described there), -// this function computes in the first (n/2) + 1 elements of it, the -// energies of the fft bins from zero to the Nyquist frequency. Contents of the -// remaining (n/2) - 1 elements are undefined at output. - -void ComputePowerSpectrum(std::vector *complex_fft); - -} // namespace knf - -#endif // KALDI_NATIVE_FBANK_CSRC_FEATURE_FUNCTIONS_H diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-window.cc b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-window.cc deleted file mode 100644 index b86a2c3d2..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-window.cc +++ /dev/null @@ -1,236 +0,0 @@ -// kaldi-native-fbank/csrc/feature-window.cc -// -// Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - -// This file is copied/modified from kaldi/src/feat/feature-window.cc - -#include "kaldi-native-fbank/csrc/feature-window.h" - -#include -#include - -#ifndef M_2PI -#define M_2PI 6.283185307179586476925286766559005 -#endif - -namespace knf { - -std::ostream &operator<<(std::ostream &os, const FrameExtractionOptions &opts) { - os << opts.ToString(); - return os; -} - -FeatureWindowFunction::FeatureWindowFunction(const FrameExtractionOptions &opts) - : window_(opts.WindowSize()) { - int32_t frame_length = opts.WindowSize(); - KNF_CHECK_GT(frame_length, 0); - - float *window_data = window_.data(); - - double a = M_2PI / (frame_length - 1); - for (int32_t i = 0; i < frame_length; i++) { - double i_fl = static_cast(i); - if (opts.window_type == "hanning") { - window_data[i] = 0.5 - 0.5 * cos(a * i_fl); - } else if (opts.window_type == "sine") { - // when you are checking ws wikipedia, please - // note that 0.5 * a = M_PI/(frame_length-1) - window_data[i] = sin(0.5 * a * i_fl); - } else if (opts.window_type == "hamming") { - window_data[i] = 0.54 - 0.46 * cos(a * i_fl); - } else if (opts.window_type == - "povey") { // like hamming but goes to zero at edges. - window_data[i] = pow(0.5 - 0.5 * cos(a * i_fl), 0.85); - } else if (opts.window_type == "rectangular") { - window_data[i] = 1.0; - } else if (opts.window_type == "blackman") { - window_data[i] = opts.blackman_coeff - 0.5 * cos(a * i_fl) + - (0.5 - opts.blackman_coeff) * cos(2 * a * i_fl); - } else { - KNF_LOG(FATAL) << "Invalid window type " << opts.window_type; - } - } -} - -void FeatureWindowFunction::Apply(float *wave) const { - int32_t window_size = window_.size(); - const float *p = window_.data(); - for (int32_t k = 0; k != window_size; ++k) { - wave[k] *= p[k]; - } -} - -int64_t FirstSampleOfFrame(int32_t frame, const FrameExtractionOptions &opts) { - int64_t frame_shift = opts.WindowShift(); - if (opts.snip_edges) { - return frame * frame_shift; - } else { - int64_t midpoint_of_frame = frame_shift * frame + frame_shift / 2, - beginning_of_frame = midpoint_of_frame - opts.WindowSize() / 2; - return beginning_of_frame; - } -} - -int32_t NumFrames(int64_t num_samples, const FrameExtractionOptions &opts, - bool flush /*= true*/) { - int64_t frame_shift = opts.WindowShift(); - int64_t frame_length = opts.WindowSize(); - if (opts.snip_edges) { - // with --snip-edges=true (the default), we use a HTK-like approach to - // determining the number of frames-- all frames have to fit completely into - // the waveform, and the first frame begins at sample zero. - if (num_samples < frame_length) - return 0; - else - return (1 + ((num_samples - frame_length) / frame_shift)); - // You can understand the expression above as follows: 'num_samples - - // frame_length' is how much room we have to shift the frame within the - // waveform; 'frame_shift' is how much we shift it each time; and the ratio - // is how many times we can shift it (integer arithmetic rounds down). - } else { - // if --snip-edges=false, the number of frames is determined by rounding the - // (file-length / frame-shift) to the nearest integer. The point of this - // formula is to make the number of frames an obvious and predictable - // function of the frame shift and signal length, which makes many - // segmentation-related questions simpler. - // - // Because integer division in C++ rounds toward zero, we add (half the - // frame-shift minus epsilon) before dividing, to have the effect of - // rounding towards the closest integer. - int32_t num_frames = (num_samples + (frame_shift / 2)) / frame_shift; - - if (flush) return num_frames; - - // note: 'end' always means the last plus one, i.e. one past the last. - int64_t end_sample_of_last_frame = - FirstSampleOfFrame(num_frames - 1, opts) + frame_length; - - // the following code is optimized more for clarity than efficiency. - // If flush == false, we can't output frames that extend past the end - // of the signal. - while (num_frames > 0 && end_sample_of_last_frame > num_samples) { - num_frames--; - end_sample_of_last_frame -= frame_shift; - } - return num_frames; - } -} - -void ExtractWindow(int64_t sample_offset, const std::vector &wave, - int32_t f, const FrameExtractionOptions &opts, - const FeatureWindowFunction &window_function, - std::vector *window, - float *log_energy_pre_window /*= nullptr*/) { - KNF_CHECK(sample_offset >= 0 && wave.size() != 0); - - int32_t frame_length = opts.WindowSize(); - int32_t frame_length_padded = opts.PaddedWindowSize(); - - int64_t num_samples = sample_offset + wave.size(); - int64_t start_sample = FirstSampleOfFrame(f, opts); - int64_t end_sample = start_sample + frame_length; - - if (opts.snip_edges) { - KNF_CHECK(start_sample >= sample_offset && end_sample <= num_samples); - } else { - KNF_CHECK(sample_offset == 0 || start_sample >= sample_offset); - } - - if (window->size() != frame_length_padded) { - window->resize(frame_length_padded); - } - - // wave_start and wave_end are start and end indexes into 'wave', for the - // piece of wave that we're trying to extract. - int32_t wave_start = int32_t(start_sample - sample_offset); - int32_t wave_end = wave_start + frame_length; - - if (wave_start >= 0 && wave_end <= wave.size()) { - // the normal case-- no edge effects to consider. - std::copy(wave.begin() + wave_start, - wave.begin() + wave_start + frame_length, window->data()); - } else { - // Deal with any end effects by reflection, if needed. This code will only - // be reached for about two frames per utterance, so we don't concern - // ourselves excessively with efficiency. - int32_t wave_dim = wave.size(); - for (int32_t s = 0; s < frame_length; ++s) { - int32_t s_in_wave = s + wave_start; - while (s_in_wave < 0 || s_in_wave >= wave_dim) { - // reflect around the beginning or end of the wave. - // e.g. -1 -> 0, -2 -> 1. - // dim -> dim - 1, dim + 1 -> dim - 2. - // the code supports repeated reflections, although this - // would only be needed in pathological cases. - if (s_in_wave < 0) - s_in_wave = -s_in_wave - 1; - else - s_in_wave = 2 * wave_dim - 1 - s_in_wave; - } - (*window)[s] = wave[s_in_wave]; - } - } - - ProcessWindow(opts, window_function, window->data(), log_energy_pre_window); -} - -static void RemoveDcOffset(float *d, int32_t n) { - float sum = 0; - for (int32_t i = 0; i != n; ++i) { - sum += d[i]; - } - - float mean = sum / n; - - for (int32_t i = 0; i != n; ++i) { - d[i] -= mean; - } -} - -float InnerProduct(const float *a, const float *b, int32_t n) { - float sum = 0; - for (int32_t i = 0; i != n; ++i) { - sum += a[i] * b[i]; - } - return sum; -} - -static void Preemphasize(float *d, int32_t n, float preemph_coeff) { - if (preemph_coeff == 0.0) { - return; - } - - KNF_CHECK(preemph_coeff >= 0.0 && preemph_coeff <= 1.0); - - for (int32_t i = n - 1; i > 0; --i) { - d[i] -= preemph_coeff * d[i - 1]; - } - d[0] -= preemph_coeff * d[0]; -} - -void ProcessWindow(const FrameExtractionOptions &opts, - const FeatureWindowFunction &window_function, float *window, - float *log_energy_pre_window /*= nullptr*/) { - int32_t frame_length = opts.WindowSize(); - - // TODO(fangjun): Remove dither - KNF_CHECK_EQ(opts.dither, 0); - - if (opts.remove_dc_offset) { - RemoveDcOffset(window, frame_length); - } - - if (log_energy_pre_window != NULL) { - float energy = std::max(InnerProduct(window, window, frame_length), - std::numeric_limits::epsilon()); - *log_energy_pre_window = std::log(energy); - } - - if (opts.preemph_coeff != 0.0) { - Preemphasize(window, frame_length, opts.preemph_coeff); - } - - window_function.Apply(window); -} - -} // namespace knf diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-window.h b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-window.h deleted file mode 100644 index a33844f4c..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/feature-window.h +++ /dev/null @@ -1,178 +0,0 @@ -// kaldi-native-fbank/csrc/feature-window.h -// -// Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - -// This file is copied/modified from kaldi/src/feat/feature-window.h - -#ifndef KALDI_NATIVE_FEAT_CSRC_FEATURE_WINDOW_H_ -#define KALDI_NATIVE_FEAT_CSRC_FEATURE_WINDOW_H_ - -#include -#include -#include - -#include "kaldi-native-fbank/csrc/log.h" - -namespace knf { - -inline int32_t RoundUpToNearestPowerOfTwo(int32_t n) { - // copied from kaldi/src/base/kaldi-math.cc - KNF_CHECK_GT(n, 0); - n--; - n |= n >> 1; - n |= n >> 2; - n |= n >> 4; - n |= n >> 8; - n |= n >> 16; - return n + 1; -} - -struct FrameExtractionOptions { - float samp_freq = 16000; - float frame_shift_ms = 10.0f; // in milliseconds. - float frame_length_ms = 25.0f; // in milliseconds. - float dither = 1.0f; // Amount of dithering, 0.0 means no dither. - float preemph_coeff = 0.97f; // Preemphasis coefficient. - bool remove_dc_offset = true; // Subtract mean of wave before FFT. - std::string window_type = "povey"; // e.g. Hamming window - // May be "hamming", "rectangular", "povey", "hanning", "sine", "blackman" - // "povey" is a window I made to be similar to Hamming but to go to zero at - // the edges, it's pow((0.5 - 0.5*cos(n/N*2*pi)), 0.85) I just don't think the - // Hamming window makes sense as a windowing function. - bool round_to_power_of_two = true; - float blackman_coeff = 0.42f; - bool snip_edges = true; - // bool allow_downsample = false; - // bool allow_upsample = false; - - // Used for streaming feature extraction. It indicates the number - // of feature frames to keep in the recycling vector. -1 means to - // keep all feature frames. - int32_t max_feature_vectors = -1; - - int32_t WindowShift() const { - return static_cast(samp_freq * 0.001f * frame_shift_ms); - } - int32_t WindowSize() const { - return static_cast(samp_freq * 0.001f * frame_length_ms); - } - int32_t PaddedWindowSize() const { - return (round_to_power_of_two ? RoundUpToNearestPowerOfTwo(WindowSize()) - : WindowSize()); - } - std::string ToString() const { - std::ostringstream os; -#define KNF_PRINT(x) os << #x << ": " << x << "\n" - KNF_PRINT(samp_freq); - KNF_PRINT(frame_shift_ms); - KNF_PRINT(frame_length_ms); - KNF_PRINT(dither); - KNF_PRINT(preemph_coeff); - KNF_PRINT(remove_dc_offset); - KNF_PRINT(window_type); - KNF_PRINT(round_to_power_of_two); - KNF_PRINT(blackman_coeff); - KNF_PRINT(snip_edges); - // KNF_PRINT(allow_downsample); - // KNF_PRINT(allow_upsample); - KNF_PRINT(max_feature_vectors); -#undef KNF_PRINT - return os.str(); - } -}; - -std::ostream &operator<<(std::ostream &os, const FrameExtractionOptions &opts); - -class FeatureWindowFunction { - public: - FeatureWindowFunction() = default; - explicit FeatureWindowFunction(const FrameExtractionOptions &opts); - /** - * @param wave Pointer to a 1-D array of shape [window_size]. - * It is modified in-place: wave[i] = wave[i] * window_[i]. - * @param - */ - void Apply(float *wave) const; - - private: - std::vector window_; // of size opts.WindowSize() -}; - -int64_t FirstSampleOfFrame(int32_t frame, const FrameExtractionOptions &opts); - -/** - This function returns the number of frames that we can extract from a wave - file with the given number of samples in it (assumed to have the same - sampling rate as specified in 'opts'). - - @param [in] num_samples The number of samples in the wave file. - @param [in] opts The frame-extraction options class - - @param [in] flush True if we are asserting that this number of samples - is 'all there is', false if we expecting more data to possibly come in. This - only makes a difference to the answer - if opts.snips_edges== false. For offline feature extraction you always want - flush == true. In an online-decoding context, once you know (or decide) that - no more data is coming in, you'd call it with flush == true at the end to - flush out any remaining data. -*/ -int32_t NumFrames(int64_t num_samples, const FrameExtractionOptions &opts, - bool flush = true); - -/* - ExtractWindow() extracts a windowed frame of waveform (possibly with a - power-of-two, padded size, depending on the config), including all the - processing done by ProcessWindow(). - - @param [in] sample_offset If 'wave' is not the entire waveform, but - part of it to the left has been discarded, then the - number of samples prior to 'wave' that we have - already discarded. Set this to zero if you are - processing the entire waveform in one piece, or - if you get 'no matching function' compilation - errors when updating the code. - @param [in] wave The waveform - @param [in] f The frame index to be extracted, with - 0 <= f < NumFrames(sample_offset + wave.Dim(), opts, true) - @param [in] opts The options class to be used - @param [in] window_function The windowing function, as derived from the - options class. - @param [out] window The windowed, possibly-padded waveform to be - extracted. Will be resized as needed. - @param [out] log_energy_pre_window If non-NULL, the log-energy of - the signal prior to pre-emphasis and multiplying by - the windowing function will be written to here. -*/ -void ExtractWindow(int64_t sample_offset, const std::vector &wave, - int32_t f, const FrameExtractionOptions &opts, - const FeatureWindowFunction &window_function, - std::vector *window, - float *log_energy_pre_window = nullptr); - -/** - This function does all the windowing steps after actually - extracting the windowed signal: depending on the - configuration, it does dithering, dc offset removal, - preemphasis, and multiplication by the windowing function. - @param [in] opts The options class to be used - @param [in] window_function The windowing function-- should have - been initialized using 'opts'. - @param [in,out] window A vector of size opts.WindowSize(). Note: - it will typically be a sub-vector of a larger vector of size - opts.PaddedWindowSize(), with the remaining samples zero, - as the FFT code is more efficient if it operates on data with - power-of-two size. - @param [out] log_energy_pre_window If non-NULL, then after dithering and - DC offset removal, this function will write to this pointer the log of - the total energy (i.e. sum-squared) of the frame. - */ -void ProcessWindow(const FrameExtractionOptions &opts, - const FeatureWindowFunction &window_function, float *window, - float *log_energy_pre_window = nullptr); - -// Compute the inner product of two vectors -float InnerProduct(const float *a, const float *b, int32_t n); - -} // namespace knf - -#endif // KALDI_NATIVE_FEAT_CSRC_FEATURE_WINDOW_H_ diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/fftsg.c b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/fftsg.c deleted file mode 100644 index ec8217a2b..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/fftsg.c +++ /dev/null @@ -1,3271 +0,0 @@ -/* This file is copied from - * https://www.kurims.kyoto-u.ac.jp/~ooura/fft.html - */ -/* -Fast Fourier/Cosine/Sine Transform - dimension :one - data length :power of 2 - decimation :frequency - radix :split-radix - data :inplace - table :use -functions - cdft: Complex Discrete Fourier Transform - rdft: Real Discrete Fourier Transform - ddct: Discrete Cosine Transform - ddst: Discrete Sine Transform - dfct: Cosine Transform of RDFT (Real Symmetric DFT) - dfst: Sine Transform of RDFT (Real Anti-symmetric DFT) -function prototypes - void cdft(int, int, double *, int *, double *); - void rdft(int, int, double *, int *, double *); - void ddct(int, int, double *, int *, double *); - void ddst(int, int, double *, int *, double *); - void dfct(int, double *, double *, int *, double *); - void dfst(int, double *, double *, int *, double *); -macro definitions - USE_CDFT_PTHREADS : default=not defined - CDFT_THREADS_BEGIN_N : must be >= 512, default=8192 - CDFT_4THREADS_BEGIN_N : must be >= 512, default=65536 - USE_CDFT_WINTHREADS : default=not defined - CDFT_THREADS_BEGIN_N : must be >= 512, default=32768 - CDFT_4THREADS_BEGIN_N : must be >= 512, default=524288 - - --------- Complex DFT (Discrete Fourier Transform) -------- - [definition] - - X[k] = sum_j=0^n-1 x[j]*exp(2*pi*i*j*k/n), 0<=k - X[k] = sum_j=0^n-1 x[j]*exp(-2*pi*i*j*k/n), 0<=k - ip[0] = 0; // first time only - cdft(2*n, 1, a, ip, w); - - ip[0] = 0; // first time only - cdft(2*n, -1, a, ip, w); - [parameters] - 2*n :data length (int) - n >= 1, n = power of 2 - a[0...2*n-1] :input/output data (double *) - input data - a[2*j] = Re(x[j]), - a[2*j+1] = Im(x[j]), 0<=j= 2+sqrt(n) - strictly, - length of ip >= - 2+(1<<(int)(log(n+0.5)/log(2))/2). - ip[0],ip[1] are pointers of the cos/sin table. - w[0...n/2-1] :cos/sin table (double *) - w[],ip[] are initialized if ip[0] == 0. - [remark] - Inverse of - cdft(2*n, -1, a, ip, w); - is - cdft(2*n, 1, a, ip, w); - for (j = 0; j <= 2 * n - 1; j++) { - a[j] *= 1.0 / n; - } - . - - --------- Real DFT / Inverse of Real DFT -------- - [definition] - RDFT - R[k] = sum_j=0^n-1 a[j]*cos(2*pi*j*k/n), 0<=k<=n/2 - I[k] = sum_j=0^n-1 a[j]*sin(2*pi*j*k/n), 0 IRDFT (excluding scale) - a[k] = (R[0] + R[n/2]*cos(pi*k))/2 + - sum_j=1^n/2-1 R[j]*cos(2*pi*j*k/n) + - sum_j=1^n/2-1 I[j]*sin(2*pi*j*k/n), 0<=k - ip[0] = 0; // first time only - rdft(n, 1, a, ip, w); - - ip[0] = 0; // first time only - rdft(n, -1, a, ip, w); - [parameters] - n :data length (int) - n >= 2, n = power of 2 - a[0...n-1] :input/output data (double *) - - output data - a[2*k] = R[k], 0<=k - input data - a[2*j] = R[j], 0<=j= 2+sqrt(n/2) - strictly, - length of ip >= - 2+(1<<(int)(log(n/2+0.5)/log(2))/2). - ip[0],ip[1] are pointers of the cos/sin table. - w[0...n/2-1] :cos/sin table (double *) - w[],ip[] are initialized if ip[0] == 0. - [remark] - Inverse of - rdft(n, 1, a, ip, w); - is - rdft(n, -1, a, ip, w); - for (j = 0; j <= n - 1; j++) { - a[j] *= 2.0 / n; - } - . - - --------- DCT (Discrete Cosine Transform) / Inverse of DCT -------- - [definition] - IDCT (excluding scale) - C[k] = sum_j=0^n-1 a[j]*cos(pi*j*(k+1/2)/n), 0<=k DCT - C[k] = sum_j=0^n-1 a[j]*cos(pi*(j+1/2)*k/n), 0<=k - ip[0] = 0; // first time only - ddct(n, 1, a, ip, w); - - ip[0] = 0; // first time only - ddct(n, -1, a, ip, w); - [parameters] - n :data length (int) - n >= 2, n = power of 2 - a[0...n-1] :input/output data (double *) - output data - a[k] = C[k], 0<=k= 2+sqrt(n/2) - strictly, - length of ip >= - 2+(1<<(int)(log(n/2+0.5)/log(2))/2). - ip[0],ip[1] are pointers of the cos/sin table. - w[0...n*5/4-1] :cos/sin table (double *) - w[],ip[] are initialized if ip[0] == 0. - [remark] - Inverse of - ddct(n, -1, a, ip, w); - is - a[0] *= 0.5; - ddct(n, 1, a, ip, w); - for (j = 0; j <= n - 1; j++) { - a[j] *= 2.0 / n; - } - . - - --------- DST (Discrete Sine Transform) / Inverse of DST -------- - [definition] - IDST (excluding scale) - S[k] = sum_j=1^n A[j]*sin(pi*j*(k+1/2)/n), 0<=k DST - S[k] = sum_j=0^n-1 a[j]*sin(pi*(j+1/2)*k/n), 0 - ip[0] = 0; // first time only - ddst(n, 1, a, ip, w); - - ip[0] = 0; // first time only - ddst(n, -1, a, ip, w); - [parameters] - n :data length (int) - n >= 2, n = power of 2 - a[0...n-1] :input/output data (double *) - - input data - a[j] = A[j], 0 - output data - a[k] = S[k], 0= 2+sqrt(n/2) - strictly, - length of ip >= - 2+(1<<(int)(log(n/2+0.5)/log(2))/2). - ip[0],ip[1] are pointers of the cos/sin table. - w[0...n*5/4-1] :cos/sin table (double *) - w[],ip[] are initialized if ip[0] == 0. - [remark] - Inverse of - ddst(n, -1, a, ip, w); - is - a[0] *= 0.5; - ddst(n, 1, a, ip, w); - for (j = 0; j <= n - 1; j++) { - a[j] *= 2.0 / n; - } - . - - --------- Cosine Transform of RDFT (Real Symmetric DFT) -------- - [definition] - C[k] = sum_j=0^n a[j]*cos(pi*j*k/n), 0<=k<=n - [usage] - ip[0] = 0; // first time only - dfct(n, a, t, ip, w); - [parameters] - n :data length - 1 (int) - n >= 2, n = power of 2 - a[0...n] :input/output data (double *) - output data - a[k] = C[k], 0<=k<=n - t[0...n/2] :work area (double *) - ip[0...*] :work area for bit reversal (int *) - length of ip >= 2+sqrt(n/4) - strictly, - length of ip >= - 2+(1<<(int)(log(n/4+0.5)/log(2))/2). - ip[0],ip[1] are pointers of the cos/sin table. - w[0...n*5/8-1] :cos/sin table (double *) - w[],ip[] are initialized if ip[0] == 0. - [remark] - Inverse of - a[0] *= 0.5; - a[n] *= 0.5; - dfct(n, a, t, ip, w); - is - a[0] *= 0.5; - a[n] *= 0.5; - dfct(n, a, t, ip, w); - for (j = 0; j <= n; j++) { - a[j] *= 2.0 / n; - } - . - - --------- Sine Transform of RDFT (Real Anti-symmetric DFT) -------- - [definition] - S[k] = sum_j=1^n-1 a[j]*sin(pi*j*k/n), 0= 2, n = power of 2 - a[0...n-1] :input/output data (double *) - output data - a[k] = S[k], 0= 2+sqrt(n/4) - strictly, - length of ip >= - 2+(1<<(int)(log(n/4+0.5)/log(2))/2). - ip[0],ip[1] are pointers of the cos/sin table. - w[0...n*5/8-1] :cos/sin table (double *) - w[],ip[] are initialized if ip[0] == 0. - [remark] - Inverse of - dfst(n, a, t, ip, w); - is - dfst(n, a, t, ip, w); - for (j = 1; j <= n - 1; j++) { - a[j] *= 2.0 / n; - } - . - - -Appendix : - The cos/sin table is recalculated when the larger table required. - w[] and ip[] are compatible with all routines. -*/ - - -void cdft(int n, int isgn, double *a, int *ip, double *w) { - void makewt(int nw, int *ip, double *w); - void cftfsub(int n, double *a, int *ip, int nw, double *w); - void cftbsub(int n, double *a, int *ip, int nw, double *w); - int nw; - - nw = ip[0]; - if (n > (nw << 2)) { - nw = n >> 2; - makewt(nw, ip, w); - } - if (isgn >= 0) { - cftfsub(n, a, ip, nw, w); - } else { - cftbsub(n, a, ip, nw, w); - } -} - - -void rdft(int n, int isgn, double *a, int *ip, double *w) { - void makewt(int nw, int *ip, double *w); - void makect(int nc, int *ip, double *c); - void cftfsub(int n, double *a, int *ip, int nw, double *w); - void cftbsub(int n, double *a, int *ip, int nw, double *w); - void rftfsub(int n, double *a, int nc, double *c); - void rftbsub(int n, double *a, int nc, double *c); - int nw, nc; - double xi; - - nw = ip[0]; - if (n > (nw << 2)) { - nw = n >> 2; - makewt(nw, ip, w); - } - nc = ip[1]; - if (n > (nc << 2)) { - nc = n >> 2; - makect(nc, ip, w + nw); - } - if (isgn >= 0) { - if (n > 4) { - cftfsub(n, a, ip, nw, w); - rftfsub(n, a, nc, w + nw); - } else if (n == 4) { - cftfsub(n, a, ip, nw, w); - } - xi = a[0] - a[1]; - a[0] += a[1]; - a[1] = xi; - } else { - a[1] = 0.5 * (a[0] - a[1]); - a[0] -= a[1]; - if (n > 4) { - rftbsub(n, a, nc, w + nw); - cftbsub(n, a, ip, nw, w); - } else if (n == 4) { - cftbsub(n, a, ip, nw, w); - } - } -} - - -void ddct(int n, int isgn, double *a, int *ip, double *w) { - void makewt(int nw, int *ip, double *w); - void makect(int nc, int *ip, double *c); - void cftfsub(int n, double *a, int *ip, int nw, double *w); - void cftbsub(int n, double *a, int *ip, int nw, double *w); - void rftfsub(int n, double *a, int nc, double *c); - void rftbsub(int n, double *a, int nc, double *c); - void dctsub(int n, double *a, int nc, double *c); - int j, nw, nc; - double xr; - - nw = ip[0]; - if (n > (nw << 2)) { - nw = n >> 2; - makewt(nw, ip, w); - } - nc = ip[1]; - if (n > nc) { - nc = n; - makect(nc, ip, w + nw); - } - if (isgn < 0) { - xr = a[n - 1]; - for (j = n - 2; j >= 2; j -= 2) { - a[j + 1] = a[j] - a[j - 1]; - a[j] += a[j - 1]; - } - a[1] = a[0] - xr; - a[0] += xr; - if (n > 4) { - rftbsub(n, a, nc, w + nw); - cftbsub(n, a, ip, nw, w); - } else if (n == 4) { - cftbsub(n, a, ip, nw, w); - } - } - dctsub(n, a, nc, w + nw); - if (isgn >= 0) { - if (n > 4) { - cftfsub(n, a, ip, nw, w); - rftfsub(n, a, nc, w + nw); - } else if (n == 4) { - cftfsub(n, a, ip, nw, w); - } - xr = a[0] - a[1]; - a[0] += a[1]; - for (j = 2; j < n; j += 2) { - a[j - 1] = a[j] - a[j + 1]; - a[j] += a[j + 1]; - } - a[n - 1] = xr; - } -} - - -void ddst(int n, int isgn, double *a, int *ip, double *w) { - void makewt(int nw, int *ip, double *w); - void makect(int nc, int *ip, double *c); - void cftfsub(int n, double *a, int *ip, int nw, double *w); - void cftbsub(int n, double *a, int *ip, int nw, double *w); - void rftfsub(int n, double *a, int nc, double *c); - void rftbsub(int n, double *a, int nc, double *c); - void dstsub(int n, double *a, int nc, double *c); - int j, nw, nc; - double xr; - - nw = ip[0]; - if (n > (nw << 2)) { - nw = n >> 2; - makewt(nw, ip, w); - } - nc = ip[1]; - if (n > nc) { - nc = n; - makect(nc, ip, w + nw); - } - if (isgn < 0) { - xr = a[n - 1]; - for (j = n - 2; j >= 2; j -= 2) { - a[j + 1] = -a[j] - a[j - 1]; - a[j] -= a[j - 1]; - } - a[1] = a[0] + xr; - a[0] -= xr; - if (n > 4) { - rftbsub(n, a, nc, w + nw); - cftbsub(n, a, ip, nw, w); - } else if (n == 4) { - cftbsub(n, a, ip, nw, w); - } - } - dstsub(n, a, nc, w + nw); - if (isgn >= 0) { - if (n > 4) { - cftfsub(n, a, ip, nw, w); - rftfsub(n, a, nc, w + nw); - } else if (n == 4) { - cftfsub(n, a, ip, nw, w); - } - xr = a[0] - a[1]; - a[0] += a[1]; - for (j = 2; j < n; j += 2) { - a[j - 1] = -a[j] - a[j + 1]; - a[j] -= a[j + 1]; - } - a[n - 1] = -xr; - } -} - - -void dfct(int n, double *a, double *t, int *ip, double *w) { - void makewt(int nw, int *ip, double *w); - void makect(int nc, int *ip, double *c); - void cftfsub(int n, double *a, int *ip, int nw, double *w); - void rftfsub(int n, double *a, int nc, double *c); - void dctsub(int n, double *a, int nc, double *c); - int j, k, l, m, mh, nw, nc; - double xr, xi, yr, yi; - - nw = ip[0]; - if (n > (nw << 3)) { - nw = n >> 3; - makewt(nw, ip, w); - } - nc = ip[1]; - if (n > (nc << 1)) { - nc = n >> 1; - makect(nc, ip, w + nw); - } - m = n >> 1; - yi = a[m]; - xi = a[0] + a[n]; - a[0] -= a[n]; - t[0] = xi - yi; - t[m] = xi + yi; - if (n > 2) { - mh = m >> 1; - for (j = 1; j < mh; j++) { - k = m - j; - xr = a[j] - a[n - j]; - xi = a[j] + a[n - j]; - yr = a[k] - a[n - k]; - yi = a[k] + a[n - k]; - a[j] = xr; - a[k] = yr; - t[j] = xi - yi; - t[k] = xi + yi; - } - t[mh] = a[mh] + a[n - mh]; - a[mh] -= a[n - mh]; - dctsub(m, a, nc, w + nw); - if (m > 4) { - cftfsub(m, a, ip, nw, w); - rftfsub(m, a, nc, w + nw); - } else if (m == 4) { - cftfsub(m, a, ip, nw, w); - } - a[n - 1] = a[0] - a[1]; - a[1] = a[0] + a[1]; - for (j = m - 2; j >= 2; j -= 2) { - a[2 * j + 1] = a[j] + a[j + 1]; - a[2 * j - 1] = a[j] - a[j + 1]; - } - l = 2; - m = mh; - while (m >= 2) { - dctsub(m, t, nc, w + nw); - if (m > 4) { - cftfsub(m, t, ip, nw, w); - rftfsub(m, t, nc, w + nw); - } else if (m == 4) { - cftfsub(m, t, ip, nw, w); - } - a[n - l] = t[0] - t[1]; - a[l] = t[0] + t[1]; - k = 0; - for (j = 2; j < m; j += 2) { - k += l << 2; - a[k - l] = t[j] - t[j + 1]; - a[k + l] = t[j] + t[j + 1]; - } - l <<= 1; - mh = m >> 1; - for (j = 0; j < mh; j++) { - k = m - j; - t[j] = t[m + k] - t[m + j]; - t[k] = t[m + k] + t[m + j]; - } - t[mh] = t[m + mh]; - m = mh; - } - a[l] = t[0]; - a[n] = t[2] - t[1]; - a[0] = t[2] + t[1]; - } else { - a[1] = a[0]; - a[2] = t[0]; - a[0] = t[1]; - } -} - - -void dfst(int n, double *a, double *t, int *ip, double *w) { - void makewt(int nw, int *ip, double *w); - void makect(int nc, int *ip, double *c); - void cftfsub(int n, double *a, int *ip, int nw, double *w); - void rftfsub(int n, double *a, int nc, double *c); - void dstsub(int n, double *a, int nc, double *c); - int j, k, l, m, mh, nw, nc; - double xr, xi, yr, yi; - - nw = ip[0]; - if (n > (nw << 3)) { - nw = n >> 3; - makewt(nw, ip, w); - } - nc = ip[1]; - if (n > (nc << 1)) { - nc = n >> 1; - makect(nc, ip, w + nw); - } - if (n > 2) { - m = n >> 1; - mh = m >> 1; - for (j = 1; j < mh; j++) { - k = m - j; - xr = a[j] + a[n - j]; - xi = a[j] - a[n - j]; - yr = a[k] + a[n - k]; - yi = a[k] - a[n - k]; - a[j] = xr; - a[k] = yr; - t[j] = xi + yi; - t[k] = xi - yi; - } - t[0] = a[mh] - a[n - mh]; - a[mh] += a[n - mh]; - a[0] = a[m]; - dstsub(m, a, nc, w + nw); - if (m > 4) { - cftfsub(m, a, ip, nw, w); - rftfsub(m, a, nc, w + nw); - } else if (m == 4) { - cftfsub(m, a, ip, nw, w); - } - a[n - 1] = a[1] - a[0]; - a[1] = a[0] + a[1]; - for (j = m - 2; j >= 2; j -= 2) { - a[2 * j + 1] = a[j] - a[j + 1]; - a[2 * j - 1] = -a[j] - a[j + 1]; - } - l = 2; - m = mh; - while (m >= 2) { - dstsub(m, t, nc, w + nw); - if (m > 4) { - cftfsub(m, t, ip, nw, w); - rftfsub(m, t, nc, w + nw); - } else if (m == 4) { - cftfsub(m, t, ip, nw, w); - } - a[n - l] = t[1] - t[0]; - a[l] = t[0] + t[1]; - k = 0; - for (j = 2; j < m; j += 2) { - k += l << 2; - a[k - l] = -t[j] - t[j + 1]; - a[k + l] = t[j] - t[j + 1]; - } - l <<= 1; - mh = m >> 1; - for (j = 1; j < mh; j++) { - k = m - j; - t[j] = t[m + k] + t[m + j]; - t[k] = t[m + k] - t[m + j]; - } - t[0] = t[m + mh]; - m = mh; - } - a[l] = t[0]; - } - a[0] = 0; -} - - -/* -------- initializing routines -------- */ - - -#include - -void makewt(int nw, int *ip, double *w) { - void makeipt(int nw, int *ip); - int j, nwh, nw0, nw1; - double delta, wn4r, wk1r, wk1i, wk3r, wk3i; - - ip[0] = nw; - ip[1] = 1; - if (nw > 2) { - nwh = nw >> 1; - delta = atan(1.0) / nwh; - wn4r = cos(delta * nwh); - w[0] = 1; - w[1] = wn4r; - if (nwh == 4) { - w[2] = cos(delta * 2); - w[3] = sin(delta * 2); - } else if (nwh > 4) { - makeipt(nw, ip); - w[2] = 0.5 / cos(delta * 2); - w[3] = 0.5 / cos(delta * 6); - for (j = 4; j < nwh; j += 4) { - w[j] = cos(delta * j); - w[j + 1] = sin(delta * j); - w[j + 2] = cos(3 * delta * j); - w[j + 3] = -sin(3 * delta * j); - } - } - nw0 = 0; - while (nwh > 2) { - nw1 = nw0 + nwh; - nwh >>= 1; - w[nw1] = 1; - w[nw1 + 1] = wn4r; - if (nwh == 4) { - wk1r = w[nw0 + 4]; - wk1i = w[nw0 + 5]; - w[nw1 + 2] = wk1r; - w[nw1 + 3] = wk1i; - } else if (nwh > 4) { - wk1r = w[nw0 + 4]; - wk3r = w[nw0 + 6]; - w[nw1 + 2] = 0.5 / wk1r; - w[nw1 + 3] = 0.5 / wk3r; - for (j = 4; j < nwh; j += 4) { - wk1r = w[nw0 + 2 * j]; - wk1i = w[nw0 + 2 * j + 1]; - wk3r = w[nw0 + 2 * j + 2]; - wk3i = w[nw0 + 2 * j + 3]; - w[nw1 + j] = wk1r; - w[nw1 + j + 1] = wk1i; - w[nw1 + j + 2] = wk3r; - w[nw1 + j + 3] = wk3i; - } - } - nw0 = nw1; - } - } -} - - -void makeipt(int nw, int *ip) { - int j, l, m, m2, p, q; - - ip[2] = 0; - ip[3] = 16; - m = 2; - for (l = nw; l > 32; l >>= 2) { - m2 = m << 1; - q = m2 << 3; - for (j = m; j < m2; j++) { - p = ip[j] << 2; - ip[m + j] = p; - ip[m2 + j] = p + q; - } - m = m2; - } -} - - -void makect(int nc, int *ip, double *c) { - int j, nch; - double delta; - - ip[1] = nc; - if (nc > 1) { - nch = nc >> 1; - delta = atan(1.0) / nch; - c[0] = cos(delta * nch); - c[nch] = 0.5 * c[0]; - for (j = 1; j < nch; j++) { - c[j] = 0.5 * cos(delta * j); - c[nc - j] = 0.5 * sin(delta * j); - } - } -} - - -/* -------- child routines -------- */ - - -#ifdef USE_CDFT_PTHREADS -#define USE_CDFT_THREADS -#ifndef CDFT_THREADS_BEGIN_N -#define CDFT_THREADS_BEGIN_N 8192 -#endif -#ifndef CDFT_4THREADS_BEGIN_N -#define CDFT_4THREADS_BEGIN_N 65536 -#endif -#include -#include -#include -#define cdft_thread_t pthread_t -#define cdft_thread_create(thp, func, argp) \ - { \ - if (pthread_create(thp, NULL, func, (void *)argp) != 0) { \ - fprintf(stderr, "cdft thread error\n"); \ - exit(1); \ - } \ - } -#define cdft_thread_wait(th) \ - { \ - if (pthread_join(th, NULL) != 0) { \ - fprintf(stderr, "cdft thread error\n"); \ - exit(1); \ - } \ - } -#endif /* USE_CDFT_PTHREADS */ - - -#ifdef USE_CDFT_WINTHREADS -#define USE_CDFT_THREADS -#ifndef CDFT_THREADS_BEGIN_N -#define CDFT_THREADS_BEGIN_N 32768 -#endif -#ifndef CDFT_4THREADS_BEGIN_N -#define CDFT_4THREADS_BEGIN_N 524288 -#endif -#include -#include -#include -#define cdft_thread_t HANDLE -#define cdft_thread_create(thp, func, argp) \ - { \ - DWORD thid; \ - *(thp) = CreateThread( \ - NULL, 0, (LPTHREAD_START_ROUTINE)func, (LPVOID)argp, 0, &thid); \ - if (*(thp) == 0) { \ - fprintf(stderr, "cdft thread error\n"); \ - exit(1); \ - } \ - } -#define cdft_thread_wait(th) \ - { \ - WaitForSingleObject(th, INFINITE); \ - CloseHandle(th); \ - } -#endif /* USE_CDFT_WINTHREADS */ - - -void cftfsub(int n, double *a, int *ip, int nw, double *w) { - void bitrv2(int n, int *ip, double *a); - void bitrv216(double *a); - void bitrv208(double *a); - void cftf1st(int n, double *a, double *w); - void cftrec4(int n, double *a, int nw, double *w); - void cftleaf(int n, int isplt, double *a, int nw, double *w); - void cftfx41(int n, double *a, int nw, double *w); - void cftf161(double *a, double *w); - void cftf081(double *a, double *w); - void cftf040(double *a); - void cftx020(double *a); -#ifdef USE_CDFT_THREADS - void cftrec4_th(int n, double *a, int nw, double *w); -#endif /* USE_CDFT_THREADS */ - - if (n > 8) { - if (n > 32) { - cftf1st(n, a, &w[nw - (n >> 2)]); -#ifdef USE_CDFT_THREADS - if (n > CDFT_THREADS_BEGIN_N) { - cftrec4_th(n, a, nw, w); - } else -#endif /* USE_CDFT_THREADS */ - if (n > 512) { - cftrec4(n, a, nw, w); - } else if (n > 128) { - cftleaf(n, 1, a, nw, w); - } else { - cftfx41(n, a, nw, w); - } - bitrv2(n, ip, a); - } else if (n == 32) { - cftf161(a, &w[nw - 8]); - bitrv216(a); - } else { - cftf081(a, w); - bitrv208(a); - } - } else if (n == 8) { - cftf040(a); - } else if (n == 4) { - cftx020(a); - } -} - - -void cftbsub(int n, double *a, int *ip, int nw, double *w) { - void bitrv2conj(int n, int *ip, double *a); - void bitrv216neg(double *a); - void bitrv208neg(double *a); - void cftb1st(int n, double *a, double *w); - void cftrec4(int n, double *a, int nw, double *w); - void cftleaf(int n, int isplt, double *a, int nw, double *w); - void cftfx41(int n, double *a, int nw, double *w); - void cftf161(double *a, double *w); - void cftf081(double *a, double *w); - void cftb040(double *a); - void cftx020(double *a); -#ifdef USE_CDFT_THREADS - void cftrec4_th(int n, double *a, int nw, double *w); -#endif /* USE_CDFT_THREADS */ - - if (n > 8) { - if (n > 32) { - cftb1st(n, a, &w[nw - (n >> 2)]); -#ifdef USE_CDFT_THREADS - if (n > CDFT_THREADS_BEGIN_N) { - cftrec4_th(n, a, nw, w); - } else -#endif /* USE_CDFT_THREADS */ - if (n > 512) { - cftrec4(n, a, nw, w); - } else if (n > 128) { - cftleaf(n, 1, a, nw, w); - } else { - cftfx41(n, a, nw, w); - } - bitrv2conj(n, ip, a); - } else if (n == 32) { - cftf161(a, &w[nw - 8]); - bitrv216neg(a); - } else { - cftf081(a, w); - bitrv208neg(a); - } - } else if (n == 8) { - cftb040(a); - } else if (n == 4) { - cftx020(a); - } -} - - -void bitrv2(int n, int *ip, double *a) { - int j, j1, k, k1, l, m, nh, nm; - double xr, xi, yr, yi; - - m = 1; - for (l = n >> 2; l > 8; l >>= 2) { - m <<= 1; - } - nh = n >> 1; - nm = 4 * m; - if (l == 8) { - for (k = 0; k < m; k++) { - for (j = 0; j < k; j++) { - j1 = 4 * j + 2 * ip[m + k]; - k1 = 4 * k + 2 * ip[m + j]; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 -= nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nh; - k1 += 2; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 += nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += 2; - k1 += nh; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 -= nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nh; - k1 -= 2; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 += nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - } - k1 = 4 * k + 2 * ip[m + k]; - j1 = k1 + 2; - k1 += nh; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 -= nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= 2; - k1 -= nh; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nh + 2; - k1 += nh + 2; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nh - nm; - k1 += 2 * nm - 2; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - } - } else { - for (k = 0; k < m; k++) { - for (j = 0; j < k; j++) { - j1 = 4 * j + ip[m + k]; - k1 = 4 * k + ip[m + j]; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nh; - k1 += 2; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += 2; - k1 += nh; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nh; - k1 -= 2; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - } - k1 = 4 * k + ip[m + k]; - j1 = k1 + 2; - k1 += nh; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += nm; - xr = a[j1]; - xi = a[j1 + 1]; - yr = a[k1]; - yi = a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - } - } -} - - -void bitrv2conj(int n, int *ip, double *a) { - int j, j1, k, k1, l, m, nh, nm; - double xr, xi, yr, yi; - - m = 1; - for (l = n >> 2; l > 8; l >>= 2) { - m <<= 1; - } - nh = n >> 1; - nm = 4 * m; - if (l == 8) { - for (k = 0; k < m; k++) { - for (j = 0; j < k; j++) { - j1 = 4 * j + 2 * ip[m + k]; - k1 = 4 * k + 2 * ip[m + j]; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 -= nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nh; - k1 += 2; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 += nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += 2; - k1 += nh; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 -= nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nh; - k1 -= 2; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 += nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - } - k1 = 4 * k + 2 * ip[m + k]; - j1 = k1 + 2; - k1 += nh; - a[j1 - 1] = -a[j1 - 1]; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - a[k1 + 3] = -a[k1 + 3]; - j1 += nm; - k1 += 2 * nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 -= nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= 2; - k1 -= nh; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nh + 2; - k1 += nh + 2; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nh - nm; - k1 += 2 * nm - 2; - a[j1 - 1] = -a[j1 - 1]; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - a[k1 + 3] = -a[k1 + 3]; - } - } else { - for (k = 0; k < m; k++) { - for (j = 0; j < k; j++) { - j1 = 4 * j + ip[m + k]; - k1 = 4 * k + ip[m + j]; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nh; - k1 += 2; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += 2; - k1 += nh; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 += nm; - k1 += nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nh; - k1 -= 2; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - j1 -= nm; - k1 -= nm; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - } - k1 = 4 * k + ip[m + k]; - j1 = k1 + 2; - k1 += nh; - a[j1 - 1] = -a[j1 - 1]; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - a[k1 + 3] = -a[k1 + 3]; - j1 += nm; - k1 += nm; - a[j1 - 1] = -a[j1 - 1]; - xr = a[j1]; - xi = -a[j1 + 1]; - yr = a[k1]; - yi = -a[k1 + 1]; - a[j1] = yr; - a[j1 + 1] = yi; - a[k1] = xr; - a[k1 + 1] = xi; - a[k1 + 3] = -a[k1 + 3]; - } - } -} - - -void bitrv216(double *a) { - double x1r, x1i, x2r, x2i, x3r, x3i, x4r, x4i, x5r, x5i, x7r, x7i, x8r, x8i, - x10r, x10i, x11r, x11i, x12r, x12i, x13r, x13i, x14r, x14i; - - x1r = a[2]; - x1i = a[3]; - x2r = a[4]; - x2i = a[5]; - x3r = a[6]; - x3i = a[7]; - x4r = a[8]; - x4i = a[9]; - x5r = a[10]; - x5i = a[11]; - x7r = a[14]; - x7i = a[15]; - x8r = a[16]; - x8i = a[17]; - x10r = a[20]; - x10i = a[21]; - x11r = a[22]; - x11i = a[23]; - x12r = a[24]; - x12i = a[25]; - x13r = a[26]; - x13i = a[27]; - x14r = a[28]; - x14i = a[29]; - a[2] = x8r; - a[3] = x8i; - a[4] = x4r; - a[5] = x4i; - a[6] = x12r; - a[7] = x12i; - a[8] = x2r; - a[9] = x2i; - a[10] = x10r; - a[11] = x10i; - a[14] = x14r; - a[15] = x14i; - a[16] = x1r; - a[17] = x1i; - a[20] = x5r; - a[21] = x5i; - a[22] = x13r; - a[23] = x13i; - a[24] = x3r; - a[25] = x3i; - a[26] = x11r; - a[27] = x11i; - a[28] = x7r; - a[29] = x7i; -} - - -void bitrv216neg(double *a) { - double x1r, x1i, x2r, x2i, x3r, x3i, x4r, x4i, x5r, x5i, x6r, x6i, x7r, x7i, - x8r, x8i, x9r, x9i, x10r, x10i, x11r, x11i, x12r, x12i, x13r, x13i, - x14r, x14i, x15r, x15i; - - x1r = a[2]; - x1i = a[3]; - x2r = a[4]; - x2i = a[5]; - x3r = a[6]; - x3i = a[7]; - x4r = a[8]; - x4i = a[9]; - x5r = a[10]; - x5i = a[11]; - x6r = a[12]; - x6i = a[13]; - x7r = a[14]; - x7i = a[15]; - x8r = a[16]; - x8i = a[17]; - x9r = a[18]; - x9i = a[19]; - x10r = a[20]; - x10i = a[21]; - x11r = a[22]; - x11i = a[23]; - x12r = a[24]; - x12i = a[25]; - x13r = a[26]; - x13i = a[27]; - x14r = a[28]; - x14i = a[29]; - x15r = a[30]; - x15i = a[31]; - a[2] = x15r; - a[3] = x15i; - a[4] = x7r; - a[5] = x7i; - a[6] = x11r; - a[7] = x11i; - a[8] = x3r; - a[9] = x3i; - a[10] = x13r; - a[11] = x13i; - a[12] = x5r; - a[13] = x5i; - a[14] = x9r; - a[15] = x9i; - a[16] = x1r; - a[17] = x1i; - a[18] = x14r; - a[19] = x14i; - a[20] = x6r; - a[21] = x6i; - a[22] = x10r; - a[23] = x10i; - a[24] = x2r; - a[25] = x2i; - a[26] = x12r; - a[27] = x12i; - a[28] = x4r; - a[29] = x4i; - a[30] = x8r; - a[31] = x8i; -} - - -void bitrv208(double *a) { - double x1r, x1i, x3r, x3i, x4r, x4i, x6r, x6i; - - x1r = a[2]; - x1i = a[3]; - x3r = a[6]; - x3i = a[7]; - x4r = a[8]; - x4i = a[9]; - x6r = a[12]; - x6i = a[13]; - a[2] = x4r; - a[3] = x4i; - a[6] = x6r; - a[7] = x6i; - a[8] = x1r; - a[9] = x1i; - a[12] = x3r; - a[13] = x3i; -} - - -void bitrv208neg(double *a) { - double x1r, x1i, x2r, x2i, x3r, x3i, x4r, x4i, x5r, x5i, x6r, x6i, x7r, x7i; - - x1r = a[2]; - x1i = a[3]; - x2r = a[4]; - x2i = a[5]; - x3r = a[6]; - x3i = a[7]; - x4r = a[8]; - x4i = a[9]; - x5r = a[10]; - x5i = a[11]; - x6r = a[12]; - x6i = a[13]; - x7r = a[14]; - x7i = a[15]; - a[2] = x7r; - a[3] = x7i; - a[4] = x3r; - a[5] = x3i; - a[6] = x5r; - a[7] = x5i; - a[8] = x1r; - a[9] = x1i; - a[10] = x6r; - a[11] = x6i; - a[12] = x2r; - a[13] = x2i; - a[14] = x4r; - a[15] = x4i; -} - - -void cftf1st(int n, double *a, double *w) { - int j, j0, j1, j2, j3, k, m, mh; - double wn4r, csc1, csc3, wk1r, wk1i, wk3r, wk3i, wd1r, wd1i, wd3r, wd3i; - double x0r, x0i, x1r, x1i, x2r, x2i, x3r, x3i, y0r, y0i, y1r, y1i, y2r, y2i, - y3r, y3i; - - mh = n >> 3; - m = 2 * mh; - j1 = m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[0] + a[j2]; - x0i = a[1] + a[j2 + 1]; - x1r = a[0] - a[j2]; - x1i = a[1] - a[j2 + 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - a[0] = x0r + x2r; - a[1] = x0i + x2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i - x2i; - a[j2] = x1r - x3i; - a[j2 + 1] = x1i + x3r; - a[j3] = x1r + x3i; - a[j3 + 1] = x1i - x3r; - wn4r = w[1]; - csc1 = w[2]; - csc3 = w[3]; - wd1r = 1; - wd1i = 0; - wd3r = 1; - wd3i = 0; - k = 0; - for (j = 2; j < mh - 2; j += 4) { - k += 4; - wk1r = csc1 * (wd1r + w[k]); - wk1i = csc1 * (wd1i + w[k + 1]); - wk3r = csc3 * (wd3r + w[k + 2]); - wk3i = csc3 * (wd3i + w[k + 3]); - wd1r = w[k]; - wd1i = w[k + 1]; - wd3r = w[k + 2]; - wd3i = w[k + 3]; - j1 = j + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j] + a[j2]; - x0i = a[j + 1] + a[j2 + 1]; - x1r = a[j] - a[j2]; - x1i = a[j + 1] - a[j2 + 1]; - y0r = a[j + 2] + a[j2 + 2]; - y0i = a[j + 3] + a[j2 + 3]; - y1r = a[j + 2] - a[j2 + 2]; - y1i = a[j + 3] - a[j2 + 3]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - y2r = a[j1 + 2] + a[j3 + 2]; - y2i = a[j1 + 3] + a[j3 + 3]; - y3r = a[j1 + 2] - a[j3 + 2]; - y3i = a[j1 + 3] - a[j3 + 3]; - a[j] = x0r + x2r; - a[j + 1] = x0i + x2i; - a[j + 2] = y0r + y2r; - a[j + 3] = y0i + y2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i - x2i; - a[j1 + 2] = y0r - y2r; - a[j1 + 3] = y0i - y2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - a[j2] = wk1r * x0r - wk1i * x0i; - a[j2 + 1] = wk1r * x0i + wk1i * x0r; - x0r = y1r - y3i; - x0i = y1i + y3r; - a[j2 + 2] = wd1r * x0r - wd1i * x0i; - a[j2 + 3] = wd1r * x0i + wd1i * x0r; - x0r = x1r + x3i; - x0i = x1i - x3r; - a[j3] = wk3r * x0r + wk3i * x0i; - a[j3 + 1] = wk3r * x0i - wk3i * x0r; - x0r = y1r + y3i; - x0i = y1i - y3r; - a[j3 + 2] = wd3r * x0r + wd3i * x0i; - a[j3 + 3] = wd3r * x0i - wd3i * x0r; - j0 = m - j; - j1 = j0 + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j0] + a[j2]; - x0i = a[j0 + 1] + a[j2 + 1]; - x1r = a[j0] - a[j2]; - x1i = a[j0 + 1] - a[j2 + 1]; - y0r = a[j0 - 2] + a[j2 - 2]; - y0i = a[j0 - 1] + a[j2 - 1]; - y1r = a[j0 - 2] - a[j2 - 2]; - y1i = a[j0 - 1] - a[j2 - 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - y2r = a[j1 - 2] + a[j3 - 2]; - y2i = a[j1 - 1] + a[j3 - 1]; - y3r = a[j1 - 2] - a[j3 - 2]; - y3i = a[j1 - 1] - a[j3 - 1]; - a[j0] = x0r + x2r; - a[j0 + 1] = x0i + x2i; - a[j0 - 2] = y0r + y2r; - a[j0 - 1] = y0i + y2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i - x2i; - a[j1 - 2] = y0r - y2r; - a[j1 - 1] = y0i - y2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - a[j2] = wk1i * x0r - wk1r * x0i; - a[j2 + 1] = wk1i * x0i + wk1r * x0r; - x0r = y1r - y3i; - x0i = y1i + y3r; - a[j2 - 2] = wd1i * x0r - wd1r * x0i; - a[j2 - 1] = wd1i * x0i + wd1r * x0r; - x0r = x1r + x3i; - x0i = x1i - x3r; - a[j3] = wk3i * x0r + wk3r * x0i; - a[j3 + 1] = wk3i * x0i - wk3r * x0r; - x0r = y1r + y3i; - x0i = y1i - y3r; - a[j3 - 2] = wd3i * x0r + wd3r * x0i; - a[j3 - 1] = wd3i * x0i - wd3r * x0r; - } - wk1r = csc1 * (wd1r + wn4r); - wk1i = csc1 * (wd1i + wn4r); - wk3r = csc3 * (wd3r - wn4r); - wk3i = csc3 * (wd3i - wn4r); - j0 = mh; - j1 = j0 + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j0 - 2] + a[j2 - 2]; - x0i = a[j0 - 1] + a[j2 - 1]; - x1r = a[j0 - 2] - a[j2 - 2]; - x1i = a[j0 - 1] - a[j2 - 1]; - x2r = a[j1 - 2] + a[j3 - 2]; - x2i = a[j1 - 1] + a[j3 - 1]; - x3r = a[j1 - 2] - a[j3 - 2]; - x3i = a[j1 - 1] - a[j3 - 1]; - a[j0 - 2] = x0r + x2r; - a[j0 - 1] = x0i + x2i; - a[j1 - 2] = x0r - x2r; - a[j1 - 1] = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - a[j2 - 2] = wk1r * x0r - wk1i * x0i; - a[j2 - 1] = wk1r * x0i + wk1i * x0r; - x0r = x1r + x3i; - x0i = x1i - x3r; - a[j3 - 2] = wk3r * x0r + wk3i * x0i; - a[j3 - 1] = wk3r * x0i - wk3i * x0r; - x0r = a[j0] + a[j2]; - x0i = a[j0 + 1] + a[j2 + 1]; - x1r = a[j0] - a[j2]; - x1i = a[j0 + 1] - a[j2 + 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - a[j0] = x0r + x2r; - a[j0 + 1] = x0i + x2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - a[j2] = wn4r * (x0r - x0i); - a[j2 + 1] = wn4r * (x0i + x0r); - x0r = x1r + x3i; - x0i = x1i - x3r; - a[j3] = -wn4r * (x0r + x0i); - a[j3 + 1] = -wn4r * (x0i - x0r); - x0r = a[j0 + 2] + a[j2 + 2]; - x0i = a[j0 + 3] + a[j2 + 3]; - x1r = a[j0 + 2] - a[j2 + 2]; - x1i = a[j0 + 3] - a[j2 + 3]; - x2r = a[j1 + 2] + a[j3 + 2]; - x2i = a[j1 + 3] + a[j3 + 3]; - x3r = a[j1 + 2] - a[j3 + 2]; - x3i = a[j1 + 3] - a[j3 + 3]; - a[j0 + 2] = x0r + x2r; - a[j0 + 3] = x0i + x2i; - a[j1 + 2] = x0r - x2r; - a[j1 + 3] = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - a[j2 + 2] = wk1i * x0r - wk1r * x0i; - a[j2 + 3] = wk1i * x0i + wk1r * x0r; - x0r = x1r + x3i; - x0i = x1i - x3r; - a[j3 + 2] = wk3i * x0r + wk3r * x0i; - a[j3 + 3] = wk3i * x0i - wk3r * x0r; -} - - -void cftb1st(int n, double *a, double *w) { - int j, j0, j1, j2, j3, k, m, mh; - double wn4r, csc1, csc3, wk1r, wk1i, wk3r, wk3i, wd1r, wd1i, wd3r, wd3i; - double x0r, x0i, x1r, x1i, x2r, x2i, x3r, x3i, y0r, y0i, y1r, y1i, y2r, y2i, - y3r, y3i; - - mh = n >> 3; - m = 2 * mh; - j1 = m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[0] + a[j2]; - x0i = -a[1] - a[j2 + 1]; - x1r = a[0] - a[j2]; - x1i = -a[1] + a[j2 + 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - a[0] = x0r + x2r; - a[1] = x0i - x2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i + x2i; - a[j2] = x1r + x3i; - a[j2 + 1] = x1i + x3r; - a[j3] = x1r - x3i; - a[j3 + 1] = x1i - x3r; - wn4r = w[1]; - csc1 = w[2]; - csc3 = w[3]; - wd1r = 1; - wd1i = 0; - wd3r = 1; - wd3i = 0; - k = 0; - for (j = 2; j < mh - 2; j += 4) { - k += 4; - wk1r = csc1 * (wd1r + w[k]); - wk1i = csc1 * (wd1i + w[k + 1]); - wk3r = csc3 * (wd3r + w[k + 2]); - wk3i = csc3 * (wd3i + w[k + 3]); - wd1r = w[k]; - wd1i = w[k + 1]; - wd3r = w[k + 2]; - wd3i = w[k + 3]; - j1 = j + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j] + a[j2]; - x0i = -a[j + 1] - a[j2 + 1]; - x1r = a[j] - a[j2]; - x1i = -a[j + 1] + a[j2 + 1]; - y0r = a[j + 2] + a[j2 + 2]; - y0i = -a[j + 3] - a[j2 + 3]; - y1r = a[j + 2] - a[j2 + 2]; - y1i = -a[j + 3] + a[j2 + 3]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - y2r = a[j1 + 2] + a[j3 + 2]; - y2i = a[j1 + 3] + a[j3 + 3]; - y3r = a[j1 + 2] - a[j3 + 2]; - y3i = a[j1 + 3] - a[j3 + 3]; - a[j] = x0r + x2r; - a[j + 1] = x0i - x2i; - a[j + 2] = y0r + y2r; - a[j + 3] = y0i - y2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i + x2i; - a[j1 + 2] = y0r - y2r; - a[j1 + 3] = y0i + y2i; - x0r = x1r + x3i; - x0i = x1i + x3r; - a[j2] = wk1r * x0r - wk1i * x0i; - a[j2 + 1] = wk1r * x0i + wk1i * x0r; - x0r = y1r + y3i; - x0i = y1i + y3r; - a[j2 + 2] = wd1r * x0r - wd1i * x0i; - a[j2 + 3] = wd1r * x0i + wd1i * x0r; - x0r = x1r - x3i; - x0i = x1i - x3r; - a[j3] = wk3r * x0r + wk3i * x0i; - a[j3 + 1] = wk3r * x0i - wk3i * x0r; - x0r = y1r - y3i; - x0i = y1i - y3r; - a[j3 + 2] = wd3r * x0r + wd3i * x0i; - a[j3 + 3] = wd3r * x0i - wd3i * x0r; - j0 = m - j; - j1 = j0 + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j0] + a[j2]; - x0i = -a[j0 + 1] - a[j2 + 1]; - x1r = a[j0] - a[j2]; - x1i = -a[j0 + 1] + a[j2 + 1]; - y0r = a[j0 - 2] + a[j2 - 2]; - y0i = -a[j0 - 1] - a[j2 - 1]; - y1r = a[j0 - 2] - a[j2 - 2]; - y1i = -a[j0 - 1] + a[j2 - 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - y2r = a[j1 - 2] + a[j3 - 2]; - y2i = a[j1 - 1] + a[j3 - 1]; - y3r = a[j1 - 2] - a[j3 - 2]; - y3i = a[j1 - 1] - a[j3 - 1]; - a[j0] = x0r + x2r; - a[j0 + 1] = x0i - x2i; - a[j0 - 2] = y0r + y2r; - a[j0 - 1] = y0i - y2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i + x2i; - a[j1 - 2] = y0r - y2r; - a[j1 - 1] = y0i + y2i; - x0r = x1r + x3i; - x0i = x1i + x3r; - a[j2] = wk1i * x0r - wk1r * x0i; - a[j2 + 1] = wk1i * x0i + wk1r * x0r; - x0r = y1r + y3i; - x0i = y1i + y3r; - a[j2 - 2] = wd1i * x0r - wd1r * x0i; - a[j2 - 1] = wd1i * x0i + wd1r * x0r; - x0r = x1r - x3i; - x0i = x1i - x3r; - a[j3] = wk3i * x0r + wk3r * x0i; - a[j3 + 1] = wk3i * x0i - wk3r * x0r; - x0r = y1r - y3i; - x0i = y1i - y3r; - a[j3 - 2] = wd3i * x0r + wd3r * x0i; - a[j3 - 1] = wd3i * x0i - wd3r * x0r; - } - wk1r = csc1 * (wd1r + wn4r); - wk1i = csc1 * (wd1i + wn4r); - wk3r = csc3 * (wd3r - wn4r); - wk3i = csc3 * (wd3i - wn4r); - j0 = mh; - j1 = j0 + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j0 - 2] + a[j2 - 2]; - x0i = -a[j0 - 1] - a[j2 - 1]; - x1r = a[j0 - 2] - a[j2 - 2]; - x1i = -a[j0 - 1] + a[j2 - 1]; - x2r = a[j1 - 2] + a[j3 - 2]; - x2i = a[j1 - 1] + a[j3 - 1]; - x3r = a[j1 - 2] - a[j3 - 2]; - x3i = a[j1 - 1] - a[j3 - 1]; - a[j0 - 2] = x0r + x2r; - a[j0 - 1] = x0i - x2i; - a[j1 - 2] = x0r - x2r; - a[j1 - 1] = x0i + x2i; - x0r = x1r + x3i; - x0i = x1i + x3r; - a[j2 - 2] = wk1r * x0r - wk1i * x0i; - a[j2 - 1] = wk1r * x0i + wk1i * x0r; - x0r = x1r - x3i; - x0i = x1i - x3r; - a[j3 - 2] = wk3r * x0r + wk3i * x0i; - a[j3 - 1] = wk3r * x0i - wk3i * x0r; - x0r = a[j0] + a[j2]; - x0i = -a[j0 + 1] - a[j2 + 1]; - x1r = a[j0] - a[j2]; - x1i = -a[j0 + 1] + a[j2 + 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - a[j0] = x0r + x2r; - a[j0 + 1] = x0i - x2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i + x2i; - x0r = x1r + x3i; - x0i = x1i + x3r; - a[j2] = wn4r * (x0r - x0i); - a[j2 + 1] = wn4r * (x0i + x0r); - x0r = x1r - x3i; - x0i = x1i - x3r; - a[j3] = -wn4r * (x0r + x0i); - a[j3 + 1] = -wn4r * (x0i - x0r); - x0r = a[j0 + 2] + a[j2 + 2]; - x0i = -a[j0 + 3] - a[j2 + 3]; - x1r = a[j0 + 2] - a[j2 + 2]; - x1i = -a[j0 + 3] + a[j2 + 3]; - x2r = a[j1 + 2] + a[j3 + 2]; - x2i = a[j1 + 3] + a[j3 + 3]; - x3r = a[j1 + 2] - a[j3 + 2]; - x3i = a[j1 + 3] - a[j3 + 3]; - a[j0 + 2] = x0r + x2r; - a[j0 + 3] = x0i - x2i; - a[j1 + 2] = x0r - x2r; - a[j1 + 3] = x0i + x2i; - x0r = x1r + x3i; - x0i = x1i + x3r; - a[j2 + 2] = wk1i * x0r - wk1r * x0i; - a[j2 + 3] = wk1i * x0i + wk1r * x0r; - x0r = x1r - x3i; - x0i = x1i - x3r; - a[j3 + 2] = wk3i * x0r + wk3r * x0i; - a[j3 + 3] = wk3i * x0i - wk3r * x0r; -} - - -#ifdef USE_CDFT_THREADS -struct cdft_arg_st { - int n0; - int n; - double *a; - int nw; - double *w; -}; -typedef struct cdft_arg_st cdft_arg_t; - - -void cftrec4_th(int n, double *a, int nw, double *w) { - void *cftrec1_th(void *p); - void *cftrec2_th(void *p); - int i, idiv4, m, nthread; - cdft_thread_t th[4]; - cdft_arg_t ag[4]; - - nthread = 2; - idiv4 = 0; - m = n >> 1; - if (n > CDFT_4THREADS_BEGIN_N) { - nthread = 4; - idiv4 = 1; - m >>= 1; - } - for (i = 0; i < nthread; i++) { - ag[i].n0 = n; - ag[i].n = m; - ag[i].a = &a[i * m]; - ag[i].nw = nw; - ag[i].w = w; - if (i != idiv4) { - cdft_thread_create(&th[i], cftrec1_th, &ag[i]); - } else { - cdft_thread_create(&th[i], cftrec2_th, &ag[i]); - } - } - for (i = 0; i < nthread; i++) { - cdft_thread_wait(th[i]); - } -} - - -void *cftrec1_th(void *p) { - int cfttree(int n, int j, int k, double *a, int nw, double *w); - void cftleaf(int n, int isplt, double *a, int nw, double *w); - void cftmdl1(int n, double *a, double *w); - int isplt, j, k, m, n, n0, nw; - double *a, *w; - - n0 = ((cdft_arg_t *)p)->n0; - n = ((cdft_arg_t *)p)->n; - a = ((cdft_arg_t *)p)->a; - nw = ((cdft_arg_t *)p)->nw; - w = ((cdft_arg_t *)p)->w; - m = n0; - while (m > 512) { - m >>= 2; - cftmdl1(m, &a[n - m], &w[nw - (m >> 1)]); - } - cftleaf(m, 1, &a[n - m], nw, w); - k = 0; - for (j = n - m; j > 0; j -= m) { - k++; - isplt = cfttree(m, j, k, a, nw, w); - cftleaf(m, isplt, &a[j - m], nw, w); - } - return (void *)0; -} - - -void *cftrec2_th(void *p) { - int cfttree(int n, int j, int k, double *a, int nw, double *w); - void cftleaf(int n, int isplt, double *a, int nw, double *w); - void cftmdl2(int n, double *a, double *w); - int isplt, j, k, m, n, n0, nw; - double *a, *w; - - n0 = ((cdft_arg_t *)p)->n0; - n = ((cdft_arg_t *)p)->n; - a = ((cdft_arg_t *)p)->a; - nw = ((cdft_arg_t *)p)->nw; - w = ((cdft_arg_t *)p)->w; - k = 1; - m = n0; - while (m > 512) { - m >>= 2; - k <<= 2; - cftmdl2(m, &a[n - m], &w[nw - m]); - } - cftleaf(m, 0, &a[n - m], nw, w); - k >>= 1; - for (j = n - m; j > 0; j -= m) { - k++; - isplt = cfttree(m, j, k, a, nw, w); - cftleaf(m, isplt, &a[j - m], nw, w); - } - return (void *)0; -} -#endif /* USE_CDFT_THREADS */ - - -void cftrec4(int n, double *a, int nw, double *w) { - int cfttree(int n, int j, int k, double *a, int nw, double *w); - void cftleaf(int n, int isplt, double *a, int nw, double *w); - void cftmdl1(int n, double *a, double *w); - int isplt, j, k, m; - - m = n; - while (m > 512) { - m >>= 2; - cftmdl1(m, &a[n - m], &w[nw - (m >> 1)]); - } - cftleaf(m, 1, &a[n - m], nw, w); - k = 0; - for (j = n - m; j > 0; j -= m) { - k++; - isplt = cfttree(m, j, k, a, nw, w); - cftleaf(m, isplt, &a[j - m], nw, w); - } -} - - -int cfttree(int n, int j, int k, double *a, int nw, double *w) { - void cftmdl1(int n, double *a, double *w); - void cftmdl2(int n, double *a, double *w); - int i, isplt, m; - - if ((k & 3) != 0) { - isplt = k & 1; - if (isplt != 0) { - cftmdl1(n, &a[j - n], &w[nw - (n >> 1)]); - } else { - cftmdl2(n, &a[j - n], &w[nw - n]); - } - } else { - m = n; - for (i = k; (i & 3) == 0; i >>= 2) { - m <<= 2; - } - isplt = i & 1; - if (isplt != 0) { - while (m > 128) { - cftmdl1(m, &a[j - m], &w[nw - (m >> 1)]); - m >>= 2; - } - } else { - while (m > 128) { - cftmdl2(m, &a[j - m], &w[nw - m]); - m >>= 2; - } - } - } - return isplt; -} - - -void cftleaf(int n, int isplt, double *a, int nw, double *w) { - void cftmdl1(int n, double *a, double *w); - void cftmdl2(int n, double *a, double *w); - void cftf161(double *a, double *w); - void cftf162(double *a, double *w); - void cftf081(double *a, double *w); - void cftf082(double *a, double *w); - - if (n == 512) { - cftmdl1(128, a, &w[nw - 64]); - cftf161(a, &w[nw - 8]); - cftf162(&a[32], &w[nw - 32]); - cftf161(&a[64], &w[nw - 8]); - cftf161(&a[96], &w[nw - 8]); - cftmdl2(128, &a[128], &w[nw - 128]); - cftf161(&a[128], &w[nw - 8]); - cftf162(&a[160], &w[nw - 32]); - cftf161(&a[192], &w[nw - 8]); - cftf162(&a[224], &w[nw - 32]); - cftmdl1(128, &a[256], &w[nw - 64]); - cftf161(&a[256], &w[nw - 8]); - cftf162(&a[288], &w[nw - 32]); - cftf161(&a[320], &w[nw - 8]); - cftf161(&a[352], &w[nw - 8]); - if (isplt != 0) { - cftmdl1(128, &a[384], &w[nw - 64]); - cftf161(&a[480], &w[nw - 8]); - } else { - cftmdl2(128, &a[384], &w[nw - 128]); - cftf162(&a[480], &w[nw - 32]); - } - cftf161(&a[384], &w[nw - 8]); - cftf162(&a[416], &w[nw - 32]); - cftf161(&a[448], &w[nw - 8]); - } else { - cftmdl1(64, a, &w[nw - 32]); - cftf081(a, &w[nw - 8]); - cftf082(&a[16], &w[nw - 8]); - cftf081(&a[32], &w[nw - 8]); - cftf081(&a[48], &w[nw - 8]); - cftmdl2(64, &a[64], &w[nw - 64]); - cftf081(&a[64], &w[nw - 8]); - cftf082(&a[80], &w[nw - 8]); - cftf081(&a[96], &w[nw - 8]); - cftf082(&a[112], &w[nw - 8]); - cftmdl1(64, &a[128], &w[nw - 32]); - cftf081(&a[128], &w[nw - 8]); - cftf082(&a[144], &w[nw - 8]); - cftf081(&a[160], &w[nw - 8]); - cftf081(&a[176], &w[nw - 8]); - if (isplt != 0) { - cftmdl1(64, &a[192], &w[nw - 32]); - cftf081(&a[240], &w[nw - 8]); - } else { - cftmdl2(64, &a[192], &w[nw - 64]); - cftf082(&a[240], &w[nw - 8]); - } - cftf081(&a[192], &w[nw - 8]); - cftf082(&a[208], &w[nw - 8]); - cftf081(&a[224], &w[nw - 8]); - } -} - - -void cftmdl1(int n, double *a, double *w) { - int j, j0, j1, j2, j3, k, m, mh; - double wn4r, wk1r, wk1i, wk3r, wk3i; - double x0r, x0i, x1r, x1i, x2r, x2i, x3r, x3i; - - mh = n >> 3; - m = 2 * mh; - j1 = m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[0] + a[j2]; - x0i = a[1] + a[j2 + 1]; - x1r = a[0] - a[j2]; - x1i = a[1] - a[j2 + 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - a[0] = x0r + x2r; - a[1] = x0i + x2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i - x2i; - a[j2] = x1r - x3i; - a[j2 + 1] = x1i + x3r; - a[j3] = x1r + x3i; - a[j3 + 1] = x1i - x3r; - wn4r = w[1]; - k = 0; - for (j = 2; j < mh; j += 2) { - k += 4; - wk1r = w[k]; - wk1i = w[k + 1]; - wk3r = w[k + 2]; - wk3i = w[k + 3]; - j1 = j + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j] + a[j2]; - x0i = a[j + 1] + a[j2 + 1]; - x1r = a[j] - a[j2]; - x1i = a[j + 1] - a[j2 + 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - a[j] = x0r + x2r; - a[j + 1] = x0i + x2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - a[j2] = wk1r * x0r - wk1i * x0i; - a[j2 + 1] = wk1r * x0i + wk1i * x0r; - x0r = x1r + x3i; - x0i = x1i - x3r; - a[j3] = wk3r * x0r + wk3i * x0i; - a[j3 + 1] = wk3r * x0i - wk3i * x0r; - j0 = m - j; - j1 = j0 + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j0] + a[j2]; - x0i = a[j0 + 1] + a[j2 + 1]; - x1r = a[j0] - a[j2]; - x1i = a[j0 + 1] - a[j2 + 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - a[j0] = x0r + x2r; - a[j0 + 1] = x0i + x2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - a[j2] = wk1i * x0r - wk1r * x0i; - a[j2 + 1] = wk1i * x0i + wk1r * x0r; - x0r = x1r + x3i; - x0i = x1i - x3r; - a[j3] = wk3i * x0r + wk3r * x0i; - a[j3 + 1] = wk3i * x0i - wk3r * x0r; - } - j0 = mh; - j1 = j0 + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j0] + a[j2]; - x0i = a[j0 + 1] + a[j2 + 1]; - x1r = a[j0] - a[j2]; - x1i = a[j0 + 1] - a[j2 + 1]; - x2r = a[j1] + a[j3]; - x2i = a[j1 + 1] + a[j3 + 1]; - x3r = a[j1] - a[j3]; - x3i = a[j1 + 1] - a[j3 + 1]; - a[j0] = x0r + x2r; - a[j0 + 1] = x0i + x2i; - a[j1] = x0r - x2r; - a[j1 + 1] = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - a[j2] = wn4r * (x0r - x0i); - a[j2 + 1] = wn4r * (x0i + x0r); - x0r = x1r + x3i; - x0i = x1i - x3r; - a[j3] = -wn4r * (x0r + x0i); - a[j3 + 1] = -wn4r * (x0i - x0r); -} - - -void cftmdl2(int n, double *a, double *w) { - int j, j0, j1, j2, j3, k, kr, m, mh; - double wn4r, wk1r, wk1i, wk3r, wk3i, wd1r, wd1i, wd3r, wd3i; - double x0r, x0i, x1r, x1i, x2r, x2i, x3r, x3i, y0r, y0i, y2r, y2i; - - mh = n >> 3; - m = 2 * mh; - wn4r = w[1]; - j1 = m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[0] - a[j2 + 1]; - x0i = a[1] + a[j2]; - x1r = a[0] + a[j2 + 1]; - x1i = a[1] - a[j2]; - x2r = a[j1] - a[j3 + 1]; - x2i = a[j1 + 1] + a[j3]; - x3r = a[j1] + a[j3 + 1]; - x3i = a[j1 + 1] - a[j3]; - y0r = wn4r * (x2r - x2i); - y0i = wn4r * (x2i + x2r); - a[0] = x0r + y0r; - a[1] = x0i + y0i; - a[j1] = x0r - y0r; - a[j1 + 1] = x0i - y0i; - y0r = wn4r * (x3r - x3i); - y0i = wn4r * (x3i + x3r); - a[j2] = x1r - y0i; - a[j2 + 1] = x1i + y0r; - a[j3] = x1r + y0i; - a[j3 + 1] = x1i - y0r; - k = 0; - kr = 2 * m; - for (j = 2; j < mh; j += 2) { - k += 4; - wk1r = w[k]; - wk1i = w[k + 1]; - wk3r = w[k + 2]; - wk3i = w[k + 3]; - kr -= 4; - wd1i = w[kr]; - wd1r = w[kr + 1]; - wd3i = w[kr + 2]; - wd3r = w[kr + 3]; - j1 = j + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j] - a[j2 + 1]; - x0i = a[j + 1] + a[j2]; - x1r = a[j] + a[j2 + 1]; - x1i = a[j + 1] - a[j2]; - x2r = a[j1] - a[j3 + 1]; - x2i = a[j1 + 1] + a[j3]; - x3r = a[j1] + a[j3 + 1]; - x3i = a[j1 + 1] - a[j3]; - y0r = wk1r * x0r - wk1i * x0i; - y0i = wk1r * x0i + wk1i * x0r; - y2r = wd1r * x2r - wd1i * x2i; - y2i = wd1r * x2i + wd1i * x2r; - a[j] = y0r + y2r; - a[j + 1] = y0i + y2i; - a[j1] = y0r - y2r; - a[j1 + 1] = y0i - y2i; - y0r = wk3r * x1r + wk3i * x1i; - y0i = wk3r * x1i - wk3i * x1r; - y2r = wd3r * x3r + wd3i * x3i; - y2i = wd3r * x3i - wd3i * x3r; - a[j2] = y0r + y2r; - a[j2 + 1] = y0i + y2i; - a[j3] = y0r - y2r; - a[j3 + 1] = y0i - y2i; - j0 = m - j; - j1 = j0 + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j0] - a[j2 + 1]; - x0i = a[j0 + 1] + a[j2]; - x1r = a[j0] + a[j2 + 1]; - x1i = a[j0 + 1] - a[j2]; - x2r = a[j1] - a[j3 + 1]; - x2i = a[j1 + 1] + a[j3]; - x3r = a[j1] + a[j3 + 1]; - x3i = a[j1 + 1] - a[j3]; - y0r = wd1i * x0r - wd1r * x0i; - y0i = wd1i * x0i + wd1r * x0r; - y2r = wk1i * x2r - wk1r * x2i; - y2i = wk1i * x2i + wk1r * x2r; - a[j0] = y0r + y2r; - a[j0 + 1] = y0i + y2i; - a[j1] = y0r - y2r; - a[j1 + 1] = y0i - y2i; - y0r = wd3i * x1r + wd3r * x1i; - y0i = wd3i * x1i - wd3r * x1r; - y2r = wk3i * x3r + wk3r * x3i; - y2i = wk3i * x3i - wk3r * x3r; - a[j2] = y0r + y2r; - a[j2 + 1] = y0i + y2i; - a[j3] = y0r - y2r; - a[j3 + 1] = y0i - y2i; - } - wk1r = w[m]; - wk1i = w[m + 1]; - j0 = mh; - j1 = j0 + m; - j2 = j1 + m; - j3 = j2 + m; - x0r = a[j0] - a[j2 + 1]; - x0i = a[j0 + 1] + a[j2]; - x1r = a[j0] + a[j2 + 1]; - x1i = a[j0 + 1] - a[j2]; - x2r = a[j1] - a[j3 + 1]; - x2i = a[j1 + 1] + a[j3]; - x3r = a[j1] + a[j3 + 1]; - x3i = a[j1 + 1] - a[j3]; - y0r = wk1r * x0r - wk1i * x0i; - y0i = wk1r * x0i + wk1i * x0r; - y2r = wk1i * x2r - wk1r * x2i; - y2i = wk1i * x2i + wk1r * x2r; - a[j0] = y0r + y2r; - a[j0 + 1] = y0i + y2i; - a[j1] = y0r - y2r; - a[j1 + 1] = y0i - y2i; - y0r = wk1i * x1r - wk1r * x1i; - y0i = wk1i * x1i + wk1r * x1r; - y2r = wk1r * x3r - wk1i * x3i; - y2i = wk1r * x3i + wk1i * x3r; - a[j2] = y0r - y2r; - a[j2 + 1] = y0i - y2i; - a[j3] = y0r + y2r; - a[j3 + 1] = y0i + y2i; -} - - -void cftfx41(int n, double *a, int nw, double *w) { - void cftf161(double *a, double *w); - void cftf162(double *a, double *w); - void cftf081(double *a, double *w); - void cftf082(double *a, double *w); - - if (n == 128) { - cftf161(a, &w[nw - 8]); - cftf162(&a[32], &w[nw - 32]); - cftf161(&a[64], &w[nw - 8]); - cftf161(&a[96], &w[nw - 8]); - } else { - cftf081(a, &w[nw - 8]); - cftf082(&a[16], &w[nw - 8]); - cftf081(&a[32], &w[nw - 8]); - cftf081(&a[48], &w[nw - 8]); - } -} - - -void cftf161(double *a, double *w) { - double wn4r, wk1r, wk1i, x0r, x0i, x1r, x1i, x2r, x2i, x3r, x3i, y0r, y0i, - y1r, y1i, y2r, y2i, y3r, y3i, y4r, y4i, y5r, y5i, y6r, y6i, y7r, y7i, - y8r, y8i, y9r, y9i, y10r, y10i, y11r, y11i, y12r, y12i, y13r, y13i, - y14r, y14i, y15r, y15i; - - wn4r = w[1]; - wk1r = w[2]; - wk1i = w[3]; - x0r = a[0] + a[16]; - x0i = a[1] + a[17]; - x1r = a[0] - a[16]; - x1i = a[1] - a[17]; - x2r = a[8] + a[24]; - x2i = a[9] + a[25]; - x3r = a[8] - a[24]; - x3i = a[9] - a[25]; - y0r = x0r + x2r; - y0i = x0i + x2i; - y4r = x0r - x2r; - y4i = x0i - x2i; - y8r = x1r - x3i; - y8i = x1i + x3r; - y12r = x1r + x3i; - y12i = x1i - x3r; - x0r = a[2] + a[18]; - x0i = a[3] + a[19]; - x1r = a[2] - a[18]; - x1i = a[3] - a[19]; - x2r = a[10] + a[26]; - x2i = a[11] + a[27]; - x3r = a[10] - a[26]; - x3i = a[11] - a[27]; - y1r = x0r + x2r; - y1i = x0i + x2i; - y5r = x0r - x2r; - y5i = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - y9r = wk1r * x0r - wk1i * x0i; - y9i = wk1r * x0i + wk1i * x0r; - x0r = x1r + x3i; - x0i = x1i - x3r; - y13r = wk1i * x0r - wk1r * x0i; - y13i = wk1i * x0i + wk1r * x0r; - x0r = a[4] + a[20]; - x0i = a[5] + a[21]; - x1r = a[4] - a[20]; - x1i = a[5] - a[21]; - x2r = a[12] + a[28]; - x2i = a[13] + a[29]; - x3r = a[12] - a[28]; - x3i = a[13] - a[29]; - y2r = x0r + x2r; - y2i = x0i + x2i; - y6r = x0r - x2r; - y6i = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - y10r = wn4r * (x0r - x0i); - y10i = wn4r * (x0i + x0r); - x0r = x1r + x3i; - x0i = x1i - x3r; - y14r = wn4r * (x0r + x0i); - y14i = wn4r * (x0i - x0r); - x0r = a[6] + a[22]; - x0i = a[7] + a[23]; - x1r = a[6] - a[22]; - x1i = a[7] - a[23]; - x2r = a[14] + a[30]; - x2i = a[15] + a[31]; - x3r = a[14] - a[30]; - x3i = a[15] - a[31]; - y3r = x0r + x2r; - y3i = x0i + x2i; - y7r = x0r - x2r; - y7i = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - y11r = wk1i * x0r - wk1r * x0i; - y11i = wk1i * x0i + wk1r * x0r; - x0r = x1r + x3i; - x0i = x1i - x3r; - y15r = wk1r * x0r - wk1i * x0i; - y15i = wk1r * x0i + wk1i * x0r; - x0r = y12r - y14r; - x0i = y12i - y14i; - x1r = y12r + y14r; - x1i = y12i + y14i; - x2r = y13r - y15r; - x2i = y13i - y15i; - x3r = y13r + y15r; - x3i = y13i + y15i; - a[24] = x0r + x2r; - a[25] = x0i + x2i; - a[26] = x0r - x2r; - a[27] = x0i - x2i; - a[28] = x1r - x3i; - a[29] = x1i + x3r; - a[30] = x1r + x3i; - a[31] = x1i - x3r; - x0r = y8r + y10r; - x0i = y8i + y10i; - x1r = y8r - y10r; - x1i = y8i - y10i; - x2r = y9r + y11r; - x2i = y9i + y11i; - x3r = y9r - y11r; - x3i = y9i - y11i; - a[16] = x0r + x2r; - a[17] = x0i + x2i; - a[18] = x0r - x2r; - a[19] = x0i - x2i; - a[20] = x1r - x3i; - a[21] = x1i + x3r; - a[22] = x1r + x3i; - a[23] = x1i - x3r; - x0r = y5r - y7i; - x0i = y5i + y7r; - x2r = wn4r * (x0r - x0i); - x2i = wn4r * (x0i + x0r); - x0r = y5r + y7i; - x0i = y5i - y7r; - x3r = wn4r * (x0r - x0i); - x3i = wn4r * (x0i + x0r); - x0r = y4r - y6i; - x0i = y4i + y6r; - x1r = y4r + y6i; - x1i = y4i - y6r; - a[8] = x0r + x2r; - a[9] = x0i + x2i; - a[10] = x0r - x2r; - a[11] = x0i - x2i; - a[12] = x1r - x3i; - a[13] = x1i + x3r; - a[14] = x1r + x3i; - a[15] = x1i - x3r; - x0r = y0r + y2r; - x0i = y0i + y2i; - x1r = y0r - y2r; - x1i = y0i - y2i; - x2r = y1r + y3r; - x2i = y1i + y3i; - x3r = y1r - y3r; - x3i = y1i - y3i; - a[0] = x0r + x2r; - a[1] = x0i + x2i; - a[2] = x0r - x2r; - a[3] = x0i - x2i; - a[4] = x1r - x3i; - a[5] = x1i + x3r; - a[6] = x1r + x3i; - a[7] = x1i - x3r; -} - - -void cftf162(double *a, double *w) { - double wn4r, wk1r, wk1i, wk2r, wk2i, wk3r, wk3i, x0r, x0i, x1r, x1i, x2r, - x2i, y0r, y0i, y1r, y1i, y2r, y2i, y3r, y3i, y4r, y4i, y5r, y5i, y6r, - y6i, y7r, y7i, y8r, y8i, y9r, y9i, y10r, y10i, y11r, y11i, y12r, y12i, - y13r, y13i, y14r, y14i, y15r, y15i; - - wn4r = w[1]; - wk1r = w[4]; - wk1i = w[5]; - wk3r = w[6]; - wk3i = -w[7]; - wk2r = w[8]; - wk2i = w[9]; - x1r = a[0] - a[17]; - x1i = a[1] + a[16]; - x0r = a[8] - a[25]; - x0i = a[9] + a[24]; - x2r = wn4r * (x0r - x0i); - x2i = wn4r * (x0i + x0r); - y0r = x1r + x2r; - y0i = x1i + x2i; - y4r = x1r - x2r; - y4i = x1i - x2i; - x1r = a[0] + a[17]; - x1i = a[1] - a[16]; - x0r = a[8] + a[25]; - x0i = a[9] - a[24]; - x2r = wn4r * (x0r - x0i); - x2i = wn4r * (x0i + x0r); - y8r = x1r - x2i; - y8i = x1i + x2r; - y12r = x1r + x2i; - y12i = x1i - x2r; - x0r = a[2] - a[19]; - x0i = a[3] + a[18]; - x1r = wk1r * x0r - wk1i * x0i; - x1i = wk1r * x0i + wk1i * x0r; - x0r = a[10] - a[27]; - x0i = a[11] + a[26]; - x2r = wk3i * x0r - wk3r * x0i; - x2i = wk3i * x0i + wk3r * x0r; - y1r = x1r + x2r; - y1i = x1i + x2i; - y5r = x1r - x2r; - y5i = x1i - x2i; - x0r = a[2] + a[19]; - x0i = a[3] - a[18]; - x1r = wk3r * x0r - wk3i * x0i; - x1i = wk3r * x0i + wk3i * x0r; - x0r = a[10] + a[27]; - x0i = a[11] - a[26]; - x2r = wk1r * x0r + wk1i * x0i; - x2i = wk1r * x0i - wk1i * x0r; - y9r = x1r - x2r; - y9i = x1i - x2i; - y13r = x1r + x2r; - y13i = x1i + x2i; - x0r = a[4] - a[21]; - x0i = a[5] + a[20]; - x1r = wk2r * x0r - wk2i * x0i; - x1i = wk2r * x0i + wk2i * x0r; - x0r = a[12] - a[29]; - x0i = a[13] + a[28]; - x2r = wk2i * x0r - wk2r * x0i; - x2i = wk2i * x0i + wk2r * x0r; - y2r = x1r + x2r; - y2i = x1i + x2i; - y6r = x1r - x2r; - y6i = x1i - x2i; - x0r = a[4] + a[21]; - x0i = a[5] - a[20]; - x1r = wk2i * x0r - wk2r * x0i; - x1i = wk2i * x0i + wk2r * x0r; - x0r = a[12] + a[29]; - x0i = a[13] - a[28]; - x2r = wk2r * x0r - wk2i * x0i; - x2i = wk2r * x0i + wk2i * x0r; - y10r = x1r - x2r; - y10i = x1i - x2i; - y14r = x1r + x2r; - y14i = x1i + x2i; - x0r = a[6] - a[23]; - x0i = a[7] + a[22]; - x1r = wk3r * x0r - wk3i * x0i; - x1i = wk3r * x0i + wk3i * x0r; - x0r = a[14] - a[31]; - x0i = a[15] + a[30]; - x2r = wk1i * x0r - wk1r * x0i; - x2i = wk1i * x0i + wk1r * x0r; - y3r = x1r + x2r; - y3i = x1i + x2i; - y7r = x1r - x2r; - y7i = x1i - x2i; - x0r = a[6] + a[23]; - x0i = a[7] - a[22]; - x1r = wk1i * x0r + wk1r * x0i; - x1i = wk1i * x0i - wk1r * x0r; - x0r = a[14] + a[31]; - x0i = a[15] - a[30]; - x2r = wk3i * x0r - wk3r * x0i; - x2i = wk3i * x0i + wk3r * x0r; - y11r = x1r + x2r; - y11i = x1i + x2i; - y15r = x1r - x2r; - y15i = x1i - x2i; - x1r = y0r + y2r; - x1i = y0i + y2i; - x2r = y1r + y3r; - x2i = y1i + y3i; - a[0] = x1r + x2r; - a[1] = x1i + x2i; - a[2] = x1r - x2r; - a[3] = x1i - x2i; - x1r = y0r - y2r; - x1i = y0i - y2i; - x2r = y1r - y3r; - x2i = y1i - y3i; - a[4] = x1r - x2i; - a[5] = x1i + x2r; - a[6] = x1r + x2i; - a[7] = x1i - x2r; - x1r = y4r - y6i; - x1i = y4i + y6r; - x0r = y5r - y7i; - x0i = y5i + y7r; - x2r = wn4r * (x0r - x0i); - x2i = wn4r * (x0i + x0r); - a[8] = x1r + x2r; - a[9] = x1i + x2i; - a[10] = x1r - x2r; - a[11] = x1i - x2i; - x1r = y4r + y6i; - x1i = y4i - y6r; - x0r = y5r + y7i; - x0i = y5i - y7r; - x2r = wn4r * (x0r - x0i); - x2i = wn4r * (x0i + x0r); - a[12] = x1r - x2i; - a[13] = x1i + x2r; - a[14] = x1r + x2i; - a[15] = x1i - x2r; - x1r = y8r + y10r; - x1i = y8i + y10i; - x2r = y9r - y11r; - x2i = y9i - y11i; - a[16] = x1r + x2r; - a[17] = x1i + x2i; - a[18] = x1r - x2r; - a[19] = x1i - x2i; - x1r = y8r - y10r; - x1i = y8i - y10i; - x2r = y9r + y11r; - x2i = y9i + y11i; - a[20] = x1r - x2i; - a[21] = x1i + x2r; - a[22] = x1r + x2i; - a[23] = x1i - x2r; - x1r = y12r - y14i; - x1i = y12i + y14r; - x0r = y13r + y15i; - x0i = y13i - y15r; - x2r = wn4r * (x0r - x0i); - x2i = wn4r * (x0i + x0r); - a[24] = x1r + x2r; - a[25] = x1i + x2i; - a[26] = x1r - x2r; - a[27] = x1i - x2i; - x1r = y12r + y14i; - x1i = y12i - y14r; - x0r = y13r - y15i; - x0i = y13i + y15r; - x2r = wn4r * (x0r - x0i); - x2i = wn4r * (x0i + x0r); - a[28] = x1r - x2i; - a[29] = x1i + x2r; - a[30] = x1r + x2i; - a[31] = x1i - x2r; -} - - -void cftf081(double *a, double *w) { - double wn4r, x0r, x0i, x1r, x1i, x2r, x2i, x3r, x3i, y0r, y0i, y1r, y1i, - y2r, y2i, y3r, y3i, y4r, y4i, y5r, y5i, y6r, y6i, y7r, y7i; - - wn4r = w[1]; - x0r = a[0] + a[8]; - x0i = a[1] + a[9]; - x1r = a[0] - a[8]; - x1i = a[1] - a[9]; - x2r = a[4] + a[12]; - x2i = a[5] + a[13]; - x3r = a[4] - a[12]; - x3i = a[5] - a[13]; - y0r = x0r + x2r; - y0i = x0i + x2i; - y2r = x0r - x2r; - y2i = x0i - x2i; - y1r = x1r - x3i; - y1i = x1i + x3r; - y3r = x1r + x3i; - y3i = x1i - x3r; - x0r = a[2] + a[10]; - x0i = a[3] + a[11]; - x1r = a[2] - a[10]; - x1i = a[3] - a[11]; - x2r = a[6] + a[14]; - x2i = a[7] + a[15]; - x3r = a[6] - a[14]; - x3i = a[7] - a[15]; - y4r = x0r + x2r; - y4i = x0i + x2i; - y6r = x0r - x2r; - y6i = x0i - x2i; - x0r = x1r - x3i; - x0i = x1i + x3r; - x2r = x1r + x3i; - x2i = x1i - x3r; - y5r = wn4r * (x0r - x0i); - y5i = wn4r * (x0r + x0i); - y7r = wn4r * (x2r - x2i); - y7i = wn4r * (x2r + x2i); - a[8] = y1r + y5r; - a[9] = y1i + y5i; - a[10] = y1r - y5r; - a[11] = y1i - y5i; - a[12] = y3r - y7i; - a[13] = y3i + y7r; - a[14] = y3r + y7i; - a[15] = y3i - y7r; - a[0] = y0r + y4r; - a[1] = y0i + y4i; - a[2] = y0r - y4r; - a[3] = y0i - y4i; - a[4] = y2r - y6i; - a[5] = y2i + y6r; - a[6] = y2r + y6i; - a[7] = y2i - y6r; -} - - -void cftf082(double *a, double *w) { - double wn4r, wk1r, wk1i, x0r, x0i, x1r, x1i, y0r, y0i, y1r, y1i, y2r, y2i, - y3r, y3i, y4r, y4i, y5r, y5i, y6r, y6i, y7r, y7i; - - wn4r = w[1]; - wk1r = w[2]; - wk1i = w[3]; - y0r = a[0] - a[9]; - y0i = a[1] + a[8]; - y1r = a[0] + a[9]; - y1i = a[1] - a[8]; - x0r = a[4] - a[13]; - x0i = a[5] + a[12]; - y2r = wn4r * (x0r - x0i); - y2i = wn4r * (x0i + x0r); - x0r = a[4] + a[13]; - x0i = a[5] - a[12]; - y3r = wn4r * (x0r - x0i); - y3i = wn4r * (x0i + x0r); - x0r = a[2] - a[11]; - x0i = a[3] + a[10]; - y4r = wk1r * x0r - wk1i * x0i; - y4i = wk1r * x0i + wk1i * x0r; - x0r = a[2] + a[11]; - x0i = a[3] - a[10]; - y5r = wk1i * x0r - wk1r * x0i; - y5i = wk1i * x0i + wk1r * x0r; - x0r = a[6] - a[15]; - x0i = a[7] + a[14]; - y6r = wk1i * x0r - wk1r * x0i; - y6i = wk1i * x0i + wk1r * x0r; - x0r = a[6] + a[15]; - x0i = a[7] - a[14]; - y7r = wk1r * x0r - wk1i * x0i; - y7i = wk1r * x0i + wk1i * x0r; - x0r = y0r + y2r; - x0i = y0i + y2i; - x1r = y4r + y6r; - x1i = y4i + y6i; - a[0] = x0r + x1r; - a[1] = x0i + x1i; - a[2] = x0r - x1r; - a[3] = x0i - x1i; - x0r = y0r - y2r; - x0i = y0i - y2i; - x1r = y4r - y6r; - x1i = y4i - y6i; - a[4] = x0r - x1i; - a[5] = x0i + x1r; - a[6] = x0r + x1i; - a[7] = x0i - x1r; - x0r = y1r - y3i; - x0i = y1i + y3r; - x1r = y5r - y7r; - x1i = y5i - y7i; - a[8] = x0r + x1r; - a[9] = x0i + x1i; - a[10] = x0r - x1r; - a[11] = x0i - x1i; - x0r = y1r + y3i; - x0i = y1i - y3r; - x1r = y5r + y7r; - x1i = y5i + y7i; - a[12] = x0r - x1i; - a[13] = x0i + x1r; - a[14] = x0r + x1i; - a[15] = x0i - x1r; -} - - -void cftf040(double *a) { - double x0r, x0i, x1r, x1i, x2r, x2i, x3r, x3i; - - x0r = a[0] + a[4]; - x0i = a[1] + a[5]; - x1r = a[0] - a[4]; - x1i = a[1] - a[5]; - x2r = a[2] + a[6]; - x2i = a[3] + a[7]; - x3r = a[2] - a[6]; - x3i = a[3] - a[7]; - a[0] = x0r + x2r; - a[1] = x0i + x2i; - a[2] = x1r - x3i; - a[3] = x1i + x3r; - a[4] = x0r - x2r; - a[5] = x0i - x2i; - a[6] = x1r + x3i; - a[7] = x1i - x3r; -} - - -void cftb040(double *a) { - double x0r, x0i, x1r, x1i, x2r, x2i, x3r, x3i; - - x0r = a[0] + a[4]; - x0i = a[1] + a[5]; - x1r = a[0] - a[4]; - x1i = a[1] - a[5]; - x2r = a[2] + a[6]; - x2i = a[3] + a[7]; - x3r = a[2] - a[6]; - x3i = a[3] - a[7]; - a[0] = x0r + x2r; - a[1] = x0i + x2i; - a[2] = x1r + x3i; - a[3] = x1i - x3r; - a[4] = x0r - x2r; - a[5] = x0i - x2i; - a[6] = x1r - x3i; - a[7] = x1i + x3r; -} - - -void cftx020(double *a) { - double x0r, x0i; - - x0r = a[0] - a[2]; - x0i = a[1] - a[3]; - a[0] += a[2]; - a[1] += a[3]; - a[2] = x0r; - a[3] = x0i; -} - - -void rftfsub(int n, double *a, int nc, double *c) { - int j, k, kk, ks, m; - double wkr, wki, xr, xi, yr, yi; - - m = n >> 1; - ks = 2 * nc / m; - kk = 0; - for (j = 2; j < m; j += 2) { - k = n - j; - kk += ks; - wkr = 0.5 - c[nc - kk]; - wki = c[kk]; - xr = a[j] - a[k]; - xi = a[j + 1] + a[k + 1]; - yr = wkr * xr - wki * xi; - yi = wkr * xi + wki * xr; - a[j] -= yr; - a[j + 1] -= yi; - a[k] += yr; - a[k + 1] -= yi; - } -} - - -void rftbsub(int n, double *a, int nc, double *c) { - int j, k, kk, ks, m; - double wkr, wki, xr, xi, yr, yi; - - m = n >> 1; - ks = 2 * nc / m; - kk = 0; - for (j = 2; j < m; j += 2) { - k = n - j; - kk += ks; - wkr = 0.5 - c[nc - kk]; - wki = c[kk]; - xr = a[j] - a[k]; - xi = a[j + 1] + a[k + 1]; - yr = wkr * xr + wki * xi; - yi = wkr * xi - wki * xr; - a[j] -= yr; - a[j + 1] -= yi; - a[k] += yr; - a[k + 1] -= yi; - } -} - - -void dctsub(int n, double *a, int nc, double *c) { - int j, k, kk, ks, m; - double wkr, wki, xr; - - m = n >> 1; - ks = nc / n; - kk = 0; - for (j = 1; j < m; j++) { - k = n - j; - kk += ks; - wkr = c[kk] - c[nc - kk]; - wki = c[kk] + c[nc - kk]; - xr = wki * a[j] - wkr * a[k]; - a[j] = wkr * a[j] + wki * a[k]; - a[k] = xr; - } - a[m] *= c[0]; -} - - -void dstsub(int n, double *a, int nc, double *c) { - int j, k, kk, ks, m; - double wkr, wki, xr; - - m = n >> 1; - ks = nc / n; - kk = 0; - for (j = 1; j < m; j++) { - k = n - j; - kk += ks; - wkr = c[kk] - c[nc - kk]; - wki = c[kk] + c[nc - kk]; - xr = wki * a[k] - wkr * a[j]; - a[k] = wkr * a[k] + wki * a[j]; - a[j] = xr; - } - a[m] *= c[0]; -} diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/log.cc b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/log.cc deleted file mode 100644 index 6922808ab..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/log.cc +++ /dev/null @@ -1,143 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -/* - * Stack trace related stuff is from kaldi. - * Refer to - * https://github.com/kaldi-asr/kaldi/blob/master/src/base/kaldi-error.cc - */ - -#include "kaldi-native-fbank/csrc/log.h" - -#ifdef KNF_HAVE_EXECINFO_H -#include // To get stack trace in error messages. -#ifdef KNF_HAVE_CXXABI_H -#include // For name demangling. -// Useful to decode the stack trace, but only used if we have execinfo.h -#endif // KNF_HAVE_CXXABI_H -#endif // KNF_HAVE_EXECINFO_H - -#include - -#include -#include -#include - -namespace knf { - -std::string GetDateTimeStr() { - std::ostringstream os; - std::time_t t = std::time(nullptr); - std::tm tm = *std::localtime(&t); - os << std::put_time(&tm, "%F %T"); // yyyy-mm-dd hh:mm:ss - return os.str(); -} - -static bool LocateSymbolRange(const std::string &trace_name, std::size_t *begin, - std::size_t *end) { - // Find the first '_' with leading ' ' or '('. - *begin = std::string::npos; - for (std::size_t i = 1; i < trace_name.size(); ++i) { - if (trace_name[i] != '_') { - continue; - } - if (trace_name[i - 1] == ' ' || trace_name[i - 1] == '(') { - *begin = i; - break; - } - } - if (*begin == std::string::npos) { - return false; - } - *end = trace_name.find_first_of(" +", *begin); - return *end != std::string::npos; -} - -#ifdef KNF_HAVE_EXECINFO_H -static std::string Demangle(const std::string &trace_name) { -#ifndef KNF_HAVE_CXXABI_H - return trace_name; -#else // KNF_HAVE_CXXABI_H - // Try demangle the symbol. We are trying to support the following formats - // produced by different platforms: - // - // Linux: - // ./kaldi-error-test(_ZN5kaldi13UnitTestErrorEv+0xb) [0x804965d] - // - // Mac: - // 0 server 0x000000010f67614d _ZNK5kaldi13MessageLogger10LogMessageEv + 813 - // - // We want to extract the name e.g., '_ZN5kaldi13UnitTestErrorEv' and - // demangle it info a readable name like kaldi::UnitTextError. - std::size_t begin, end; - if (!LocateSymbolRange(trace_name, &begin, &end)) { - return trace_name; - } - std::string symbol = trace_name.substr(begin, end - begin); - int status; - char *demangled_name = abi::__cxa_demangle(symbol.c_str(), 0, 0, &status); - if (status == 0 && demangled_name != nullptr) { - symbol = demangled_name; - free(demangled_name); - } - return trace_name.substr(0, begin) + symbol + - trace_name.substr(end, std::string::npos); -#endif // KNF_HAVE_CXXABI_H -} -#endif // KNF_HAVE_EXECINFO_H - -std::string GetStackTrace() { - std::string ans; -#ifdef KNF_HAVE_EXECINFO_H - constexpr const std::size_t kMaxTraceSize = 50; - constexpr const std::size_t kMaxTracePrint = 50; // Must be even. - // Buffer for the trace. - void *trace[kMaxTraceSize]; - // Get the trace. - std::size_t size = backtrace(trace, kMaxTraceSize); - // Get the trace symbols. - char **trace_symbol = backtrace_symbols(trace, size); - if (trace_symbol == nullptr) - return ans; - - // Compose a human-readable backtrace string. - ans += "[ Stack-Trace: ]\n"; - if (size <= kMaxTracePrint) { - for (std::size_t i = 0; i < size; ++i) { - ans += Demangle(trace_symbol[i]) + "\n"; - } - } else { // Print out first+last (e.g.) 5. - for (std::size_t i = 0; i < kMaxTracePrint / 2; ++i) { - ans += Demangle(trace_symbol[i]) + "\n"; - } - ans += ".\n.\n.\n"; - for (std::size_t i = size - kMaxTracePrint / 2; i < size; ++i) { - ans += Demangle(trace_symbol[i]) + "\n"; - } - if (size == kMaxTraceSize) - ans += ".\n.\n.\n"; // Stack was too long, probably a bug. - } - - // We must free the array of pointers allocated by backtrace_symbols(), - // but not the strings themselves. - free(trace_symbol); -#endif // KNF_HAVE_EXECINFO_H - return ans; -} - -} // namespace knf diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/log.h b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/log.h deleted file mode 100644 index feb38db1f..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/log.h +++ /dev/null @@ -1,347 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -// The content in this file is copied/modified from -// https://github.com/k2-fsa/k2/blob/master/k2/csrc/log.h -#ifndef KALDI_NATIVE_FBANK_CSRC_LOG_H_ -#define KALDI_NATIVE_FBANK_CSRC_LOG_H_ - -#include - -#include // NOLINT -#include -#include - -namespace knf { - -#if defined(NDEBUG) -constexpr bool kDisableDebug = true; -#else -constexpr bool kDisableDebug = false; -#endif - -enum class LogLevel { - kTrace = 0, - kDebug = 1, - kInfo = 2, - kWarning = 3, - kError = 4, - kFatal = 5, // print message and abort the program -}; - -// They are used in KNF_LOG(xxx), so their names -// do not follow the google c++ code style -// -// You can use them in the following way: -// -// KNF_LOG(TRACE) << "some message"; -// KNF_LOG(DEBUG) << "some message"; -#ifndef _MSC_VER -constexpr LogLevel TRACE = LogLevel::kTrace; -constexpr LogLevel DEBUG = LogLevel::kDebug; -constexpr LogLevel INFO = LogLevel::kInfo; -constexpr LogLevel WARNING = LogLevel::kWarning; -constexpr LogLevel ERROR = LogLevel::kError; -constexpr LogLevel FATAL = LogLevel::kFatal; -#else -#define TRACE LogLevel::kTrace -#define DEBUG LogLevel::kDebug -#define INFO LogLevel::kInfo -#define WARNING LogLevel::kWarning -#define ERROR LogLevel::kError -#define FATAL LogLevel::kFatal -#endif - -std::string GetStackTrace(); - -/* Return the current log level. - - - If the current log level is TRACE, then all logged messages are printed out. - - If the current log level is DEBUG, log messages with "TRACE" level are not - shown and all other levels are printed out. - - Similarly, if the current log level is INFO, log message with "TRACE" and - "DEBUG" are not shown and all other levels are printed out. - - If it is FATAL, then only FATAL messages are shown. - */ -inline LogLevel GetCurrentLogLevel() { - static LogLevel log_level = INFO; - static std::once_flag init_flag; - std::call_once(init_flag, []() { - const char *env_log_level = std::getenv("KNF_LOG_LEVEL"); - if (env_log_level == nullptr) return; - - std::string s = env_log_level; - if (s == "TRACE") - log_level = TRACE; - else if (s == "DEBUG") - log_level = DEBUG; - else if (s == "INFO") - log_level = INFO; - else if (s == "WARNING") - log_level = WARNING; - else if (s == "ERROR") - log_level = ERROR; - else if (s == "FATAL") - log_level = FATAL; - else - fprintf(stderr, - "Unknown KNF_LOG_LEVEL: %s" - "\nSupported values are: " - "TRACE, DEBUG, INFO, WARNING, ERROR, FATAL", - s.c_str()); - }); - return log_level; -} - -inline bool EnableAbort() { - static std::once_flag init_flag; - static bool enable_abort = false; - std::call_once(init_flag, []() { - enable_abort = (std::getenv("KNF_ABORT") != nullptr); - }); - return enable_abort; -} - -class Logger { - public: - Logger(const char *filename, const char *func_name, uint32_t line_num, - LogLevel level) - : filename_(filename), - func_name_(func_name), - line_num_(line_num), - level_(level) { - cur_level_ = GetCurrentLogLevel(); - fprintf(stderr, "here\n"); - switch (level) { - case TRACE: - if (cur_level_ <= TRACE) fprintf(stderr, "[T] "); - break; - case DEBUG: - if (cur_level_ <= DEBUG) fprintf(stderr, "[D] "); - break; - case INFO: - if (cur_level_ <= INFO) fprintf(stderr, "[I] "); - break; - case WARNING: - if (cur_level_ <= WARNING) fprintf(stderr, "[W] "); - break; - case ERROR: - if (cur_level_ <= ERROR) fprintf(stderr, "[E] "); - break; - case FATAL: - if (cur_level_ <= FATAL) fprintf(stderr, "[F] "); - break; - } - - if (cur_level_ <= level_) { - fprintf(stderr, "%s:%u:%s ", filename, line_num, func_name); - } - } - - ~Logger() noexcept(false) { - static constexpr const char *kErrMsg = R"( - Some bad things happened. Please read the above error messages and stack - trace. If you are using Python, the following command may be helpful: - - gdb --args python /path/to/your/code.py - - (You can use `gdb` to debug the code. Please consider compiling - a debug version of KNF.). - - If you are unable to fix it, please open an issue at: - - https://github.com/csukuangfj/kaldi-native-fbank/issues/new - )"; - fprintf(stderr, "\n"); - if (level_ == FATAL) { - std::string stack_trace = GetStackTrace(); - if (!stack_trace.empty()) { - fprintf(stderr, "\n\n%s\n", stack_trace.c_str()); - } - - fflush(nullptr); - -#ifndef __ANDROID_API__ - if (EnableAbort()) { - // NOTE: abort() will terminate the program immediately without - // printing the Python stack backtrace. - abort(); - } - - throw std::runtime_error(kErrMsg); -#else - abort(); -#endif - } - } - - const Logger &operator<<(bool b) const { - if (cur_level_ <= level_) { - fprintf(stderr, b ? "true" : "false"); - } - return *this; - } - - const Logger &operator<<(int8_t i) const { - if (cur_level_ <= level_) fprintf(stderr, "%d", i); - return *this; - } - - const Logger &operator<<(const char *s) const { - if (cur_level_ <= level_) fprintf(stderr, "%s", s); - return *this; - } - - const Logger &operator<<(int32_t i) const { - if (cur_level_ <= level_) fprintf(stderr, "%d", i); - return *this; - } - - const Logger &operator<<(uint32_t i) const { - if (cur_level_ <= level_) fprintf(stderr, "%u", i); - return *this; - } - - const Logger &operator<<(uint64_t i) const { - if (cur_level_ <= level_) - fprintf(stderr, "%llu", (long long unsigned int)i); // NOLINT - return *this; - } - - const Logger &operator<<(int64_t i) const { - if (cur_level_ <= level_) - fprintf(stderr, "%lli", (long long int)i); // NOLINT - return *this; - } - - const Logger &operator<<(float f) const { - if (cur_level_ <= level_) fprintf(stderr, "%f", f); - return *this; - } - - const Logger &operator<<(double d) const { - if (cur_level_ <= level_) fprintf(stderr, "%f", d); - return *this; - } - - template - const Logger &operator<<(const T &t) const { - // require T overloads operator<< - std::ostringstream os; - os << t; - return *this << os.str().c_str(); - } - - // specialization to fix compile error: `stringstream << nullptr` is ambiguous - const Logger &operator<<(const std::nullptr_t &null) const { - if (cur_level_ <= level_) *this << "(null)"; - return *this; - } - - private: - const char *filename_; - const char *func_name_; - uint32_t line_num_; - LogLevel level_; - LogLevel cur_level_; -}; - -class Voidifier { - public: - void operator&(const Logger &)const {} -}; - -} // namespace knf - -#if defined(__clang__) || defined(__GNUC__) || defined(__GNUG__) || \ - defined(__PRETTY_FUNCTION__) -// for clang and GCC -#define KNF_FUNC __PRETTY_FUNCTION__ -#else -// for other compilers -#define KNF_FUNC __func__ -#endif - -#define KNF_STATIC_ASSERT(x) static_assert(x, "") - -#define KNF_CHECK(x) \ - (x) ? (void)0 \ - : ::knf::Voidifier() & \ - ::knf::Logger(__FILE__, KNF_FUNC, __LINE__, ::knf::FATAL) \ - << "Check failed: " << #x << " " - -// WARNING: x and y may be evaluated multiple times, but this happens only -// when the check fails. Since the program aborts if it fails, we don't think -// the extra evaluation of x and y matters. -// -// CAUTION: we recommend the following use case: -// -// auto x = Foo(); -// auto y = Bar(); -// KNF_CHECK_EQ(x, y) << "Some message"; -// -// And please avoid -// -// KNF_CHECK_EQ(Foo(), Bar()); -// -// if `Foo()` or `Bar()` causes some side effects, e.g., changing some -// local static variables or global variables. -#define _KNF_CHECK_OP(x, y, op) \ - ((x)op(y)) ? (void)0 \ - : ::knf::Voidifier() & \ - ::knf::Logger(__FILE__, KNF_FUNC, __LINE__, ::knf::FATAL) \ - << "Check failed: " << #x << " " << #op << " " << #y \ - << " (" << (x) << " vs. " << (y) << ") " - -#define KNF_CHECK_EQ(x, y) _KNF_CHECK_OP(x, y, ==) -#define KNF_CHECK_NE(x, y) _KNF_CHECK_OP(x, y, !=) -#define KNF_CHECK_LT(x, y) _KNF_CHECK_OP(x, y, <) -#define KNF_CHECK_LE(x, y) _KNF_CHECK_OP(x, y, <=) -#define KNF_CHECK_GT(x, y) _KNF_CHECK_OP(x, y, >) -#define KNF_CHECK_GE(x, y) _KNF_CHECK_OP(x, y, >=) - -#define KNF_LOG(x) ::knf::Logger(__FILE__, KNF_FUNC, __LINE__, ::knf::x) - -// ------------------------------------------------------------ -// For debug check -// ------------------------------------------------------------ -// If you define the macro "-D NDEBUG" while compiling kaldi-native-fbank, -// the following macros are in fact empty and does nothing. - -#define KNF_DCHECK(x) ::knf::kDisableDebug ? (void)0 : KNF_CHECK(x) - -#define KNF_DCHECK_EQ(x, y) ::knf::kDisableDebug ? (void)0 : KNF_CHECK_EQ(x, y) - -#define KNF_DCHECK_NE(x, y) ::knf::kDisableDebug ? (void)0 : KNF_CHECK_NE(x, y) - -#define KNF_DCHECK_LT(x, y) ::knf::kDisableDebug ? (void)0 : KNF_CHECK_LT(x, y) - -#define KNF_DCHECK_LE(x, y) ::knf::kDisableDebug ? (void)0 : KNF_CHECK_LE(x, y) - -#define KNF_DCHECK_GT(x, y) ::knf::kDisableDebug ? (void)0 : KNF_CHECK_GT(x, y) - -#define KNF_DCHECK_GE(x, y) ::knf::kDisableDebug ? (void)0 : KNF_CHECK_GE(x, y) - -#define KNF_DLOG(x) \ - ::knf::kDisableDebug ? (void)0 : ::knf::Voidifier() & KNF_LOG(x) - -#endif // KALDI_NATIVE_FBANK_CSRC_LOG_H_ diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/mel-computations.cc b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/mel-computations.cc deleted file mode 100644 index dade576b0..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/mel-computations.cc +++ /dev/null @@ -1,256 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -// This file is copied/modified from kaldi/src/feat/mel-computations.cc - -#include "kaldi-native-fbank/csrc/mel-computations.h" - -#include -#include - -#include "kaldi-native-fbank/csrc/feature-window.h" - -namespace knf { - -std::ostream &operator<<(std::ostream &os, const MelBanksOptions &opts) { - os << opts.ToString(); - return os; -} - -float MelBanks::VtlnWarpFreq( - float vtln_low_cutoff, // upper+lower frequency cutoffs for VTLN. - float vtln_high_cutoff, - float low_freq, // upper+lower frequency cutoffs in mel computation - float high_freq, float vtln_warp_factor, float freq) { - /// This computes a VTLN warping function that is not the same as HTK's one, - /// but has similar inputs (this function has the advantage of never producing - /// empty bins). - - /// This function computes a warp function F(freq), defined between low_freq - /// and high_freq inclusive, with the following properties: - /// F(low_freq) == low_freq - /// F(high_freq) == high_freq - /// The function is continuous and piecewise linear with two inflection - /// points. - /// The lower inflection point (measured in terms of the unwarped - /// frequency) is at frequency l, determined as described below. - /// The higher inflection point is at a frequency h, determined as - /// described below. - /// If l <= f <= h, then F(f) = f/vtln_warp_factor. - /// If the higher inflection point (measured in terms of the unwarped - /// frequency) is at h, then max(h, F(h)) == vtln_high_cutoff. - /// Since (by the last point) F(h) == h/vtln_warp_factor, then - /// max(h, h/vtln_warp_factor) == vtln_high_cutoff, so - /// h = vtln_high_cutoff / max(1, 1/vtln_warp_factor). - /// = vtln_high_cutoff * min(1, vtln_warp_factor). - /// If the lower inflection point (measured in terms of the unwarped - /// frequency) is at l, then min(l, F(l)) == vtln_low_cutoff - /// This implies that l = vtln_low_cutoff / min(1, 1/vtln_warp_factor) - /// = vtln_low_cutoff * max(1, vtln_warp_factor) - - if (freq < low_freq || freq > high_freq) - return freq; // in case this gets called - // for out-of-range frequencies, just return the freq. - - KNF_CHECK_GT(vtln_low_cutoff, low_freq); - KNF_CHECK_LT(vtln_high_cutoff, high_freq); - - float one = 1.0f; - float l = vtln_low_cutoff * std::max(one, vtln_warp_factor); - float h = vtln_high_cutoff * std::min(one, vtln_warp_factor); - float scale = 1.0f / vtln_warp_factor; - float Fl = scale * l; // F(l); - float Fh = scale * h; // F(h); - KNF_CHECK(l > low_freq && h < high_freq); - // slope of left part of the 3-piece linear function - float scale_left = (Fl - low_freq) / (l - low_freq); - // [slope of center part is just "scale"] - - // slope of right part of the 3-piece linear function - float scale_right = (high_freq - Fh) / (high_freq - h); - - if (freq < l) { - return low_freq + scale_left * (freq - low_freq); - } else if (freq < h) { - return scale * freq; - } else { // freq >= h - return high_freq + scale_right * (freq - high_freq); - } -} - -float MelBanks::VtlnWarpMelFreq( - float vtln_low_cutoff, // upper+lower frequency cutoffs for VTLN. - float vtln_high_cutoff, - float low_freq, // upper+lower frequency cutoffs in mel computation - float high_freq, float vtln_warp_factor, float mel_freq) { - return MelScale(VtlnWarpFreq(vtln_low_cutoff, vtln_high_cutoff, low_freq, - high_freq, vtln_warp_factor, - InverseMelScale(mel_freq))); -} - -MelBanks::MelBanks(const MelBanksOptions &opts, - const FrameExtractionOptions &frame_opts, - float vtln_warp_factor) - : htk_mode_(opts.htk_mode) { - int32_t num_bins = opts.num_bins; - if (num_bins < 3) KNF_LOG(FATAL) << "Must have at least 3 mel bins"; - - float sample_freq = frame_opts.samp_freq; - int32_t window_length_padded = frame_opts.PaddedWindowSize(); - KNF_CHECK_EQ(window_length_padded % 2, 0); - - int32_t num_fft_bins = window_length_padded / 2; - float nyquist = 0.5f * sample_freq; - - float low_freq = opts.low_freq, high_freq; - if (opts.high_freq > 0.0f) - high_freq = opts.high_freq; - else - high_freq = nyquist + opts.high_freq; - - if (low_freq < 0.0f || low_freq >= nyquist || high_freq <= 0.0f || - high_freq > nyquist || high_freq <= low_freq) { - KNF_LOG(FATAL) << "Bad values in options: low-freq " << low_freq - << " and high-freq " << high_freq << " vs. nyquist " - << nyquist; - } - - float fft_bin_width = sample_freq / window_length_padded; - // fft-bin width [think of it as Nyquist-freq / half-window-length] - - float mel_low_freq = MelScale(low_freq); - float mel_high_freq = MelScale(high_freq); - - debug_ = opts.debug_mel; - - // divide by num_bins+1 in next line because of end-effects where the bins - // spread out to the sides. - float mel_freq_delta = (mel_high_freq - mel_low_freq) / (num_bins + 1); - - float vtln_low = opts.vtln_low, vtln_high = opts.vtln_high; - if (vtln_high < 0.0f) { - vtln_high += nyquist; - } - - if (vtln_warp_factor != 1.0f && - (vtln_low < 0.0f || vtln_low <= low_freq || vtln_low >= high_freq || - vtln_high <= 0.0f || vtln_high >= high_freq || vtln_high <= vtln_low)) { - KNF_LOG(FATAL) << "Bad values in options: vtln-low " << vtln_low - << " and vtln-high " << vtln_high << ", versus " - << "low-freq " << low_freq << " and high-freq " << high_freq; - } - - bins_.resize(num_bins); - center_freqs_.resize(num_bins); - - for (int32_t bin = 0; bin < num_bins; ++bin) { - float left_mel = mel_low_freq + bin * mel_freq_delta, - center_mel = mel_low_freq + (bin + 1) * mel_freq_delta, - right_mel = mel_low_freq + (bin + 2) * mel_freq_delta; - - if (vtln_warp_factor != 1.0f) { - left_mel = VtlnWarpMelFreq(vtln_low, vtln_high, low_freq, high_freq, - vtln_warp_factor, left_mel); - center_mel = VtlnWarpMelFreq(vtln_low, vtln_high, low_freq, high_freq, - vtln_warp_factor, center_mel); - right_mel = VtlnWarpMelFreq(vtln_low, vtln_high, low_freq, high_freq, - vtln_warp_factor, right_mel); - } - center_freqs_[bin] = InverseMelScale(center_mel); - - // this_bin will be a vector of coefficients that is only - // nonzero where this mel bin is active. - std::vector this_bin(num_fft_bins); - - int32_t first_index = -1, last_index = -1; - for (int32_t i = 0; i < num_fft_bins; ++i) { - float freq = (fft_bin_width * i); // Center frequency of this fft - // bin. - float mel = MelScale(freq); - if (mel > left_mel && mel < right_mel) { - float weight; - if (mel <= center_mel) - weight = (mel - left_mel) / (center_mel - left_mel); - else - weight = (right_mel - mel) / (right_mel - center_mel); - this_bin[i] = weight; - if (first_index == -1) first_index = i; - last_index = i; - } - } - KNF_CHECK(first_index != -1 && last_index >= first_index && - "You may have set num_mel_bins too large."); - - bins_[bin].first = first_index; - int32_t size = last_index + 1 - first_index; - bins_[bin].second.insert(bins_[bin].second.end(), - this_bin.begin() + first_index, - this_bin.begin() + first_index + size); - - // Replicate a bug in HTK, for testing purposes. - if (opts.htk_mode && bin == 0 && mel_low_freq != 0.0f) { - bins_[bin].second[0] = 0.0; - } - } // for (int32_t bin = 0; bin < num_bins; ++bin) { - - if (debug_) { - std::ostringstream os; - for (size_t i = 0; i < bins_.size(); i++) { - os << "bin " << i << ", offset = " << bins_[i].first << ", vec = "; - for (auto k : bins_[i].second) os << k << ", "; - os << "\n"; - } - KNF_LOG(INFO) << os.str(); - } -} - -// "power_spectrum" contains fft energies. -void MelBanks::Compute(const float *power_spectrum, - float *mel_energies_out) const { - int32_t num_bins = bins_.size(); - - for (int32_t i = 0; i < num_bins; i++) { - int32_t offset = bins_[i].first; - const auto &v = bins_[i].second; - float energy = 0; - for (int32_t k = 0; k != v.size(); ++k) { - energy += v[k] * power_spectrum[k + offset]; - } - - // HTK-like flooring- for testing purposes (we prefer dither) - if (htk_mode_ && energy < 1.0) { - energy = 1.0; - } - - mel_energies_out[i] = energy; - - // The following assert was added due to a problem with OpenBlas that - // we had at one point (it was a bug in that library). Just to detect - // it early. - KNF_CHECK_EQ(energy, energy); // check that energy is not nan - } - - if (debug_) { - fprintf(stderr, "MEL BANKS:\n"); - for (int32_t i = 0; i < num_bins; i++) - fprintf(stderr, " %f", mel_energies_out[i]); - fprintf(stderr, "\n"); - } -} - -} // namespace knf diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/mel-computations.h b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/mel-computations.h deleted file mode 100644 index e743243a1..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/mel-computations.h +++ /dev/null @@ -1,115 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -// This file is copied/modified from kaldi/src/feat/mel-computations.h -#ifndef KALDI_NATIVE_FBANK_CSRC_MEL_COMPUTATIONS_H_ -#define KALDI_NATIVE_FBANK_CSRC_MEL_COMPUTATIONS_H_ - -#include -#include - -#include "kaldi-native-fbank/csrc/feature-window.h" - -namespace knf { - -struct MelBanksOptions { - int32_t num_bins = 25; // e.g. 25; number of triangular bins - float low_freq = 20; // e.g. 20; lower frequency cutoff - - // an upper frequency cutoff; 0 -> no cutoff, negative - // ->added to the Nyquist frequency to get the cutoff. - float high_freq = 0; - - float vtln_low = 100; // vtln lower cutoff of warping function. - - // vtln upper cutoff of warping function: if negative, added - // to the Nyquist frequency to get the cutoff. - float vtln_high = -500; - - bool debug_mel = false; - // htk_mode is a "hidden" config, it does not show up on command line. - // Enables more exact compatibility with HTK, for testing purposes. Affects - // mel-energy flooring and reproduces a bug in HTK. - bool htk_mode = false; - - std::string ToString() const { - std::ostringstream os; - os << "num_bins: " << num_bins << "\n"; - os << "low_freq: " << low_freq << "\n"; - os << "high_freq: " << high_freq << "\n"; - os << "vtln_low: " << vtln_low << "\n"; - os << "vtln_high: " << vtln_high << "\n"; - os << "debug_mel: " << debug_mel << "\n"; - os << "htk_mode: " << htk_mode << "\n"; - return os.str(); - } -}; - -std::ostream &operator<<(std::ostream &os, const MelBanksOptions &opts); - -class MelBanks { - public: - static inline float InverseMelScale(float mel_freq) { - return 700.0f * (expf(mel_freq / 1127.0f) - 1.0f); - } - - static inline float MelScale(float freq) { - return 1127.0f * logf(1.0f + freq / 700.0f); - } - - static float VtlnWarpFreq( - float vtln_low_cutoff, - float vtln_high_cutoff, // discontinuities in warp func - float low_freq, - float high_freq, // upper+lower frequency cutoffs in - // the mel computation - float vtln_warp_factor, float freq); - - static float VtlnWarpMelFreq(float vtln_low_cutoff, float vtln_high_cutoff, - float low_freq, float high_freq, - float vtln_warp_factor, float mel_freq); - - // TODO(fangjun): Remove vtln_warp_factor - MelBanks(const MelBanksOptions &opts, - const FrameExtractionOptions &frame_opts, float vtln_warp_factor); - - /// Compute Mel energies (note: not log energies). - /// At input, "fft_energies" contains the FFT energies (not log). - /// - /// @param fft_energies 1-D array of size num_fft_bins/2+1 - /// @param mel_energies_out 1-D array of size num_mel_bins - void Compute(const float *fft_energies, float *mel_energies_out) const; - - int32_t NumBins() const { return bins_.size(); } - - private: - // center frequencies of bins, numbered from 0 ... num_bins-1. - // Needed by GetCenterFreqs(). - std::vector center_freqs_; - - // the "bins_" vector is a vector, one for each bin, of a pair: - // (the first nonzero fft-bin), (the vector of weights). - std::vector>> bins_; - - // TODO(fangjun): Remove debug_ and htk_mode_ - bool debug_; - bool htk_mode_; -}; - -} // namespace knf - -#endif // KALDI_NATIVE_FBANK_CSRC_MEL_COMPUTATIONS_H_ diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/rfft.cc b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/rfft.cc deleted file mode 100644 index 69bfde5f8..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/rfft.cc +++ /dev/null @@ -1,66 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#include "kaldi-native-fbank/csrc/rfft.h" - -#include -#include - -#include "kaldi-native-fbank/csrc/log.h" - -// see fftsg.c -#ifdef __cplusplus -extern "C" void rdft(int n, int isgn, double *a, int *ip, double *w); -#else -void rdft(int n, int isgn, double *a, int *ip, double *w); -#endif - -namespace knf { -class Rfft::RfftImpl { - public: - explicit RfftImpl(int32_t n) : n_(n), ip_(2 + std::sqrt(n / 2)), w_(n / 2) { - KNF_CHECK_EQ(n & (n - 1), 0); - } - - void Compute(float *in_out) { - std::vector d(in_out, in_out + n_); - - Compute(d.data()); - - std::copy(d.begin(), d.end(), in_out); - } - - void Compute(double *in_out) { - // 1 means forward fft - rdft(n_, 1, in_out, ip_.data(), w_.data()); - } - - private: - int32_t n_; - std::vector ip_; - std::vector w_; -}; - -Rfft::Rfft(int32_t n) : impl_(std::make_unique(n)) {} - -Rfft::~Rfft() = default; - -void Rfft::Compute(float *in_out) { impl_->Compute(in_out); } -void Rfft::Compute(double *in_out) { impl_->Compute(in_out); } - -} // namespace knf diff --git a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/rfft.h b/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/rfft.h deleted file mode 100644 index c8cb9f8c1..000000000 --- a/audio/paddleaudio/third_party/kaldi-native-fbank/csrc/rfft.h +++ /dev/null @@ -1,56 +0,0 @@ -/** - * Copyright (c) 2022 Xiaomi Corporation (authors: Fangjun Kuang) - * - * See LICENSE for clarification regarding multiple authors - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#ifndef KALDI_NATIVE_FBANK_CSRC_RFFT_H_ -#define KALDI_NATIVE_FBANK_CSRC_RFFT_H_ - -#include - -namespace knf { - -// n-point Real discrete Fourier transform -// where n is a power of 2. n >= 2 -// -// R[k] = sum_j=0^n-1 in[j]*cos(2*pi*j*k/n), 0<=k<=n/2 -// I[k] = sum_j=0^n-1 in[j]*sin(2*pi*j*k/n), 0 impl_; -}; - -} // namespace knf - -#endif // KALDI_NATIVE_FBANK_CSRC_RFFT_H_ diff --git a/audio/paddleaudio/third_party/kaldi/CMakeLists.txt b/audio/paddleaudio/third_party/kaldi/CMakeLists.txt new file mode 100644 index 000000000..e63fb5788 --- /dev/null +++ b/audio/paddleaudio/third_party/kaldi/CMakeLists.txt @@ -0,0 +1,111 @@ +# checkout the thirdparty/kaldi/base/kaldi-types.h +# compile kaldi without openfst +add_definitions("-DCOMPILE_WITHOUT_OPENFST") + +if ((NOT EXISTS ${CMAKE_CURRENT_LIST_DIR}/base)) + file(COPY ../../../../speechx/speechx/kaldi/base DESTINATION ${CMAKE_CURRENT_LIST_DIR}) + file(COPY ../../../../speechx/speechx/kaldi/feat DESTINATION ${CMAKE_CURRENT_LIST_DIR}) + file(COPY ../../../../speechx/speechx/kaldi/matrix DESTINATION ${CMAKE_CURRENT_LIST_DIR}) + file(COPY ../../../../speechx/speechx/kaldi/util DESTINATION ${CMAKE_CURRENT_LIST_DIR}) +endif() + +# kaldi-base +add_library(kaldi-base STATIC + base/io-funcs.cc + base/kaldi-error.cc + base/kaldi-math.cc + base/kaldi-utils.cc + base/timer.cc +) +target_include_directories(kaldi-base PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}) + +# kaldi-matrix +add_library(kaldi-matrix STATIC + matrix/compressed-matrix.cc + matrix/matrix-functions.cc + matrix/kaldi-matrix.cc + matrix/kaldi-vector.cc + matrix/optimization.cc + matrix/packed-matrix.cc + matrix/qr.cc + matrix/sparse-matrix.cc + matrix/sp-matrix.cc + matrix/srfft.cc + matrix/tp-matrix.cc +) +target_include_directories(kaldi-matrix PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}) + +if (NOT MSVC) + target_link_libraries(kaldi-matrix PUBLIC kaldi-base libopenblas) +else() + target_link_libraries(kaldi-matrix PUBLIC kaldi-base openblas) +endif() + +# kaldi-util +add_library(kaldi-util STATIC + util/kaldi-holder.cc + util/kaldi-io.cc + util/kaldi-semaphore.cc + util/kaldi-table.cc + util/kaldi-thread.cc + util/parse-options.cc + util/simple-io-funcs.cc + util/simple-options.cc + util/text-utils.cc +) +target_include_directories(kaldi-util PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}) +target_link_libraries(kaldi-util PUBLIC kaldi-base kaldi-matrix) + +# kaldi-feat-common +add_library(kaldi-feat-common STATIC + feat/cmvn.cc + feat/feature-functions.cc + feat/feature-window.cc + feat/mel-computations.cc + feat/pitch-functions.cc + feat/resample.cc + feat/signal.cc + feat/wave-reader.cc +) +target_include_directories(kaldi-feat-common PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}) +target_link_libraries(kaldi-feat-common PUBLIC kaldi-base kaldi-matrix kaldi-util) + + +# kaldi-mfcc +add_library(kaldi-mfcc STATIC + feat/feature-mfcc.cc +) +target_include_directories(kaldi-mfcc PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}) +target_link_libraries(kaldi-mfcc PUBLIC kaldi-feat-common) + + +# kaldi-fbank +add_library(kaldi-fbank STATIC + feat/feature-fbank.cc +) +target_include_directories(kaldi-fbank PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}) +target_link_libraries(kaldi-fbank PUBLIC kaldi-feat-common) + + +set(KALDI_LIBRARIES + ${CMAKE_CURRENT_BINARY_DIR}/libkaldi-base.a + ${CMAKE_CURRENT_BINARY_DIR}/libkaldi-matrix.a + ${CMAKE_CURRENT_BINARY_DIR}/libkaldi-util.a + ${CMAKE_CURRENT_BINARY_DIR}/libkaldi-feat-common.a + ${CMAKE_CURRENT_BINARY_DIR}/libkaldi-mfcc.a + ${CMAKE_CURRENT_BINARY_DIR}/libkaldi-fbank.a +) + +add_library(libkaldi INTERFACE) +add_dependencies(libkaldi kaldi-base kaldi-matrix kaldi-util kaldi-feat-common kaldi-mfcc kaldi-fbank) +target_include_directories(libkaldi INTERFACE ${CMAKE_CURRENT_SOURCE_DIR}) + +if (APPLE) + target_link_libraries(libkaldi INTERFACE ${KALDI_LIBRARIES} libopenblas ${GFORTRAN_LIBRARIES_DIR}/libgfortran.a ${GFORTRAN_LIBRARIES_DIR}/libquadmath.a ${GFORTRAN_LIBRARIES_DIR}/libgcc_s.1.1.dylib) +elseif (MSVC) + target_link_libraries(libkaldi INTERFACE kaldi-base kaldi-matrix kaldi-util kaldi-feat-common kaldi-mfcc kaldi-fbank openblas) +else() + target_link_libraries(libkaldi INTERFACE -Wl,--start-group -Wl,--whole-archive ${KALDI_LIBRARIES} libopenblas.a gfortran -Wl,--no-whole-archive -Wl,--end-group) +endif() + +target_compile_definitions(libkaldi INTERFACE "-DCOMPILE_WITHOUT_OPENFST") diff --git a/audio/setup.py b/audio/setup.py index 823e5dfad..d7208a431 100644 --- a/audio/setup.py +++ b/audio/setup.py @@ -40,13 +40,19 @@ COMMITID = 'none' base = [ "kaldiio", "librosa==0.8.1", - "pathos", + "scipy>=1.0.0", + "soundfile~=0.10", + "colorlog", + "pathos == 0.2.8", "pybind11", "parameterized", + "tqdm", + "scikit-learn" ] requirements = { - "install": base, + "install": + base, "develop": [ "sox", "soxbindings", @@ -54,7 +60,6 @@ requirements = { ], } - def check_call(cmd: str, shell=False, executable=None): try: sp.check_call( @@ -87,7 +92,6 @@ def check_output(cmd: Union[str, List[str], Tuple[str]], shell=False): file=sys.stderr) return out_bytes.strip().decode('utf8') - def _run_cmd(cmd): try: return subprocess.check_output( @@ -96,7 +100,6 @@ def _run_cmd(cmd): except Exception: return None - @contextlib.contextmanager def pushd(new_dir): old_dir = os.getcwd() @@ -106,26 +109,22 @@ def pushd(new_dir): os.chdir(old_dir) print(old_dir) - def read(*names, **kwargs): with io.open( os.path.join(os.path.dirname(__file__), *names), encoding=kwargs.get("encoding", "utf8")) as fp: return fp.read() - def _remove(files: str): for f in files: f.unlink() - ################################# Install ################################## def _post_install(install_lib_dir): pass - class DevelopCommand(develop): def run(self): develop.run(self) @@ -143,7 +142,7 @@ class TestCommand(test): # Run nose ensuring that argv simulates running nosetests directly import nose nose.run_exit(argv=['nosetests', '-w', 'tests']) - + def run_benchmark(self): for benchmark_item in glob.glob('tests/benchmark/*py'): os.system(f'pytest {benchmark_item}') @@ -189,7 +188,6 @@ def _make_version_file(version, sha): with open(version_path, "a") as f: f.write(f"__version__ = '{version}'\n") - def _rm_version(): file_ = ROOT_DIR / "paddleaudio" / "__init__.py" with open(file_, "r") as f: @@ -237,8 +235,8 @@ def main(): if platform.system() != 'Windows' and platform.system() != 'Linux': lib_package_data = {'paddleaudio': ['lib/libgcc_s.1.1.dylib']} - #if platform.system() == 'Linux': - # lib_package_data = {'paddleaudio': ['lib/lib*']} + if platform.system() == 'Linux': + lib_package_data = {'paddleaudio': ['lib/lib*']} setup_info = dict( # Metadata @@ -256,7 +254,8 @@ def main(): python_requires='>=3.7', install_requires=requirements["install"], extras_require={ - 'develop': requirements["develop"], + 'develop': + requirements["develop"], #'test': ["nose", "torchaudio==0.10.2", "pytest-benchmark", "librosa=0.8.1", "parameterized", "paddlepaddle"], }, cmdclass={ @@ -268,7 +267,7 @@ def main(): }, # Package info - packages=find_packages(include=['paddleaudio*']), + packages=find_packages(include=('paddleaudio*')), package_data=lib_package_data, ext_modules=setup_helpers.get_ext_modules(), zip_safe=True, @@ -285,11 +284,11 @@ def main(): 'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: 3.9', 'Programming Language :: Python :: 3.10', - ], ) + ], + ) setup(**setup_info) _rm_version() - if __name__ == '__main__': main() diff --git a/paddlespeech/dataset/aidatatang_200zh/README.md b/dataset/aidatatang_200zh/README.md similarity index 100% rename from paddlespeech/dataset/aidatatang_200zh/README.md rename to dataset/aidatatang_200zh/README.md diff --git a/dataset/aidatatang_200zh/aidatatang_200zh.py b/dataset/aidatatang_200zh/aidatatang_200zh.py index 3b706c492..85f478c20 100644 --- a/dataset/aidatatang_200zh/aidatatang_200zh.py +++ b/dataset/aidatatang_200zh/aidatatang_200zh.py @@ -18,7 +18,139 @@ Manifest file is a json-format file with each line containing the meta data (i.e. audio filepath, transcript and audio duration) of each audio file in the data set. """ -from paddlespeech.dataset.aidatatang_200zh import aidatatang_200zh_main +import argparse +import codecs +import json +import os +from pathlib import Path + +import soundfile + +from utils.utility import download +from utils.utility import unpack + +DATA_HOME = os.path.expanduser('~/.cache/paddle/dataset/speech') + +URL_ROOT = 'http://www.openslr.org/resources/62' +# URL_ROOT = 'https://openslr.magicdatatech.com/resources/62' +DATA_URL = URL_ROOT + '/aidatatang_200zh.tgz' +MD5_DATA = '6e0f4f39cd5f667a7ee53c397c8d0949' + +parser = argparse.ArgumentParser(description=__doc__) +parser.add_argument( + "--target_dir", + default=DATA_HOME + "/aidatatang_200zh", + type=str, + help="Directory to save the dataset. (default: %(default)s)") +parser.add_argument( + "--manifest_prefix", + default="manifest", + type=str, + help="Filepath prefix for output manifests. (default: %(default)s)") +args = parser.parse_args() + + +def create_manifest(data_dir, manifest_path_prefix): + print("Creating manifest %s ..." % manifest_path_prefix) + json_lines = [] + transcript_path = os.path.join(data_dir, 'transcript', + 'aidatatang_200_zh_transcript.txt') + transcript_dict = {} + for line in codecs.open(transcript_path, 'r', 'utf-8'): + line = line.strip() + if line == '': + continue + audio_id, text = line.split(' ', 1) + # remove withespace, charactor text + text = ''.join(text.split()) + transcript_dict[audio_id] = text + + data_types = ['train', 'dev', 'test'] + for dtype in data_types: + del json_lines[:] + total_sec = 0.0 + total_text = 0.0 + total_num = 0 + + audio_dir = os.path.join(data_dir, 'corpus/', dtype) + for subfolder, _, filelist in sorted(os.walk(audio_dir)): + for fname in filelist: + if not fname.endswith('.wav'): + continue + + audio_path = os.path.abspath(os.path.join(subfolder, fname)) + audio_id = os.path.basename(fname)[:-4] + utt2spk = Path(audio_path).parent.name + + audio_data, samplerate = soundfile.read(audio_path) + duration = float(len(audio_data) / samplerate) + text = transcript_dict[audio_id] + json_lines.append( + json.dumps( + { + 'utt': audio_id, + 'utt2spk': str(utt2spk), + 'feat': audio_path, + 'feat_shape': (duration, ), # second + 'text': text, + }, + ensure_ascii=False)) + + total_sec += duration + total_text += len(text) + total_num += 1 + + manifest_path = manifest_path_prefix + '.' + dtype + with codecs.open(manifest_path, 'w', 'utf-8') as fout: + for line in json_lines: + fout.write(line + '\n') + + manifest_dir = os.path.dirname(manifest_path_prefix) + meta_path = os.path.join(manifest_dir, dtype) + '.meta' + with open(meta_path, 'w') as f: + print(f"{dtype}:", file=f) + print(f"{total_num} utts", file=f) + print(f"{total_sec / (60*60)} h", file=f) + print(f"{total_text} text", file=f) + print(f"{total_text / total_sec} text/sec", file=f) + print(f"{total_sec / total_num} sec/utt", file=f) + + +def prepare_dataset(url, md5sum, target_dir, manifest_path, subset): + """Download, unpack and create manifest file.""" + data_dir = os.path.join(target_dir, subset) + if not os.path.exists(data_dir): + filepath = download(url, md5sum, target_dir) + unpack(filepath, target_dir) + # unpack all audio tar files + audio_dir = os.path.join(data_dir, 'corpus') + for subfolder, dirlist, filelist in sorted(os.walk(audio_dir)): + for sub in dirlist: + print(f"unpack dir {sub}...") + for folder, _, filelist in sorted( + os.walk(os.path.join(subfolder, sub))): + for ftar in filelist: + unpack(os.path.join(folder, ftar), folder, True) + else: + print("Skip downloading and unpacking. Data already exists in %s." % + target_dir) + + create_manifest(data_dir, manifest_path) + + +def main(): + if args.target_dir.startswith('~'): + args.target_dir = os.path.expanduser(args.target_dir) + + prepare_dataset( + url=DATA_URL, + md5sum=MD5_DATA, + target_dir=args.target_dir, + manifest_path=args.manifest_prefix, + subset='aidatatang_200zh') + + print("Data download and manifest prepare done!") + if __name__ == '__main__': - aidatatang_200zh_main() + main() diff --git a/dataset/aishell/README.md b/dataset/aishell/README.md new file mode 100644 index 000000000..a7dd0cf32 --- /dev/null +++ b/dataset/aishell/README.md @@ -0,0 +1,3 @@ +# [Aishell1](http://openslr.elda.org/33/) + +This Open Source Mandarin Speech Corpus, AISHELL-ASR0009-OS1, is 178 hours long. It is a part of AISHELL-ASR0009, of which utterance contains 11 domains, including smart home, autonomous driving, and industrial production. The whole recording was put in quiet indoor environment, using 3 different devices at the same time: high fidelity microphone (44.1kHz, 16-bit,); Android-system mobile phone (16kHz, 16-bit), iOS-system mobile phone (16kHz, 16-bit). Audios in high fidelity were re-sampled to 16kHz to build AISHELL- ASR0009-OS1. 400 speakers from different accent areas in China were invited to participate in the recording. The manual transcription accuracy rate is above 95%, through professional speech annotation and strict quality inspection. The corpus is divided into training, development and testing sets. ( This database is free for academic research, not in the commerce, if without permission. ) diff --git a/dataset/aishell/aishell.py b/dataset/aishell/aishell.py index b32887574..ec43104db 100644 --- a/dataset/aishell/aishell.py +++ b/dataset/aishell/aishell.py @@ -18,7 +18,143 @@ Manifest file is a json-format file with each line containing the meta data (i.e. audio filepath, transcript and audio duration) of each audio file in the data set. """ -from paddlespeech.dataset.aishell import aishell_main +import argparse +import codecs +import json +import os +from pathlib import Path + +import soundfile + +from utils.utility import download +from utils.utility import unpack + +DATA_HOME = os.path.expanduser('~/.cache/paddle/dataset/speech') + +URL_ROOT = 'http://openslr.elda.org/resources/33' +# URL_ROOT = 'https://openslr.magicdatatech.com/resources/33' +DATA_URL = URL_ROOT + '/data_aishell.tgz' +MD5_DATA = '2f494334227864a8a8fec932999db9d8' +RESOURCE_URL = URL_ROOT + '/resource_aishell.tgz' +MD5_RESOURCE = '957d480a0fcac85fc18e550756f624e5' + +parser = argparse.ArgumentParser(description=__doc__) +parser.add_argument( + "--target_dir", + default=DATA_HOME + "/Aishell", + type=str, + help="Directory to save the dataset. (default: %(default)s)") +parser.add_argument( + "--manifest_prefix", + default="manifest", + type=str, + help="Filepath prefix for output manifests. (default: %(default)s)") +args = parser.parse_args() + + +def create_manifest(data_dir, manifest_path_prefix): + print("Creating manifest %s ..." % manifest_path_prefix) + json_lines = [] + transcript_path = os.path.join(data_dir, 'transcript', + 'aishell_transcript_v0.8.txt') + transcript_dict = {} + for line in codecs.open(transcript_path, 'r', 'utf-8'): + line = line.strip() + if line == '': + continue + audio_id, text = line.split(' ', 1) + # remove withespace, charactor text + text = ''.join(text.split()) + transcript_dict[audio_id] = text + + data_types = ['train', 'dev', 'test'] + for dtype in data_types: + del json_lines[:] + total_sec = 0.0 + total_text = 0.0 + total_num = 0 + + audio_dir = os.path.join(data_dir, 'wav', dtype) + for subfolder, _, filelist in sorted(os.walk(audio_dir)): + for fname in filelist: + audio_path = os.path.abspath(os.path.join(subfolder, fname)) + audio_id = os.path.basename(fname)[:-4] + # if no transcription for audio then skipped + if audio_id not in transcript_dict: + continue + + utt2spk = Path(audio_path).parent.name + audio_data, samplerate = soundfile.read(audio_path) + duration = float(len(audio_data) / samplerate) + text = transcript_dict[audio_id] + json_lines.append( + json.dumps( + { + 'utt': audio_id, + 'utt2spk': str(utt2spk), + 'feat': audio_path, + 'feat_shape': (duration, ), # second + 'text': text + }, + ensure_ascii=False)) + + total_sec += duration + total_text += len(text) + total_num += 1 + + manifest_path = manifest_path_prefix + '.' + dtype + with codecs.open(manifest_path, 'w', 'utf-8') as fout: + for line in json_lines: + fout.write(line + '\n') + + manifest_dir = os.path.dirname(manifest_path_prefix) + meta_path = os.path.join(manifest_dir, dtype) + '.meta' + with open(meta_path, 'w') as f: + print(f"{dtype}:", file=f) + print(f"{total_num} utts", file=f) + print(f"{total_sec / (60*60)} h", file=f) + print(f"{total_text} text", file=f) + print(f"{total_text / total_sec} text/sec", file=f) + print(f"{total_sec / total_num} sec/utt", file=f) + + +def prepare_dataset(url, md5sum, target_dir, manifest_path=None): + """Download, unpack and create manifest file.""" + data_dir = os.path.join(target_dir, 'data_aishell') + if not os.path.exists(data_dir): + filepath = download(url, md5sum, target_dir) + unpack(filepath, target_dir) + # unpack all audio tar files + audio_dir = os.path.join(data_dir, 'wav') + for subfolder, _, filelist in sorted(os.walk(audio_dir)): + for ftar in filelist: + unpack(os.path.join(subfolder, ftar), subfolder, True) + else: + print("Skip downloading and unpacking. Data already exists in %s." % + target_dir) + + if manifest_path: + create_manifest(data_dir, manifest_path) + + +def main(): + if args.target_dir.startswith('~'): + args.target_dir = os.path.expanduser(args.target_dir) + + prepare_dataset( + url=DATA_URL, + md5sum=MD5_DATA, + target_dir=args.target_dir, + manifest_path=args.manifest_prefix) + + prepare_dataset( + url=RESOURCE_URL, + md5sum=MD5_RESOURCE, + target_dir=args.target_dir, + manifest_path=None) + + print("Data download and manifest prepare done!") + if __name__ == '__main__': - aishell_main() + main() diff --git a/dataset/librispeech/librispeech.py b/dataset/librispeech/librispeech.py index 44567b0cf..2d6f1763d 100644 --- a/dataset/librispeech/librispeech.py +++ b/dataset/librispeech/librispeech.py @@ -28,8 +28,8 @@ from multiprocessing.pool import Pool import distutils.util import soundfile -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unpack +from utils.utility import download +from utils.utility import unpack URL_ROOT = "http://openslr.elda.org/resources/12" #URL_ROOT = "https://openslr.magicdatatech.com/resources/12" diff --git a/dataset/mini_librispeech/mini_librispeech.py b/dataset/mini_librispeech/mini_librispeech.py index 24bd98d8c..0eb80bf8f 100644 --- a/dataset/mini_librispeech/mini_librispeech.py +++ b/dataset/mini_librispeech/mini_librispeech.py @@ -27,8 +27,8 @@ from multiprocessing.pool import Pool import soundfile -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unpack +from utils.utility import download +from utils.utility import unpack URL_ROOT = "http://openslr.elda.org/resources/31" URL_TRAIN_CLEAN = URL_ROOT + "/train-clean-5.tar.gz" diff --git a/dataset/musan/musan.py b/dataset/musan/musan.py index 85d986e85..ae3430b2a 100644 --- a/dataset/musan/musan.py +++ b/dataset/musan/musan.py @@ -29,8 +29,8 @@ import os import soundfile -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unpack +from utils.utility import download +from utils.utility import unpack DATA_HOME = os.path.expanduser('~/.cache/paddle/dataset/speech') diff --git a/dataset/rir_noise/rir_noise.py b/dataset/rir_noise/rir_noise.py index b98dff722..b1d475584 100644 --- a/dataset/rir_noise/rir_noise.py +++ b/dataset/rir_noise/rir_noise.py @@ -29,8 +29,8 @@ import os import soundfile -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unzip +from utils.utility import download +from utils.utility import unzip DATA_HOME = os.path.expanduser('~/.cache/paddle/dataset/speech') diff --git a/dataset/tal_cs/README.md b/dataset/tal_cs/README.md deleted file mode 100644 index 633056360..000000000 --- a/dataset/tal_cs/README.md +++ /dev/null @@ -1,13 +0,0 @@ -# [TAL_CSASR](https://ai.100tal.com/dataset/) - -This data set is TAL English class audio, including mixed Chinese and English speech. Each audio has only one speaker, and this data set has more than 100 speakers. (File 63.36G) This data contains the sample of intra sentence and inter sentence mixing. The ratio between Chinese characters and English words in the data is 13:1. - -- Total data: 587H (train_set: 555.9H, dev_set: 8H, test_set: 23.6H) -- Sample rate: 16000 -- Sample bit: 16 -- Recording device: microphone -- Speaker number: 200+ -- Recording time: 2019 -- Data format: audio: .wav; test: .txt -- Audio duration: 1-60s -- Data type: audio of English teachers' teaching diff --git a/dataset/tal_cs/tal_cs.py b/dataset/tal_cs/tal_cs.py deleted file mode 100644 index 2024b21e3..000000000 --- a/dataset/tal_cs/tal_cs.py +++ /dev/null @@ -1,116 +0,0 @@ -# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -"""Prepare TALCS ASR datasets. - -create manifest files. -Manifest file is a json-format file with each line containing the -meta data (i.e. audio filepath, transcript and audio duration) -of each audio file in the data set. -""" -import argparse -import codecs -import io -import json -import os - -import soundfile - -parser = argparse.ArgumentParser(description=__doc__) -parser.add_argument( - "--target_dir", - type=str, - help="Directory to save the dataset. (default: %(default)s)") -parser.add_argument( - "--manifest_prefix", - type=str, - help="Filepath prefix for output manifests. (default: %(default)s)") -args = parser.parse_args() - -TRAIN_SET = os.path.join(args.target_dir, "train_set") -DEV_SET = os.path.join(args.target_dir, "dev_set") -TEST_SET = os.path.join(args.target_dir, "test_set") - -manifest_train_path = os.path.join(args.manifest_prefix, "manifest.train.raw") -manifest_dev_path = os.path.join(args.manifest_prefix, "manifest.dev.raw") -manifest_test_path = os.path.join(args.manifest_prefix, "manifest.test.raw") - - -def create_manifest(data_dir, manifest_path): - """Create a manifest json file summarizing the data set, with each line - containing the meta data (i.e. audio filepath, transcription text, audio - duration) of each audio file within the data set. - """ - print("Creating manifest %s ..." % manifest_path) - json_lines = [] - total_sec = 0.0 - total_char = 0.0 - total_num = 0 - wav_dir = os.path.join(data_dir, 'wav') - text_filepath = os.path.join(data_dir, 'label.txt') - for subfolder, _, filelist in sorted(os.walk(wav_dir)): - for line in io.open(text_filepath, encoding="utf8"): - segments = line.strip().split() - nchars = len(segments[1:]) - text = ' '.join(segments[1:]).lower() - - audio_filepath = os.path.abspath( - os.path.join(subfolder, segments[0] + '.wav')) - audio_data, samplerate = soundfile.read(audio_filepath) - duration = float(len(audio_data)) / samplerate - - utt = os.path.splitext(os.path.basename(audio_filepath))[0] - utt2spk = '-'.join(utt.split('-')[:2]) - - json_lines.append( - json.dumps({ - 'utt': utt, - 'utt2spk': utt2spk, - 'feat': audio_filepath, - 'feat_shape': (duration, ), # second - 'text': text, - })) - - total_sec += duration - total_char += nchars - total_num += 1 - - with codecs.open(manifest_path, 'w', 'utf-8') as out_file: - for line in json_lines: - out_file.write(line + '\n') - - subset = os.path.splitext(manifest_path)[1][1:] - manifest_dir = os.path.dirname(manifest_path) - data_dir_name = os.path.split(data_dir)[-1] - meta_path = os.path.join(manifest_dir, data_dir_name) + '.meta' - with open(meta_path, 'w') as f: - print(f"{subset}:", file=f) - print(f"{total_num} utts", file=f) - print(f"{total_sec / (60*60)} h", file=f) - print(f"{total_char} char", file=f) - print(f"{total_char / total_sec} char/sec", file=f) - print(f"{total_sec / total_num} sec/utt", file=f) - - -def main(): - if args.target_dir.startswith('~'): - args.target_dir = os.path.expanduser(args.target_dir) - - create_manifest(TRAIN_SET, manifest_train_path) - create_manifest(DEV_SET, manifest_dev_path) - create_manifest(TEST_SET, manifest_test_path) - print("Data download and manifest prepare done!") - - -if __name__ == '__main__': - main() diff --git a/dataset/thchs30/thchs30.py b/dataset/thchs30/thchs30.py index c5c3eb7a8..d41c0e175 100644 --- a/dataset/thchs30/thchs30.py +++ b/dataset/thchs30/thchs30.py @@ -27,8 +27,8 @@ from pathlib import Path import soundfile -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unpack +from utils.utility import download +from utils.utility import unpack DATA_HOME = os.path.expanduser('~/.cache/paddle/dataset/speech') diff --git a/dataset/timit/timit.py b/dataset/timit/timit.py index f3889d176..c4a9f0663 100644 --- a/dataset/timit/timit.py +++ b/dataset/timit/timit.py @@ -28,7 +28,7 @@ from pathlib import Path import soundfile -from paddlespeech.dataset.download import unzip +from utils.utility import unzip URL_ROOT = "" MD5_DATA = "45c68037c7fdfe063a43c851f181fb2d" diff --git a/dataset/voxceleb/voxceleb1.py b/dataset/voxceleb/voxceleb1.py index 8d4100678..95827f708 100644 --- a/dataset/voxceleb/voxceleb1.py +++ b/dataset/voxceleb/voxceleb1.py @@ -31,9 +31,9 @@ from pathlib import Path import soundfile -from paddlespeech.dataset.download import check_md5sum -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unzip +from utils.utility import check_md5sum +from utils.utility import download +from utils.utility import unzip # all the data will be download in the current data/voxceleb directory default DATA_HOME = os.path.expanduser('.') diff --git a/dataset/voxceleb/voxceleb2.py b/dataset/voxceleb/voxceleb2.py index 6df6d1f38..fe9e8b9c8 100644 --- a/dataset/voxceleb/voxceleb2.py +++ b/dataset/voxceleb/voxceleb2.py @@ -27,9 +27,9 @@ from pathlib import Path import soundfile -from paddlespeech.dataset.download import check_md5sum -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unzip +from utils.utility import check_md5sum +from utils.utility import download +from utils.utility import unzip # all the data will be download in the current data/voxceleb directory default DATA_HOME = os.path.expanduser('.') diff --git a/dataset/voxforge/voxforge.py b/dataset/voxforge/voxforge.py index 327d200bf..373791bff 100644 --- a/dataset/voxforge/voxforge.py +++ b/dataset/voxforge/voxforge.py @@ -28,9 +28,9 @@ import subprocess import soundfile -from paddlespeech.dataset.download import download_multi -from paddlespeech.dataset.download import getfile_insensitive -from paddlespeech.dataset.download import unpack +from utils.utility import download_multi +from utils.utility import getfile_insensitive +from utils.utility import unpack DATA_HOME = os.path.expanduser('~/.cache/paddle/dataset/speech') diff --git a/demos/TTSAndroid/README.md b/demos/TTSAndroid/README.md index 36848cbe3..d60135620 100644 --- a/demos/TTSAndroid/README.md +++ b/demos/TTSAndroid/README.md @@ -1,6 +1,6 @@ # 语音合成 Java API Demo 使用指南 -在 Android 上实现语音合成功能,此 Demo 有很好的易用性和开放性,如在 Demo 中跑自己训练好的模型等。 +在 Android 上实现语音合成功能,此 Demo 有很好的的易用性和开放性,如在 Demo 中跑自己训练好的模型等。 本文主要介绍语音合成 Demo 运行方法。 @@ -157,11 +157,8 @@ Android 示例基于 Java API 开发,调用 Paddle Lite `Java API` 包括以 ### 更新输入 -**本 Demo 不包含文本前端模块**,通过下拉框选择预先设置好的文本,在代码中映射成对应的 phone_id,**如需文本前端模块请自行处理**,可参考: -- C++ 中文前端 [lym0302/paddlespeech_tts_cpp](https://github.com/lym0302/paddlespeech_tts_cpp) -- C++ 英文 g2p [yazone/g2pE_mobile](https://github.com/yazone/g2pE_mobile) - -`phone_id_map.txt` 请参考 [fastspeech2_cnndecoder_csmsc_pdlite_1.3.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_cnndecoder_csmsc_pdlite_1.3.0.zip)。 +**本 Demo 不包含文本前端模块**,通过下拉框选择预先设置好的文本,在代码中映射成对应的 phone_id,**如需文本前端模块请自行处理**,`phone_id_map.txt` +请参考 [fastspeech2_cnndecoder_csmsc_pdlite_1.3.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_cnndecoder_csmsc_pdlite_1.3.0.zip)。 ## 通过 setting 界面更新语音合成的相关参数 diff --git a/demos/TTSArmLinux/.gitignore b/demos/TTSArmLinux/.gitignore deleted file mode 100644 index f18480d7a..000000000 --- a/demos/TTSArmLinux/.gitignore +++ /dev/null @@ -1,8 +0,0 @@ -# 目录 -build/ -output/ -libs/ -models/ - -# 符号连接 -dict diff --git a/demos/TTSArmLinux/README.md b/demos/TTSArmLinux/README.md deleted file mode 100644 index a4ccba6c8..000000000 --- a/demos/TTSArmLinux/README.md +++ /dev/null @@ -1,91 +0,0 @@ -# TTS ARM Linux C++ Demo - -修改自 [demos/TTSAndroid](../TTSAndroid),模型也来自该安卓 Demo。 - -### 配置编译选项 - -打开 [config.sh](config.sh) 按需修改配置。 - -默认编译 64 位版本,如果要编译 32 位版本,把 `ARM_ABI=armv8` 改成 `ARM_ABI=armv7hf` 。 - -### 安装依赖 - -```bash -# Ubuntu -sudo apt install build-essential cmake pkg-config wget tar unzip - -# CentOS -sudo yum groupinstall "Development Tools" -sudo yum install cmake wget tar unzip -``` - -### 下载 Paddle Lite 库文件和模型文件 - -预编译的二进制使用与安卓 Demo 版本相同的 Paddle Lite 推理库([Paddle-Lite:68b66fd35](https://github.com/PaddlePaddle/Paddle-Lite/tree/68b66fd356c875c92167d311ad458e6093078449))和模型([fs2cnn_mbmelgan_cpu_v1.3.0](https://paddlespeech.bj.bcebos.com/demos/TTSAndroid/fs2cnn_mbmelgan_cpu_v1.3.0.tar.gz))。 - -可用以下命令下载: - -```bash -./download.sh -``` - -### 编译 Demo - -```bash -./build.sh -``` - -预编译的二进制兼容 Ubuntu 16.04 到 20.04。 - -如果编译或链接失败,说明发行版与预编译库不兼容,请尝试手动编译 Paddle Lite 库,具体步骤在最下面。 - -### 运行 - -你可以修改 `./front.conf` 中 `--phone2id_path` 参数为你自己的声学模型的 `phone_id_map.txt` 。 - -```bash -./run.sh -./run.sh --sentence "语音合成测试" -./run.sh --sentence "输出到指定的音频文件" --output_wav ./output/test.wav -./run.sh --help -``` - -目前只支持中文合成,出现任何英文都会导致程序崩溃。 - -如果未指定`--wav_file`,默认输出到`./output/tts.wav`。 - -## 手动编译 Paddle Lite 库 - -预编译的二进制兼容 Ubuntu 16.04 到 20.04,如果你的发行版与其不兼容,可以自行从源代码编译。 - -注意,我们只能保证 [Paddle-Lite:68b66fd35](https://github.com/PaddlePaddle/Paddle-Lite/tree/68b66fd356c875c92167d311ad458e6093078449) 与通过 `download.sh` 下载的模型兼容。 -如果使用其他版本的 Paddle Lite 库,可能需要用对应版本的 opt 工具重新导出模型。 - -此外,[Paddle-Lite 2.12](https://github.com/PaddlePaddle/Paddle-Lite/releases/tag/v2.12) 与 TTS 不兼容,无法导出或运行 TTS 模型,需要使用更新的版本(比如 `develop` 分支中的代码)。 -但 `develop` 分支中的代码可能与通过 `download.sh` 下载的模型不兼容,Demo 运行起来可能会崩溃。 - -### 安装 Paddle Lite 的编译依赖 - -```bash -# Ubuntu -sudo apt install build-essential cmake git python - -# CentOS -sudo yum groupinstall "Development Tools" -sudo yum install cmake git python -``` - -### 编译 Paddle Lite 68b66fd35 - -``` -git clone https://github.com/PaddlePaddle/Paddle-Lite.git -cd Paddle-Lite -git checkout 68b66fd356c875c92167d311ad458e6093078449 -./lite/tools/build_linux.sh --with_extra=ON -``` - -编译完成后,打开 Demo 的 [config.sh](config.sh),把 `PADDLE_LITE_DIR` 改成以下值即可(注意替换 `/path/to/` 为实际目录): - -``` -PADDLE_LITE_DIR="/path/to/Paddle-Lite/build.lite.linux.${ARM_ABI}.gcc/inference_lite_lib.armlinux.${ARM_ABI}/cxx" -``` diff --git a/demos/TTSArmLinux/build-depends.sh b/demos/TTSArmLinux/build-depends.sh deleted file mode 120000 index fd3aec9c8..000000000 --- a/demos/TTSArmLinux/build-depends.sh +++ /dev/null @@ -1 +0,0 @@ -src/TTSCppFrontend/build-depends.sh \ No newline at end of file diff --git a/demos/TTSArmLinux/build.sh b/demos/TTSArmLinux/build.sh deleted file mode 100755 index 5d31173ef..000000000 --- a/demos/TTSArmLinux/build.sh +++ /dev/null @@ -1,29 +0,0 @@ -#!/bin/bash -set -e -set -x - -cd "$(dirname "$(realpath "$0")")" - -BASE_DIR="$PWD" - -# load configure -. ./config.sh - -# build -echo "ARM_ABI is ${ARM_ABI}" -echo "PADDLE_LITE_DIR is ${PADDLE_LITE_DIR}" - -echo "Build depends..." -./build-depends.sh "$@" - -mkdir -p "$BASE_DIR/build" -cd "$BASE_DIR/build" -cmake -DPADDLE_LITE_DIR="${PADDLE_LITE_DIR}" -DARM_ABI="${ARM_ABI}" ../src - -if [ "$*" = "" ]; then - make -j$(nproc) -else - make "$@" -fi - -echo "make successful!" diff --git a/demos/TTSArmLinux/clean.sh b/demos/TTSArmLinux/clean.sh deleted file mode 100755 index 2743801c3..000000000 --- a/demos/TTSArmLinux/clean.sh +++ /dev/null @@ -1,23 +0,0 @@ -#!/bin/bash -set -e -set -x - -cd "$(dirname "$(realpath "$0")")" - -BASE_DIR="$PWD" - -# load configure -. ./config.sh - -# remove dirs -set -x - -rm -rf "$OUTPUT_DIR" -rm -rf "$LIBS_DIR" -rm -rf "$MODELS_DIR" -rm -rf "$BASE_DIR/build" - -"$BASE_DIR/src/TTSCppFrontend/clean.sh" - -# 符号连接 -rm "$BASE_DIR/dict" diff --git a/demos/TTSArmLinux/config.sh b/demos/TTSArmLinux/config.sh deleted file mode 100644 index bf38d7d6d..000000000 --- a/demos/TTSArmLinux/config.sh +++ /dev/null @@ -1,15 +0,0 @@ -# configuration - -ARM_ABI=armv8 -#ARM_ABI=armv7hf - -MODELS_DIR="${PWD}/models" -LIBS_DIR="${PWD}/libs" -OUTPUT_DIR="${PWD}/output" - -PADDLE_LITE_DIR="${LIBS_DIR}/inference_lite_lib.armlinux.${ARM_ABI}.gcc.with_extra.with_cv/cxx" -#PADDLE_LITE_DIR="/path/to/Paddle-Lite/build.lite.linux.${ARM_ABI}.gcc/inference_lite_lib.armlinux.${ARM_ABI}/cxx" - -ACOUSTIC_MODEL_PATH="${MODELS_DIR}/cpu/fastspeech2_csmsc_arm.nb" -VOCODER_PATH="${MODELS_DIR}/cpu/mb_melgan_csmsc_arm.nb" -FRONT_CONF="${PWD}/front.conf" diff --git a/demos/TTSArmLinux/download.sh b/demos/TTSArmLinux/download.sh deleted file mode 100755 index 7eaa836a5..000000000 --- a/demos/TTSArmLinux/download.sh +++ /dev/null @@ -1,70 +0,0 @@ -#!/bin/bash -set -e - -cd "$(dirname "$(realpath "$0")")" - -BASE_DIR="$PWD" - -# load configure -. ./config.sh - -mkdir -p "$LIBS_DIR" "$MODELS_DIR" - -download() { - file="$1" - url="$2" - md5="$3" - dir="$4" - - cd "$dir" - - if [ -f "$file" ] && [ "$(md5sum "$file" | awk '{ print $1 }')" = "$md5" ]; then - echo "File $file (MD5: $md5) has been downloaded." - else - echo "Downloading $file..." - wget -O "$file" "$url" - - # MD5 verify - fileMd5="$(md5sum "$file" | awk '{ print $1 }')" - if [ "$fileMd5" == "$md5" ]; then - echo "File $file (MD5: $md5) has been downloaded." - else - echo "MD5 mismatch, file may be corrupt" - echo "$file MD5: $fileMd5, it should be $md5" - fi - fi - - echo "Extracting $file..." - echo '-----------------------' - tar -vxf "$file" - echo '=======================' -} - -######################################## - -echo "Download models..." - -download 'inference_lite_lib.armlinux.armv8.gcc.with_extra.with_cv.tar.gz' \ - 'https://paddlespeech.bj.bcebos.com/demos/TTSArmLinux/inference_lite_lib.armlinux.armv8.gcc.with_extra.with_cv.tar.gz' \ - '39e0c6604f97c70f5d13c573d7e709b9' \ - "$LIBS_DIR" - -download 'inference_lite_lib.armlinux.armv7hf.gcc.with_extra.with_cv.tar.gz' \ - 'https://paddlespeech.bj.bcebos.com/demos/TTSArmLinux/inference_lite_lib.armlinux.armv7hf.gcc.with_extra.with_cv.tar.gz' \ - 'f5ceb509f0b610dafb8379889c5f36f8' \ - "$LIBS_DIR" - -download 'fs2cnn_mbmelgan_cpu_v1.3.0.tar.gz' \ - 'https://paddlespeech.bj.bcebos.com/demos/TTSAndroid/fs2cnn_mbmelgan_cpu_v1.3.0.tar.gz' \ - '93ef17d44b498aff3bea93e2c5c09a1e' \ - "$MODELS_DIR" - -echo "Done." - -######################################## - -echo "Download dictionary files..." - -ln -s src/TTSCppFrontend/front_demo/dict "$BASE_DIR/" - -"$BASE_DIR/src/TTSCppFrontend/download.sh" diff --git a/demos/TTSArmLinux/front.conf b/demos/TTSArmLinux/front.conf deleted file mode 100644 index 5960b32a9..000000000 --- a/demos/TTSArmLinux/front.conf +++ /dev/null @@ -1,21 +0,0 @@ -# jieba conf ---jieba_dict_path=./dict/jieba/jieba.dict.utf8 ---jieba_hmm_path=./dict/jieba/hmm_model.utf8 ---jieba_user_dict_path=./dict/jieba/user.dict.utf8 ---jieba_idf_path=./dict/jieba/idf.utf8 ---jieba_stop_word_path=./dict/jieba/stop_words.utf8 - -# dict conf fastspeech2_0.4 ---separate_tone=false ---word2phone_path=./dict/fastspeech2_nosil_baker_ckpt_0.4/word2phone_fs2.dict ---phone2id_path=./dict/fastspeech2_nosil_baker_ckpt_0.4/phone_id_map.txt ---tone2id_path=./dict/fastspeech2_nosil_baker_ckpt_0.4/word2phone_fs2.dict - -# dict conf speedyspeech_0.5 -#--separate_tone=true -#--word2phone_path=./dict/speedyspeech_nosil_baker_ckpt_0.5/word2phone.dict -#--phone2id_path=./dict/speedyspeech_nosil_baker_ckpt_0.5/phone_id_map.txt -#--tone2id_path=./dict/speedyspeech_nosil_baker_ckpt_0.5/tone_id_map.txt - -# dict of tranditional_to_simplified ---trand2simpd_path=./dict/tranditional_to_simplified/trand2simp.txt diff --git a/demos/TTSArmLinux/run.sh b/demos/TTSArmLinux/run.sh deleted file mode 100755 index d0860f044..000000000 --- a/demos/TTSArmLinux/run.sh +++ /dev/null @@ -1,19 +0,0 @@ -#!/bin/bash -set -e - -cd "$(dirname "$(realpath "$0")")" - -# load configure -. ./config.sh - -# create dir -mkdir -p "$OUTPUT_DIR" - -# run -set -x -./build/paddlespeech_tts_demo \ - --front_conf "$FRONT_CONF" \ - --acoustic_model "$ACOUSTIC_MODEL_PATH" \ - --vocoder "$VOCODER_PATH" \ - "$@" -# end diff --git a/demos/TTSArmLinux/src/CMakeLists.txt b/demos/TTSArmLinux/src/CMakeLists.txt deleted file mode 100644 index f8240d0ce..000000000 --- a/demos/TTSArmLinux/src/CMakeLists.txt +++ /dev/null @@ -1,80 +0,0 @@ -cmake_minimum_required(VERSION 3.10) -project(paddlespeech_tts_demo) - - -########## Global Options ########## - -option(WITH_FRONT_DEMO "Build front demo" OFF) - -set(CMAKE_CXX_STANDARD 17) -set(CMAKE_POSITION_INDEPENDENT_CODE ON) -set(ABSL_PROPAGATE_CXX_STD ON) - - -########## ARM Options ########## - -set(CMAKE_SYSTEM_NAME Linux) -if(ARM_ABI STREQUAL "armv8") - set(CMAKE_SYSTEM_PROCESSOR aarch64) - #set(CMAKE_C_COMPILER "aarch64-linux-gnu-gcc") - #set(CMAKE_CXX_COMPILER "aarch64-linux-gnu-g++") -elseif(ARM_ABI STREQUAL "armv7hf") - set(CMAKE_SYSTEM_PROCESSOR arm) - #set(CMAKE_C_COMPILER "arm-linux-gnueabihf-gcc") - #set(CMAKE_CXX_COMPILER "arm-linux-gnueabihf-g++") -else() - message(FATAL_ERROR "Unknown arch abi ${ARM_ABI}, only support armv8 and armv7hf.") - return() -endif() - - -########## Paddle Lite Options ########## - -message(STATUS "TARGET ARCH ABI: ${ARM_ABI}") -message(STATUS "PADDLE LITE DIR: ${PADDLE_LITE_DIR}") - -include_directories(${PADDLE_LITE_DIR}/include) -link_directories(${PADDLE_LITE_DIR}/libs/${ARM_ABI}) -link_directories(${PADDLE_LITE_DIR}/lib) - -if(ARM_ABI STREQUAL "armv8") - set(CMAKE_CXX_FLAGS "-march=armv8-a ${CMAKE_CXX_FLAGS}") - set(CMAKE_C_FLAGS "-march=armv8-a ${CMAKE_C_FLAGS}") -elseif(ARM_ABI STREQUAL "armv7hf") - set(CMAKE_CXX_FLAGS "-march=armv7-a -mfloat-abi=hard -mfpu=neon-vfpv4 ${CMAKE_CXX_FLAGS}") - set(CMAKE_C_FLAGS "-march=armv7-a -mfloat-abi=hard -mfpu=neon-vfpv4 ${CMAKE_C_FLAGS}" ) -endif() - - -########## Dependencies ########## - -find_package(OpenMP REQUIRED) -if(OpenMP_FOUND OR OpenMP_CXX_FOUND) - set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}") - set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}") - message(STATUS "Found OpenMP ${OpenMP_VERSION} ${OpenMP_CXX_VERSION}") - message(STATUS "OpenMP C flags: ${OpenMP_C_FLAGS}") - message(STATUS "OpenMP CXX flags: ${OpenMP_CXX_FLAGS}") - message(STATUS "OpenMP OpenMP_CXX_LIB_NAMES: ${OpenMP_CXX_LIB_NAMES}") - message(STATUS "OpenMP OpenMP_CXX_LIBRARIES: ${OpenMP_CXX_LIBRARIES}") -else() - message(FATAL_ERROR "Could not found OpenMP!") - return() -endif() - - -############### tts cpp frontend ############### - -add_subdirectory(TTSCppFrontend) - -include_directories( - TTSCppFrontend/src - third-party/build/src/cppjieba/include - third-party/build/src/limonp/include -) - - -############### paddlespeech_tts_demo ############### - -add_executable(paddlespeech_tts_demo main.cc) -target_link_libraries(paddlespeech_tts_demo paddle_light_api_shared paddlespeech_tts_front) diff --git a/demos/TTSArmLinux/src/Predictor.hpp b/demos/TTSArmLinux/src/Predictor.hpp deleted file mode 100644 index f173abb5c..000000000 --- a/demos/TTSArmLinux/src/Predictor.hpp +++ /dev/null @@ -1,320 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. -#include -#include -#include -#include -#include -#include -#include -#include "paddle_api.h" - -using namespace paddle::lite_api; - -class PredictorInterface { - public: - virtual ~PredictorInterface() = 0; - virtual bool Init(const std::string &AcousticModelPath, - const std::string &VocoderPath, - PowerMode cpuPowerMode, - int cpuThreadNum, - // WAV采样率(必须与模型输出匹配) - // 如果播放速度和音调异常,请修改采样率 - // 常见采样率:16000, 24000, 32000, 44100, 48000, 96000 - uint32_t wavSampleRate) = 0; - virtual std::shared_ptr LoadModel( - const std::string &modelPath, - int cpuThreadNum, - PowerMode cpuPowerMode) = 0; - virtual void ReleaseModel() = 0; - virtual bool RunModel(const std::vector &phones) = 0; - virtual std::unique_ptr GetAcousticModelOutput( - const std::vector &phones) = 0; - virtual std::unique_ptr GetVocoderOutput( - std::unique_ptr &&amOutput) = 0; - virtual void VocoderOutputToWav( - std::unique_ptr &&vocOutput) = 0; - virtual void SaveFloatWav(float *floatWav, int64_t size) = 0; - virtual bool IsLoaded() = 0; - virtual float GetInferenceTime() = 0; - virtual int GetWavSize() = 0; - // 获取WAV持续时间(单位:毫秒) - virtual float GetWavDuration() = 0; - // 获取RTF(合成时间 / 音频时长) - virtual float GetRTF() = 0; - virtual void ReleaseWav() = 0; - virtual bool WriteWavToFile(const std::string &wavPath) = 0; -}; - -PredictorInterface::~PredictorInterface() {} - -// WavDataType: WAV数据类型 -// 可在 int16_t 和 float 之间切换, -// 用于生成 16-bit PCM 或 32-bit IEEE float 格式的 WAV -template -class Predictor : public PredictorInterface { - public: - bool Init(const std::string &AcousticModelPath, - const std::string &VocoderPath, - PowerMode cpuPowerMode, - int cpuThreadNum, - // WAV采样率(必须与模型输出匹配) - // 如果播放速度和音调异常,请修改采样率 - // 常见采样率:16000, 24000, 32000, 44100, 48000, 96000 - uint32_t wavSampleRate) override { - // Release model if exists - ReleaseModel(); - - acoustic_model_predictor_ = - LoadModel(AcousticModelPath, cpuThreadNum, cpuPowerMode); - if (acoustic_model_predictor_ == nullptr) { - return false; - } - vocoder_predictor_ = LoadModel(VocoderPath, cpuThreadNum, cpuPowerMode); - if (vocoder_predictor_ == nullptr) { - return false; - } - - wav_sample_rate_ = wavSampleRate; - - return true; - } - - virtual ~Predictor() { - ReleaseModel(); - ReleaseWav(); - } - - std::shared_ptr LoadModel( - const std::string &modelPath, - int cpuThreadNum, - PowerMode cpuPowerMode) override { - if (modelPath.empty()) { - return nullptr; - } - - // 设置MobileConfig - MobileConfig config; - config.set_model_from_file(modelPath); - config.set_threads(cpuThreadNum); - config.set_power_mode(cpuPowerMode); - - return CreatePaddlePredictor(config); - } - - void ReleaseModel() override { - acoustic_model_predictor_ = nullptr; - vocoder_predictor_ = nullptr; - } - - bool RunModel(const std::vector &phones) override { - if (!IsLoaded()) { - return false; - } - - // 计时开始 - auto start = std::chrono::system_clock::now(); - - // 执行推理 - VocoderOutputToWav(GetVocoderOutput(GetAcousticModelOutput(phones))); - - // 计时结束 - auto end = std::chrono::system_clock::now(); - - // 计算用时 - std::chrono::duration duration = end - start; - inference_time_ = duration.count() * 1000; // 单位:毫秒 - - return true; - } - - std::unique_ptr GetAcousticModelOutput( - const std::vector &phones) override { - auto phones_handle = acoustic_model_predictor_->GetInput(0); - phones_handle->Resize({static_cast(phones.size())}); - phones_handle->CopyFromCpu(phones.data()); - acoustic_model_predictor_->Run(); - - // 获取输出Tensor - auto am_output_handle = acoustic_model_predictor_->GetOutput(0); - // 打印输出Tensor的shape - std::cout << "Acoustic Model Output shape: "; - auto shape = am_output_handle->shape(); - for (auto s : shape) { - std::cout << s << ", "; - } - std::cout << std::endl; - - return am_output_handle; - } - - std::unique_ptr GetVocoderOutput( - std::unique_ptr &&amOutput) override { - auto mel_handle = vocoder_predictor_->GetInput(0); - // [?, 80] - auto dims = amOutput->shape(); - mel_handle->Resize(dims); - auto am_output_data = amOutput->mutable_data(); - mel_handle->CopyFromCpu(am_output_data); - vocoder_predictor_->Run(); - - // 获取输出Tensor - auto voc_output_handle = vocoder_predictor_->GetOutput(0); - // 打印输出Tensor的shape - std::cout << "Vocoder Output shape: "; - auto shape = voc_output_handle->shape(); - for (auto s : shape) { - std::cout << s << ", "; - } - std::cout << std::endl; - - return voc_output_handle; - } - - void VocoderOutputToWav( - std::unique_ptr &&vocOutput) override { - // 获取输出Tensor的数据 - int64_t output_size = 1; - for (auto dim : vocOutput->shape()) { - output_size *= dim; - } - auto output_data = vocOutput->mutable_data(); - - SaveFloatWav(output_data, output_size); - } - - void SaveFloatWav(float *floatWav, int64_t size) override; - - bool IsLoaded() override { - return acoustic_model_predictor_ != nullptr && - vocoder_predictor_ != nullptr; - } - - float GetInferenceTime() override { return inference_time_; } - - const std::vector &GetWav() { return wav_; } - - int GetWavSize() override { return wav_.size() * sizeof(WavDataType); } - - // 获取WAV持续时间(单位:毫秒) - float GetWavDuration() override { - return static_cast(GetWavSize()) / sizeof(WavDataType) / - static_cast(wav_sample_rate_) * 1000; - } - - // 获取RTF(合成时间 / 音频时长) - float GetRTF() override { return GetInferenceTime() / GetWavDuration(); } - - void ReleaseWav() override { wav_.clear(); } - - bool WriteWavToFile(const std::string &wavPath) override { - std::ofstream fout(wavPath, std::ios::binary); - if (!fout.is_open()) { - return false; - } - - // 写入头信息 - WavHeader header; - header.audio_format = GetWavAudioFormat(); - header.data_size = GetWavSize(); - header.size = sizeof(header) - 8 + header.data_size; - header.sample_rate = wav_sample_rate_; - header.byte_rate = header.sample_rate * header.num_channels * - header.bits_per_sample / 8; - header.block_align = header.num_channels * header.bits_per_sample / 8; - fout.write(reinterpret_cast(&header), sizeof(header)); - - // 写入wav数据 - fout.write(reinterpret_cast(wav_.data()), - header.data_size); - - fout.close(); - return true; - } - - protected: - struct WavHeader { - // RIFF 头 - char riff[4] = {'R', 'I', 'F', 'F'}; - uint32_t size = 0; - char wave[4] = {'W', 'A', 'V', 'E'}; - - // FMT 头 - char fmt[4] = {'f', 'm', 't', ' '}; - uint32_t fmt_size = 16; - uint16_t audio_format = 0; - uint16_t num_channels = 1; - uint32_t sample_rate = 0; - uint32_t byte_rate = 0; - uint16_t block_align = 0; - uint16_t bits_per_sample = sizeof(WavDataType) * 8; - - // DATA 头 - char data[4] = {'d', 'a', 't', 'a'}; - uint32_t data_size = 0; - }; - - enum WavAudioFormat { - WAV_FORMAT_16BIT_PCM = 1, // 16-bit PCM 格式 - WAV_FORMAT_32BIT_FLOAT = 3 // 32-bit IEEE float 格式 - }; - - protected: - // 返回值通过模板特化由 WavDataType 决定 - inline uint16_t GetWavAudioFormat(); - - inline float Abs(float number) { return (number < 0) ? -number : number; } - - protected: - float inference_time_ = 0; - uint32_t wav_sample_rate_ = 0; - std::vector wav_; - std::shared_ptr acoustic_model_predictor_ = nullptr; - std::shared_ptr vocoder_predictor_ = nullptr; -}; - -template <> -uint16_t Predictor::GetWavAudioFormat() { - return Predictor::WAV_FORMAT_16BIT_PCM; -} - -template <> -uint16_t Predictor::GetWavAudioFormat() { - return Predictor::WAV_FORMAT_32BIT_FLOAT; -} - -// 保存 16-bit PCM 格式 WAV -template <> -void Predictor::SaveFloatWav(float *floatWav, int64_t size) { - wav_.resize(size); - float maxSample = 0.01; - // 寻找最大采样值 - for (int64_t i = 0; i < size; i++) { - float sample = Abs(floatWav[i]); - if (sample > maxSample) { - maxSample = sample; - } - } - // 把采样值缩放到 int_16 范围 - for (int64_t i = 0; i < size; i++) { - wav_[i] = floatWav[i] * 32767.0f / maxSample; - } -} - -// 保存 32-bit IEEE float 格式 WAV -template <> -void Predictor::SaveFloatWav(float *floatWav, int64_t size) { - wav_.resize(size); - std::copy_n(floatWav, size, wav_.data()); -} diff --git a/demos/TTSArmLinux/src/TTSCppFrontend b/demos/TTSArmLinux/src/TTSCppFrontend deleted file mode 120000 index 25953976d..000000000 --- a/demos/TTSArmLinux/src/TTSCppFrontend +++ /dev/null @@ -1 +0,0 @@ -../../TTSCppFrontend/ \ No newline at end of file diff --git a/demos/TTSArmLinux/src/main.cc b/demos/TTSArmLinux/src/main.cc deleted file mode 100644 index 0b8e26bc4..000000000 --- a/demos/TTSArmLinux/src/main.cc +++ /dev/null @@ -1,162 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include "Predictor.hpp" - -using namespace paddle::lite_api; - -DEFINE_string( - sentence, - "你好,欢迎使用语音合成服务", - "Text to be synthesized (Chinese only. English will crash the program.)"); -DEFINE_string(front_conf, "./front.conf", "Front configuration file"); -DEFINE_string(acoustic_model, - "./models/cpu/fastspeech2_csmsc_arm.nb", - "Acoustic model .nb file"); -DEFINE_string(vocoder, - "./models/cpu/fastspeech2_csmsc_arm.nb", - "vocoder .nb file"); -DEFINE_string(output_wav, "./output/tts.wav", "Output WAV file"); -DEFINE_string(wav_bit_depth, - "16", - "WAV bit depth, 16 (16-bit PCM) or 32 (32-bit IEEE float)"); -DEFINE_string(wav_sample_rate, - "24000", - "WAV sample rate, should match the output of the vocoder"); -DEFINE_string(cpu_thread, "1", "CPU thread numbers"); - -int main(int argc, char *argv[]) { - gflags::ParseCommandLineFlags(&argc, &argv, true); - - PredictorInterface *predictor; - - if (FLAGS_wav_bit_depth == "16") { - predictor = new Predictor(); - } else if (FLAGS_wav_bit_depth == "32") { - predictor = new Predictor(); - } else { - LOG(ERROR) << "Unsupported WAV bit depth: " << FLAGS_wav_bit_depth; - return -1; - } - - - /////////////////////////// 前端:文本转音素 /////////////////////////// - - // 实例化文本前端引擎 - ppspeech::FrontEngineInterface *front_inst = nullptr; - front_inst = new ppspeech::FrontEngineInterface(FLAGS_front_conf); - if ((!front_inst) || (front_inst->init())) { - LOG(ERROR) << "Creater tts engine failed!"; - if (front_inst != nullptr) { - delete front_inst; - } - front_inst = nullptr; - return -1; - } - - std::wstring ws_sentence = ppspeech::utf8string2wstring(FLAGS_sentence); - - // 繁体转简体 - std::wstring sentence_simp; - front_inst->Trand2Simp(ws_sentence, &sentence_simp); - ws_sentence = sentence_simp; - - std::string s_sentence; - std::vector sentence_part; - std::vector phoneids = {}; - std::vector toneids = {}; - - // 根据标点进行分句 - LOG(INFO) << "Start to segment sentences by punctuation"; - front_inst->SplitByPunc(ws_sentence, &sentence_part); - LOG(INFO) << "Segment sentences through punctuation successfully"; - - // 分句后获取音素id - LOG(INFO) - << "Start to get the phoneme and tone id sequence of each sentence"; - for (int i = 0; i < sentence_part.size(); i++) { - LOG(INFO) << "Raw sentence is: " - << ppspeech::wstring2utf8string(sentence_part[i]); - front_inst->SentenceNormalize(&sentence_part[i]); - s_sentence = ppspeech::wstring2utf8string(sentence_part[i]); - LOG(INFO) << "After normalization sentence is: " << s_sentence; - - if (0 != front_inst->GetSentenceIds(s_sentence, &phoneids, &toneids)) { - LOG(ERROR) << "TTS inst get sentence phoneids and toneids failed"; - return -1; - } - } - LOG(INFO) << "The phoneids of the sentence is: " - << limonp::Join(phoneids.begin(), phoneids.end(), " "); - LOG(INFO) << "The toneids of the sentence is: " - << limonp::Join(toneids.begin(), toneids.end(), " "); - LOG(INFO) << "Get the phoneme id sequence of each sentence successfully"; - - - /////////////////////////// 后端:音素转音频 /////////////////////////// - - // WAV采样率(必须与模型输出匹配) - // 如果播放速度和音调异常,请修改采样率 - // 常见采样率:16000, 24000, 32000, 44100, 48000, 96000 - const uint32_t wavSampleRate = std::stoul(FLAGS_wav_sample_rate); - - // CPU线程数 - const int cpuThreadNum = std::stol(FLAGS_cpu_thread); - - // CPU电源模式 - const PowerMode cpuPowerMode = PowerMode::LITE_POWER_HIGH; - - if (!predictor->Init(FLAGS_acoustic_model, - FLAGS_vocoder, - cpuPowerMode, - cpuThreadNum, - wavSampleRate)) { - LOG(ERROR) << "predictor init failed" << std::endl; - return -1; - } - - std::vector phones(phoneids.size()); - std::transform(phoneids.begin(), phoneids.end(), phones.begin(), [](int x) { - return static_cast(x); - }); - - if (!predictor->RunModel(phones)) { - LOG(ERROR) << "predictor run model failed" << std::endl; - return -1; - } - - LOG(INFO) << "Inference time: " << predictor->GetInferenceTime() << " ms, " - << "WAV size (without header): " << predictor->GetWavSize() - << " bytes, " - << "WAV duration: " << predictor->GetWavDuration() << " ms, " - << "RTF: " << predictor->GetRTF() << std::endl; - - if (!predictor->WriteWavToFile(FLAGS_output_wav)) { - LOG(ERROR) << "write wav file failed" << std::endl; - return -1; - } - - delete predictor; - - return 0; -} diff --git a/demos/TTSArmLinux/src/third-party b/demos/TTSArmLinux/src/third-party deleted file mode 120000 index 851b2c1ec..000000000 --- a/demos/TTSArmLinux/src/third-party +++ /dev/null @@ -1 +0,0 @@ -TTSCppFrontend/third-party \ No newline at end of file diff --git a/demos/TTSCppFrontend/.gitignore b/demos/TTSCppFrontend/.gitignore deleted file mode 100644 index 0075a9011..000000000 --- a/demos/TTSCppFrontend/.gitignore +++ /dev/null @@ -1,2 +0,0 @@ -build/ -dict/ diff --git a/demos/TTSCppFrontend/CMakeLists.txt b/demos/TTSCppFrontend/CMakeLists.txt deleted file mode 100644 index 14245372b..000000000 --- a/demos/TTSCppFrontend/CMakeLists.txt +++ /dev/null @@ -1,63 +0,0 @@ -cmake_minimum_required(VERSION 3.10) -project(paddlespeech_tts_cpp) - - -########## Global Options ########## - -option(WITH_FRONT_DEMO "Build front demo" ON) - -set(CMAKE_CXX_STANDARD 17) -set(CMAKE_POSITION_INDEPENDENT_CODE ON) -set(ABSL_PROPAGATE_CXX_STD ON) - - -########## Dependencies ########## - -set(ENV{PKG_CONFIG_PATH} "${CMAKE_SOURCE_DIR}/third-party/build/lib/pkgconfig:${CMAKE_SOURCE_DIR}/third-party/build/lib64/pkgconfig") -find_package(PkgConfig REQUIRED) - -# It is hard to load xxx-config.cmake in a custom location, so use pkgconfig instead. -pkg_check_modules(ABSL REQUIRED absl_strings IMPORTED_TARGET) -pkg_check_modules(GFLAGS REQUIRED gflags IMPORTED_TARGET) -pkg_check_modules(GLOG REQUIRED libglog IMPORTED_TARGET) - -# load header-only libraries -include_directories( - ${CMAKE_SOURCE_DIR}/third-party/build/src/cppjieba/include - ${CMAKE_SOURCE_DIR}/third-party/build/src/limonp/include -) - -find_package(Threads REQUIRED) - - -########## paddlespeech_tts_front ########## - -include_directories(src) - -file(GLOB FRONT_SOURCES - ./src/base/*.cpp - ./src/front/*.cpp -) -add_library(paddlespeech_tts_front STATIC ${FRONT_SOURCES}) - -target_link_libraries( - paddlespeech_tts_front - PUBLIC - PkgConfig::GFLAGS - PkgConfig::GLOG - PkgConfig::ABSL - Threads::Threads -) - - -########## tts_front_demo ########## - -if (WITH_FRONT_DEMO) - - file(GLOB FRONT_DEMO_SOURCES front_demo/*.cpp) - add_executable(tts_front_demo ${FRONT_DEMO_SOURCES}) - - target_include_directories(tts_front_demo PRIVATE ./front_demo) - target_link_libraries(tts_front_demo PRIVATE paddlespeech_tts_front) - -endif (WITH_FRONT_DEMO) diff --git a/demos/TTSCppFrontend/README.md b/demos/TTSCppFrontend/README.md deleted file mode 100644 index c179fdd04..000000000 --- a/demos/TTSCppFrontend/README.md +++ /dev/null @@ -1,56 +0,0 @@ -# PaddleSpeech TTS CPP Frontend - -A TTS frontend that implements text-to-phoneme conversion. - -Currently it only supports Chinese, any English word will crash the demo. - -## Install Build Tools - -```bash -# Ubuntu -sudo apt install build-essential cmake pkg-config - -# CentOS -sudo yum groupinstall "Development Tools" -sudo yum install cmake -``` - -If your cmake version is too old, you can go here to download a precompiled new version: https://cmake.org/download/ - -## Build - -```bash -# Build with all CPU cores -./build.sh - -# Build with 1 core -./build.sh -j1 -``` - -Dependent libraries will be automatically downloaded to the `third-party/build` folder. - -If the download speed is too slow, you can open [third-party/CMakeLists.txt](third-party/CMakeLists.txt) and modify `GIT_REPOSITORY` URLs. - -## Download dictionary files - -```bash -./download.sh -``` - -## Run -You can change `--phone2id_path` in `./front_demo/front.conf` to the `phone_id_map.txt` of your own acoustic model. - -```bash -./run_front_demo.sh -./run_front_demo.sh --help -./run_front_demo.sh --sentence "这是语音合成服务的文本前端,用于将文本转换为音素序号数组。" -./run_front_demo.sh --front_conf ./front_demo/front.conf --sentence "你还需要一个语音合成后端才能将其转换为实际的声音。" -``` - -## Clean - -```bash -./clean.sh -``` - -The folders `front_demo/dict`, `build` and `third-party/build` will be deleted. diff --git a/demos/TTSCppFrontend/build-depends.sh b/demos/TTSCppFrontend/build-depends.sh deleted file mode 100755 index c5f2ca125..000000000 --- a/demos/TTSCppFrontend/build-depends.sh +++ /dev/null @@ -1,20 +0,0 @@ -#!/bin/bash -set -e -set -x - -cd "$(dirname "$(realpath "$0")")" - -cd ./third-party - -mkdir -p build -cd build - -cmake .. - -if [ "$*" = "" ]; then - make -j$(nproc) -else - make "$@" -fi - -echo "Done." diff --git a/demos/TTSCppFrontend/build.sh b/demos/TTSCppFrontend/build.sh deleted file mode 100755 index a136cb936..000000000 --- a/demos/TTSCppFrontend/build.sh +++ /dev/null @@ -1,21 +0,0 @@ -#!/bin/bash -set -e -set -x - -cd "$(dirname "$(realpath "$0")")" - -echo "************* Download & Build Dependencies *************" -./build-depends.sh "$@" - -echo "************* Build Front Lib and Demo *************" -mkdir -p ./build -cd ./build -cmake .. - -if [ "$*" = "" ]; then - make -j$(nproc) -else - make "$@" -fi - -echo "Done." diff --git a/demos/TTSCppFrontend/clean.sh b/demos/TTSCppFrontend/clean.sh deleted file mode 100755 index efbb28871..000000000 --- a/demos/TTSCppFrontend/clean.sh +++ /dev/null @@ -1,10 +0,0 @@ -#!/bin/bash -set -e -set -x - -cd "$(dirname "$(realpath "$0")")" -rm -rf "./front_demo/dict" -rm -rf "./build" -rm -rf "./third-party/build" - -echo "Done." diff --git a/demos/TTSCppFrontend/download.sh b/demos/TTSCppFrontend/download.sh deleted file mode 100755 index 0953e3a59..000000000 --- a/demos/TTSCppFrontend/download.sh +++ /dev/null @@ -1,62 +0,0 @@ -#!/bin/bash -set -e - -cd "$(dirname "$(realpath "$0")")" - -download() { - file="$1" - url="$2" - md5="$3" - dir="$4" - - cd "$dir" - - if [ -f "$file" ] && [ "$(md5sum "$file" | awk '{ print $1 }')" = "$md5" ]; then - echo "File $file (MD5: $md5) has been downloaded." - else - echo "Downloading $file..." - wget -O "$file" "$url" - - # MD5 verify - fileMd5="$(md5sum "$file" | awk '{ print $1 }')" - if [ "$fileMd5" == "$md5" ]; then - echo "File $file (MD5: $md5) has been downloaded." - else - echo "MD5 mismatch, file may be corrupt" - echo "$file MD5: $fileMd5, it should be $md5" - fi - fi - - echo "Extracting $file..." - echo '-----------------------' - tar -vxf "$file" - echo '=======================' -} - -######################################## - -DIST_DIR="$PWD/front_demo/dict" - -mkdir -p "$DIST_DIR" - -download 'fastspeech2_nosil_baker_ckpt_0.4.tar.gz' \ - 'https://paddlespeech.bj.bcebos.com/t2s/text_frontend/fastspeech2_nosil_baker_ckpt_0.4.tar.gz' \ - '7bf1bab1737375fa123c413eb429c573' \ - "$DIST_DIR" - -download 'speedyspeech_nosil_baker_ckpt_0.5.tar.gz' \ - 'https://paddlespeech.bj.bcebos.com/t2s/text_frontend/speedyspeech_nosil_baker_ckpt_0.5.tar.gz' \ - '0b7754b21f324789aef469c61f4d5b8f' \ - "$DIST_DIR" - -download 'jieba.tar.gz' \ - 'https://paddlespeech.bj.bcebos.com/t2s/text_frontend/jieba.tar.gz' \ - '6d30f426bd8c0025110a483f051315ca' \ - "$DIST_DIR" - -download 'tranditional_to_simplified.tar.gz' \ - 'https://paddlespeech.bj.bcebos.com/t2s/text_frontend/tranditional_to_simplified.tar.gz' \ - '258f5b59d5ebfe96d02007ca1d274a7f' \ - "$DIST_DIR" - -echo "Done." diff --git a/demos/TTSCppFrontend/front_demo/front.conf b/demos/TTSCppFrontend/front_demo/front.conf deleted file mode 100644 index abff44470..000000000 --- a/demos/TTSCppFrontend/front_demo/front.conf +++ /dev/null @@ -1,21 +0,0 @@ -# jieba conf ---jieba_dict_path=./front_demo/dict/jieba/jieba.dict.utf8 ---jieba_hmm_path=./front_demo/dict/jieba/hmm_model.utf8 ---jieba_user_dict_path=./front_demo/dict/jieba/user.dict.utf8 ---jieba_idf_path=./front_demo/dict/jieba/idf.utf8 ---jieba_stop_word_path=./front_demo/dict/jieba/stop_words.utf8 - -# dict conf fastspeech2_0.4 ---separate_tone=false ---word2phone_path=./front_demo/dict/fastspeech2_nosil_baker_ckpt_0.4/word2phone_fs2.dict ---phone2id_path=./front_demo/dict/fastspeech2_nosil_baker_ckpt_0.4/phone_id_map.txt ---tone2id_path=./front_demo/dict/fastspeech2_nosil_baker_ckpt_0.4/word2phone_fs2.dict - -# dict conf speedyspeech_0.5 -#--separate_tone=true -#--word2phone_path=./front_demo/dict/speedyspeech_nosil_baker_ckpt_0.5/word2phone.dict -#--phone2id_path=./front_demo/dict/speedyspeech_nosil_baker_ckpt_0.5/phone_id_map.txt -#--tone2id_path=./front_demo/dict/speedyspeech_nosil_baker_ckpt_0.5/tone_id_map.txt - -# dict of tranditional_to_simplified ---trand2simpd_path=./front_demo/dict/tranditional_to_simplified/trand2simp.txt diff --git a/demos/TTSCppFrontend/front_demo/front_demo.cpp b/demos/TTSCppFrontend/front_demo/front_demo.cpp deleted file mode 100644 index 77f3fc725..000000000 --- a/demos/TTSCppFrontend/front_demo/front_demo.cpp +++ /dev/null @@ -1,79 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -#include -#include -#include -#include -#include "front/front_interface.h" - -DEFINE_string(sentence, "你好,欢迎使用语音合成服务", "Text to be synthesized"); -DEFINE_string(front_conf, "./front_demo/front.conf", "Front conf file"); -// DEFINE_string(separate_tone, "true", "If true, get phoneids and tonesid"); - - -int main(int argc, char** argv) { - gflags::ParseCommandLineFlags(&argc, &argv, true); - // 实例化文本前端引擎 - ppspeech::FrontEngineInterface* front_inst = nullptr; - front_inst = new ppspeech::FrontEngineInterface(FLAGS_front_conf); - if ((!front_inst) || (front_inst->init())) { - LOG(ERROR) << "Creater tts engine failed!"; - if (front_inst != nullptr) { - delete front_inst; - } - front_inst = nullptr; - return -1; - } - - std::wstring ws_sentence = ppspeech::utf8string2wstring(FLAGS_sentence); - - // 繁体转简体 - std::wstring sentence_simp; - front_inst->Trand2Simp(ws_sentence, &sentence_simp); - ws_sentence = sentence_simp; - - std::string s_sentence; - std::vector sentence_part; - std::vector phoneids = {}; - std::vector toneids = {}; - - // 根据标点进行分句 - LOG(INFO) << "Start to segment sentences by punctuation"; - front_inst->SplitByPunc(ws_sentence, &sentence_part); - LOG(INFO) << "Segment sentences through punctuation successfully"; - - // 分句后获取音素id - LOG(INFO) - << "Start to get the phoneme and tone id sequence of each sentence"; - for (int i = 0; i < sentence_part.size(); i++) { - LOG(INFO) << "Raw sentence is: " - << ppspeech::wstring2utf8string(sentence_part[i]); - front_inst->SentenceNormalize(&sentence_part[i]); - s_sentence = ppspeech::wstring2utf8string(sentence_part[i]); - LOG(INFO) << "After normalization sentence is: " << s_sentence; - - if (0 != front_inst->GetSentenceIds(s_sentence, &phoneids, &toneids)) { - LOG(ERROR) << "TTS inst get sentence phoneids and toneids failed"; - return -1; - } - } - LOG(INFO) << "The phoneids of the sentence is: " - << limonp::Join(phoneids.begin(), phoneids.end(), " "); - LOG(INFO) << "The toneids of the sentence is: " - << limonp::Join(toneids.begin(), toneids.end(), " "); - LOG(INFO) << "Get the phoneme id sequence of each sentence successfully"; - - return EXIT_SUCCESS; -} diff --git a/demos/TTSCppFrontend/front_demo/gentools/gen_dict_paddlespeech.py b/demos/TTSCppFrontend/front_demo/gentools/gen_dict_paddlespeech.py deleted file mode 100644 index 5aaa6e345..000000000 --- a/demos/TTSCppFrontend/front_demo/gentools/gen_dict_paddlespeech.py +++ /dev/null @@ -1,111 +0,0 @@ -# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import argparse -import configparser - -from paddlespeech.t2s.frontend.zh_frontend import Frontend - - -def get_phone(frontend, - word, - merge_sentences=True, - print_info=False, - robot=False, - get_tone_ids=False): - phonemes = frontend.get_phonemes(word, merge_sentences, print_info, robot) - # Some optimizations - phones, tones = frontend._get_phone_tone(phonemes[0], get_tone_ids) - #print(type(phones), phones) - #print(type(tones), tones) - return phones, tones - - -def gen_word2phone_dict(frontend, - jieba_words_dict, - word2phone_dict, - get_tone=False): - with open(jieba_words_dict, "r") as f1, open(word2phone_dict, "w+") as f2: - for line in f1.readlines(): - word = line.split(" ")[0] - phone, tone = get_phone(frontend, word, get_tone_ids=get_tone) - phone_str = "" - - if tone: - assert (len(phone) == len(tone)) - for i in range(len(tone)): - phone_tone = phone[i] + tone[i] - phone_str += (" " + phone_tone) - phone_str = phone_str.strip("sp0").strip(" ") - else: - for x in phone: - phone_str += (" " + x) - phone_str = phone_str.strip("sp").strip(" ") - print(phone_str) - f2.write(word + " " + phone_str + "\n") - print("Generate word2phone dict successfully.") - - -def main(): - parser = argparse.ArgumentParser(description="Generate dictionary") - parser.add_argument( - "--config", type=str, default="./config.ini", help="config file.") - parser.add_argument( - "--am_type", - type=str, - default="fastspeech2", - help="fastspeech2 or speedyspeech") - args = parser.parse_args() - - # Read config - cf = configparser.ConfigParser() - cf.read(args.config) - jieba_words_dict_file = cf.get("jieba", - "jieba_words_dict") # get words dict - - am_type = args.am_type - if (am_type == "fastspeech2"): - phone2id_dict_file = cf.get(am_type, "phone2id_dict") - word2phone_dict_file = cf.get(am_type, "word2phone_dict") - - frontend = Frontend(phone_vocab_path=phone2id_dict_file) - print("frontend done!") - - gen_word2phone_dict( - frontend, - jieba_words_dict_file, - word2phone_dict_file, - get_tone=False) - - elif (am_type == "speedyspeech"): - phone2id_dict_file = cf.get(am_type, "phone2id_dict") - tone2id_dict_file = cf.get(am_type, "tone2id_dict") - word2phone_dict_file = cf.get(am_type, "word2phone_dict") - - frontend = Frontend( - phone_vocab_path=phone2id_dict_file, - tone_vocab_path=tone2id_dict_file) - print("frontend done!") - - gen_word2phone_dict( - frontend, - jieba_words_dict_file, - word2phone_dict_file, - get_tone=True) - - else: - print("Please set correct am type, fastspeech2 or speedyspeech.") - - -if __name__ == "__main__": - main() diff --git a/demos/TTSCppFrontend/front_demo/gentools/genid.py b/demos/TTSCppFrontend/front_demo/gentools/genid.py deleted file mode 100644 index cf83623f0..000000000 --- a/demos/TTSCppFrontend/front_demo/gentools/genid.py +++ /dev/null @@ -1,35 +0,0 @@ -# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -PHONESFILE = "./dict/phones.txt" -PHONES_ID_FILE = "./dict/phonesid.dict" -TONESFILE = "./dict/tones.txt" -TONES_ID_FILE = "./dict/tonesid.dict" - - -def GenIdFile(file, idfile): - id = 2 - with open(file, 'r') as f1, open(idfile, "w+") as f2: - f2.write(" 0\n") - f2.write(" 1\n") - for line in f1.readlines(): - phone = line.strip() - print(phone + " " + str(id) + "\n") - f2.write(phone + " " + str(id) + "\n") - id += 1 - - -if __name__ == "__main__": - GenIdFile(PHONESFILE, PHONES_ID_FILE) - GenIdFile(TONESFILE, TONES_ID_FILE) diff --git a/demos/TTSCppFrontend/front_demo/gentools/word2phones.py b/demos/TTSCppFrontend/front_demo/gentools/word2phones.py deleted file mode 100644 index d9baeea9c..000000000 --- a/demos/TTSCppFrontend/front_demo/gentools/word2phones.py +++ /dev/null @@ -1,55 +0,0 @@ -# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import re - -from pypinyin import lazy_pinyin -from pypinyin import Style - -worddict = "./dict/jieba_part.dict.utf8" -newdict = "./dict/word_phones.dict" - - -def GenPhones(initials, finals, separate=True): - - phones = [] - for c, v in zip(initials, finals): - if re.match(r'i\d', v): - if c in ['z', 'c', 's']: - v = re.sub('i', 'ii', v) - elif c in ['zh', 'ch', 'sh', 'r']: - v = re.sub('i', 'iii', v) - if c: - if separate is True: - phones.append(c + '0') - elif separate is False: - phones.append(c) - else: - print("Not sure whether phone and tone need to be separated") - if v: - phones.append(v) - return phones - - -with open(worddict, "r") as f1, open(newdict, "w+") as f2: - for line in f1.readlines(): - word = line.split(" ")[0] - initials = lazy_pinyin( - word, neutral_tone_with_five=True, style=Style.INITIALS) - finals = lazy_pinyin( - word, neutral_tone_with_five=True, style=Style.FINALS_TONE3) - - phones = GenPhones(initials, finals, True) - - temp = " ".join(phones) - f2.write(word + " " + temp + "\n") diff --git a/demos/TTSCppFrontend/run_front_demo.sh b/demos/TTSCppFrontend/run_front_demo.sh deleted file mode 100755 index 4dcded5c1..000000000 --- a/demos/TTSCppFrontend/run_front_demo.sh +++ /dev/null @@ -1,7 +0,0 @@ -#!/bin/bash -set -e -set -x - -cd "$(dirname "$(realpath "$0")")" - -./build/tts_front_demo "$@" diff --git a/demos/TTSCppFrontend/src/base/type_conv.cpp b/demos/TTSCppFrontend/src/base/type_conv.cpp deleted file mode 100644 index b7ff63642..000000000 --- a/demos/TTSCppFrontend/src/base/type_conv.cpp +++ /dev/null @@ -1,28 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. -#include "base/type_conv.h" - -namespace ppspeech { -// wstring to string -std::string wstring2utf8string(const std::wstring& str) { - static std::wstring_convert> strCnv; - return strCnv.to_bytes(str); -} - -// string to wstring -std::wstring utf8string2wstring(const std::string& str) { - static std::wstring_convert> strCnv; - return strCnv.from_bytes(str); -} -} // namespace ppspeech diff --git a/demos/TTSCppFrontend/src/base/type_conv.h b/demos/TTSCppFrontend/src/base/type_conv.h deleted file mode 100644 index 6aecfc438..000000000 --- a/demos/TTSCppFrontend/src/base/type_conv.h +++ /dev/null @@ -1,31 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -#ifndef BASE_TYPE_CONVC_H -#define BASE_TYPE_CONVC_H - -#include -#include -#include - - -namespace ppspeech { -// wstring to string -std::string wstring2utf8string(const std::wstring& str); - -// string to wstring -std::wstring utf8string2wstring(const std::string& str); -} - -#endif // BASE_TYPE_CONVC_H \ No newline at end of file diff --git a/demos/TTSCppFrontend/src/front/front_interface.cpp b/demos/TTSCppFrontend/src/front/front_interface.cpp deleted file mode 100644 index e7b08c798..000000000 --- a/demos/TTSCppFrontend/src/front/front_interface.cpp +++ /dev/null @@ -1,1130 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. -#include "front/front_interface.h" - -namespace ppspeech { - -int FrontEngineInterface::init() { - if (_initialed) { - return 0; - } - if (0 != ReadConfFile()) { - LOG(ERROR) << "Read front conf file failed"; - return -1; - } - - _jieba = new cppjieba::Jieba(_jieba_dict_path, - _jieba_hmm_path, - _jieba_user_dict_path, - _jieba_idf_path, - _jieba_stop_word_path); - - _punc = {",", - "。", - "、", - "?", - ":", - ";", - "~", - "!", - ",", - ".", - "?", - "!", - ":", - ";", - "/", - "\\"}; - _punc_omit = {"“", "”", "\"", "\""}; - - // 需要儿化音处理的词语 - must_erhua = { - "小院儿", "胡同儿", "范儿", "老汉儿", "撒欢儿", "寻老礼儿", "妥妥儿"}; - not_erhua = {"虐儿", "为儿", "护儿", "瞒儿", "救儿", "替儿", - "有儿", "一儿", "我儿", "俺儿", "妻儿", "拐儿", - "聋儿", "乞儿", "患儿", "幼儿", "孤儿", "婴儿", - "婴幼儿", "连体儿", "脑瘫儿", "流浪儿", "体弱儿", "混血儿", - "蜜雪儿", "舫儿", "祖儿", "美儿", "应采儿", "可儿", - "侄儿", "孙儿", "侄孙儿", "女儿", "男儿", "红孩儿", - "花儿", "虫儿", "马儿", "鸟儿", "猪儿", "猫儿", - "狗儿"}; - - must_not_neural_tone_words = { - "男子", "女子", "分子", "原子", "量子", "莲子", "石子", "瓜子", "电子"}; - // 需要轻声处理的词语 - must_neural_tone_words = { - "麻烦", "麻利", "鸳鸯", "高粱", "骨头", "骆驼", "马虎", "首饰", "馒头", - "馄饨", "风筝", "难为", "队伍", "阔气", "闺女", "门道", "锄头", "铺盖", - "铃铛", "铁匠", "钥匙", "里脊", "里头", "部分", "那么", "道士", "造化", - "迷糊", "连累", "这么", "这个", "运气", "过去", "软和", "转悠", "踏实", - "跳蚤", "跟头", "趔趄", "财主", "豆腐", "讲究", "记性", "记号", "认识", - "规矩", "见识", "裁缝", "补丁", "衣裳", "衣服", "衙门", "街坊", "行李", - "行当", "蛤蟆", "蘑菇", "薄荷", "葫芦", "葡萄", "萝卜", "荸荠", "苗条", - "苗头", "苍蝇", "芝麻", "舒服", "舒坦", "舌头", "自在", "膏药", "脾气", - "脑袋", "脊梁", "能耐", "胳膊", "胭脂", "胡萝", "胡琴", "胡同", "聪明", - "耽误", "耽搁", "耷拉", "耳朵", "老爷", "老实", "老婆", "老头", "老太", - "翻腾", "罗嗦", "罐头", "编辑", "结实", "红火", "累赘", "糨糊", "糊涂", - "精神", "粮食", "簸箕", "篱笆", "算计", "算盘", "答应", "笤帚", "笑语", - "笑话", "窟窿", "窝囊", "窗户", "稳当", "稀罕", "称呼", "秧歌", "秀气", - "秀才", "福气", "祖宗", "砚台", "码头", "石榴", "石头", "石匠", "知识", - "眼睛", "眯缝", "眨巴", "眉毛", "相声", "盘算", "白净", "痢疾", "痛快", - "疟疾", "疙瘩", "疏忽", "畜生", "生意", "甘蔗", "琵琶", "琢磨", "琉璃", - "玻璃", "玫瑰", "玄乎", "狐狸", "状元", "特务", "牲口", "牙碜", "牌楼", - "爽快", "爱人", "热闹", "烧饼", "烟筒", "烂糊", "点心", "炊帚", "灯笼", - "火候", "漂亮", "滑溜", "溜达", "温和", "清楚", "消息", "浪头", "活泼", - "比方", "正经", "欺负", "模糊", "槟榔", "棺材", "棒槌", "棉花", "核桃", - "栅栏", "柴火", "架势", "枕头", "枇杷", "机灵", "本事", "木头", "木匠", - "朋友", "月饼", "月亮", "暖和", "明白", "时候", "新鲜", "故事", "收拾", - "收成", "提防", "挖苦", "挑剔", "指甲", "指头", "拾掇", "拳头", "拨弄", - "招牌", "招呼", "抬举", "护士", "折腾", "扫帚", "打量", "打算", "打点", - "打扮", "打听", "打发", "扎实", "扁担", "戒指", "懒得", "意识", "意思", - "情形", "悟性", "怪物", "思量", "怎么", "念头", "念叨", "快活", "忙活", - "志气", "心思", "得罪", "张罗", "弟兄", "开通", "应酬", "庄稼", "干事", - "帮手", "帐篷", "希罕", "师父", "师傅", "巴结", "巴掌", "差事", "工夫", - "岁数", "屁股", "尾巴", "少爷", "小气", "小伙", "将就", "对头", "对付", - "寡妇", "家伙", "客气", "实在", "官司", "学问", "学生", "字号", "嫁妆", - "媳妇", "媒人", "婆家", "娘家", "委屈", "姑娘", "姐夫", "妯娌", "妥当", - "妖精", "奴才", "女婿", "头发", "太阳", "大爷", "大方", "大意", "大夫", - "多少", "多么", "外甥", "壮实", "地道", "地方", "在乎", "困难", "嘴巴", - "嘱咐", "嘟囔", "嘀咕", "喜欢", "喇嘛", "喇叭", "商量", "唾沫", "哑巴", - "哈欠", "哆嗦", "咳嗽", "和尚", "告诉", "告示", "含糊", "吓唬", "后头", - "名字", "名堂", "合同", "吆喝", "叫唤", "口袋", "厚道", "厉害", "千斤", - "包袱", "包涵", "匀称", "勤快", "动静", "动弹", "功夫", "力气", "前头", - "刺猬", "刺激", "别扭", "利落", "利索", "利害", "分析", "出息", "凑合", - "凉快", "冷战", "冤枉", "冒失", "养活", "关系", "先生", "兄弟", "便宜", - "使唤", "佩服", "作坊", "体面", "位置", "似的", "伙计", "休息", "什么", - "人家", "亲戚", "亲家", "交情", "云彩", "事情", "买卖", "主意", "丫头", - "丧气", "两口", "东西", "东家", "世故", "不由", "不在", "下水", "下巴", - "上头", "上司", "丈夫", "丈人", "一辈", "那个", "菩萨", "父亲", "母亲", - "咕噜", "邋遢", "费用", "冤家", "甜头", "介绍", "荒唐", "大人", "泥鳅", - "幸福", "熟悉", "计划", "扑腾", "蜡烛", "姥爷", "照顾", "喉咙", "吉他", - "弄堂", "蚂蚱", "凤凰", "拖沓", "寒碜", "糟蹋", "倒腾", "报复", "逻辑", - "盘缠", "喽啰", "牢骚", "咖喱", "扫把", "惦记"}; - - - // 生成词典(词到音素的映射) - if (0 != GenDict(_word2phone_path, &word_phone_map)) { - LOG(ERROR) << "Genarate word2phone dict failed"; - return -1; - } - - // 生成音素字典(音素到音素id的映射) - if (0 != GenDict(_phone2id_path, &phone_id_map)) { - LOG(ERROR) << "Genarate phone2id dict failed"; - return -1; - } - - // 生成音调字典(音调到音调id的映射) - if (_separate_tone == "true") { - if (0 != GenDict(_tone2id_path, &tone_id_map)) { - LOG(ERROR) << "Genarate tone2id dict failed"; - return -1; - } - } - - // 生成繁简字典(繁体到简体id的映射) - if (0 != GenDict(_trand2simp_path, &trand_simp_map)) { - LOG(ERROR) << "Genarate trand2simp dict failed"; - return -1; - } - - _initialed = true; - return 0; -} - -int FrontEngineInterface::ReadConfFile() { - std::ifstream is(_conf_file.c_str(), std::ifstream::in); - if (!is.good()) { - LOG(ERROR) << "Cannot open config file: " << _conf_file; - return -1; - } - std::string line, key, value; - while (std::getline(is, line)) { - if (line.substr(0, 2) == "--") { - size_t pos = line.find_first_of("=", 0); - std::string key = line.substr(2, pos - 2); - std::string value = line.substr(pos + 1); - conf_map[key] = value; - LOG(INFO) << "Key: " << key << "; Value: " << value; - } - } - - // jieba conf path - _jieba_dict_path = conf_map["jieba_dict_path"]; - _jieba_hmm_path = conf_map["jieba_hmm_path"]; - _jieba_user_dict_path = conf_map["jieba_user_dict_path"]; - _jieba_idf_path = conf_map["jieba_idf_path"]; - _jieba_stop_word_path = conf_map["jieba_stop_word_path"]; - - // dict path - _separate_tone = conf_map["separate_tone"]; - _word2phone_path = conf_map["word2phone_path"]; - _phone2id_path = conf_map["phone2id_path"]; - _tone2id_path = conf_map["tone2id_path"]; - _trand2simp_path = conf_map["trand2simpd_path"]; - - return 0; -} - -int FrontEngineInterface::Trand2Simp(const std::wstring &sentence, - std::wstring *sentence_simp) { - // sentence_simp = sentence; - for (int i = 0; i < sentence.length(); i++) { - std::wstring temp(1, sentence[i]); - std::string sigle_word = ppspeech::wstring2utf8string(temp); - // 单个字是否在繁转简的字典里 - if (trand_simp_map.find(sigle_word) == trand_simp_map.end()) { - sentence_simp->append(temp); - } else { - sentence_simp->append( - (ppspeech::utf8string2wstring(trand_simp_map[sigle_word]))); - } - } - - return 0; -} - -int FrontEngineInterface::GenDict(const std::string &dict_file, - std::map *map) { - std::ifstream is(dict_file.c_str(), std::ifstream::in); - if (!is.good()) { - LOG(ERROR) << "Cannot open dict file: " << dict_file; - return -1; - } - std::string line, key, value; - while (std::getline(is, line)) { - size_t pos = line.find_first_of(" ", 0); - key = line.substr(0, pos); - value = line.substr(pos + 1); - (*map)[key] = value; - } - return 0; -} - -int FrontEngineInterface::GetSegResult( - std::vector> *seg, - std::vector *seg_words) { - std::vector>::iterator iter; - for (iter = seg->begin(); iter != seg->end(); iter++) { - seg_words->push_back((*iter).first); - } - return 0; -} - -int FrontEngineInterface::GetSentenceIds(const std::string &sentence, - std::vector *phoneids, - std::vector *toneids) { - std::vector> - cut_result; //分词结果包含词和词性 - if (0 != Cut(sentence, &cut_result)) { - LOG(ERROR) << "Cut sentence: \"" << sentence << "\" failed"; - return -1; - } - - if (0 != GetWordsIds(cut_result, phoneids, toneids)) { - LOG(ERROR) << "Get words phoneids failed"; - return -1; - } - return 0; -} - -int FrontEngineInterface::GetWordsIds( - const std::vector> &cut_result, - std::vector *phoneids, - std::vector *toneids) { - std::string word; - std::string pos; - std::vector word_initials; - std::vector word_finals; - std::string phone; - for (int i = 0; i < cut_result.size(); i++) { - word = cut_result[i].first; - pos = cut_result[i].second; - if (std::find(_punc_omit.begin(), _punc_omit.end(), word) == - _punc_omit.end()) { // 非可忽略的标点 - word_initials = {}; - word_finals = {}; - phone = ""; - // 判断是否在标点符号集合中 - if (std::find(_punc.begin(), _punc.end(), word) == - _punc.end()) { // 文字 - // 获取字词的声母韵母列表 - if (0 != - GetInitialsFinals(word, &word_initials, &word_finals)) { - LOG(ERROR) - << "Genarate the word_initials and word_finals of " - << word << " failed"; - return -1; - } - - // 对读音进行修改 - if (0 != ModifyTone(word, pos, &word_finals)) { - LOG(ERROR) << "Failed to modify tone."; - } - - // 对儿化音进行修改 - std::vector> new_initals_finals = - MergeErhua(word_initials, word_finals, word, pos); - word_initials = new_initals_finals[0]; - word_finals = new_initals_finals[1]; - - // 将声母和韵母合并成音素 - assert(word_initials.size() == word_finals.size()); - std::string temp_phone; - for (int j = 0; j < word_initials.size(); j++) { - if (word_initials[j] != "") { - temp_phone = word_initials[j] + " " + word_finals[j]; - } else { - temp_phone = word_finals[j]; - } - if (j == 0) { - phone += temp_phone; - } else { - phone += (" " + temp_phone); - } - } - } else { // 标点符号 - if (_separate_tone == "true") { - phone = "sp0"; // speedyspeech - } else { - phone = "sp"; // fastspeech2 - } - } - - // 音素到音素id - if (0 != Phone2Phoneid(phone, phoneids, toneids)) { - LOG(ERROR) << "Genarate the phone id of " << word << " failed"; - return -1; - } - } - } - return 0; -} - -int FrontEngineInterface::Cut( - const std::string &sentence, - std::vector> *cut_result) { - std::vector> cut_result_jieba; - - // 结巴分词 - _jieba->Tag(sentence, cut_result_jieba); - - // 对分词后结果进行整合 - if (0 != MergeforModify(&cut_result_jieba, cut_result)) { - LOG(ERROR) << "Failed to modify for word segmentation result."; - return -1; - } - - return 0; -} - -int FrontEngineInterface::GetPhone(const std::string &word, - std::string *phone) { - // 判断 word 在不在 词典里,如果不在,进行CutAll分词 - if (word_phone_map.find(word) == word_phone_map.end()) { - std::vector wordcut; - _jieba->CutAll(word, wordcut); - phone->assign(word_phone_map[wordcut[0]]); - for (int i = 1; i < wordcut.size(); i++) { - phone->assign((*phone) + (" " + word_phone_map[wordcut[i]])); - } - } else { - phone->assign(word_phone_map[word]); - } - - return 0; -} - -int FrontEngineInterface::Phone2Phoneid(const std::string &phone, - std::vector *phoneid, - std::vector *toneid) { - std::vector phone_vec; - phone_vec = absl::StrSplit(phone, " "); - std::string temp_phone; - for (int i = 0; i < phone_vec.size(); i++) { - temp_phone = phone_vec[i]; - if (_separate_tone == "true") { - phoneid->push_back(atoi( - (phone_id_map[temp_phone.substr(0, temp_phone.length() - 1)]) - .c_str())); - toneid->push_back( - atoi((tone_id_map[temp_phone.substr(temp_phone.length() - 1, - temp_phone.length())]) - .c_str())); - } else { - phoneid->push_back(atoi((phone_id_map[temp_phone]).c_str())); - } - } - return 0; -} - - -// 根据韵母判断该词中每个字的读音都为第三声。true表示词中每个字都是第三声 -bool FrontEngineInterface::AllToneThree( - const std::vector &finals) { - bool flags = true; - for (int i = 0; i < finals.size(); i++) { - if (static_cast(finals[i].back()) != 51) { //如果读音不为第三声 - flags = false; - } - } - return flags; -} - -// 判断词是否是叠词 -bool FrontEngineInterface::IsReduplication(const std::string &word) { - bool flags = false; - std::wstring word_wstr = ppspeech::utf8string2wstring(word); - int len = word_wstr.length(); - if (len == 2 && word_wstr[0] == word_wstr[1]) { - flags = true; - } - return flags; -} - -// 获取每个字词的声母和韵母列表, word_initials 为声母列表,word_finals -// 为韵母列表 -int FrontEngineInterface::GetInitialsFinals( - const std::string &word, - std::vector *word_initials, - std::vector *word_finals) { - std::string phone; - GetPhone(word, &phone); //获取字词对应的音素 - std::vector phone_vec = absl::StrSplit(phone, " "); - //获取韵母,每个字的音素有1或者2个,start为单个字音素的起始位置。 - int start = 0; - while (start < phone_vec.size()) { - if (phone_vec[start] == "sp" || phone_vec[start] == "sp0") { - start += 1; - } else if (isdigit(phone_vec[start].back()) == 0 || - static_cast(phone_vec[start].back()) == 48) { - word_initials->push_back(phone_vec[start]); - word_finals->push_back(phone_vec[start + 1]); - start += 2; - } else { - word_initials->push_back(""); - word_finals->push_back(phone_vec[start]); - start += 1; - } - } - - assert(word_finals->size() == ppspeech::utf8string2wstring(word).length() && - word_finals->size() == word_initials->size()); - - return 0; -} - -// 获取每个字词的韵母列表 -int FrontEngineInterface::GetFinals(const std::string &word, - std::vector *word_finals) { - std::vector word_initials; - if (0 != GetInitialsFinals(word, &word_initials, word_finals)) { - LOG(ERROR) << "Failed to get word finals"; - return -1; - } - - return 0; -} - -int FrontEngineInterface::Word2WordVec(const std::string &word, - std::vector *wordvec) { - std::wstring word_wstr = ppspeech::utf8string2wstring(word); - for (int i = 0; i < word_wstr.length(); i++) { - std::wstring word_sigle(1, word_wstr[i]); - wordvec->push_back(word_sigle); - } - return 0; -} - -// yuantian01解释:把一个词再进行分词找到。例子:小雨伞 --> 小 雨伞 或者 小雨 伞 -int FrontEngineInterface::SplitWord(const std::string &word, - std::vector *new_word_vec) { - std::vector word_vec; - std::string second_subword; - _jieba->CutForSearch(word, word_vec); - // 升序 - std::sort(word_vec.begin(), - word_vec.end(), - [](std::string a, std::string b) { return a.size() > b.size(); }); - std::string first_subword = word_vec[0]; // 提取长度最短的字符串 - int first_begin_idx = word.find_first_of(first_subword); - if (first_begin_idx == 0) { - second_subword = word.substr(first_subword.length()); - new_word_vec->push_back(first_subword); - new_word_vec->push_back(second_subword); - } else { - second_subword = word.substr(0, word.length() - first_subword.length()); - new_word_vec->push_back(second_subword); - new_word_vec->push_back(first_subword); - } - - return 0; -} - - -// example: 不 一起 --> 不一起 -std::vector> FrontEngineInterface::MergeBu( - std::vector> *seg_result) { - std::vector> result; - std::string word; - std::string pos; - std::string last_word = ""; - - for (int i = 0; i < seg_result->size(); i++) { - word = std::get<0>((*seg_result)[i]); - pos = std::get<1>((*seg_result)[i]); - if (last_word == "不") { - word = last_word + word; - } - if (word != "不") { - result.push_back(make_pair(word, pos)); - } - last_word = word; - } - - if (last_word == "不") { - result.push_back(make_pair(last_word, "d")); - last_word = ""; - } - - return result; -} - -std::vector> FrontEngineInterface::Mergeyi( - std::vector> *seg_result) { - std::vector> *result_temp = - new std::vector>(); - std::string word; - std::string pos; - // function 1 example: 听 一 听 --> 听一听 - for (int i = 0; i < seg_result->size(); i++) { - word = std::get<0>((*seg_result)[i]); - pos = std::get<1>((*seg_result)[i]); - - if ((i - 1 >= 0) && (word == "一") && (i + 1 < seg_result->size()) && - (std::get<0>((*seg_result)[i - 1]) == - std::get<0>((*seg_result)[i + 1])) && - std::get<1>((*seg_result)[i - 1]) == "v") { - std::get<0>((*result_temp)[i - 1]) = - std::get<0>((*result_temp)[i - 1]) + "一" + - std::get<0>((*result_temp)[i - 1]); - } else { - if ((i - 2 >= 0) && (std::get<0>((*seg_result)[i - 1]) == "一") && - (std::get<0>((*seg_result)[i - 2]) == word) && (pos == "v")) { - continue; - } else { - result_temp->push_back(make_pair(word, pos)); - } - } - } - - // function 2 example: 一 你 --> 一你 - std::vector> result = {}; - for (int j = 0; j < result_temp->size(); j++) { - word = std::get<0>((*result_temp)[j]); - pos = std::get<1>((*result_temp)[j]); - if ((result.size() != 0) && (result.back().first == "一")) { - result.back().first = result.back().first + word; - } else { - result.push_back(make_pair(word, pos)); - } - } - - return result; -} - -// example: 你 你 --> 你你 -std::vector> -FrontEngineInterface::MergeReduplication( - std::vector> *seg_result) { - std::vector> result; - std::string word; - std::string pos; - - for (int i = 0; i < seg_result->size(); i++) { - word = std::get<0>((*seg_result)[i]); - pos = std::get<1>((*seg_result)[i]); - if ((result.size() != 0) && (word == result.back().first)) { - result.back().first = - result.back().first + std::get<0>((*seg_result)[i]); - } else { - result.push_back(make_pair(word, pos)); - } - } - - return result; -} - -// the first and the second words are all_tone_three -std::vector> -FrontEngineInterface::MergeThreeTones( - std::vector> *seg_result) { - std::vector> result; - std::string word; - std::string pos; - std::vector> finals; //韵母数组 - std::vector word_final; - std::vector merge_last(seg_result->size(), false); - - // 判断最后一个分词结果是不是标点,不看标点的声母韵母 - int word_num = seg_result->size() - 1; - - // seg_result[word_num].first - if (std::find( - _punc.begin(), _punc.end(), std::get<0>((*seg_result)[word_num])) == - _punc.end()) { // 最后一个分词结果不是标点 - word_num += 1; - } - - // 获取韵母数组 - for (int i = 0; i < word_num; i++) { - word_final = {}; - word = std::get<0>((*seg_result)[i]); - pos = std::get<1>((*seg_result)[i]); - if (std::find(_punc_omit.begin(), _punc_omit.end(), word) == - _punc_omit.end()) { // 非可忽略的标点,即文字 - if (0 != GetFinals(word, &word_final)) { - LOG(ERROR) << "Failed to get the final of word."; - } - } - - finals.push_back(word_final); - } - assert(word_num == finals.size()); - - // 对第三声读音的字词分词结果进行处理 - for (int i = 0; i < word_num; i++) { - word = std::get<0>((*seg_result)[i]); - pos = std::get<1>((*seg_result)[i]); - if (i - 1 >= 0 && AllToneThree(finals[i - 1]) && - AllToneThree(finals[i]) && !merge_last[i - 1]) { - // if the last word is reduplication, not merge, because - // reduplication need to be _neural_sandhi - // seg_result[i - 1].first - if (!IsReduplication(std::get<0>((*seg_result)[i - 1])) && - (ppspeech::utf8string2wstring( - std::get<0>((*seg_result)[i - 1]))) - .length() + - (ppspeech::utf8string2wstring(word)).length() <= - 3) { - result.back().first = - result.back().first + std::get<0>((*seg_result)[i]); - merge_last[i] = true; - } else { - result.push_back(make_pair(word, pos)); - } - } else { - result.push_back(make_pair(word, pos)); - } - } - - //把标点的分词结果补上 - if (word_num < seg_result->size()) { - result.push_back( - // seg_result[word_num].first seg_result[word_num].second - // std::get<0>((*seg_result)[word_num]) - make_pair(std::get<0>((*seg_result)[word_num]), - std::get<1>((*seg_result)[word_num]))); - } - - return result; -} - -// the last char of first word and the first char of second word is tone_three -std::vector> -FrontEngineInterface::MergeThreeTones2( - std::vector> *seg_result) { - std::vector> result; - std::string word; - std::string pos; - std::vector> finals; //韵母数组 - std::vector word_final; - std::vector merge_last(seg_result->size(), false); - - // 判断最后一个分词结果是不是标点 - int word_num = seg_result->size() - 1; - if (std::find( - _punc.begin(), _punc.end(), std::get<0>((*seg_result)[word_num])) == - _punc.end()) { // 最后一个分词结果不是标点 - word_num += 1; - } - - // 获取韵母数组 - for (int i = 0; i < word_num; i++) { - word_final = {}; - word = std::get<0>((*seg_result)[i]); - pos = std::get<1>((*seg_result)[i]); - // 如果是文字,则获取韵母,如果是可忽略的标点,例如引号,则跳过 - if (std::find(_punc_omit.begin(), _punc_omit.end(), word) == - _punc_omit.end()) { - if (0 != GetFinals(word, &word_final)) { - LOG(ERROR) << "Failed to get the final of word."; - } - } - - finals.push_back(word_final); - } - assert(word_num == finals.size()); - - // 对第三声读音的字词分词结果进行处理 - for (int i = 0; i < word_num; i++) { - word = std::get<0>((*seg_result)[i]); - pos = std::get<1>((*seg_result)[i]); - if (i - 1 >= 0 && !finals[i - 1].empty() && - absl::EndsWith(finals[i - 1].back(), "3") == true && - !finals[i].empty() && - absl::EndsWith(finals[i].front(), "3") == true && - !merge_last[i - 1]) { - // if the last word is reduplication, not merge, because - // reduplication need to be _neural_sandhi - // seg_result[i - 1].first - if (!IsReduplication(std::get<0>((*seg_result)[i - 1])) && - (ppspeech::utf8string2wstring( - std::get<0>((*seg_result)[i - 1]))) - .length() + - ppspeech::utf8string2wstring(word).length() <= - 3) { - result.back().first = - result.back().first + std::get<0>((*seg_result)[i]); - merge_last[i] = true; - } else { - result.push_back(make_pair(word, pos)); - } - } else { - result.push_back(make_pair(word, pos)); - } - } - - //把标点的分词结果补上 - if (word_num < seg_result->size()) { - result.push_back(make_pair(std::get<0>((*seg_result)[word_num]), - std::get<1>((*seg_result)[word_num]))); - } - - return result; -} - -// example: 吃饭 儿 --> 吃饭儿 -std::vector> FrontEngineInterface::MergeEr( - std::vector> *seg_result) { - std::vector> result; - std::string word; - std::string pos; - - for (int i = 0; i < seg_result->size(); i++) { - word = std::get<0>((*seg_result)[i]); - pos = std::get<1>((*seg_result)[i]); - if ((i - 1 >= 0) && (word == "儿")) { - result.back().first = - result.back().first + std::get<0>((*seg_result)[i]); - } else { - result.push_back(make_pair(word, pos)); - } - } - - return result; -} - -int FrontEngineInterface::MergeforModify( - std::vector> *seg_word_type, - std::vector> *modify_seg_word_type) { - std::vector seg_result; - GetSegResult(seg_word_type, &seg_result); - LOG(INFO) << "Before merge, seg result is: " - << limonp::Join(seg_result.begin(), seg_result.end(), "/"); - std::vector> tmp; - tmp = MergeBu(seg_word_type); - *modify_seg_word_type = tmp; - tmp = Mergeyi(modify_seg_word_type); - *modify_seg_word_type = tmp; - tmp = MergeReduplication(modify_seg_word_type); - *modify_seg_word_type = tmp; - tmp = MergeThreeTones(modify_seg_word_type); - *modify_seg_word_type = tmp; - tmp = MergeThreeTones2(modify_seg_word_type); - *modify_seg_word_type = tmp; - tmp = MergeEr(modify_seg_word_type); - *modify_seg_word_type = tmp; - seg_result = {}; - - GetSegResult(modify_seg_word_type, &seg_result); - LOG(INFO) << "After merge, seg result is: " - << limonp::Join(seg_result.begin(), seg_result.end(), "/"); - - return 0; -} - - -int FrontEngineInterface::BuSandi(const std::string &word, - std::vector *finals) { - std::wstring bu = L"不"; - std::vector wordvec; - // 一个词转成向量形式 - if (0 != Word2WordVec(word, &wordvec)) { - LOG(ERROR) << "Failed to get word vector"; - return -1; - } - - // e.g. 看不懂 b u4 --> b u5, 将韵母的最后一位替换成 5 - if (wordvec.size() == 3 && wordvec[1] == bu) { - (*finals)[1] = (*finals)[1].replace((*finals)[1].length() - 1, 1, "5"); - } else { - // e.g. 不怕 b u4 --> b u2, 将韵母的最后一位替换成 2 - for (int i = 0; i < wordvec.size(); i++) { - if (wordvec[i] == bu && i + 1 < wordvec.size() && - absl::EndsWith((*finals)[i + 1], "4") == true) { - (*finals)[i] = - (*finals)[i].replace((*finals)[i].length() - 1, 1, "2"); - } - } - } - - return 0; -} - - -int FrontEngineInterface::YiSandhi(const std::string &word, - std::vector *finals) { - std::wstring yi = L"一"; - std::vector wordvec; - // 一个词转成向量形式 - if (0 != Word2WordVec(word, &wordvec)) { - LOG(ERROR) << "Failed to get word vector"; - return -1; - } - - //情况1:"一" in number sequences, e.g. 一零零, 二一零 - std::wstring num_wstr = L"零一二三四六七八九"; - std::wstring word_wstr = ppspeech::utf8string2wstring(word); - if (word_wstr.find(yi) != word_wstr.npos && wordvec.back() != yi) { - int flags = 0; - for (int j = 0; j < wordvec.size(); j++) { - if (num_wstr.find(wordvec[j]) == num_wstr.npos) { - flags = -1; - break; - } - } - if (flags == 0) { - return 0; - } - } else if (wordvec.size() == 3 && wordvec[1] == yi && - wordvec[0] == wordvec[2]) { - // "一" between reduplication words shold be yi5, e.g. 看一看 - (*finals)[1] = (*finals)[1].replace((*finals)[1].length() - 1, 1, "5"); - } else if (wordvec[0] == L"第" && wordvec[1] == yi) { //以第一位开始 - (*finals)[1] = (*finals)[1].replace((*finals)[1].length() - 1, 1, "1"); - } else { - for (int i = 0; i < wordvec.size(); i++) { - if (wordvec[i] == yi && i + 1 < wordvec.size()) { - if (absl::EndsWith((*finals)[i + 1], "4") == true) { - // "一" before tone4 should be yi2, e.g. 一段 - (*finals)[i] = - (*finals)[i].replace((*finals)[i].length() - 1, 1, "2"); - } else { - // "一" before non-tone4 should be yi4, e.g. 一天 - (*finals)[i] = - (*finals)[i].replace((*finals)[i].length() - 1, 1, "4"); - } - } - } - } - - return 0; -} - -int FrontEngineInterface::NeuralSandhi(const std::string &word, - const std::string &pos, - std::vector *finals) { - std::wstring word_wstr = ppspeech::utf8string2wstring(word); - std::vector wordvec; - // 一个词转成向量形式 - if (0 != Word2WordVec(word, &wordvec)) { - LOG(ERROR) << "Failed to get word vector"; - return -1; - } - int word_num = wordvec.size(); - assert(word_num == word_wstr.length()); - - // 情况1:reduplication words for n. and v. e.g. 奶奶, 试试, 旺旺 - for (int j = 0; j < wordvec.size(); j++) { - std::string inits = "nva"; - if (j - 1 >= 0 && wordvec[j] == wordvec[j - 1] && - inits.find(pos[0]) != inits.npos) { - (*finals)[j] = - (*finals)[j].replace((*finals)[j].length() - 1, 1, "5"); - } - } - - // 情况2:对下述词的处理 - std::wstring yuqici = L"吧呢哈啊呐噻嘛吖嗨呐哦哒额滴哩哟喽啰耶喔诶"; - std::wstring de = L"的地得"; - std::wstring le = L"了着过"; - std::vector le_pos = {"ul", "uz", "ug"}; - std::wstring men = L"们子"; - std::vector men_pos = {"r", "n"}; - std::wstring weizhi = L"上下里"; - std::vector weizhi_pos = {"s", "l", "f"}; - std::wstring dong = L"来去"; - std::wstring fangxiang = L"上下进出回过起开"; - std::wstring ge = L"个"; - std::wstring xiushi = L"几有两半多各整每做是零一二三四六七八九"; - auto ge_idx = word_wstr.find_first_of(ge); // 出现“个”的第一个位置 - - if (word_num >= 1 && yuqici.find(wordvec.back()) != yuqici.npos) { - (*finals).back() = - (*finals).back().replace((*finals).back().length() - 1, 1, "5"); - } else if (word_num >= 1 && de.find(wordvec.back()) != de.npos) { - (*finals).back() = - (*finals).back().replace((*finals).back().length() - 1, 1, "5"); - } else if (word_num == 1 && le.find(wordvec[0]) != le.npos && - find(le_pos.begin(), le_pos.end(), pos) != le_pos.end()) { - (*finals).back() = - (*finals).back().replace((*finals).back().length() - 1, 1, "5"); - } else if (word_num > 1 && men.find(wordvec.back()) != men.npos && - find(men_pos.begin(), men_pos.end(), pos) != men_pos.end() && - find(must_not_neural_tone_words.begin(), - must_not_neural_tone_words.end(), - word) != must_not_neural_tone_words.end()) { - (*finals).back() = - (*finals).back().replace((*finals).back().length() - 1, 1, "5"); - } else if (word_num > 1 && weizhi.find(wordvec.back()) != weizhi.npos && - find(weizhi_pos.begin(), weizhi_pos.end(), pos) != - weizhi_pos.end()) { - (*finals).back() = - (*finals).back().replace((*finals).back().length() - 1, 1, "5"); - } else if (word_num > 1 && dong.find(wordvec.back()) != dong.npos && - fangxiang.find(wordvec[word_num - 2]) != fangxiang.npos) { - (*finals).back() = - (*finals).back().replace((*finals).back().length() - 1, 1, "5"); - } else if ((ge_idx != word_wstr.npos && ge_idx >= 1 && - xiushi.find(wordvec[ge_idx - 1]) != xiushi.npos) || - word_wstr == ge) { - (*finals).back() = - (*finals).back().replace((*finals).back().length() - 1, 1, "5"); - } else { - if (find(must_neural_tone_words.begin(), - must_neural_tone_words.end(), - word) != must_neural_tone_words.end() || - (word_num >= 2 && - find(must_neural_tone_words.begin(), - must_neural_tone_words.end(), - ppspeech::wstring2utf8string(word_wstr.substr( - word_num - 2))) != must_neural_tone_words.end())) { - (*finals).back() = - (*finals).back().replace((*finals).back().length() - 1, 1, "5"); - } - } - - // 进行进一步分词,把长词切分更短些 - std::vector word_list; - if (0 != SplitWord(word, &word_list)) { - LOG(ERROR) << "Failed to split word."; - return -1; - } - // 创建对应的 韵母列表 - std::vector> finals_list; - std::vector finals_temp; - finals_temp.assign((*finals).begin(), - (*finals).begin() + - ppspeech::utf8string2wstring(word_list[0]).length()); - finals_list.push_back(finals_temp); - finals_temp.assign( - (*finals).begin() + ppspeech::utf8string2wstring(word_list[0]).length(), - (*finals).end()); - finals_list.push_back(finals_temp); - - finals = new std::vector(); - for (int i = 0; i < word_list.size(); i++) { - std::wstring temp_wstr = ppspeech::utf8string2wstring(word_list[i]); - if ((find(must_neural_tone_words.begin(), - must_neural_tone_words.end(), - word_list[i]) != must_neural_tone_words.end()) || - (temp_wstr.length() >= 2 && - find(must_neural_tone_words.begin(), - must_neural_tone_words.end(), - ppspeech::wstring2utf8string( - temp_wstr.substr(temp_wstr.length() - 2))) != - must_neural_tone_words.end())) { - finals_list[i].back() = finals_list[i].back().replace( - finals_list[i].back().length() - 1, 1, "5"); - } - (*finals).insert( - (*finals).end(), finals_list[i].begin(), finals_list[i].end()); - } - - return 0; -} - -int FrontEngineInterface::ThreeSandhi(const std::string &word, - std::vector *finals) { - std::wstring word_wstr = ppspeech::utf8string2wstring(word); - std::vector> finals_list; - std::vector finals_temp; - std::vector wordvec; - // 一个词转成向量形式 - if (0 != Word2WordVec(word, &wordvec)) { - LOG(ERROR) << "Failed to get word vector"; - return -1; - } - int word_num = wordvec.size(); - assert(word_num == word_wstr.length()); - - if (word_num == 2 && AllToneThree((*finals))) { - (*finals)[0] = (*finals)[0].replace((*finals)[0].length() - 1, 1, "2"); - } else if (word_num == 3) { - // 进行进一步分词,把长词切分更短些 - std::vector word_list; - if (0 != SplitWord(word, &word_list)) { - LOG(ERROR) << "Failed to split word."; - return -1; - } - if (AllToneThree((*finals))) { - std::wstring temp_wstr = ppspeech::utf8string2wstring(word_list[0]); - // disyllabic + monosyllabic, e.g. 蒙古/包 - if (temp_wstr.length() == 2) { - (*finals)[0] = - (*finals)[0].replace((*finals)[0].length() - 1, 1, "2"); - (*finals)[1] = - (*finals)[1].replace((*finals)[1].length() - 1, 1, "2"); - } else if (temp_wstr.length() == - 1) { // monosyllabic + disyllabic, e.g. 纸/老虎 - (*finals)[1] = - (*finals)[1].replace((*finals)[1].length() - 1, 1, "2"); - } - } else { - // 创建对应的 韵母列表 - finals_temp = {}; - finals_list = {}; - finals_temp.assign( - (*finals).begin(), - (*finals).begin() + - ppspeech::utf8string2wstring(word_list[0]).length()); - finals_list.push_back(finals_temp); - finals_temp.assign( - (*finals).begin() + - ppspeech::utf8string2wstring(word_list[0]).length(), - (*finals).end()); - finals_list.push_back(finals_temp); - - finals = new std::vector(); - for (int i = 0; i < finals_list.size(); i++) { - // e.g. 所有/人 - if (AllToneThree(finals_list[i]) && - finals_list[i].size() == 2) { - finals_list[i][0] = finals_list[i][0].replace( - finals_list[i][0].length() - 1, 1, "2"); - } else if (i == 1 && !(AllToneThree(finals_list[i])) && - absl::EndsWith(finals_list[i][0], "3") == true && - absl::EndsWith(finals_list[0].back(), "3") == true) { - finals_list[0].back() = finals_list[0].back().replace( - finals_list[0].back().length() - 1, 1, "2"); - } - } - (*finals).insert( - (*finals).end(), finals_list[0].begin(), finals_list[0].end()); - (*finals).insert( - (*finals).end(), finals_list[1].begin(), finals_list[1].end()); - } - - } else if (word_num == 4) { //将成语拆分为两个长度为 2 的单词 - // 创建对应的 韵母列表 - finals_temp = {}; - finals_list = {}; - finals_temp.assign((*finals).begin(), (*finals).begin() + 2); - finals_list.push_back(finals_temp); - finals_temp.assign((*finals).begin() + 2, (*finals).end()); - finals_list.push_back(finals_temp); - - finals = new std::vector(); - for (int j = 0; j < finals_list.size(); j++) { - if (AllToneThree(finals_list[j])) { - finals_list[j][0] = finals_list[j][0].replace( - finals_list[j][0].length() - 1, 1, "2"); - } - (*finals).insert( - (*finals).end(), finals_list[j].begin(), finals_list[j].end()); - } - } - - return 0; -} - -int FrontEngineInterface::ModifyTone(const std::string &word, - const std::string &pos, - std::vector *finals) { - if ((0 != BuSandi(word, finals)) || (0 != YiSandhi(word, finals)) || - (0 != NeuralSandhi(word, pos, finals)) || - (0 != ThreeSandhi(word, finals))) { - LOG(ERROR) << "Failed to modify tone of the word: " << word; - return -1; - } - - return 0; -} - -std::vector> FrontEngineInterface::MergeErhua( - const std::vector &initials, - const std::vector &finals, - const std::string &word, - const std::string &pos) { - std::vector new_initials = {}; - std::vector new_finals = {}; - std::vector> new_initials_finals; - std::vector specified_pos = {"a", "j", "nr"}; - std::wstring word_wstr = ppspeech::utf8string2wstring(word); - std::vector wordvec; - // 一个词转成向量形式 - if (0 != Word2WordVec(word, &wordvec)) { - LOG(ERROR) << "Failed to get word vector"; - } - int word_num = wordvec.size(); - - if ((find(must_erhua.begin(), must_erhua.end(), word) == - must_erhua.end()) && - ((find(not_erhua.begin(), not_erhua.end(), word) != not_erhua.end()) || - (find(specified_pos.begin(), specified_pos.end(), pos) != - specified_pos.end()))) { - new_initials_finals.push_back(initials); - new_initials_finals.push_back(finals); - return new_initials_finals; - } - if (finals.size() != word_num) { - new_initials_finals.push_back(initials); - new_initials_finals.push_back(finals); - return new_initials_finals; - } - - assert(finals.size() == word_num); - for (int i = 0; i < finals.size(); i++) { - if (i == finals.size() - 1 && wordvec[i] == L"儿" && - (finals[i] == "er2" || finals[i] == "er5") && word_num >= 2 && - find(not_erhua.begin(), - not_erhua.end(), - ppspeech::wstring2utf8string(word_wstr.substr( - word_wstr.length() - 2))) == not_erhua.end() && - !new_finals.empty()) { - new_finals.back() = - new_finals.back().substr(0, new_finals.back().length() - 1) + - "r" + new_finals.back().substr(new_finals.back().length() - 1); - } else { - new_initials.push_back(initials[i]); - new_finals.push_back(finals[i]); - } - } - new_initials_finals.push_back(new_initials); - new_initials_finals.push_back(new_finals); - - return new_initials_finals; -} -} // namespace ppspeech diff --git a/demos/TTSCppFrontend/src/front/front_interface.h b/demos/TTSCppFrontend/src/front/front_interface.h deleted file mode 100644 index 8c16859cf..000000000 --- a/demos/TTSCppFrontend/src/front/front_interface.h +++ /dev/null @@ -1,198 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. -#ifndef PADDLE_TTS_SERVING_FRONT_FRONT_INTERFACE_H -#define PADDLE_TTS_SERVING_FRONT_FRONT_INTERFACE_H - -#include -#include -#include -#include -#include -//#include "utils/dir_utils.h" -#include -#include "absl/strings/str_split.h" -#include "front/text_normalize.h" - - -namespace ppspeech { - -class FrontEngineInterface : public TextNormalizer { - public: - explicit FrontEngineInterface(std::string conf) : _conf_file(conf) { - TextNormalizer(); - _jieba = nullptr; - _initialed = false; - init(); - } - - int init(); - ~FrontEngineInterface() {} - - // 读取配置文件 - int ReadConfFile(); - - // 简体转繁体 - int Trand2Simp(const std::wstring &sentence, std::wstring *sentence_simp); - - // 生成字典 - int GenDict(const std::string &file, - std::map *map); - - // 由 词+词性的分词结果转为仅包含词的结果 - int GetSegResult(std::vector> *seg, - std::vector *seg_words); - - // 生成句子的音素,音调id。如果音素和音调未分开,则 toneids - // 为空(fastspeech2),反之则不为空(speedyspeech) - int GetSentenceIds(const std::string &sentence, - std::vector *phoneids, - std::vector *toneids); - - // 根据分词结果获取词的音素,音调id,并对读音进行适当修改 - // (ModifyTone)。如果音素和音调未分开,则 toneids - // 为空(fastspeech2),反之则不为空(speedyspeech) - int GetWordsIds( - const std::vector> &cut_result, - std::vector *phoneids, - std::vector *toneids); - - // 结巴分词生成包含词和词性的分词结果,再对分词结果进行适当修改 - // (MergeforModify) - int Cut(const std::string &sentence, - std::vector> *cut_result); - - // 字词到音素的映射,查找字典 - int GetPhone(const std::string &word, std::string *phone); - - // 音素到音素id - int Phone2Phoneid(const std::string &phone, - std::vector *phoneid, - std::vector *toneids); - - - // 根据韵母判断该词中每个字的读音都为第三声。true表示词中每个字都是第三声 - bool AllToneThree(const std::vector &finals); - - // 判断词是否是叠词 - bool IsReduplication(const std::string &word); - - // 获取每个字词的声母韵母列表 - int GetInitialsFinals(const std::string &word, - std::vector *word_initials, - std::vector *word_finals); - - // 获取每个字词的韵母列表 - int GetFinals(const std::string &word, - std::vector *word_finals); - - // 整个词转成向量形式,向量的每个元素对应词的一个字 - int Word2WordVec(const std::string &word, - std::vector *wordvec); - - // 将整个词重新进行 full cut,分词后,各个词会在词典中 - int SplitWord(const std::string &word, - std::vector *fullcut_word); - - // 对分词结果进行处理:对包含“不”字的分词结果进行整理 - std::vector> MergeBu( - std::vector> *seg_result); - - // 对分词结果进行处理:对包含“一”字的分词结果进行整理 - std::vector> Mergeyi( - std::vector> *seg_result); - - // 对分词结果进行处理:对前后相同的两个字进行合并 - std::vector> MergeReduplication( - std::vector> *seg_result); - - // 对一个词和后一个词他们的读音均为第三声的两个词进行合并 - std::vector> MergeThreeTones( - std::vector> *seg_result); - - // 对一个词的最后一个读音和后一个词的第一个读音为第三声的两个词进行合并 - std::vector> MergeThreeTones2( - std::vector> *seg_result); - - // 对分词结果进行处理:对包含“儿”字的分词结果进行整理 - std::vector> MergeEr( - std::vector> *seg_result); - - // 对分词结果进行处理、修改 - int MergeforModify( - std::vector> *seg_result, - std::vector> *merge_seg_result); - - - // 对包含“不”字的相关词音调进行修改 - int BuSandi(const std::string &word, std::vector *finals); - - // 对包含“一”字的相关词音调进行修改 - int YiSandhi(const std::string &word, std::vector *finals); - - // 对一些特殊词(包括量词,语助词等)的相关词音调进行修改 - int NeuralSandhi(const std::string &word, - const std::string &pos, - std::vector *finals); - - // 对包含第三声的相关词音调进行修改 - int ThreeSandhi(const std::string &word, std::vector *finals); - - // 对字词音调进行处理、修改 - int ModifyTone(const std::string &word, - const std::string &pos, - std::vector *finals); - - - // 对儿化音进行处理 - std::vector> MergeErhua( - const std::vector &initials, - const std::vector &finals, - const std::string &word, - const std::string &pos); - - - private: - bool _initialed; - cppjieba::Jieba *_jieba; - std::vector _punc; - std::vector _punc_omit; - - std::string _conf_file; - std::map conf_map; - std::map word_phone_map; - std::map phone_id_map; - std::map tone_id_map; - std::map trand_simp_map; - - - std::string _jieba_dict_path; - std::string _jieba_hmm_path; - std::string _jieba_user_dict_path; - std::string _jieba_idf_path; - std::string _jieba_stop_word_path; - - std::string _separate_tone; - std::string _word2phone_path; - std::string _phone2id_path; - std::string _tone2id_path; - std::string _trand2simp_path; - - std::vector must_erhua; - std::vector not_erhua; - - std::vector must_not_neural_tone_words; - std::vector must_neural_tone_words; -}; -} // namespace ppspeech -#endif \ No newline at end of file diff --git a/demos/TTSCppFrontend/src/front/text_normalize.cpp b/demos/TTSCppFrontend/src/front/text_normalize.cpp deleted file mode 100644 index 8420e8407..000000000 --- a/demos/TTSCppFrontend/src/front/text_normalize.cpp +++ /dev/null @@ -1,542 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. -#include "front/text_normalize.h" - -namespace ppspeech { - -// 初始化 digits_map and unit_map -int TextNormalizer::InitMap() { - digits_map["0"] = "零"; - digits_map["1"] = "一"; - digits_map["2"] = "二"; - digits_map["3"] = "三"; - digits_map["4"] = "四"; - digits_map["5"] = "五"; - digits_map["6"] = "六"; - digits_map["7"] = "七"; - digits_map["8"] = "八"; - digits_map["9"] = "九"; - - units_map[1] = "十"; - units_map[2] = "百"; - units_map[3] = "千"; - units_map[4] = "万"; - units_map[8] = "亿"; - - return 0; -} - -// 替换 -int TextNormalizer::Replace(std::wstring *sentence, - const int &pos, - const int &len, - const std::wstring &repstr) { - // 删除原来的 - sentence->erase(pos, len); - // 插入新的 - sentence->insert(pos, repstr); - return 0; -} - -// 根据标点符号切分句子 -int TextNormalizer::SplitByPunc(const std::wstring &sentence, - std::vector *sentence_part) { - std::wstring temp = sentence; - std::wregex reg(L"[:,;。?!,;?!]"); - std::wsmatch match; - - while (std::regex_search(temp, match, reg)) { - sentence_part->push_back( - temp.substr(0, match.position(0) + match.length(0))); - Replace(&temp, 0, match.position(0) + match.length(0), L""); - } - // 如果最后没有标点符号 - if (temp != L"") { - sentence_part->push_back(temp); - } - return 0; -} - -// 数字转文本,10200 - > 一万零二百 -std::string TextNormalizer::CreateTextValue(const std::string &num_str, - bool use_zero) { - std::string num_lstrip = - std::string(absl::StripPrefix(num_str, "0")).data(); - int len = num_lstrip.length(); - - if (len == 0) { - return ""; - } else if (len == 1) { - if (use_zero && (len < num_str.length())) { - return digits_map["0"] + digits_map[num_lstrip]; - } else { - return digits_map[num_lstrip]; - } - } else { - int largest_unit = 0; // 最大单位 - std::string first_part; - std::string second_part; - - if (len > 1 && len <= 2) { - largest_unit = 1; - } else if (len > 2 && len <= 3) { - largest_unit = 2; - } else if (len > 3 && len <= 4) { - largest_unit = 3; - } else if (len > 4 && len <= 8) { - largest_unit = 4; - } else if (len > 8) { - largest_unit = 8; - } - - first_part = num_str.substr(0, num_str.length() - largest_unit); - second_part = num_str.substr(num_str.length() - largest_unit); - - return CreateTextValue(first_part, use_zero) + units_map[largest_unit] + - CreateTextValue(second_part, use_zero); - } -} - -// 数字一个一个对应,可直接用于年份,电话,手机, -std::string TextNormalizer::SingleDigit2Text(const std::string &num_str, - bool alt_one) { - std::string text = ""; - if (alt_one) { - digits_map["1"] = "幺"; - } else { - digits_map["1"] = "一"; - } - - for (size_t i = 0; i < num_str.size(); i++) { - std::string num_int(1, num_str[i]); - if (digits_map.find(num_int) == digits_map.end()) { - LOG(ERROR) << "digits_map doesn't have key: " << num_int; - } - text += digits_map[num_int]; - } - - return text; -} - -std::string TextNormalizer::SingleDigit2Text(const std::wstring &num, - bool alt_one) { - std::string num_str = wstring2utf8string(num); - return SingleDigit2Text(num_str, alt_one); -} - -// 数字整体对应,可直接用于月份,日期,数值整数部分 -std::string TextNormalizer::MultiDigit2Text(const std::string &num_str, - bool alt_one, - bool use_zero) { - LOG(INFO) << "aaaaaaaaaaaaaaaa: " << alt_one << use_zero; - if (alt_one) { - digits_map["1"] = "幺"; - } else { - digits_map["1"] = "一"; - } - - std::wstring result = - utf8string2wstring(CreateTextValue(num_str, use_zero)); - std::wstring result_0(1, result[0]); - std::wstring result_1(1, result[1]); - // 一十八 --> 十八 - if ((result_0 == utf8string2wstring(digits_map["1"])) && - (result_1 == utf8string2wstring(units_map[1]))) { - return wstring2utf8string(result.substr(1, result.length())); - } else { - return wstring2utf8string(result); - } -} - -std::string TextNormalizer::MultiDigit2Text(const std::wstring &num, - bool alt_one, - bool use_zero) { - std::string num_str = wstring2utf8string(num); - return MultiDigit2Text(num_str, alt_one, use_zero); -} - -// 数字转文本,包括整数和小数 -std::string TextNormalizer::Digits2Text(const std::string &num_str) { - std::string text; - std::vector integer_decimal; - integer_decimal = absl::StrSplit(num_str, "."); - - if (integer_decimal.size() == 1) { // 整数 - text = MultiDigit2Text(integer_decimal[0]); - } else if (integer_decimal.size() == 2) { // 小数 - if (integer_decimal[0] == "") { // 无整数的小数类型,例如:.22 - text = "点" + - SingleDigit2Text( - std::string(absl::StripSuffix(integer_decimal[1], "0")) - .data()); - } else { // 常规小数类型,例如:12.34 - text = MultiDigit2Text(integer_decimal[0]) + "点" + - SingleDigit2Text( - std::string(absl::StripSuffix(integer_decimal[1], "0")) - .data()); - } - } else { - return "The value does not conform to the numeric format"; - } - - return text; -} - -std::string TextNormalizer::Digits2Text(const std::wstring &num) { - std::string num_str = wstring2utf8string(num); - return Digits2Text(num_str); -} - -// 日期,2021年8月18日 --> 二零二一年八月十八日 -int TextNormalizer::ReData(std::wstring *sentence) { - std::wregex reg( - L"(\\d{4}|\\d{2})年((0?[1-9]|1[0-2])月)?(((0?[1-9])|((1|2)[0-9])|30|31)" - L"([日号]))?"); - std::wsmatch match; - std::string rep; - - while (std::regex_search(*sentence, match, reg)) { - rep = ""; - rep += SingleDigit2Text(match[1]) + "年"; - if (match[3] != L"") { - rep += MultiDigit2Text(match[3], false, false) + "月"; - } - if (match[5] != L"") { - rep += MultiDigit2Text(match[5], false, false) + - wstring2utf8string(match[9]); - } - - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - - -// XX-XX-XX or XX/XX/XX 例如:2021/08/18 --> 二零二一年八月十八日 -int TextNormalizer::ReData2(std::wstring *sentence) { - std::wregex reg( - L"(\\d{4})([- /.])(0[1-9]|1[012])\\2(0[1-9]|[12][0-9]|3[01])"); - std::wsmatch match; - std::string rep; - - while (std::regex_search(*sentence, match, reg)) { - rep = ""; - rep += (SingleDigit2Text(match[1]) + "年"); - rep += (MultiDigit2Text(match[3], false, false) + "月"); - rep += (MultiDigit2Text(match[4], false, false) + "日"); - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// XX:XX:XX 09:09:02 --> 九点零九分零二秒 -int TextNormalizer::ReTime(std::wstring *sentence) { - std::wregex reg(L"([0-1]?[0-9]|2[0-3]):([0-5][0-9])(:([0-5][0-9]))?"); - std::wsmatch match; - std::string rep; - - while (std::regex_search(*sentence, match, reg)) { - rep = ""; - rep += (MultiDigit2Text(match[1], false, false) + "点"); - if (absl::StartsWith(wstring2utf8string(match[2]), "0")) { - rep += "零"; - } - rep += (MultiDigit2Text(match[2]) + "分"); - if (absl::StartsWith(wstring2utf8string(match[4]), "0")) { - rep += "零"; - } - rep += (MultiDigit2Text(match[4]) + "秒"); - - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 温度,例如:-24.3℃ --> 零下二十四点三度 -int TextNormalizer::ReTemperature(std::wstring *sentence) { - std::wregex reg(L"(-?)(\\d+(\\.\\d+)?)(°C|℃|度|摄氏度)"); - std::wsmatch match; - std::string rep; - std::string sign; - std::vector integer_decimal; - std::string unit; - - while (std::regex_search(*sentence, match, reg)) { - match[1] == L"-" ? sign = "负" : sign = ""; - match[4] == L"摄氏度" ? unit = "摄氏度" : unit = "度"; - rep = sign + Digits2Text(match[2]) + unit; - - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 分数,例如: 1/3 --> 三分之一 -int TextNormalizer::ReFrac(std::wstring *sentence) { - std::wregex reg(L"(-?)(\\d+)/(\\d+)"); - std::wsmatch match; - std::string sign; - std::string rep; - while (std::regex_search(*sentence, match, reg)) { - match[1] == L"-" ? sign = "负" : sign = ""; - rep = sign + MultiDigit2Text(match[3]) + "分之" + - MultiDigit2Text(match[2]); - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 百分数,例如:45.5% --> 百分之四十五点五 -int TextNormalizer::RePercentage(std::wstring *sentence) { - std::wregex reg(L"(-?)(\\d+(\\.\\d+)?)%"); - std::wsmatch match; - std::string sign; - std::string rep; - std::vector integer_decimal; - - while (std::regex_search(*sentence, match, reg)) { - match[1] == L"-" ? sign = "负" : sign = ""; - rep = sign + "百分之" + Digits2Text(match[2]); - - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 手机号码,例如:+86 18883862235 --> 八六幺八八八三八六二二三五 -int TextNormalizer::ReMobilePhone(std::wstring *sentence) { - std::wregex reg( - L"(\\d)?((\\+?86 ?)?1([38]\\d|5[0-35-9]|7[678]|9[89])\\d{8})(\\d)?"); - std::wsmatch match; - std::string rep; - std::vector country_phonenum; - - while (std::regex_search(*sentence, match, reg)) { - country_phonenum = absl::StrSplit(wstring2utf8string(match[0]), "+"); - rep = ""; - for (int i = 0; i < country_phonenum.size(); i++) { - LOG(INFO) << country_phonenum[i]; - rep += SingleDigit2Text(country_phonenum[i], true); - } - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 座机号码,例如:010-51093154 --> 零幺零五幺零九三幺五四 -int TextNormalizer::RePhone(std::wstring *sentence) { - std::wregex reg( - L"(\\d)?((0(10|2[1-3]|[3-9]\\d{2})-?)?[1-9]\\d{6,7})(\\d)?"); - std::wsmatch match; - std::vector zone_phonenum; - std::string rep; - - while (std::regex_search(*sentence, match, reg)) { - rep = ""; - zone_phonenum = absl::StrSplit(wstring2utf8string(match[0]), "-"); - for (int i = 0; i < zone_phonenum.size(); i++) { - rep += SingleDigit2Text(zone_phonenum[i], true); - } - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 范围,例如:60~90 --> 六十到九十 -int TextNormalizer::ReRange(std::wstring *sentence) { - std::wregex reg( - L"((-?)((\\d+)(\\.\\d+)?)|(\\.(\\d+)))[-~]((-?)((\\d+)(\\.\\d+)?)|(\\.(" - L"\\d+)))"); - std::wsmatch match; - std::string rep; - std::string sign1; - std::string sign2; - - while (std::regex_search(*sentence, match, reg)) { - rep = ""; - match[2] == L"-" ? sign1 = "负" : sign1 = ""; - if (match[6] != L"") { - rep += sign1 + Digits2Text(match[6]) + "到"; - } else { - rep += sign1 + Digits2Text(match[3]) + "到"; - } - match[9] == L"-" ? sign2 = "负" : sign2 = ""; - if (match[13] != L"") { - rep += sign2 + Digits2Text(match[13]); - } else { - rep += sign2 + Digits2Text(match[10]); - } - - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 带负号的整数,例如:-10 --> 负十 -int TextNormalizer::ReInterger(std::wstring *sentence) { - std::wregex reg(L"(-)(\\d+)"); - std::wsmatch match; - std::string rep; - while (std::regex_search(*sentence, match, reg)) { - rep = "负" + MultiDigit2Text(match[2]); - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 纯小数 -int TextNormalizer::ReDecimalNum(std::wstring *sentence) { - std::wregex reg(L"(-?)((\\d+)(\\.\\d+))|(\\.(\\d+))"); - std::wsmatch match; - std::string sign; - std::string rep; - // std::vector integer_decimal; - while (std::regex_search(*sentence, match, reg)) { - match[1] == L"-" ? sign = "负" : sign = ""; - if (match[5] != L"") { - rep = sign + Digits2Text(match[5]); - } else { - rep = sign + Digits2Text(match[2]); - } - - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - - return 0; -} - -// 正整数 + 量词 -int TextNormalizer::RePositiveQuantifiers(std::wstring *sentence) { - std::wstring common_quantifiers = - L"(朵|匹|张|座|回|场|尾|条|个|首|阙|阵|网|炮|顶|丘|棵|只|支|袭|辆|挑|" - L"担|颗|壳|窠|曲|墙|群|腔|砣|座|客|贯|扎|捆|刀|令|打|手|罗|坡|山|岭|江|" - L"溪|钟|队|单|双|对|出|口|头|脚|板|跳|枝|件|贴|针|线|管|名|位|身|堂|课|" - L"本|页|家|户|层|丝|毫|厘|分|钱|两|斤|担|铢|石|钧|锱|忽|(千|毫|微)克|" - L"毫|厘|(公)分|分|寸|尺|丈|里|寻|常|铺|程|(千|分|厘|毫|微)米|米|撮|勺|" - L"合|升|斗|石|盘|碗|碟|叠|桶|笼|盆|盒|杯|钟|斛|锅|簋|篮|盘|桶|罐|瓶|壶|" - L"卮|盏|箩|箱|煲|啖|袋|钵|年|月|日|季|刻|时|周|天|秒|分|旬|纪|岁|世|更|" - L"夜|春|夏|秋|冬|代|伏|辈|丸|泡|粒|颗|幢|堆|条|根|支|道|面|片|张|颗|块|" - L"元|(亿|千万|百万|万|千|百)|(亿|千万|百万|万|千|百|美|)元|(亿|千万|" - L"百万|万|千|百|)块|角|毛|分)"; - std::wregex reg(L"(\\d+)([多余几])?" + common_quantifiers); - std::wsmatch match; - std::string rep; - while (std::regex_search(*sentence, match, reg)) { - rep = MultiDigit2Text(match[1]); - Replace(sentence, - match.position(1), - match.length(1), - utf8string2wstring(rep)); - } - - return 0; -} - -// 编号类数字,例如: 89757 --> 八九七五七 -int TextNormalizer::ReDefalutNum(std::wstring *sentence) { - std::wregex reg(L"\\d{3}\\d*"); - std::wsmatch match; - while (std::regex_search(*sentence, match, reg)) { - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(SingleDigit2Text(match[0]))); - } - - return 0; -} - -int TextNormalizer::ReNumber(std::wstring *sentence) { - std::wregex reg(L"(-?)((\\d+)(\\.\\d+)?)|(\\.(\\d+))"); - std::wsmatch match; - std::string sign; - std::string rep; - while (std::regex_search(*sentence, match, reg)) { - match[1] == L"-" ? sign = "负" : sign = ""; - if (match[5] != L"") { - rep = sign + Digits2Text(match[5]); - } else { - rep = sign + Digits2Text(match[2]); - } - - Replace(sentence, - match.position(0), - match.length(0), - utf8string2wstring(rep)); - } - return 0; -} - -// 整体正则,按顺序 -int TextNormalizer::SentenceNormalize(std::wstring *sentence) { - ReData(sentence); - ReData2(sentence); - ReTime(sentence); - ReTemperature(sentence); - ReFrac(sentence); - RePercentage(sentence); - ReMobilePhone(sentence); - RePhone(sentence); - ReRange(sentence); - ReInterger(sentence); - ReDecimalNum(sentence); - RePositiveQuantifiers(sentence); - ReDefalutNum(sentence); - ReNumber(sentence); - return 0; -} -} // namespace ppspeech \ No newline at end of file diff --git a/demos/TTSCppFrontend/src/front/text_normalize.h b/demos/TTSCppFrontend/src/front/text_normalize.h deleted file mode 100644 index 4383fa1b4..000000000 --- a/demos/TTSCppFrontend/src/front/text_normalize.h +++ /dev/null @@ -1,77 +0,0 @@ -// Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. -#ifndef PADDLE_TTS_SERVING_FRONT_TEXT_NORMALIZE_H -#define PADDLE_TTS_SERVING_FRONT_TEXT_NORMALIZE_H - -#include -#include -#include -#include -#include -#include "absl/strings/str_split.h" -#include "absl/strings/strip.h" -#include "base/type_conv.h" - -namespace ppspeech { - -class TextNormalizer { - public: - TextNormalizer() { InitMap(); } - ~TextNormalizer() {} - - int InitMap(); - int Replace(std::wstring *sentence, - const int &pos, - const int &len, - const std::wstring &repstr); - int SplitByPunc(const std::wstring &sentence, - std::vector *sentence_part); - - std::string CreateTextValue(const std::string &num, bool use_zero = true); - std::string SingleDigit2Text(const std::string &num_str, - bool alt_one = false); - std::string SingleDigit2Text(const std::wstring &num, bool alt_one = false); - std::string MultiDigit2Text(const std::string &num_str, - bool alt_one = false, - bool use_zero = true); - std::string MultiDigit2Text(const std::wstring &num, - bool alt_one = false, - bool use_zero = true); - std::string Digits2Text(const std::string &num_str); - std::string Digits2Text(const std::wstring &num); - - int ReData(std::wstring *sentence); - int ReData2(std::wstring *sentence); - int ReTime(std::wstring *sentence); - int ReTemperature(std::wstring *sentence); - int ReFrac(std::wstring *sentence); - int RePercentage(std::wstring *sentence); - int ReMobilePhone(std::wstring *sentence); - int RePhone(std::wstring *sentence); - int ReRange(std::wstring *sentence); - int ReInterger(std::wstring *sentence); - int ReDecimalNum(std::wstring *sentence); - int RePositiveQuantifiers(std::wstring *sentence); - int ReDefalutNum(std::wstring *sentence); - int ReNumber(std::wstring *sentence); - int SentenceNormalize(std::wstring *sentence); - - - private: - std::map digits_map; - std::map units_map; -}; -} // namespace ppspeech - -#endif \ No newline at end of file diff --git a/demos/TTSCppFrontend/third-party/CMakeLists.txt b/demos/TTSCppFrontend/third-party/CMakeLists.txt deleted file mode 100644 index 0579b8f24..000000000 --- a/demos/TTSCppFrontend/third-party/CMakeLists.txt +++ /dev/null @@ -1,64 +0,0 @@ -cmake_minimum_required(VERSION 3.10) -project(tts_third_party_libs) - -include(ExternalProject) - -# gflags -ExternalProject_Add(gflags - GIT_REPOSITORY https://github.com/gflags/gflags.git - GIT_TAG v2.2.2 - PREFIX ${CMAKE_CURRENT_BINARY_DIR} - INSTALL_DIR ${CMAKE_CURRENT_BINARY_DIR} - CMAKE_ARGS -DCMAKE_INSTALL_PREFIX= - -DCMAKE_POSITION_INDEPENDENT_CODE=ON - -DBUILD_STATIC_LIBS=OFF - -DBUILD_SHARED_LIBS=ON -) - -# glog -ExternalProject_Add( - glog - GIT_REPOSITORY https://github.com/google/glog.git - GIT_TAG v0.6.0 - PREFIX ${CMAKE_CURRENT_BINARY_DIR} - INSTALL_DIR ${CMAKE_CURRENT_BINARY_DIR} - CMAKE_ARGS -DCMAKE_INSTALL_PREFIX= - -DCMAKE_POSITION_INDEPENDENT_CODE=ON - DEPENDS gflags -) - -# abseil -ExternalProject_Add( - abseil - GIT_REPOSITORY https://github.com/abseil/abseil-cpp.git - GIT_TAG 20230125.1 - PREFIX ${CMAKE_CURRENT_BINARY_DIR} - INSTALL_DIR ${CMAKE_CURRENT_BINARY_DIR} - CMAKE_ARGS -DCMAKE_INSTALL_PREFIX= - -DCMAKE_POSITION_INDEPENDENT_CODE=ON - -DABSL_PROPAGATE_CXX_STD=ON -) - -# cppjieba (header-only) -ExternalProject_Add( - cppjieba - GIT_REPOSITORY https://github.com/yanyiwu/cppjieba.git - GIT_TAG v5.0.3 - PREFIX ${CMAKE_CURRENT_BINARY_DIR} - CONFIGURE_COMMAND "" - BUILD_COMMAND "" - INSTALL_COMMAND "" - TEST_COMMAND "" -) - -# limonp (header-only) -ExternalProject_Add( - limonp - GIT_REPOSITORY https://github.com/yanyiwu/limonp.git - GIT_TAG v0.6.6 - PREFIX ${CMAKE_CURRENT_BINARY_DIR} - CONFIGURE_COMMAND "" - BUILD_COMMAND "" - INSTALL_COMMAND "" - TEST_COMMAND "" -) diff --git a/demos/audio_searching/src/test_audio_search.py b/demos/audio_searching/src/test_audio_search.py index f9ea2929e..cb91e1562 100644 --- a/demos/audio_searching/src/test_audio_search.py +++ b/demos/audio_searching/src/test_audio_search.py @@ -14,8 +14,8 @@ from audio_search import app from fastapi.testclient import TestClient -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unpack +from utils.utility import download +from utils.utility import unpack client = TestClient(app) diff --git a/demos/audio_searching/src/test_vpr_search.py b/demos/audio_searching/src/test_vpr_search.py index cc795564e..298e12eba 100644 --- a/demos/audio_searching/src/test_vpr_search.py +++ b/demos/audio_searching/src/test_vpr_search.py @@ -14,8 +14,8 @@ from fastapi.testclient import TestClient from vpr_search import app -from paddlespeech.dataset.download import download -from paddlespeech.dataset.download import unpack +from utils.utility import download +from utils.utility import unpack client = TestClient(app) diff --git a/demos/speech_recognition/README.md b/demos/speech_recognition/README.md index ee2acd6fd..c815a88af 100644 --- a/demos/speech_recognition/README.md +++ b/demos/speech_recognition/README.md @@ -17,7 +17,7 @@ The input of this demo should be a WAV file(`.wav`), and the sample rate must be Here are sample files for this demo that can be downloaded: ```bash -wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/ch_zh_mix.wav +wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav ``` ### 3. Usage @@ -27,8 +27,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee paddlespeech asr --input ./zh.wav -v # English paddlespeech asr --model transformer_librispeech --lang en --input ./en.wav -v - # Code-Switch - paddlespeech asr --model conformer_talcs --lang zh_en --codeswitch True --input ./ch_zh_mix.wav -v # Chinese ASR + Punctuation Restoration paddlespeech asr --input ./zh.wav -v | paddlespeech text --task punc -v ``` @@ -42,7 +40,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee - `input`(required): Audio file to recognize. - `model`: Model type of asr task. Default: `conformer_wenetspeech`. - `lang`: Model language. Default: `zh`. - - `codeswitch`: Code Swith Model. Default: `False` - `sample_rate`: Sample rate of the model. Default: `16000`. - `config`: Config of asr task. Use pretrained model when it is None. Default: `None`. - `ckpt_path`: Model checkpoint. Use pretrained model when it is None. Default: `None`. @@ -86,15 +83,14 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee Here is a list of pretrained models released by PaddleSpeech that can be used by command and python API: -| Model | Code Switch | Language | Sample Rate -| :--- | :---: | :---: | :---: | -| conformer_wenetspeech | False | zh | 16k -| conformer_online_multicn | False | zh | 16k -| conformer_aishell | False | zh | 16k -| conformer_online_aishell | False | zh | 16k -| transformer_librispeech | False | en | 16k -| deepspeech2online_wenetspeech | False | zh | 16k -| deepspeech2offline_aishell | False | zh| 16k -| deepspeech2online_aishell | False | zh | 16k -| deepspeech2offline_librispeech | False | en | 16k -| conformer_talcs | True | zh_en | 16k +| Model | Language | Sample Rate +| :--- | :---: | :---: | +| conformer_wenetspeech | zh | 16k +| conformer_online_multicn | zh | 16k +| conformer_aishell | zh | 16k +| conformer_online_aishell | zh | 16k +| transformer_librispeech | en | 16k +| deepspeech2online_wenetspeech | zh | 16k +| deepspeech2offline_aishell| zh| 16k +| deepspeech2online_aishell | zh | 16k +| deepspeech2offline_librispeech | en | 16k diff --git a/demos/speech_recognition/README_cn.md b/demos/speech_recognition/README_cn.md index 62dce3bc9..13aa9f277 100644 --- a/demos/speech_recognition/README_cn.md +++ b/demos/speech_recognition/README_cn.md @@ -1,5 +1,4 @@ (简体中文|[English](./README.md)) - (简体中文|[English](./README.md)) # 语音识别 ## 介绍 @@ -17,7 +16,7 @@ 可以下载此 demo 的示例音频: ```bash -wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/ch_zh_mix.wav +wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav ``` ### 3. 使用方法 - 命令行 (推荐使用) @@ -26,8 +25,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee paddlespeech asr --input ./zh.wav -v # 英文 paddlespeech asr --model transformer_librispeech --lang en --input ./en.wav -v - #中英混合 - paddlespeech asr --model conformer_talcs --lang zh_en --codeswitch True --input ./ch_zh_mix.wav -v # 中文 + 标点恢复 paddlespeech asr --input ./zh.wav -v | paddlespeech text --task punc -v ``` @@ -41,7 +38,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee - `input`(必须输入):用于识别的音频文件。 - `model`:ASR 任务的模型,默认值:`conformer_wenetspeech`。 - `lang`:模型语言,默认值:`zh`。 - - `codeswitch`: 是否使用语言转换,默认值:`False`。 - `sample_rate`:音频采样率,默认值:`16000`。 - `config`:ASR 任务的参数文件,若不设置则使用预训练模型中的默认配置,默认值:`None`。 - `ckpt_path`:模型参数文件,若不设置则下载预训练模型使用,默认值:`None`。 @@ -84,15 +80,14 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav https://paddlespee ### 4.预训练模型 以下是 PaddleSpeech 提供的可以被命令行和 python API 使用的预训练模型列表: -| 模型 | 语言转换 | 语言 | 采样率 -| :--- | :---: | :---: | :---: | -| conformer_wenetspeech | False | zh | 16k -| conformer_online_multicn | False | zh | 16k -| conformer_aishell | False | zh | 16k -| conformer_online_aishell | False | zh | 16k -| transformer_librispeech | False | en | 16k -| deepspeech2online_wenetspeech | False | zh | 16k -| deepspeech2offline_aishell | False | zh| 16k -| deepspeech2online_aishell | False | zh | 16k -| deepspeech2offline_librispeech | False | en | 16k -| conformer_talcs | True | zh_en | 16k +| 模型 | 语言 | 采样率 +| :--- | :---: | :---: | +| conformer_wenetspeech | zh | 16k +| conformer_online_multicn | zh | 16k +| conformer_aishell | zh | 16k +| conformer_online_aishell | zh | 16k +| transformer_librispeech | en | 16k +| deepspeech2online_wenetspeech | zh | 16k +| deepspeech2offline_aishell| zh| 16k +| deepspeech2online_aishell | zh | 16k +| deepspeech2offline_librispeech | en | 16k diff --git a/demos/speech_recognition/run.sh b/demos/speech_recognition/run.sh index 8ba6e4c3e..e48ff3e96 100755 --- a/demos/speech_recognition/run.sh +++ b/demos/speech_recognition/run.sh @@ -2,7 +2,6 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav -wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/ch_zh_mix.wav # asr paddlespeech asr --input ./zh.wav @@ -19,11 +18,6 @@ paddlespeech asr --help # english asr paddlespeech asr --lang en --model transformer_librispeech --input ./en.wav - -# code-switch asr -paddlespeech asr --lang zh_en --codeswitch True --model conformer_talcs --input ./ch_zh_mix.wav - - # model stats paddlespeech stats --task asr diff --git a/demos/speech_web/README.md b/demos/speech_web/README.md index fc1fe7105..572781ab6 100644 --- a/demos/speech_web/README.md +++ b/demos/speech_web/README.md @@ -23,7 +23,7 @@ Paddle Speech Demo 是一个以 PaddleSpeech 的语音交互功能为主体开 + ERNIE-SAT:语言-语音跨模态大模型 ERNIE-SAT 可视化展示示例,支持个性化合成,跨语言语音合成(音频为中文则输入英文文本进行合成),语音编辑(修改音频文字中间的结果)功能。 ERNIE-SAT 更多实现细节,可以参考: + [【ERNIE-SAT with AISHELL-3 dataset】](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/ernie_sat) - + [【ERNIE-SAT with AISHELL3 and VCTK datasets】](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3_vctk/ernie_sat) + + [【ERNIE-SAT with with AISHELL3 and VCTK datasets】](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3_vctk/ernie_sat) + [【ERNIE-SAT with VCTK dataset】](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/vctk/ernie_sat) 运行效果: diff --git a/demos/speech_web/speech_server/main.py b/demos/speech_web/speech_server/main.py index f4678628f..03e7e5996 100644 --- a/demos/speech_web/speech_server/main.py +++ b/demos/speech_web/speech_server/main.py @@ -260,7 +260,7 @@ async def websocket_endpoint_online(websocket: WebSocket): # and we break the loop if message['signal'] == 'start': resp = {"status": "ok", "signal": "server_ready"} - # do something at beginning here + # do something at begining here # create the instance to process the audio # connection_handler = chatbot.asr.connection_handler connection_handler = PaddleASRConnectionHanddler(engine) diff --git a/demos/speech_web/speech_server/requirements.txt b/demos/speech_web/speech_server/requirements.txt index 8425a1fee..cdc654656 100644 --- a/demos/speech_web/speech_server/requirements.txt +++ b/demos/speech_web/speech_server/requirements.txt @@ -1,6 +1,8 @@ aiofiles faiss-cpu -praatio>=5.0.0 +praatio==5.0.0 pydantic python-multipart +scikit_learn starlette +uvicorn diff --git a/demos/streaming_asr_server/README.md b/demos/streaming_asr_server/README.md index 31256d151..1d33b694b 100644 --- a/demos/streaming_asr_server/README.md +++ b/demos/streaming_asr_server/README.md @@ -9,7 +9,7 @@ This demo is an implementation of starting the streaming speech service and acce Streaming ASR server only support `websocket` protocol, and doesn't support `http` protocol. -For service interface definitions, please refer to: +服务接口定义请参考: - [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API) ## Usage @@ -23,7 +23,7 @@ You can choose one way from easy, meduim and hard to install paddlespeech. **If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to ### 2. Prepare config File -The configuration file can be found in `conf/ws_application.yaml` or `conf/ws_conformer_wenetspeech_application.yaml`. +The configuration file can be found in `conf/ws_application.yaml` 和 `conf/ws_conformer_wenetspeech_application.yaml`. At present, the speech tasks integrated by the model include: DeepSpeech2 and conformer. @@ -87,7 +87,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav server_executor = ServerExecutor() server_executor( - config_file="./conf/ws_conformer_wenetspeech_application_faster.yaml", + config_file="./conf/ws_conformer_wenetspeech_application.yaml", log_file="./log/paddlespeech.log") ``` @@ -579,354 +579,3 @@ bash server.sh [2022-05-07 11:11:18,915] [ INFO] - audio duration: 4.9968125, elapsed time: 15.928460597991943, RTF=3.187724293835709 [2022-05-07 11:11:18,916] [ INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康 ``` - -## Generate corresponding subtitle (.srt format) from audio file (.wav format or.mp3 format) - -By default, each server is deployed on the 'CPU' device and speech recognition and punctuation prediction can be deployed on different 'GPU' by modifying the' device 'parameter in the service configuration file respectively. - -We use `streaming_ asr_server.py` and `punc_server.py` two services to lanuch streaming speech recognition and punctuation prediction services respectively. And the `websocket_client_srt.py` script can be used to call streaming speech recognition and punctuation prediction services at the same time, and will generate the corresponding subtitle (.srt format). - -**need to install ffmpeg before running this script** - -**You should at the directory of `.../demos/streaming_asr_server/`** - -### 1. Start two server - -```bash -Note: streaming speech recognition and punctuation prediction are configured on different graphics cards through configuration files -paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application.yaml -``` - -Open another terminal run the following commands: -```bash -paddlespeech_server start --config_file conf/punc_application.yaml -``` - -### 2. Call client - - ```bash - python3 local/websocket_client_srt.py --server_ip 127.0.0.1 --port 8090 --punc.server_ip 127.0.0.1 --punc.port 8190 --wavfile ../../data/认知.mp3 - ``` - Output: - ```text - [2023-03-30 23:26:13,991] [ INFO] - Start to do streaming asr client -[2023-03-30 23:26:13,994] [ INFO] - asr websocket client start -[2023-03-30 23:26:13,994] [ INFO] - endpoint: http://127.0.0.1:8190/paddlespeech/text -[2023-03-30 23:26:13,994] [ INFO] - endpoint: ws://127.0.0.1:8090/paddlespeech/asr/streaming -[2023-03-30 23:26:14,475] [ INFO] - /home/fxb/PaddleSpeech-develop/data/认知.mp3 converted to /home/fxb/PaddleSpeech-develop/data/认知.wav -[2023-03-30 23:26:14,476] [ INFO] - start to process the wavscp: /home/fxb/PaddleSpeech-develop/data/认知.wav -[2023-03-30 23:26:14,515] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"} -[2023-03-30 23:26:14,533] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,545] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,556] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,572] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,588] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,600] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,613] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,626] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:15,122] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,135] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,154] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,163] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,175] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,185] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,196] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,637] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,648] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,657] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,666] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,676] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,683] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,691] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,703] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:16,146] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,159] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,167] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,177] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,187] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,197] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,210] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,694] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,704] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,713] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,725] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,737] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,749] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,759] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,770] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:17,279] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,302] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,316] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,332] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,343] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,358] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,373] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,958] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:17,971] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:17,987] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,000] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,017] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,028] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,038] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,049] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,653] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,689] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,701] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,712] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,723] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,750] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,767] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:19,295] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,307] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,323] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,332] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,342] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,349] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,373] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,389] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:20,046] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,055] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,067] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,076] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,094] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,124] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,135] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,732] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,742] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,757] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,770] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,782] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,798] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,815] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,834] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:21,390] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,405] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,416] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,428] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,448] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,459] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,473] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:22,065] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,085] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,110] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,118] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,137] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,144] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,154] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,169] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,698] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,709] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,731] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,743] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,755] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,771] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,782] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:23,415] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,430] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,442] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,456] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,470] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,487] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,498] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,524] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:24,200] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,210] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,219] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,231] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,250] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,262] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,272] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,898] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,903] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,907] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,932] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,957] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,979] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,991] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:25,011] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:25,616] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,625] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,648] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,658] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,669] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,681] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,690] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,707] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,378] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,384] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,389] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,397] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,402] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,415] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,428] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:27,008] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,018] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,026] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,037] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,046] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,054] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,062] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,070] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,735] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,745] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,755] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,769] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,783] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,794] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,804] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:28,454] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,472] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,481] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,489] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,499] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,533] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,543] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,556] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:29,212] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,222] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,233] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,246] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,258] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,270] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,286] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:30,003] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,013] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,038] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,048] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,062] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,074] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,114] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,125] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,856] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,876] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,885] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,897] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,914] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,940] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,952] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:31,655] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,696] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,709] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,718] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,727] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,740] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,757] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,768] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:32,476] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,486] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,495] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,549] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,560] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,574] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,590] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:33,338] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,356] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,368] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,386] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,397] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,409] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,424] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,434] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:34,352] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,364] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,377] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,395] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,410] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,423] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,434] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:35,373] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,397] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,410] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,420] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,437] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,448] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,460] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,473] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:36,288] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,297] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,306] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,326] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,336] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,351] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,365] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:37,164] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,173] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,182] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,192] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,204] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,232] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,238] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,252] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:38,084] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,093] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,106] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,122] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,140] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,181] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,206] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:39,094] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,111] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,132] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,150] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,174] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,190] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,197] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,212] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:40,009] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,094] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,105] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,128] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,149] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,173] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,189] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,200] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,952] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:40,973] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:40,986] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:40,999] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:41,013] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:41,022] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:41,033] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:41,819] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,832] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,845] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,878] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,886] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,893] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,925] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,935] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:42,562] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,589] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,621] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,634] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,644] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,657] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,668] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:43,380] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,389] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,436] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,448] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,462] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,472] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,486] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,496] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:44,346] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,356] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,364] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,374] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,389] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,398] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,420] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:45,226] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,235] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,258] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,273] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,295] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,306] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:46,380] [ INFO] - client punctuation restored msg={'result': '第一部分是认知部分,该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理,让学生对设备有大致的认知。随后使用真实传感器的内部构造图,辅以文字说明,进一步帮助学生对传感器有更深刻的印象,最后结合具体的实践应用,提升学生对实训的兴趣以及意义感。'} -[2023-03-30 23:27:01,059] [ INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '第一部分是认知部分,该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理,让学生对设备有大致的认知。随后使用真实传感器的内部构造图,辅以文字说明,进一步帮助学生对传感器有更深刻的印象,最后结合具体的实践应用,提升学生对实训的兴趣以及意义感。', 'times': [{'w': '第', 'bg': 0.0, 'ed': 0.36}, {'w': '一', 'bg': 0.36, 'ed': 0.48}, {'w': '部', 'bg': 0.48, 'ed': 0.62}, {'w': '分', 'bg': 0.62, 'ed': 0.8200000000000001}, {'w': '是', 'bg': 0.8200000000000001, 'ed': 1.08}, {'w': '认', 'bg': 1.08, 'ed': 1.28}, {'w': '知', 'bg': 1.28, 'ed': 1.44}, {'w': '部', 'bg': 1.44, 'ed': 1.58}, {'w': '分', 'bg': 1.58, 'ed': 2.1}, {'w': '该', 'bg': 2.1, 'ed': 2.6}, {'w': '部', 'bg': 2.6, 'ed': 2.72}, {'w': '分', 'bg': 2.72, 'ed': 2.94}, {'w': '通', 'bg': 2.94, 'ed': 3.16}, {'w': '过', 'bg': 3.16, 'ed': 3.36}, {'w': '示', 'bg': 3.36, 'ed': 3.54}, {'w': '意', 'bg': 3.54, 'ed': 3.68}, {'w': '图', 'bg': 3.68, 'ed': 3.9}, {'w': '和', 'bg': 3.9, 'ed': 4.14}, {'w': '文', 'bg': 4.14, 'ed': 4.32}, {'w': '本', 'bg': 4.32, 'ed': 4.46}, {'w': '的', 'bg': 4.46, 'ed': 4.58}, {'w': '形', 'bg': 4.58, 'ed': 4.72}, {'w': '式', 'bg': 4.72, 'ed': 5.0}, {'w': '向', 'bg': 5.0, 'ed': 5.32}, {'w': '学', 'bg': 5.32, 'ed': 5.5}, {'w': '生', 'bg': 5.5, 'ed': 5.66}, {'w': '讲', 'bg': 5.66, 'ed': 5.86}, {'w': '解', 'bg': 5.86, 'ed': 6.18}, {'w': '主', 'bg': 6.18, 'ed': 6.46}, {'w': '要', 'bg': 6.46, 'ed': 6.62}, {'w': '传', 'bg': 6.62, 'ed': 6.8}, {'w': '感', 'bg': 6.8, 'ed': 7.0}, {'w': '器', 'bg': 7.0, 'ed': 7.16}, {'w': '的', 'bg': 7.16, 'ed': 7.28}, {'w': '工', 'bg': 7.28, 'ed': 7.44}, {'w': '作', 'bg': 7.44, 'ed': 7.6000000000000005}, {'w': '原', 'bg': 7.6000000000000005, 'ed': 7.74}, {'w': '理', 'bg': 7.74, 'ed': 8.06}, {'w': '让', 'bg': 8.06, 'ed': 8.44}, {'w': '学', 'bg': 8.44, 'ed': 8.64}, {'w': '生', 'bg': 8.64, 'ed': 8.84}, {'w': '对', 'bg': 8.84, 'ed': 9.06}, {'w': '设', 'bg': 9.06, 'ed': 9.24}, {'w': '备', 'bg': 9.24, 'ed': 9.52}, {'w': '有', 'bg': 9.52, 'ed': 9.86}, {'w': '大', 'bg': 9.86, 'ed': 10.1}, {'w': '致', 'bg': 10.1, 'ed': 10.24}, {'w': '的', 'bg': 10.24, 'ed': 10.36}, {'w': '认', 'bg': 10.36, 'ed': 10.5}, {'w': '知', 'bg': 10.5, 'ed': 11.040000000000001}, {'w': '随', 'bg': 11.040000000000001, 'ed': 11.56}, {'w': '后', 'bg': 11.56, 'ed': 11.82}, {'w': '使', 'bg': 11.82, 'ed': 12.1}, {'w': '用', 'bg': 12.1, 'ed': 12.26}, {'w': '真', 'bg': 12.26, 'ed': 12.44}, {'w': '实', 'bg': 12.44, 'ed': 12.620000000000001}, {'w': '传', 'bg': 12.620000000000001, 'ed': 12.780000000000001}, {'w': '感', 'bg': 12.780000000000001, 'ed': 12.94}, {'w': '器', 'bg': 12.94, 'ed': 13.1}, {'w': '的', 'bg': 13.1, 'ed': 13.26}, {'w': '内', 'bg': 13.26, 'ed': 13.42}, {'w': '部', 'bg': 13.42, 'ed': 13.56}, {'w': '构', 'bg': 13.56, 'ed': 13.700000000000001}, {'w': '造', 'bg': 13.700000000000001, 'ed': 13.86}, {'w': '图', 'bg': 13.86, 'ed': 14.280000000000001}, {'w': '辅', 'bg': 14.280000000000001, 'ed': 14.66}, {'w': '以', 'bg': 14.66, 'ed': 14.82}, {'w': '文', 'bg': 14.82, 'ed': 15.0}, {'w': '字', 'bg': 15.0, 'ed': 15.16}, {'w': '说', 'bg': 15.16, 'ed': 15.32}, {'w': '明', 'bg': 15.32, 'ed': 15.72}, {'w': '进', 'bg': 15.72, 'ed': 16.1}, {'w': '一', 'bg': 16.1, 'ed': 16.2}, {'w': '步', 'bg': 16.2, 'ed': 16.32}, {'w': '帮', 'bg': 16.32, 'ed': 16.48}, {'w': '助', 'bg': 16.48, 'ed': 16.66}, {'w': '学', 'bg': 16.66, 'ed': 16.82}, {'w': '生', 'bg': 16.82, 'ed': 17.12}, {'w': '对', 'bg': 17.12, 'ed': 17.48}, {'w': '传', 'bg': 17.48, 'ed': 17.66}, {'w': '感', 'bg': 17.66, 'ed': 17.84}, {'w': '器', 'bg': 17.84, 'ed': 18.12}, {'w': '有', 'bg': 18.12, 'ed': 18.42}, {'w': '更', 'bg': 18.42, 'ed': 18.66}, {'w': '深', 'bg': 18.66, 'ed': 18.88}, {'w': '刻', 'bg': 18.88, 'ed': 19.04}, {'w': '的', 'bg': 19.04, 'ed': 19.16}, {'w': '印', 'bg': 19.16, 'ed': 19.3}, {'w': '象', 'bg': 19.3, 'ed': 19.8}, {'w': '最', 'bg': 19.8, 'ed': 20.3}, {'w': '后', 'bg': 20.3, 'ed': 20.62}, {'w': '结', 'bg': 20.62, 'ed': 20.96}, {'w': '合', 'bg': 20.96, 'ed': 21.14}, {'w': '具', 'bg': 21.14, 'ed': 21.3}, {'w': '体', 'bg': 21.3, 'ed': 21.42}, {'w': '的', 'bg': 21.42, 'ed': 21.580000000000002}, {'w': '实', 'bg': 21.580000000000002, 'ed': 21.76}, {'w': '践', 'bg': 21.76, 'ed': 21.92}, {'w': '应', 'bg': 21.92, 'ed': 22.080000000000002}, {'w': '用', 'bg': 22.080000000000002, 'ed': 22.44}, {'w': '提', 'bg': 22.44, 'ed': 22.78}, {'w': '升', 'bg': 22.78, 'ed': 22.94}, {'w': '学', 'bg': 22.94, 'ed': 23.12}, {'w': '生', 'bg': 23.12, 'ed': 23.34}, {'w': '对', 'bg': 23.34, 'ed': 23.62}, {'w': '实', 'bg': 23.62, 'ed': 23.82}, {'w': '训', 'bg': 23.82, 'ed': 23.96}, {'w': '的', 'bg': 23.96, 'ed': 24.12}, {'w': '兴', 'bg': 24.12, 'ed': 24.3}, {'w': '趣', 'bg': 24.3, 'ed': 24.6}, {'w': '以', 'bg': 24.6, 'ed': 24.88}, {'w': '及', 'bg': 24.88, 'ed': 25.12}, {'w': '意', 'bg': 25.12, 'ed': 25.34}, {'w': '义', 'bg': 25.34, 'ed': 25.46}, {'w': '感', 'bg': 25.46, 'ed': 26.04}]} -[2023-03-30 23:27:01,060] [ INFO] - audio duration: 26.04, elapsed time: 46.581613540649414, RTF=1.7888484462614982 -sentences: ['第一部分是认知部分', '该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理', '让学生对设备有大致的认知', '随后使用真实传感器的内部构造图', '辅以文字说明', '进一步帮助学生对传感器有更深刻的印象', '最后结合具体的实践应用', '提升学生对实训的兴趣以及意义感'] -relative_times: [[0.0, 2.1], [2.1, 8.06], [8.06, 11.040000000000001], [11.040000000000001, 14.280000000000001], [14.280000000000001, 15.72], [15.72, 19.8], [19.8, 22.44], [22.44, 26.04]] -[2023-03-30 23:27:01,076] [ INFO] - results saved to /home/fxb/PaddleSpeech-develop/data/认知.srt - ``` diff --git a/demos/streaming_asr_server/README_cn.md b/demos/streaming_asr_server/README_cn.md index bbddd6932..1902a2fa9 100644 --- a/demos/streaming_asr_server/README_cn.md +++ b/demos/streaming_asr_server/README_cn.md @@ -90,7 +90,7 @@ wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav server_executor = ServerExecutor() server_executor( - config_file="./conf/ws_conformer_wenetspeech_application_faster.yaml", + config_file="./conf/ws_conformer_wenetspeech_application", log_file="./log/paddlespeech.log") ``` @@ -578,354 +578,3 @@ bash server.sh [2022-05-07 11:11:18,915] [ INFO] - audio duration: 4.9968125, elapsed time: 15.928460597991943, RTF=3.187724293835709 [2022-05-07 11:11:18,916] [ INFO] - asr websocket client finished : 我认为跑步最重要的就是给我带来了身体健康 ``` - -## 从音频文件(.wav 格式 或者.mp3 格式)生成字幕文件 (.srt 格式) - -**注意:** 默认部署在 `cpu` 设备上,可以通过修改服务配置文件中 `device` 参数将语音识别和标点预测部署在不同的 `gpu` 上。 - -使用 `streaming_asr_server.py` 和 `punc_server.py` 两个服务,分别启动流式语音识别和标点预测服务。调用 `websocket_client.py` 脚本可以同时调用流式语音识别和标点预测服务,将会生成对应的字幕文件(.srt格式)。 - -**使用该脚本前需要安装mffpeg** - -**应该在对应的`.../demos/streaming_asr_server/`目录下运行以下脚本** - -### 1. 启动服务端 - -```bash -Note: streaming speech recognition and punctuation prediction are configured on different graphics cards through configuration files -paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application.yaml -``` - -Open another terminal run the following commands: -```bash -paddlespeech_server start --config_file conf/punc_application.yaml -``` - -### 2. 启动客户端 - - ```bash - python3 local/websocket_client_srt.py --server_ip 127.0.0.1 --port 8090 --punc.server_ip 127.0.0.1 --punc.port 8190 --wavfile ../../data/认知.mp3 - ``` - Output: - ```text - [2023-03-30 23:26:13,991] [ INFO] - Start to do streaming asr client -[2023-03-30 23:26:13,994] [ INFO] - asr websocket client start -[2023-03-30 23:26:13,994] [ INFO] - endpoint: http://127.0.0.1:8190/paddlespeech/text -[2023-03-30 23:26:13,994] [ INFO] - endpoint: ws://127.0.0.1:8090/paddlespeech/asr/streaming -[2023-03-30 23:26:14,475] [ INFO] - /home/fxb/PaddleSpeech-develop/data/认知.mp3 converted to /home/fxb/PaddleSpeech-develop/data/认知.wav -[2023-03-30 23:26:14,476] [ INFO] - start to process the wavscp: /home/fxb/PaddleSpeech-develop/data/认知.wav -[2023-03-30 23:26:14,515] [ INFO] - client receive msg={"status": "ok", "signal": "server_ready"} -[2023-03-30 23:26:14,533] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,545] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,556] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,572] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,588] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,600] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,613] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:14,626] [ INFO] - client receive msg={'result': ''} -[2023-03-30 23:26:15,122] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,135] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,154] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,163] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,175] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,185] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,196] [ INFO] - client receive msg={'result': '第一部'} -[2023-03-30 23:26:15,637] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,648] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,657] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,666] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,676] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,683] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,691] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:15,703] [ INFO] - client receive msg={'result': '第一部分是认'} -[2023-03-30 23:26:16,146] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,159] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,167] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,177] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,187] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,197] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,210] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,694] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,704] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,713] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,725] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,737] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,749] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,759] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:16,770] [ INFO] - client receive msg={'result': '第一部分是认知部分'} -[2023-03-30 23:26:17,279] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,302] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,316] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,332] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,343] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,358] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,373] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通'} -[2023-03-30 23:26:17,958] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:17,971] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:17,987] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,000] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,017] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,028] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,038] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,049] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图'} -[2023-03-30 23:26:18,653] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,689] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,701] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,712] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,723] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,750] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:18,767] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本'} -[2023-03-30 23:26:19,295] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,307] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,323] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,332] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,342] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,349] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,373] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:19,389] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式'} -[2023-03-30 23:26:20,046] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,055] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,067] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,076] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,094] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,124] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,135] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生'} -[2023-03-30 23:26:20,732] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,742] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,757] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,770] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,782] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,798] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,815] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:20,834] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解'} -[2023-03-30 23:26:21,390] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,405] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,416] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,428] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,448] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,459] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:21,473] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感'} -[2023-03-30 23:26:22,065] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,085] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,110] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,118] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,137] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,144] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,154] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,169] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作'} -[2023-03-30 23:26:22,698] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,709] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,731] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,743] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,755] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,771] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:22,782] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理'} -[2023-03-30 23:26:23,415] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,430] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,442] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,456] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,470] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,487] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,498] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:23,524] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生'} -[2023-03-30 23:26:24,200] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,210] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,219] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,231] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,250] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,262] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,272] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备'} -[2023-03-30 23:26:24,898] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,903] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,907] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,932] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,957] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,979] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:24,991] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:25,011] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致'} -[2023-03-30 23:26:25,616] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,625] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,648] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,658] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,669] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,681] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,690] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:25,707] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,378] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,384] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,389] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,397] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,402] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,415] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:26,428] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知'} -[2023-03-30 23:26:27,008] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,018] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,026] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,037] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,046] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,054] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,062] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,070] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使'} -[2023-03-30 23:26:27,735] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,745] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,755] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,769] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,783] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,794] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:27,804] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传'} -[2023-03-30 23:26:28,454] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,472] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,481] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,489] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,499] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,533] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,543] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:28,556] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内'} -[2023-03-30 23:26:29,212] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,222] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,233] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,246] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,258] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,270] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:29,286] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图'} -[2023-03-30 23:26:30,003] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,013] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,038] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,048] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,062] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,074] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,114] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,125] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅'} -[2023-03-30 23:26:30,856] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,876] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,885] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,897] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,914] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,940] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:30,952] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说'} -[2023-03-30 23:26:31,655] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,696] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,709] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,718] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,727] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,740] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,757] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:31,768] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明'} -[2023-03-30 23:26:32,476] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,486] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,495] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,549] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,560] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,574] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:32,590] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助'} -[2023-03-30 23:26:33,338] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,356] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,368] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,386] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,397] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,409] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,424] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:33,434] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生'} -[2023-03-30 23:26:34,352] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,364] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,377] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,395] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,410] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,423] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:34,434] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感'} -[2023-03-30 23:26:35,373] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,397] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,410] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,420] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,437] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,448] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,460] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:35,473] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有'} -[2023-03-30 23:26:36,288] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,297] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,306] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,326] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,336] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,351] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:36,365] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的'} -[2023-03-30 23:26:37,164] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,173] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,182] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,192] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,204] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,232] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,238] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:37,252] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象'} -[2023-03-30 23:26:38,084] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,093] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,106] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,122] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,140] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,181] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:38,206] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后'} -[2023-03-30 23:26:39,094] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,111] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,132] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,150] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,174] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,190] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,197] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:39,212] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合'} -[2023-03-30 23:26:40,009] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,094] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,105] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,128] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,149] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,173] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,189] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,200] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实'} -[2023-03-30 23:26:40,952] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:40,973] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:40,986] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:40,999] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:41,013] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:41,022] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:41,033] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用'} -[2023-03-30 23:26:41,819] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,832] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,845] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,878] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,886] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,893] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,925] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:41,935] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升'} -[2023-03-30 23:26:42,562] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,589] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,621] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,634] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,644] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,657] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:42,668] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对'} -[2023-03-30 23:26:43,380] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,389] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,436] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,448] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,462] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,472] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,486] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:43,496] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴'} -[2023-03-30 23:26:44,346] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,356] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,364] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,374] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,389] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,398] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:44,420] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以'} -[2023-03-30 23:26:45,226] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,235] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,258] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,273] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,295] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:45,306] [ INFO] - client receive msg={'result': '第一部分是认知部分该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理让学生对设备有大致的认知随后使用真实传感器的内部构造图辅以文字说明进一步帮助学生对传感器有更深刻的印象最后结合具体的实践应用提升学生对实训的兴趣以及意义感'} -[2023-03-30 23:26:46,380] [ INFO] - client punctuation restored msg={'result': '第一部分是认知部分,该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理,让学生对设备有大致的认知。随后使用真实传感器的内部构造图,辅以文字说明,进一步帮助学生对传感器有更深刻的印象,最后结合具体的实践应用,提升学生对实训的兴趣以及意义感。'} -[2023-03-30 23:27:01,059] [ INFO] - client final receive msg={'status': 'ok', 'signal': 'finished', 'result': '第一部分是认知部分,该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理,让学生对设备有大致的认知。随后使用真实传感器的内部构造图,辅以文字说明,进一步帮助学生对传感器有更深刻的印象,最后结合具体的实践应用,提升学生对实训的兴趣以及意义感。', 'times': [{'w': '第', 'bg': 0.0, 'ed': 0.36}, {'w': '一', 'bg': 0.36, 'ed': 0.48}, {'w': '部', 'bg': 0.48, 'ed': 0.62}, {'w': '分', 'bg': 0.62, 'ed': 0.8200000000000001}, {'w': '是', 'bg': 0.8200000000000001, 'ed': 1.08}, {'w': '认', 'bg': 1.08, 'ed': 1.28}, {'w': '知', 'bg': 1.28, 'ed': 1.44}, {'w': '部', 'bg': 1.44, 'ed': 1.58}, {'w': '分', 'bg': 1.58, 'ed': 2.1}, {'w': '该', 'bg': 2.1, 'ed': 2.6}, {'w': '部', 'bg': 2.6, 'ed': 2.72}, {'w': '分', 'bg': 2.72, 'ed': 2.94}, {'w': '通', 'bg': 2.94, 'ed': 3.16}, {'w': '过', 'bg': 3.16, 'ed': 3.36}, {'w': '示', 'bg': 3.36, 'ed': 3.54}, {'w': '意', 'bg': 3.54, 'ed': 3.68}, {'w': '图', 'bg': 3.68, 'ed': 3.9}, {'w': '和', 'bg': 3.9, 'ed': 4.14}, {'w': '文', 'bg': 4.14, 'ed': 4.32}, {'w': '本', 'bg': 4.32, 'ed': 4.46}, {'w': '的', 'bg': 4.46, 'ed': 4.58}, {'w': '形', 'bg': 4.58, 'ed': 4.72}, {'w': '式', 'bg': 4.72, 'ed': 5.0}, {'w': '向', 'bg': 5.0, 'ed': 5.32}, {'w': '学', 'bg': 5.32, 'ed': 5.5}, {'w': '生', 'bg': 5.5, 'ed': 5.66}, {'w': '讲', 'bg': 5.66, 'ed': 5.86}, {'w': '解', 'bg': 5.86, 'ed': 6.18}, {'w': '主', 'bg': 6.18, 'ed': 6.46}, {'w': '要', 'bg': 6.46, 'ed': 6.62}, {'w': '传', 'bg': 6.62, 'ed': 6.8}, {'w': '感', 'bg': 6.8, 'ed': 7.0}, {'w': '器', 'bg': 7.0, 'ed': 7.16}, {'w': '的', 'bg': 7.16, 'ed': 7.28}, {'w': '工', 'bg': 7.28, 'ed': 7.44}, {'w': '作', 'bg': 7.44, 'ed': 7.6000000000000005}, {'w': '原', 'bg': 7.6000000000000005, 'ed': 7.74}, {'w': '理', 'bg': 7.74, 'ed': 8.06}, {'w': '让', 'bg': 8.06, 'ed': 8.44}, {'w': '学', 'bg': 8.44, 'ed': 8.64}, {'w': '生', 'bg': 8.64, 'ed': 8.84}, {'w': '对', 'bg': 8.84, 'ed': 9.06}, {'w': '设', 'bg': 9.06, 'ed': 9.24}, {'w': '备', 'bg': 9.24, 'ed': 9.52}, {'w': '有', 'bg': 9.52, 'ed': 9.86}, {'w': '大', 'bg': 9.86, 'ed': 10.1}, {'w': '致', 'bg': 10.1, 'ed': 10.24}, {'w': '的', 'bg': 10.24, 'ed': 10.36}, {'w': '认', 'bg': 10.36, 'ed': 10.5}, {'w': '知', 'bg': 10.5, 'ed': 11.040000000000001}, {'w': '随', 'bg': 11.040000000000001, 'ed': 11.56}, {'w': '后', 'bg': 11.56, 'ed': 11.82}, {'w': '使', 'bg': 11.82, 'ed': 12.1}, {'w': '用', 'bg': 12.1, 'ed': 12.26}, {'w': '真', 'bg': 12.26, 'ed': 12.44}, {'w': '实', 'bg': 12.44, 'ed': 12.620000000000001}, {'w': '传', 'bg': 12.620000000000001, 'ed': 12.780000000000001}, {'w': '感', 'bg': 12.780000000000001, 'ed': 12.94}, {'w': '器', 'bg': 12.94, 'ed': 13.1}, {'w': '的', 'bg': 13.1, 'ed': 13.26}, {'w': '内', 'bg': 13.26, 'ed': 13.42}, {'w': '部', 'bg': 13.42, 'ed': 13.56}, {'w': '构', 'bg': 13.56, 'ed': 13.700000000000001}, {'w': '造', 'bg': 13.700000000000001, 'ed': 13.86}, {'w': '图', 'bg': 13.86, 'ed': 14.280000000000001}, {'w': '辅', 'bg': 14.280000000000001, 'ed': 14.66}, {'w': '以', 'bg': 14.66, 'ed': 14.82}, {'w': '文', 'bg': 14.82, 'ed': 15.0}, {'w': '字', 'bg': 15.0, 'ed': 15.16}, {'w': '说', 'bg': 15.16, 'ed': 15.32}, {'w': '明', 'bg': 15.32, 'ed': 15.72}, {'w': '进', 'bg': 15.72, 'ed': 16.1}, {'w': '一', 'bg': 16.1, 'ed': 16.2}, {'w': '步', 'bg': 16.2, 'ed': 16.32}, {'w': '帮', 'bg': 16.32, 'ed': 16.48}, {'w': '助', 'bg': 16.48, 'ed': 16.66}, {'w': '学', 'bg': 16.66, 'ed': 16.82}, {'w': '生', 'bg': 16.82, 'ed': 17.12}, {'w': '对', 'bg': 17.12, 'ed': 17.48}, {'w': '传', 'bg': 17.48, 'ed': 17.66}, {'w': '感', 'bg': 17.66, 'ed': 17.84}, {'w': '器', 'bg': 17.84, 'ed': 18.12}, {'w': '有', 'bg': 18.12, 'ed': 18.42}, {'w': '更', 'bg': 18.42, 'ed': 18.66}, {'w': '深', 'bg': 18.66, 'ed': 18.88}, {'w': '刻', 'bg': 18.88, 'ed': 19.04}, {'w': '的', 'bg': 19.04, 'ed': 19.16}, {'w': '印', 'bg': 19.16, 'ed': 19.3}, {'w': '象', 'bg': 19.3, 'ed': 19.8}, {'w': '最', 'bg': 19.8, 'ed': 20.3}, {'w': '后', 'bg': 20.3, 'ed': 20.62}, {'w': '结', 'bg': 20.62, 'ed': 20.96}, {'w': '合', 'bg': 20.96, 'ed': 21.14}, {'w': '具', 'bg': 21.14, 'ed': 21.3}, {'w': '体', 'bg': 21.3, 'ed': 21.42}, {'w': '的', 'bg': 21.42, 'ed': 21.580000000000002}, {'w': '实', 'bg': 21.580000000000002, 'ed': 21.76}, {'w': '践', 'bg': 21.76, 'ed': 21.92}, {'w': '应', 'bg': 21.92, 'ed': 22.080000000000002}, {'w': '用', 'bg': 22.080000000000002, 'ed': 22.44}, {'w': '提', 'bg': 22.44, 'ed': 22.78}, {'w': '升', 'bg': 22.78, 'ed': 22.94}, {'w': '学', 'bg': 22.94, 'ed': 23.12}, {'w': '生', 'bg': 23.12, 'ed': 23.34}, {'w': '对', 'bg': 23.34, 'ed': 23.62}, {'w': '实', 'bg': 23.62, 'ed': 23.82}, {'w': '训', 'bg': 23.82, 'ed': 23.96}, {'w': '的', 'bg': 23.96, 'ed': 24.12}, {'w': '兴', 'bg': 24.12, 'ed': 24.3}, {'w': '趣', 'bg': 24.3, 'ed': 24.6}, {'w': '以', 'bg': 24.6, 'ed': 24.88}, {'w': '及', 'bg': 24.88, 'ed': 25.12}, {'w': '意', 'bg': 25.12, 'ed': 25.34}, {'w': '义', 'bg': 25.34, 'ed': 25.46}, {'w': '感', 'bg': 25.46, 'ed': 26.04}]} -[2023-03-30 23:27:01,060] [ INFO] - audio duration: 26.04, elapsed time: 46.581613540649414, RTF=1.7888484462614982 -sentences: ['第一部分是认知部分', '该部分通过示意图和文本的形式向学生讲解主要传感器的工作原理', '让学生对设备有大致的认知', '随后使用真实传感器的内部构造图', '辅以文字说明', '进一步帮助学生对传感器有更深刻的印象', '最后结合具体的实践应用', '提升学生对实训的兴趣以及意义感'] -relative_times: [[0.0, 2.1], [2.1, 8.06], [8.06, 11.040000000000001], [11.040000000000001, 14.280000000000001], [14.280000000000001, 15.72], [15.72, 19.8], [19.8, 22.44], [22.44, 26.04]] -[2023-03-30 23:27:01,076] [ INFO] - results saved to /home/fxb/PaddleSpeech-develop/data/认知.srt - ``` diff --git a/demos/streaming_asr_server/local/websocket_client_srt.py b/demos/streaming_asr_server/local/websocket_client_srt.py deleted file mode 100644 index 02fea4842..000000000 --- a/demos/streaming_asr_server/local/websocket_client_srt.py +++ /dev/null @@ -1,162 +0,0 @@ -#!/usr/bin/python -# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# calc avg RTF(NOT Accurate): grep -rn RTF log.txt | awk '{print $NF}' | awk -F "=" '{sum += $NF} END {print "all time",sum, "audio num", NR, "RTF", sum/NR}' -# python3 websocket_client.py --server_ip 127.0.0.1 --port 8290 --punc.server_ip 127.0.0.1 --punc.port 8190 --wavfile ./zh.wav -# python3 websocket_client.py --server_ip 127.0.0.1 --port 8290 --wavfile ./zh.wav -import argparse -import asyncio -import codecs -import os -from pydub import AudioSegment -import re - -from paddlespeech.cli.log import logger -from paddlespeech.server.utils.audio_handler import ASRWsAudioHandler - -def convert_to_wav(input_file): - # Load audio file - audio = AudioSegment.from_file(input_file) - - # Set parameters for audio file - audio = audio.set_channels(1) - audio = audio.set_frame_rate(16000) - - # Create output filename - output_file = os.path.splitext(input_file)[0] + ".wav" - - # Export audio file as WAV - audio.export(output_file, format="wav") - - logger.info(f"{input_file} converted to {output_file}") - -def format_time(sec): - # Convert seconds to SRT format (HH:MM:SS,ms) - hours = int(sec/3600) - minutes = int((sec%3600)/60) - seconds = int(sec%60) - milliseconds = int((sec%1)*1000) - return f'{hours:02d}:{minutes:02d}:{seconds:02d},{milliseconds:03d}' - -def results2srt(results, srt_file): - """convert results from paddlespeech to srt format for subtitle - Args: - results (dict): results from paddlespeech - """ - # times contains start and end time of each word - times = results['times'] - # result contains the whole sentence including punctuation - result = results['result'] - # split result into several sencences by ',' and '。' - sentences = re.split(',|。', result)[:-1] - # print("sentences: ", sentences) - # generate relative time for each sentence in sentences - relative_times = [] - word_i = 0 - for sentence in sentences: - relative_times.append([]) - for word in sentence: - if relative_times[-1] == []: - relative_times[-1].append(times[word_i]['bg']) - if len(relative_times[-1]) == 1: - relative_times[-1].append(times[word_i]['ed']) - else: - relative_times[-1][1] = times[word_i]['ed'] - word_i += 1 - # print("relative_times: ", relative_times) - # generate srt file acoording to relative_times and sentences - with open(srt_file, 'w') as f: - for i in range(len(sentences)): - # Write index number - f.write(str(i+1)+'\n') - - # Write start and end times - start = format_time(relative_times[i][0]) - end = format_time(relative_times[i][1]) - f.write(start + ' --> ' + end + '\n') - - # Write text - f.write(sentences[i]+'\n\n') - logger.info(f"results saved to {srt_file}") - -def main(args): - logger.info("asr websocket client start") - handler = ASRWsAudioHandler( - args.server_ip, - args.port, - endpoint=args.endpoint, - punc_server_ip=args.punc_server_ip, - punc_server_port=args.punc_server_port) - loop = asyncio.get_event_loop() - - # check if the wav file is mp3 format - # if so, convert it to wav format using convert_to_wav function - if args.wavfile and os.path.exists(args.wavfile): - if args.wavfile.endswith(".mp3"): - convert_to_wav(args.wavfile) - args.wavfile = args.wavfile.replace(".mp3", ".wav") - - # support to process single audio file - if args.wavfile and os.path.exists(args.wavfile): - logger.info(f"start to process the wavscp: {args.wavfile}") - result = loop.run_until_complete(handler.run(args.wavfile)) - # result = result["result"] - # logger.info(f"asr websocket client finished : {result}") - results2srt(result, args.wavfile.replace(".wav", ".srt")) - - # support to process batch audios from wav.scp - if args.wavscp and os.path.exists(args.wavscp): - logger.info(f"start to process the wavscp: {args.wavscp}") - with codecs.open(args.wavscp, 'r', encoding='utf-8') as f,\ - codecs.open("result.txt", 'w', encoding='utf-8') as w: - for line in f: - utt_name, utt_path = line.strip().split() - result = loop.run_until_complete(handler.run(utt_path)) - result = result["result"] - w.write(f"{utt_name} {result}\n") - - -if __name__ == "__main__": - logger.info("Start to do streaming asr client") - parser = argparse.ArgumentParser() - parser.add_argument( - '--server_ip', type=str, default='127.0.0.1', help='server ip') - parser.add_argument('--port', type=int, default=8090, help='server port') - parser.add_argument( - '--punc.server_ip', - type=str, - default=None, - dest="punc_server_ip", - help='Punctuation server ip') - parser.add_argument( - '--punc.port', - type=int, - default=8091, - dest="punc_server_port", - help='Punctuation server port') - parser.add_argument( - "--endpoint", - type=str, - default="/paddlespeech/asr/streaming", - help="ASR websocket endpoint") - parser.add_argument( - "--wavfile", - action="store", - help="wav file path ", - default="./16_audio.wav") - parser.add_argument( - "--wavscp", type=str, default=None, help="The batch audios dict text") - args = parser.parse_args() - - main(args) diff --git a/demos/text_to_speech/README.md b/demos/text_to_speech/README.md index d7bb8ca1c..41dcf820b 100644 --- a/demos/text_to_speech/README.md +++ b/demos/text_to_speech/README.md @@ -58,18 +58,7 @@ The input of this demo should be a text of the specific language that can be pas paddlespeech tts --am fastspeech2_mix --voc pwgan_csmsc --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --spk_id 175 --output mix_spk175_pwgan.wav paddlespeech tts --am fastspeech2_mix --voc hifigan_csmsc --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --spk_id 175 --output mix_spk175.wav ``` - - Chinese English Mixed, single male spk - ```bash - # male mix tts - # The `lang` must be `mix`! - paddlespeech tts --am fastspeech2_male --voc pwgan_male --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --output male_mix_fs2_pwgan.wav - paddlespeech tts --am fastspeech2_male --voc hifigan_male --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --output male_mix_fs2_hifigan.wav - ``` - - Cantonese - ```bash - paddlespeech tts --am fastspeech2_canton --voc pwgan_aishell3 --input "各个国家有各个国家嘅国歌" --lang canton --spk_id 10 - ``` - - Use ONNXRuntime infer: + - Use ONNXRuntime infer: ```bash paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" --output default.wav --use_onnx True paddlespeech tts --am speedyspeech_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output ss.wav --use_onnx True @@ -81,15 +70,7 @@ The input of this demo should be a text of the specific language that can be pas paddlespeech tts --am fastspeech2_ljspeech --voc hifigan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_hifigan.wav --use_onnx True paddlespeech tts --am fastspeech2_vctk --voc pwgan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_pwgan.wav --use_onnx True paddlespeech tts --am fastspeech2_vctk --voc hifigan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_hifigan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc pwgan_male --lang zh --input "你好,欢迎使用百度飞桨深度学习框架!" --output male_zh_fs2_pwgan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc pwgan_male --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output male_en_fs2_pwgan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc pwgan_male --lang mix --input "热烈欢迎您在 Discussions 中提交问题,并在 Issues 中指出发现的 bug。此外,我们非常希望您参与到 Paddle Speech 的开发中!" --output male_fs2_pwgan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc hifigan_male --lang zh --input "你好,欢迎使用百度飞桨深度学习框架!" --output male_zh_fs2_hifigan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc hifigan_male --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output male_en_fs2_hifigan.wav --use_onnx True - paddlespeech tts --am fastspeech2_mix --voc hifigan_male --lang mix --input "热烈欢迎您在 Discussions 中提交问题,并在 Issues 中指出发现的 bug。此外,我们非常希望您参与到 Paddle Speech 的开发中!" --output male_fs2_hifigan.wav --use_onnx True - paddlespeech tts --am fastspeech2_mix --voc pwgan_csmsc --lang mix --spk_id 174 --input "热烈欢迎您在 Discussions 中提交问题,并在 Issues 中指出发现的 bug。此外,我们非常希望您参与到 Paddle Speech 的开发中!" --output mix_fs2_pwgan_csmsc_spk174.wav --use_onnx True - paddlespeech tts --am fastspeech2_canton --voc pwgan_aishell3 --lang canton --spk_id 10 --input "各个国家有各个国家嘅国歌" --output output_canton.wav --use_onnx True - ``` + ``` Usage: @@ -180,10 +161,6 @@ Here is a list of pretrained models released by PaddleSpeech that can be used by | fastspeech2_mix | mix | | tacotron2_csmsc | zh | | tacotron2_ljspeech | en | - | fastspeech2_male | zh | - | fastspeech2_male | en | - | fastspeech2_male | mix | - | fastspeech2_canton | canton | - Vocoder | Model | Language | @@ -199,5 +176,3 @@ Here is a list of pretrained models released by PaddleSpeech that can be used by | hifigan_aishell3 | zh | | hifigan_vctk | en | | wavernn_csmsc | zh | - | pwgan_male | zh | - | hifigan_male | zh | diff --git a/demos/text_to_speech/README_cn.md b/demos/text_to_speech/README_cn.md index d8a2a14cc..4a4132238 100644 --- a/demos/text_to_speech/README_cn.md +++ b/demos/text_to_speech/README_cn.md @@ -58,18 +58,7 @@ paddlespeech tts --am fastspeech2_mix --voc pwgan_csmsc --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --spk_id 175 --output mix_spk175_pwgan.wav paddlespeech tts --am fastspeech2_mix --voc hifigan_csmsc --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --spk_id 175 --output mix_spk175.wav ``` - - 中英文混合,单个男性说话人 - ```bash - # male mix tts - # The `lang` must be `mix`! - paddlespeech tts --am fastspeech2_male --voc pwgan_male --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --output male_mix_fs2_pwgan.wav - paddlespeech tts --am fastspeech2_male --voc hifigan_male --lang mix --input "我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN." --output male_mix_fs2_hifigan.wav - ``` - - 粤语 - ```bash - paddlespeech tts --am fastspeech2_canton --voc pwgan_aishell3 --input "各个国家有各个国家嘅国歌" --lang canton --spk_id 10 - ``` - - 使用 ONNXRuntime 推理: + - 使用 ONNXRuntime 推理: ```bash paddlespeech tts --input "你好,欢迎使用百度飞桨深度学习框架!" --output default.wav --use_onnx True paddlespeech tts --am speedyspeech_csmsc --input "你好,欢迎使用百度飞桨深度学习框架!" --output ss.wav --use_onnx True @@ -81,15 +70,7 @@ paddlespeech tts --am fastspeech2_ljspeech --voc hifigan_ljspeech --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output lj_fs2_hifigan.wav --use_onnx True paddlespeech tts --am fastspeech2_vctk --voc pwgan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_pwgan.wav --use_onnx True paddlespeech tts --am fastspeech2_vctk --voc hifigan_vctk --input "Life was like a box of chocolates, you never know what you're gonna get." --lang en --spk_id 0 --output vctk_fs2_hifigan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc pwgan_male --lang zh --input "你好,欢迎使用百度飞桨深度学习框架!" --output male_zh_fs2_pwgan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc pwgan_male --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output male_en_fs2_pwgan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc pwgan_male --lang mix --input "热烈欢迎您在 Discussions 中提交问题,并在 Issues 中指出发现的 bug。此外,我们非常希望您参与到 Paddle Speech 的开发中!" --output male_fs2_pwgan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc hifigan_male --lang zh --input "你好,欢迎使用百度飞桨深度学习框架!" --output male_zh_fs2_hifigan.wav --use_onnx True - paddlespeech tts --am fastspeech2_male --voc hifigan_male --lang en --input "Life was like a box of chocolates, you never know what you're gonna get." --output male_en_fs2_hifigan.wav --use_onnx True - paddlespeech tts --am fastspeech2_mix --voc hifigan_male --lang mix --input "热烈欢迎您在 Discussions 中提交问题,并在 Issues 中指出发现的 bug。此外,我们非常希望您参与到 Paddle Speech 的开发中!" --output male_fs2_hifigan.wav --use_onnx True - paddlespeech tts --am fastspeech2_mix --voc pwgan_csmsc --lang mix --spk_id 174 --input "热烈欢迎您在 Discussions 中提交问题,并在 Issues 中指出发现的 bug。此外,我们非常希望您参与到 Paddle Speech 的开发中!" --output mix_fs2_pwgan_csmsc_spk174.wav --use_onnx True - paddlespeech tts --am fastspeech2_canton --voc pwgan_aishell3 --lang canton --spk_id 10 --input "各个国家有各个国家嘅国歌" --output output_canton.wav --use_onnx True - ``` + ``` 使用方法: @@ -180,10 +161,6 @@ | fastspeech2_mix | mix | | tacotron2_csmsc | zh | | tacotron2_ljspeech | en | - | fastspeech2_male | zh | - | fastspeech2_male | en | - | fastspeech2_male | mix | - | fastspeech2_canton | canton | - 声码器 | 模型 | 语言 | @@ -199,5 +176,3 @@ | hifigan_aishell3 | zh | | hifigan_vctk | en | | wavernn_csmsc | zh | - | pwgan_male | zh | - | hifigan_male | zh | diff --git a/docker/ubuntu18-cpu/Dockerfile b/docker/ubuntu18-cpu/Dockerfile index 3ae48cb65..35f45f2e4 100644 --- a/docker/ubuntu18-cpu/Dockerfile +++ b/docker/ubuntu18-cpu/Dockerfile @@ -2,7 +2,7 @@ FROM registry.baidubce.com/paddlepaddle/paddle:2.2.2 LABEL maintainer="paddlesl@baidu.com" RUN apt-get update \ - && apt-get install libsndfile-dev libsndfile1 \ + && apt-get install libsndfile-dev \ && apt-get clean \ && rm -rf /var/lib/apt/lists/* diff --git a/docs/images/note_map.png b/docs/images/note_map.png deleted file mode 100644 index f280d98c4..000000000 Binary files a/docs/images/note_map.png and /dev/null differ diff --git a/docs/requirements.txt b/docs/requirements.txt index 30622230b..bd7f40ec3 100644 --- a/docs/requirements.txt +++ b/docs/requirements.txt @@ -1,9 +1,12 @@ braceexpand +colorlog editdistance +fastapi g2p_en g2pM h5py inflect +jieba jsonlines kaldiio keyboard @@ -13,7 +16,7 @@ matplotlib myst-parser nara_wpe numpydoc -onnxruntime>=1.11.0 +onnxruntime==1.10.0 opencc paddlenlp # use paddlepaddle == 2.3.* according to: https://github.com/PaddlePaddle/Paddle/issues/48243 @@ -21,25 +24,31 @@ paddlepaddle>=2.2.2,<2.4.0 paddlespeech_ctcdecoders paddlespeech_feat pandas +pathos==0.2.8 pattern_singleton -ppdiffusers>=0.9.0 -praatio>=5.0.0, <=5.1.1 +Pillow>=9.0.0 +praatio==5.0.0 prettytable pypinyin-dict pypinyin<=0.44.0 python-dateutil -pyworld>=0.2.12 +pyworld==0.2.12 recommonmark>=0.5.0 -resampy +resampy==0.2.2 sacrebleu +scipy +sentencepiece~=0.1.96 +soundfile~=0.10 sphinx sphinx-autobuild sphinx-markdown-tables sphinx_rtd_theme textgrid timer -ToJyutping==0.2.1 -typeguard==2.13.3 +tqdm +typeguard +uvicorn +visualdl webrtcvad websockets yacs~=0.1.8 diff --git a/docs/source/released_model.md b/docs/source/released_model.md index 9e9221779..87c58b787 100644 --- a/docs/source/released_model.md +++ b/docs/source/released_model.md @@ -3,21 +3,20 @@ ## Speech-to-Text Models ### Speech Recognition Model -Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech | Example Link | Inference Type | static_model | -:-------------:| :------------:| :-----: | -----: | :-----: |:-----:| :-----: | :-----: | :-----: | :-----: | :-----: | -[Ds2 Online Wenetspeech ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr0/asr0_deepspeech2_online_wenetspeech_ckpt_1.0.4.model.tar.gz) | Wenetspeech Dataset | Char-based | 1.2 GB | 2 Conv + 5 LSTM layers | 0.152 (test\_net, w/o LM)
0.2417 (test\_meeting, w/o LM)
0.053 (aishell, w/ LM) |-| 10000 h | - | onnx/inference/python |-| -[Ds2 Online Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_online_aishell_fbank161_ckpt_0.2.1.model.tar.gz) | Aishell Dataset | Char-based | 491 MB | 2 Conv + 5 LSTM layers | 0.0666 |-| 151 h | [D2 Online Aishell ASR0](../../examples/aishell/asr0) | onnx/inference/python |-| -[Ds2 Offline Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_offline_aishell_ckpt_1.0.1.model.tar.gz)| Aishell Dataset | Char-based | 1.4 GB | 2 Conv + 5 bidirectional LSTM layers| 0.0554 |-| 151 h | [Ds2 Offline Aishell ASR0](../../examples/aishell/asr0) | inference/python |-| -[Conformer Online Wenetspeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz) | WenetSpeech Dataset | Char-based | 457 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring| 0.11 (test\_net) 0.1879 (test\_meeting) |-| 10000 h |- | python |-| -[Conformer U2PP Online Wenetspeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.3.0.model.tar.gz) | WenetSpeech Dataset | Char-based | 540 MB | Encoder:Conformer, Decoder:BiTransformer, Decoding method: Attention rescoring| 0.047198 (aishell test\_-1) 0.059212 (aishell test\_16) |-| 10000 h |- | python |[FP32](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.3.0.model.tar.gz)
[INT8](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/static/asr1_chunk_conformer_u2pp_wenetspeech_static_quant_1.3.0.model.tar.gz) | -[Conformer Online Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_chunk_conformer_aishell_ckpt_0.2.0.model.tar.gz) | Aishell Dataset | Char-based | 189 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring| 0.0544 |-| 151 h | [Conformer Online Aishell ASR1](../../examples/aishell/asr1) | python |-| -[Conformer Offline Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_conformer_aishell_ckpt_1.0.1.model.tar.gz) | Aishell Dataset | Char-based | 189 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0460 |-| 151 h | [Conformer Offline Aishell ASR1](../../examples/aishell/asr1) | python |-| -[Transformer Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz) | Aishell Dataset | Char-based | 128 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0523 || 151 h | [Transformer Aishell ASR1](../../examples/aishell/asr1) | python |-| -[Ds2 Offline Librispeech ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr0/asr0_deepspeech2_offline_librispeech_ckpt_1.0.1.model.tar.gz)| Librispeech Dataset | Char-based | 1.3 GB | 2 Conv + 5 bidirectional LSTM layers| - |0.0467| 960 h | [Ds2 Offline Librispeech ASR0](../../examples/librispeech/asr0) | inference/python |-| -[Conformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 191 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0338 | 960 h | [Conformer Librispeech ASR1](../../examples/librispeech/asr1) | python |-| -[Transformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_transformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 131 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0381 | 960 h | [Transformer Librispeech ASR1](../../examples/librispeech/asr1) | python |-| -[Transformer Librispeech ASR2 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 131 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: JoinCTC w/ LM |-| 0.0240 | 960 h | [Transformer Librispeech ASR2](../../examples/librispeech/asr2) | python |-| -[Conformer TALCS ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/tal_cs/asr1/asr1_conformer_talcs_ckpt_1.4.0.model.tar.gz) | TALCS Dataset | subword-based | 470 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0844 | 587 h | [Conformer TALCS ASR1](../../examples/tal_cs/asr1) | python |-| +Acoustic Model | Training Data | Token-based | Size | Descriptions | CER | WER | Hours of speech | Example Link | Inference Type | +:-------------:| :------------:| :-----: | -----: | :-----: |:-----:| :-----: | :-----: | :-----: | :-----: | +[Ds2 Online Wenetspeech ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr0/asr0_deepspeech2_online_wenetspeech_ckpt_1.0.4.model.tar.gz) | Wenetspeech Dataset | Char-based | 1.2 GB | 2 Conv + 5 LSTM layers | 0.152 (test\_net, w/o LM)
0.2417 (test\_meeting, w/o LM)
0.053 (aishell, w/ LM) |-| 10000 h | - | onnx/inference/python | +[Ds2 Online Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_online_aishell_fbank161_ckpt_0.2.1.model.tar.gz) | Aishell Dataset | Char-based | 491 MB | 2 Conv + 5 LSTM layers | 0.0666 |-| 151 h | [D2 Online Aishell ASR0](../../examples/aishell/asr0) | onnx/inference/python | +[Ds2 Offline Aishell ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr0/asr0_deepspeech2_offline_aishell_ckpt_1.0.1.model.tar.gz)| Aishell Dataset | Char-based | 1.4 GB | 2 Conv + 5 bidirectional LSTM layers| 0.0554 |-| 151 h | [Ds2 Offline Aishell ASR0](../../examples/aishell/asr0) | inference/python | +[Conformer Online Wenetspeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz) | WenetSpeech Dataset | Char-based | 457 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring| 0.11 (test\_net) 0.1879 (test\_meeting) |-| 10000 h |- | python | +[Conformer U2PP Online Wenetspeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/wenetspeech/asr1/asr1_chunk_conformer_u2pp_wenetspeech_ckpt_1.3.0.model.tar.gz) | WenetSpeech Dataset | Char-based | 476 MB | Encoder:Conformer, Decoder:BiTransformer, Decoding method: Attention rescoring| 0.047198 (aishell test\_-1) 0.059212 (aishell test\_16) |-| 10000 h |- | python | +[Conformer Online Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_chunk_conformer_aishell_ckpt_0.2.0.model.tar.gz) | Aishell Dataset | Char-based | 189 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring| 0.0544 |-| 151 h | [Conformer Online Aishell ASR1](../../examples/aishell/asr1) | python | +[Conformer Offline Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_conformer_aishell_ckpt_1.0.1.model.tar.gz) | Aishell Dataset | Char-based | 189 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0460 |-| 151 h | [Conformer Offline Aishell ASR1](../../examples/aishell/asr1) | python | +[Transformer Aishell ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr1/asr1_transformer_aishell_ckpt_0.1.1.model.tar.gz) | Aishell Dataset | Char-based | 128 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring | 0.0523 || 151 h | [Transformer Aishell ASR1](../../examples/aishell/asr1) | python | +[Ds2 Offline Librispeech ASR0 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr0/asr0_deepspeech2_offline_librispeech_ckpt_1.0.1.model.tar.gz)| Librispeech Dataset | Char-based | 1.3 GB | 2 Conv + 5 bidirectional LSTM layers| - |0.0467| 960 h | [Ds2 Offline Librispeech ASR0](../../examples/librispeech/asr0) | inference/python | +[Conformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_conformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 191 MB | Encoder:Conformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0338 | 960 h | [Conformer Librispeech ASR1](../../examples/librispeech/asr1) | python | +[Transformer Librispeech ASR1 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr1/asr1_transformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 131 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: Attention rescoring |-| 0.0381 | 960 h | [Transformer Librispeech ASR1](../../examples/librispeech/asr1) | python | +[Transformer Librispeech ASR2 Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr2/asr2_transformer_librispeech_ckpt_0.1.1.model.tar.gz) | Librispeech Dataset | subword-based | 131 MB | Encoder:Transformer, Decoder:Transformer, Decoding method: JoinCTC w/ LM |-| 0.0240 | 960 h | [Transformer Librispeech ASR2](../../examples/librispeech/asr2) | python | ### Self-Supervised Pre-trained Model Model | Pre-Train Method | Pre-Train Data | Finetune Data | Size | Descriptions | CER | WER | Example Link | @@ -25,12 +24,12 @@ Model | Pre-Train Method | Pre-Train Data | Finetune Data | Size | Descriptions [Wav2vec2-large-960h-lv60-self Model](https://paddlespeech.bj.bcebos.com/wav2vec/wav2vec2-large-960h-lv60-self.pdparams) | wav2vec2 | Librispeech and LV-60k Dataset (5.3w h) | - | 1.18 GB |Pre-trained Wav2vec2.0 Model | - | - | - | [Wav2vec2ASR-large-960h-librispeech Model](https://paddlespeech.bj.bcebos.com/s2t/librispeech/asr3/wav2vec2ASR-large-960h-librispeech_ckpt_1.3.1.model.tar.gz) | wav2vec2 | Librispeech and LV-60k Dataset (5.3w h) | Librispeech (960 h) | 718 MB |Encoder: Wav2vec2.0, Decoder: CTC, Decoding method: Greedy search | - | 0.0189 | [Wav2vecASR Librispeech ASR3](../../examples/librispeech/asr3) | [Wav2vec2-large-wenetspeech-self Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2-large-wenetspeech-self_ckpt_1.3.0.model.tar.gz) | wav2vec2 | Wenetspeech Dataset (1w h) | - | 714 MB |Pre-trained Wav2vec2.0 Model | - | - | - | -[Wav2vec2ASR-large-aishell1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz) | wav2vec2 | Wenetspeech Dataset (1w h) | aishell1 (train set) | 1.18 GB |Encoder: Wav2vec2.0, Decoder: CTC, Decoding method: Greedy search | 0.0510 | - | - | +[Wav2vec2ASR-large-aishell1 Model](https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.3.0.model.tar.gz) | wav2vec2 | Wenetspeech Dataset (1w h) | aishell1 (train set) | 1.17 GB |Encoder: Wav2vec2.0, Decoder: CTC, Decoding method: Greedy search | 0.0453 | - | - | ### Whisper Model Demo Link | Training Data | Size | Descriptions | CER | Model :-----------: | :-----:| :-------: | :-----: | :-----: |:---------:| -[Whisper](../../demos/whisper) | 680kh from internet | large: 5.8G,
medium: 2.9G,
small: 923M,
base: 277M,
tiny: 145M | Encoder:Transformer,
Decoder:Transformer,
Decoding method:
Greedy search | 0.027
(large, Librispeech) | [whisper-large](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-large-model.tar.gz)
[whisper-medium](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-medium-model.tar.gz)
[whisper-medium-English-only](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-medium-en-model.tar.gz)
[whisper-small](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-small-model.tar.gz)
[whisper-small-English-only](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-small-en-model.tar.gz)
[whisper-base](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-base-model.tar.gz)
[whisper-base-English-only](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-base-en-model.tar.gz)
[whisper-tiny](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-tiny-model.tar.gz)
[whisper-tiny-English-only](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-tiny-en-model.tar.gz) +[Whisper](../../demos/whisper) | 680kh from internet | large: 5.8G,
medium: 2.9G,
small: 923M,
base: 277M,
tiny: 145M | Encoder:Transformer,
Decoder:Transformer,
Decoding method:
Greedy search | 2.7
(large, Librispeech) | [whisper-large](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-large-model.tar.gz)
[whisper-medium](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-medium-model.tar.gz)
[whisper-medium-English-only](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-medium-en-model.tar.gz)
[whisper-small](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-small-model.tar.gz)
[whisper-small-English-only](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-small-en-model.tar.gz)
[whisper-base](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-base-model.tar.gz)
[whisper-base-English-only](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-base-en-model.tar.gz)
[whisper-tiny](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-tiny-model.tar.gz)
[whisper-tiny-English-only](https://paddlespeech.bj.bcebos.com/whisper/whisper_model_20221122/whisper-tiny-en-model.tar.gz) ### Language Model based on NGram |Language Model | Training Data | Token-based | Size | Descriptions| @@ -61,10 +60,7 @@ FastSpeech2| AISHELL-3 |[fastspeech2-aishell3](https://github.com/PaddlePaddle/P FastSpeech2| LJSpeech |[fastspeech2-ljspeech](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/ljspeech/tts3)|[fastspeech2_nosil_ljspeech_ckpt_0.5.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_nosil_ljspeech_ckpt_0.5.zip)|[fastspeech2_ljspeech_static_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_ljspeech_static_1.1.0.zip)
[fastspeech2_ljspeech_onnx_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_ljspeech_onnx_1.1.0.zip)
[fastspeech2_ljspeech_pdlite_1.3.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_ljspeech_pdlite_1.3.0.zip)|145MB| FastSpeech2| VCTK |[fastspeech2-vctk](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/vctk/tts3)|[fastspeech2_vctk_ckpt_1.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_ckpt_1.2.0.zip)|[fastspeech2_vctk_static_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_static_1.1.0.zip)
[fastspeech2_vctk_onnx_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_onnx_1.1.0.zip)
[fastspeech2_vctk_pdlite_1.3.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_vctk_pdlite_1.3.0.zip)| 145MB| FastSpeech2| ZH_EN |[fastspeech2-zh_en](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/zh_en_tts/tts3)|[fastspeech2_mix_ckpt_1.2.0.zip](https://paddlespeech.bj.bcebos.com/t2s/chinse_english_mixed/models/fastspeech2_mix_ckpt_1.2.0.zip)|[fastspeech2_mix_static_0.2.0.zip](https://paddlespeech.bj.bcebos.com/t2s/chinse_english_mixed/models/fastspeech2_mix_static_0.2.0.zip)
[fastspeech2_mix_onnx_0.2.0.zip](https://paddlespeech.bj.bcebos.com/t2s/chinse_english_mixed/models/fastspeech2_mix_onnx_0.2.0.zip) | 145MB| -FastSpeech2| male-zh ||[fastspeech2_male_zh_ckpt_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_zh_ckpt_1.4.0.zip)|[fastspeech2_male_zh_static_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_zh_static_1.4.0.zip)
[fastspeech2_male_zh_onnx_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_zh_onnx_1.4.0.zip) |146MB| -FastSpeech2| male-en ||[fastspeech2_male_en_ckpt_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_en_ckpt_1.4.0.zip)|[fastspeech2_male_en_static_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_en_static_1.4.0.zip)
[fastspeech2_male_en_onnx_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_en_onnx_1.4.0.zip) |145MB| -FastSpeech2| male-mix ||[fastspeech2_male_mix_ckpt_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_mix_ckpt_1.4.0.zip)|[fastspeech2_male_mix_static_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_mix_static_1.4.0.zip)
[fastspeech2_male_mix_onnx_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_mix_onnx_1.4.0.zip) |146MB| -FastSpeech2| Cantonese |[fastspeech2-canton](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/canton/tts3)|[fastspeech2_canton_ckpt_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_canton_ckpt_1.4.0.zip)|[fastspeech2_canton_static_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_canton_static_1.4.0.zip)
[fastspeech2_canton_onnx_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_canton_onnx_1.4.0.zip)|146MB| +FastSpeech2| Male ||[fastspeech2_male_ckpt_1.3.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_male_ckpt_1.3.0.zip)| | | ### Vocoders Model Type | Dataset| Example Link | Pretrained Models| Static / ONNX / Paddle-Lite Models|Size (static) @@ -81,8 +77,7 @@ HiFiGAN | LJSpeech |[HiFiGAN-ljspeech](https://github.com/PaddlePaddle/PaddleSpe HiFiGAN | AISHELL-3 |[HiFiGAN-aishell3](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/aishell3/voc5)|[hifigan_aishell3_ckpt_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_aishell3_ckpt_0.2.0.zip)|[hifigan_aishell3_static_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_aishell3_static_1.1.0.zip)
[hifigan_aishell3_onnx_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_aishell3_onnx_1.1.0.zip)
[hifigan_aishell3_pdlite_1.3.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_aishell3_pdlite_1.3.0.zip)|46MB| HiFiGAN | VCTK |[HiFiGAN-vctk](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/vctk/voc5)|[hifigan_vctk_ckpt_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_vctk_ckpt_0.2.0.zip)|[hifigan_vctk_static_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_vctk_static_1.1.0.zip)
[hifigan_vctk_onnx_1.1.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_vctk_onnx_1.1.0.zip)
[hifigan_vctk_pdlite_1.3.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_vctk_pdlite_1.3.0.zip)|46MB| WaveRNN | CSMSC |[WaveRNN-csmsc](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/csmsc/voc6)|[wavernn_csmsc_ckpt_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/wavernn/wavernn_csmsc_ckpt_0.2.0.zip)|[wavernn_csmsc_static_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/wavernn/wavernn_csmsc_static_0.2.0.zip)|18MB| -Parallel WaveGAN| Male ||[pwg_male_ckpt_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwg_male_ckpt_1.4.0.zip)|[pwgan_male_static_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwgan_male_static_1.4.0.zip)
[pwgan_male_onnx_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwgan_male_onnx_1.4.0.zip)|4.8M| -HiFiGAN| Male ||[hifigan_male_ckpt_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_male_ckpt_1.4.0.zip)|[hifigan_male_static_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_male_static_1.4.0.zip)
[hifigan_male_onnx_1.4.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_male_onnx_1.4.0.zip)|46M| +Parallel WaveGAN| Male ||[pwg_male_ckpt_1.3.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/pwgan/pwg_male_ckpt_1.3.0.zip)||| ### Voice Cloning diff --git a/docs/source/tts/quick_start.md b/docs/source/tts/quick_start.md index d2a1b4ec9..d8dbc646c 100644 --- a/docs/source/tts/quick_start.md +++ b/docs/source/tts/quick_start.md @@ -79,8 +79,8 @@ checkpoint_name ├── snapshot_iter_*.pdz ├── speech_stats.npy ├── phone_id_map.txt -├── spk_id_map.txt (optional) -└── tone_id_map.txt (optional) +├── spk_id_map.txt (optimal) +└── tone_id_map.txt (optimal) ``` **Vocoders:** ```text diff --git a/docs/source/tts/quick_start_cn.md b/docs/source/tts/quick_start_cn.md index ba2596439..c56d9bb45 100644 --- a/docs/source/tts/quick_start_cn.md +++ b/docs/source/tts/quick_start_cn.md @@ -87,8 +87,8 @@ checkpoint_name ├── snapshot_iter_*.pdz ├── speech_stats.npy ├── phone_id_map.txt -├── spk_id_map.txt (optional) -└── tone_id_map.txt (optional) +├── spk_id_map.txt (optimal) +└── tone_id_map.txt (optimal) ``` **Vocoders:** ```text diff --git a/docs/source/tts/svs_music_score.md b/docs/source/tts/svs_music_score.md deleted file mode 100644 index 9f351c001..000000000 --- a/docs/source/tts/svs_music_score.md +++ /dev/null @@ -1,183 +0,0 @@ -本人非音乐专业人士,如文档中有误欢迎指正。 - -# 一、常见基础 -## 1.1 简谱和音名(note) -

- -

- -上图从左往右的黑键音名分别是:C#/Db,D#/Db,F#/Db,G#/Ab,A#/Bb -钢琴88键如下图,分为大字一组,大字组,小字组,小字一组,小字二组,小字三组,小字四组。分别对应音名的后缀是 1 2 3 4 5 6,例如小字一组(C大调)包含的键分别为: C4,C#4/Db4,D4,D#4/Eb4,E4,F4,F#4/Gb4,G4,G#4/Ab4,A4,A#4/Bb4,B4 -钢琴八度音就是12345671八个音,最后一个音是高1。**遵循:全全半全全全半** 就会得到 1 2 3 4 5 6 7 (高)1 的音 - -

- -

- -## 1.2 十二大调 -“#”表示升调 - -

- -

- -“b”表示降调 - -

- -

- -什么大调表示Do(简谱1) 这个音从哪个键开始,例如D大调,则用D这个键来表示 Do这个音。 -下图是十二大调下简谱与音名的对应表。 - -

- -

- - -## 1.3 Tempo -Tempo 用于表示速度(Speed of the beat/pulse),一分钟里面有几拍(beats per mimute BPM) - -

- -

- -whole note --> 4 beats
-half note --> 2 beats
-quarter note --> 1 beat
-eighth note --> 1/2 beat
-sixteenth note --> 1/4 beat
- - -# 二、应用试验 -## 2.1 从谱中获取 music scores -music scores 包含:note,note_dur,is_slur - -

- -

- -从左上角的谱信息 *bE* 可以得出该谱子是 **降E大调**,可以对应1.2小节十二大调简谱音名对照表根据 简谱获取对应的note -从左上角的谱信息 *quarter note* 可以得出该谱子的速度是 **一分钟95拍(beat)**,一拍的时长 = **60/95 = 0.631578s** -从左上角的谱信息 *4/4* 可以得出该谱子表示四分音符为一拍(分母的4),每小节有4拍(分子的4) - -从该简谱上可以获取 music score 如下: - -|text |phone |简谱(辅助)后面的点表示高八音 |note (从小字组开始算) |几拍(辅助) |note_dur |is_slur| -:-------------:| :------------:| :-----: | -----: | :-----: |:-----:| :-----: | -|小 |x |5 |A#3/Bb3 |半 |0.315789 |0 | -| |iao |5 |A#3/Bb3 |半 |0.315789 |0 | -|酒 |j |1. |D#4/Eb4 |半 |0.315789 |0 | -| |iu |1. |D#4/Eb4 |半 |0.315789 |0 | -|窝 |w |2. |F4 |半 |0.315789 |0 | -| |o |2. |F4 |半 |0.315789 |0 | -|长 |ch |3. |G4 |半 |0.315789 |0 | -| |ang |3. |G4 |半 |0.315789 |0 | -| |ang |1. |D#4/Eb4 |半 |0.315789 |1 | -|睫 |j |1. |D#4/Eb4 |半 |0.315789 |0 | -| |ie |1. |D#4/Eb4 |半 |0.315789 |0 | -| |ie |5 |A#3/Bb3 |半 |0.315789 |1 | -|毛 |m |5 |A#3/Bb3 |一 |0.631578 |0 | -| |ao |5 |A#3/Bb3 |一 |0.631578 |0 | -|是 |sh |5 |A#3/Bb3 |半 |0.315789 |0 | -| |i |5 |A#3/Bb3 |半 |0.315789 |0 | -|你 |n |3. |G4 |半 |0.315789 |0 | -| |i |3. |G4 |半 |0.315789 |0 | -|最 |z |2. |F4 |半 |0.315789 |0 | -| |ui |2. |F4 |半 |0.315789 |0 | -|美 |m |3. |G4 |半 |0.315789 |0 | -| |ei |3. |G4 |半 |0.315789 |0 | -|的 |d |2. |F4 |半 |0.315789 |0 | -| |e |2. |F4 |半 |0.315789 |0 | -|记 |j |7 |D4 |半 |0.315789 |0 | -| |i |7 |D4 |半 |0.315789 |0 | -|号 |h |5 |A#3/Bb3 |半 |0.315789 |0 | -| |ao |5 |A#3/Bb3 |半 |0.315789 |0 | - - -## 2.2 一些实验 - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
序号 说明 合成音频(diffsinger_opencpop + pwgan_opencpop)
1 原始 opencpop 标注的 notes,note_durs,is_slurs,升F大调,起始在小字组(第3组) - -
-
2 原始 opencpop 标注的 notes 和 is_slurs,note_durs 改变(从谱子获取) - -
-
3 原始 opencpop 标注的 notes 去掉 rest(毛字一拍),is_slurs 和 note_durs 改变(从谱子获取) - -
-
4 从谱子获取 notes,note durs,is_slurs,不含 rest(毛字一拍),起始在小字一组(第3组) - -
-
5 从谱子获取 notes,note durs,is_slurs,加上 rest (毛字半拍,rest半拍),起始在小字一组(第3组) - -
-
6 从谱子获取 notes, is_slurs,包含 rest,note_durs 从原始标注获取,起始在小字一组(第3组) - -
-
7 从谱子获取 notes,note durs,is_slurs,不含 rest(毛字一拍),起始在小字一组(第4组) - -
-
- -
- - -上述实验表明通过该方法来提取 music score 是可行的,但是在应用中可以**灵活地在歌词中加"AP"(用来表示吸气声)和"SP"(用来表示停顿声)**,对应的在 **note 上加 rest**,会使得整体的歌声合成更自然。 -除此之外,还要考虑哪一个大调并且以哪一组为起始**得到的 note 在训练数据集中出现过**,如若推理时传入训练数据中没有见过的 note, 合成出来的音频可能不是我们期待的音调。 - - -# 三、其他 -## 3.1 读取midi - -```python -import mido -mid = mido.MidiFile('2093.midi') -``` diff --git a/docs/tutorial/st/st_tutorial.ipynb b/docs/tutorial/st/st_tutorial.ipynb index e755bebad..2fb850535 100644 --- a/docs/tutorial/st/st_tutorial.ipynb +++ b/docs/tutorial/st/st_tutorial.ipynb @@ -62,7 +62,7 @@ "collapsed": false }, "source": [ - "# 使用Transformer进行端到端语音翻译的基本流程\n", + "# 使用Transformer进行端到端语音翻译的的基本流程\n", "## 基础模型\n", "由于 ASR 章节已经介绍了 Transformer 以及语音特征抽取,在此便不做过多介绍,感兴趣的同学可以去相关章节进行了解。\n", "\n", diff --git a/docs/tutorial/tts/tts_tutorial.ipynb b/docs/tutorial/tts/tts_tutorial.ipynb index 0cecb680d..583adb014 100644 --- a/docs/tutorial/tts/tts_tutorial.ipynb +++ b/docs/tutorial/tts/tts_tutorial.ipynb @@ -464,7 +464,7 @@ "
FastSpeech2 网络结构图

\n", "\n", "\n", - "PaddleSpeech TTS 实现的 FastSpeech2 与论文不同的地方在于,我们使用的是 phone 级别的 `pitch` 和 `energy`(与 FastPitch 类似),这样的合成结果可以更加**稳定**。\n", + "PaddleSpeech TTS 实现的 FastSpeech2 与论文不同的地方在于,我们使用的的是 phone 级别的 `pitch` 和 `energy`(与 FastPitch 类似),这样的合成结果可以更加**稳定**。\n", "
\n", "
FastPitch 网络结构图

\n", "\n", diff --git a/examples/aishell/asr0/local/train.sh b/examples/aishell/asr0/local/train.sh index c0da33257..2b71b7f76 100755 --- a/examples/aishell/asr0/local/train.sh +++ b/examples/aishell/asr0/local/train.sh @@ -1,6 +1,6 @@ #!/bin/bash -if [ $# -lt 2 ] || [ $# -gt 3 ];then +if [ $# -lt 2 ] && [ $# -gt 3 ];then echo "usage: CUDA_VISIBLE_DEVICES=0 ${0} config_path ckpt_name ips(optional)" exit -1 fi diff --git a/examples/aishell/asr1/conf/chunk_squeezeformer.yaml b/examples/aishell/asr1/conf/chunk_squeezeformer.yaml deleted file mode 100644 index 35a90b7d6..000000000 --- a/examples/aishell/asr1/conf/chunk_squeezeformer.yaml +++ /dev/null @@ -1,98 +0,0 @@ -############################################ -# Network Architecture # -############################################ -cmvn_file: -cmvn_file_type: "json" -# encoder related -encoder: squeezeformer -encoder_conf: - encoder_dim: 256 # dimension of attention - output_size: 256 # dimension of output - attention_heads: 4 - num_blocks: 12 # the number of encoder blocks - reduce_idx: 5 - recover_idx: 11 - feed_forward_expansion_factor: 8 - input_dropout_rate: 0.1 - feed_forward_dropout_rate: 0.1 - attention_dropout_rate: 0.1 - adaptive_scale: true - cnn_module_kernel: 31 - normalize_before: false - activation_type: 'swish' - pos_enc_layer_type: 'rel_pos' - time_reduction_layer_type: 'stream' - causal: true - use_dynamic_chunk: true - use_dynamic_left_chunk: false - -# decoder related -decoder: transformer -decoder_conf: - attention_heads: 4 - linear_units: 2048 - num_blocks: 6 - dropout_rate: 0.1 # sublayer output dropout - positional_dropout_rate: 0.1 - self_attention_dropout_rate: 0.0 - src_attention_dropout_rate: 0.0 -# hybrid CTC/attention -model_conf: - ctc_weight: 0.3 - lsm_weight: 0.1 # label smoothing option - length_normalized_loss: false - init_type: 'kaiming_uniform' # !Warning: need to convergence - -########################################### -# Data # -########################################### - -train_manifest: data/manifest.train -dev_manifest: data/manifest.dev -test_manifest: data/manifest.test - - -########################################### -# Dataloader # -########################################### - -vocab_filepath: data/lang_char/vocab.txt -spm_model_prefix: '' -unit_type: 'char' -preprocess_config: conf/preprocess.yaml -feat_dim: 80 -stride_ms: 10.0 -window_ms: 25.0 -sortagrad: 0 # Feed samples from shortest to longest ; -1: enabled for all epochs, 0: disabled, other: enabled for 'other' epochs -batch_size: 32 -maxlen_in: 512 # if input length > maxlen-in, batchsize is automatically reduced -maxlen_out: 150 # if output length > maxlen-out, batchsize is automatically reduced -minibatches: 0 # for debug -batch_count: auto -batch_bins: 0 -batch_frames_in: 0 -batch_frames_out: 0 -batch_frames_inout: 0 -num_workers: 2 -subsampling_factor: 1 -num_encs: 1 - -########################################### -# Training # -########################################### -n_epoch: 240 -accum_grad: 1 -global_grad_clip: 5.0 -dist_sampler: True -optim: adam -optim_conf: - lr: 0.001 - weight_decay: 1.0e-6 -scheduler: warmuplr -scheduler_conf: - warmup_steps: 25000 - lr_decay: 1.0 -log_interval: 100 -checkpoint: - kbest_n: 50 - latest_n: 5 diff --git a/examples/aishell/asr1/conf/squeezeformer.yaml b/examples/aishell/asr1/conf/squeezeformer.yaml deleted file mode 100644 index b7841aca5..000000000 --- a/examples/aishell/asr1/conf/squeezeformer.yaml +++ /dev/null @@ -1,93 +0,0 @@ -############################################ -# Network Architecture # -############################################ -cmvn_file: -cmvn_file_type: "json" -# encoder related -encoder: squeezeformer -encoder_conf: - encoder_dim: 256 # dimension of attention - output_size: 256 # dimension of output - attention_heads: 4 - num_blocks: 12 # the number of encoder blocks - reduce_idx: 5 - recover_idx: 11 - feed_forward_expansion_factor: 8 - input_dropout_rate: 0.1 - feed_forward_dropout_rate: 0.1 - attention_dropout_rate: 0.1 - adaptive_scale: true - cnn_module_kernel: 31 - normalize_before: false - activation_type: 'swish' - pos_enc_layer_type: 'rel_pos' - time_reduction_layer_type: 'conv1d' - -# decoder related -decoder: transformer -decoder_conf: - attention_heads: 4 - linear_units: 2048 - num_blocks: 6 - dropout_rate: 0.1 - positional_dropout_rate: 0.1 - self_attention_dropout_rate: 0.0 - src_attention_dropout_rate: 0.0 - -# hybrid CTC/attention -model_conf: - ctc_weight: 0.3 - lsm_weight: 0.1 # label smoothing option - length_normalized_loss: false - init_type: 'kaiming_uniform' # !Warning: need to convergence - -########################################### -# Data # -########################################### -train_manifest: data/manifest.train -dev_manifest: data/manifest.dev -test_manifest: data/manifest.test - -########################################### -# Dataloader # -########################################### -vocab_filepath: data/lang_char/vocab.txt -spm_model_prefix: '' -unit_type: 'char' -preprocess_config: conf/preprocess.yaml -feat_dim: 80 -stride_ms: 10.0 -window_ms: 25.0 -sortagrad: 0 # Feed samples from shortest to longest ; -1: enabled for all epochs, 0: disabled, other: enabled for 'other' epochs -batch_size: 32 -maxlen_in: 512 # if input length > maxlen-in, batchsize is automatically reduced -maxlen_out: 150 # if output length > maxlen-out, batchsize is automatically reduced -minibatches: 0 # for debug -batch_count: auto -batch_bins: 0 -batch_frames_in: 0 -batch_frames_out: 0 -batch_frames_inout: 0 -num_workers: 2 -subsampling_factor: 1 -num_encs: 1 - -########################################### -# Training # -########################################### -n_epoch: 150 -accum_grad: 8 -global_grad_clip: 5.0 -dist_sampler: False -optim: adam -optim_conf: - lr: 0.002 - weight_decay: 1.0e-6 -scheduler: warmuplr -scheduler_conf: - warmup_steps: 25000 - lr_decay: 1.0 -log_interval: 100 -checkpoint: - kbest_n: 50 - latest_n: 5 diff --git a/examples/aishell/asr1/local/test.sh b/examples/aishell/asr1/local/test.sh index 8487e9904..26926b4a9 100755 --- a/examples/aishell/asr1/local/test.sh +++ b/examples/aishell/asr1/local/test.sh @@ -1,21 +1,15 @@ #!/bin/bash -set -e +if [ $# != 3 ];then + echo "usage: ${0} config_path decode_config_path ckpt_path_prefix" + exit -1 +fi stage=0 stop_stage=100 - -source utils/parse_options.sh || exit 1; - ngpu=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}') echo "using $ngpu gpus..." - -if [ $# != 3 ];then - echo "usage: ${0} config_path decode_config_path ckpt_path_prefix" - exit -1 -fi - config_path=$1 decode_config_path=$2 ckpt_prefix=$3 @@ -98,7 +92,6 @@ if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then fi if [ ${stage} -le 101 ] && [ ${stop_stage} -ge 101 ]; then - echo "using sclite to compute cer..." # format the reference test file for sclite python utils/format_rsl.py \ --origin_ref data/manifest.test.raw \ diff --git a/examples/aishell/asr1/local/train.sh b/examples/aishell/asr1/local/train.sh index 3d4f052a3..bfa8dd97d 100755 --- a/examples/aishell/asr1/local/train.sh +++ b/examples/aishell/asr1/local/train.sh @@ -17,7 +17,7 @@ if [ ${seed} != 0 ]; then echo "using seed $seed & FLAGS_cudnn_deterministic=True ..." fi -if [ $# -lt 2 ] || [ $# -gt 3 ];then +if [ $# -lt 2 ] && [ $# -gt 3 ];then echo "usage: CUDA_VISIBLE_DEVICES=0 ${0} config_path ckpt_name ips(optional)" exit -1 fi diff --git a/examples/aishell/asr3/README.md b/examples/aishell/asr3/README.md deleted file mode 100644 index 6b587e12f..000000000 --- a/examples/aishell/asr3/README.md +++ /dev/null @@ -1,198 +0,0 @@ -# Wav2vec2ASR with Aishell -This example contains code used to finetune [wav2vec2.0](https://https://arxiv.org/pdf/2006.11477.pdf) model with [Aishell dataset](http://www.openslr.org/resources/33) -## Overview -All the scripts you need are in `run.sh`. There are several stages in `run.sh`, and each stage has its function. -| Stage | Function | -|:---- |:----------------------------------------------------------- | -| 0 | Process data. It includes:
(1) Download the dataset
(2) Calculate the CMVN of the train dataset
(3) Get the vocabulary file
(4) Get the manifest files of the train, development and test dataset
(5) Download the pretrained wav2vec2 model | -| 1 | Train the model | -| 2 | Get the final model by averaging the top-k models, set k = 1 means to choose the best model | -| 3 | Test the final model performance | -| 4 | Infer the single audio file | - - -You can choose to run a range of stages by setting `stage` and `stop_stage `. - -For example, if you want to execute the code in stage 2 and stage 3, you can run this script: -```bash -bash run.sh --stage 2 --stop_stage 3 -``` -Or you can set `stage` equal to `stop-stage` to only run one stage. -For example, if you only want to run `stage 0`, you can use the script below: -```bash -bash run.sh --stage 0 --stop_stage 0 -``` -The document below will describe the scripts in `run.sh` in detail. -## The Environment Variables -The path.sh contains the environment variables. -```bash -. ./path.sh -. ./cmd.sh -``` -This script needs to be run first. And another script is also needed: -```bash -source ${MAIN_ROOT}/utils/parse_options.sh -``` -It will support the way of using `--variable value` in the shell scripts. -## The Local Variables -Some local variables are set in `run.sh`. -`gpus` denotes the GPU number you want to use. If you set `gpus=`, it means you only use CPU. -`stage` denotes the number of stages you want to start from in the experiments. -`stop stage` denotes the number of the stage you want to end at in the experiments. -`conf_path` denotes the config path of the model. -`avg_num` denotes the number K of top-K models you want to average to get the final model. -`audio file` denotes the file path of the single file you want to infer in stage 5 -`ckpt` denotes the checkpoint prefix of the model, e.g. "wav2vec2ASR" - -You can set the local variables (except `ckpt`) when you use `run.sh` - -For example, you can set the `gpus` and `avg_num` when you use the command line: -```bash -bash run.sh --gpus 0,1 --avg_num 20 -``` -## Stage 0: Data Processing -To use this example, you need to process data firstly and you can use stage 0 in `run.sh` to do this. The code is shown below: -```bash - if [ ${stage} -le 0 ] && [ ${stop_stage} -ge 0 ]; then - # prepare data - bash ./local/data.sh || exit -1 - fi -``` -Stage 0 is for processing the data. - -If you only want to process the data. You can run -```bash -bash run.sh --stage 0 --stop_stage 0 -``` -You can also just run these scripts in your command line. -```bash -. ./path.sh -. ./cmd.sh -bash ./local/data.sh -``` -After processing the data, the `data` directory will look like this: -```bash -data/ -|-- dev.meta -|-- lang_char -| `-- vocab.txt -|-- manifest.dev -|-- manifest.dev.raw -|-- manifest.test -|-- manifest.test.raw -|-- manifest.train -|-- manifest.train.raw -|-- mean_std.json -|-- test.meta -|-- train.meta -|-- train.csv -|-- dev.csv -|-- test.csv -``` - -Stage 0 also downloads the Chinese pre-trained [wav2vec2](https://paddlespeech.bj.bcebos.com/wav2vec/chinese-wav2vec2-large.pdparams) model. -```bash -mkdir -p exp/wav2vec2 -wget -P exp/wav2vec2 https://paddlespeech.bj.bcebos.com/wav2vec/chinese-wav2vec2-large.pdparams -``` -## Stage 1: Model Training -If you want to train the model. you can use stage 1 in `run.sh`. The code is shown below. -```bash -if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then - # train model, all `ckpt` under `exp` dir - CUDA_VISIBLE_DEVICES=${gpus} ./local/train.sh ${conf_path} ${ckpt} - fi -``` -If you want to train the model, you can use the script below to execute stage 0 and stage 1: -```bash -bash run.sh --stage 0 --stop_stage 1 -``` -or you can run these scripts in the command line (only use CPU). -```bash -. ./path.sh -. ./cmd.sh -bash ./local/data.sh -CUDA_VISIBLE_DEVICES= ./local/train.sh conf/wav2vec2ASR.yaml wav2vec2ASR -``` -## Stage 2: Top-k Models Averaging -After training the model, we need to get the final model for testing and inference. In every epoch, the model checkpoint is saved, so we can choose the best model from them based on the validation loss or we can sort them and average the parameters of the top-k models to get the final model. We can use stage 2 to do this, and the code is shown below. Note: We only train one epoch for wav2vec2ASR, thus the `avg_num` is set to 1. -```bash - if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then - # avg n best model - avg.sh best exp/${ckpt}/checkpoints ${avg_num} - fi -``` -The `avg.sh` is in the `../../../utils/` which is define in the `path.sh`. -If you want to get the final model, you can use the script below to execute stage 0, stage 1, and stage 2: -```bash -bash run.sh --stage 0 --stop_stage 2 -``` -or you can run these scripts in the command line (only use CPU). - -```bash -. ./path.sh -. ./cmd.sh -bash ./local/data.sh -CUDA_VISIBLE_DEVICES= ./local/train.sh conf/wav2vec2ASR.yaml wav2vec2ASR -avg.sh best exp/wav2vec2ASR/checkpoints 1 -``` -## Stage 3: Model Testing -The test stage is to evaluate the model performance. The code of test stage is shown below: -```bash - if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then - # test ckpt avg_n - CUDA_VISIBLE_DEVICES=0 ./local/test.sh ${conf_path} ${decode_conf_path} exp/${ckpt}/checkpoints/${avg_ckpt} || exit -1 - fi -``` -If you want to train a model and test it, you can use the script below to execute stage 0, stage 1, stage 2, and stage 3 : -```bash -bash run.sh --stage 0 --stop_stage 3 -``` -or you can run these scripts in the command line (only use CPU). -```bash -. ./path.sh -. ./cmd.sh -bash ./local/data.sh -CUDA_VISIBLE_DEVICES= ./local/train.sh conf/wav2vec2ASR.yaml wav2vec2ASR -avg.sh best exp/wav2vec2ASR/checkpoints 1 -CUDA_VISIBLE_DEVICES= ./local/test.sh conf/wav2vec2ASR.yaml conf/tuning/decode.yaml exp/wav2vec2ASR/checkpoints/avg_1 -``` -## Pretrained Model -You can get the pretrained wav2vec2ASR from [this](../../../docs/source/released_model.md). - -using the `tar` scripts to unpack the model and then you can use the script to test the model. - -For example: -```bash -wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz -tar xzvf wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz -source path.sh -# If you have process the data and get the manifest file, you can skip the following 2 steps -bash local/data.sh --stage -1 --stop_stage -1 -bash local/data.sh --stage 2 --stop_stage 2 -CUDA_VISIBLE_DEVICES= ./local/test.sh conf/wav2vec2ASR.yaml conf/tuning/decode.yaml exp/wav2vec2ASR/checkpoints/avg_1 -``` -The performance of the released models are shown in [here](./RESULTS.md). - - -## Stage 4: Single Audio File Inference -In some situations, you want to use the trained model to do the inference for the single audio file. You can use stage 5. The code is shown below -```bash - if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then - # test a single .wav file - CUDA_VISIBLE_DEVICES=0 ./local/test_wav.sh ${conf_path} ${decode_conf_path} exp/${ckpt}/checkpoints/${avg_ckpt} ${audio_file} || exit -1 - fi -``` -you can train the model by yourself using ```bash run.sh --stage 0 --stop_stage 3```, or you can download the pretrained model through the script below: -```bash -wget https://paddlespeech.bj.bcebos.com/s2t/aishell/asr3/wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz -tar xzvf wav2vec2ASR-large-aishell1_ckpt_1.4.0.model.tar.gz -``` -You can download the audio demo: -```bash -wget -nc https://paddlespeech.bj.bcebos.com/datasets/single_wav/zh/demo_01_03.wav -P data/ -``` -You need to prepare an audio file or use the audio demo above, please confirm the sample rate of the audio is 16K. You can get the result of the audio demo by running the script below. -```bash -CUDA_VISIBLE_DEVICES= ./local/test_wav.sh conf/wav2vec2ASR.yaml conf/tuning/decode.yaml exp/wav2vec2ASR/checkpoints/avg_1 data/demo_01_03.wav -``` diff --git a/examples/aishell/asr3/RESULT.md b/examples/aishell/asr3/RESULT.md deleted file mode 100644 index 42edeac11..000000000 --- a/examples/aishell/asr3/RESULT.md +++ /dev/null @@ -1,18 +0,0 @@ -# AISHELL - -## Version - -* paddle version: develop (commit id: daea892c67e85da91906864de40ce9f6f1b893ae) -* paddlespeech version: develop (commit id: c14b4238b256693281e59605abff7c9435b3e2b2) -* paddlenlp version: 2.5.2 - -## Device -* python: 3.7 -* cuda: 10.2 -* cudnn: 7.6 - -## Result -train: Epoch 80, 2*V100-32G, batchsize:5 -| Model | Params | Config | Augmentation| Test set | Decode method | WER | -| --- | --- | --- | --- | --- | --- | --- | -| wav2vec2ASR | 324.49 M | conf/wav2vec2ASR.yaml | spec_aug | test-set | greedy search | 5.1009 | diff --git a/examples/aishell/asr3/cmd.sh b/examples/aishell/asr3/cmd.sh deleted file mode 100755 index 7b70ef5e0..000000000 --- a/examples/aishell/asr3/cmd.sh +++ /dev/null @@ -1,89 +0,0 @@ -# ====== About run.pl, queue.pl, slurm.pl, and ssh.pl ====== -# Usage: .pl [options] JOB=1: -# e.g. -# run.pl --mem 4G JOB=1:10 echo.JOB.log echo JOB -# -# Options: -# --time