History

jiamingkong 8432e8626f Final cleaning; Modified SSL/infer.py and README for wavlm inclusion in model options		3 years ago
..
TTSAndroid	Fix some typos. (#3178 )	3 years ago
TTSArmLinux	Fix some typos. (#3178 )	3 years ago
TTSCppFrontend	Fix some typos. (#3178 )	3 years ago
asr_deployment	[demo] u2++ asr deployment demo (#2639 )	3 years ago
audio_content_search	remove old vector model info, test=doc	4 years ago
audio_searching	[s2t] mv dataset into paddlespeech.dataset (#3183 )	3 years ago
audio_tagging	Update usage and doc of cli executor.	4 years ago
automatic_video_subtitiles	Update usage and doc of cli executor.	4 years ago
custom_streaming_asr	more cli for speech demos	3 years ago
keyword_spotting	more cli for speech demos	3 years ago
metaverse	Update README_cn.md	3 years ago
punctuation_restoration	Update usage and doc of cli executor.	4 years ago
speaker_verification	Update README_cn.md	3 years ago
speech_recognition	[ASR] add asr code-switch cli and demo, test='asr' (#2816 )	3 years ago
speech_server	fix: 🐛 修复服务端 python ASREngine 无法使用conformer_talcs模型 (#3230 )	3 years ago
speech_ssl	Final cleaning; Modified SSL/infer.py and README for wavlm inclusion in model options	3 years ago
speech_translation	Update usage and doc of cli executor.	4 years ago
speech_web	Fix some typos. (#3178 )	3 years ago
story_talker	Revised the Chinese doc, test=doc	3 years ago
streaming_asr_server	add function for generating srt file (#3123 )	3 years ago
streaming_tts_server	update dependency of paddle	3 years ago
streaming_tts_serving_fastdeploy	Add TTS fastdeploy serving (#2528 )	3 years ago
style_fs2	Revised the Chinese doc, test=doc	3 years ago
text_to_speech	[TTS]Cli Cantonese onnx, test=tts (#2990 )	3 years ago
whisper	[ASR] fix Whisper cli model download path error. test=asr (#2679 )	3 years ago
README.md	add all whisper model size support, test=asr (#2677 )	3 years ago
README_cn.md	add all whisper model size support, test=asr (#2677 )	3 years ago

README.md

Speech Application based on PaddleSpeech

(简体中文|English)

This directory contains many speech applications in multiple scenarios.

audio searching - mass audio similarity retrieval
audio tagging - multi-label tagging of an audio file
automatic_video_subtitles - generate subtitles from a video
metaverse - 2D AR with TTS
punctuation_restoration - restore punctuation from raw text
speech recognition - recognize text of an audio file
speech server - Server for Speech Task, e.g. ASR,TTS,CLS
streaming asr server - receive audio stream from websocket, and recognize to transcript.
streaming tts server - receive text from http or websocket, and streaming audio data stream.
speech translation - end to end speech translation
story talker - book reader based on OCR and TTS
style_fs2 - multi style control for FastSpeech2 model
text_to_speech - convert text into speech
self supervised pretraining - speech feature extraction and speech recognition based on wav2vec2
Wishper - speech recognize and translate based on Whisper model