You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
PaddleSpeech/demos
liangym 9ae280a7f3
Merge pull request #2149 from lym0302/updata_readme
2 years ago
..
audio_content_search remove old vector model info, test=doc 3 years ago
audio_searching Bump numpy from 1.21.0 to 1.22.0 in /demos/audio_searching 3 years ago
audio_tagging Update usage and doc of cli executor. 3 years ago
automatic_video_subtitiles Update usage and doc of cli executor. 3 years ago
custom_streaming_asr more cli for speech demos 2 years ago
keyword_spotting more cli for speech demos 2 years ago
metaverse fix demos, test=tts 3 years ago
punctuation_restoration Update usage and doc of cli executor. 3 years ago
speaker_verification more cli for speech demos 2 years ago
speech_recognition more cli for speech demos 2 years ago
speech_server Merge pull request #2138 from zh794390558/demos 2 years ago
speech_translation Update usage and doc of cli executor. 3 years ago
speech_web Bump moment from 2.29.3 to 2.29.4 in /demos/speech_web/web_client 2 years ago
story_talker fix demos, test=tts 3 years ago
streaming_asr_server Merge pull request #2138 from zh794390558/demos 2 years ago
streaming_tts_server Merge pull request #2138 from zh794390558/demos 2 years ago
style_fs2 update readme, test=doc_fix (#1156) 3 years ago
text_to_speech more cli for speech demos 2 years ago
README.md update demos readme, test=doc 2 years ago
README_cn.md update demos readme, test=doc 2 years ago

README.md

Speech Application based on PaddleSpeech

(简体中文|English)

This directory contains many speech applications in multiple scenarios.

  • audio searching - mass audio similarity retrieval
  • audio tagging - multi-label tagging of an audio file
  • automatic_video_subtitles - generate subtitles from a video
  • metaverse - 2D AR with TTS
  • punctuation_restoration - restore punctuation from raw text
  • speech recognition - recognize text of an audio file
  • speech server - Server for Speech Task, e.g. ASR,TTS,CLS
  • streaming asr server - receive audio stream from websocket, and recognize to transcript.
  • streaming tts server - receive text from http or websocket, and streaming audio data stream.
  • speech translation - end to end speech translation
  • story talker - book reader based on OCR and TTS
  • style_fs2 - multi style control for FastSpeech2 model
  • text_to_speech - convert text into speech