{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"\n",
"\n",
"\n",
"# 语音识别——DeepSpeech2\n",
" \n",
"# 0. 视频理解与字幕"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# 下载demo视频\n",
"!test -f work/source/subtitle_demo1.mp4 || wget https://paddlespeech.cdn.bcebos.com/demos/asr_demos/subtitle_demo1.mp4 -P work/source/"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import IPython.display as dp\n",
"from IPython.display import HTML\n",
"html_str = '''\n",
"\n",
"'''.format(\"work/source/subtitle_demo1.mp4 \")\n",
"dp.display(HTML(html_str))\n",
"print (\"ASR结果为:当我说我可以把三十年的经验变成一个准确的算法他们说不可能当我说我们十个人就能实现对十九个城市变电站七乘二十四小时的实时监管他们说不可能\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"> Demo实现:[Dhttps://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/automatic_video_subtitiles/](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/automatic_video_subtitiles/)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"\n",
"# 1. 前言"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"## 1.1 背景知识\n",
"语音识别(Automatic Speech Recognition, ASR) 是一项从一段音频中提取出语言文字内容的任务。\n",
"\n",
"