{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# GAN Vocoders 总览\n",
"\n",
"Loss 函数简称与全称的对应关系\n",
"\n",
"|Short Name|Full Name|\n",
":-----:|:-----|\n",
"|adv|adversial loss|\n",
"|FM|Feature Matching|\n",
"|MSD|Multi-Scale Discriminator|\n",
"|mr-STFT|Multi-resolution STFT loss|\n",
"|fmr-STFT|full band Multi-resolution STFT loss|\n",
"|smr-STFT|sub band Multi-resolution STFT loss|\n",
"|Mel|Mel-Spectrogram Loss|\n",
"|MPD|Multi-Period Discriminator|\n",
"|FB-RAWs|Filter Bank Random Window Discriminators|\n",
"\n",
"
\n",
"csmsc 数据集上 GAN Vocoder 整体对比如下, \n ",
"\n",
"测试机器:1 x Tesla V100-32G 40 core Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz\n ",
"\n",
"测试环境:Python 3.7.0, paddlepaddle 2.2.0\n",
"\n",
"Model|Date|Input|Generator
Loss|Discriminator
Loss|Need
Finetune|Training
Steps|Finetune
Steps|Batch
Size|ips
(gen only)
(gen + dis)|Static Model
Size (gen)|RTF
(GPU)|\n",
":-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|\n",
"Mel GAN|9 Dec 2019|mel|adv
FM |MSD|——|——|——|——|——|——|——|\n",
"Parallel Wave GAN |6 Feb 2020|mel
noise|adv
mr-STFT|adv|No|40W|——|8|18
10|5.1MB|0.01786|\n",
"HiFi GAN|23 Oct 2020|mel|adv
FM
Mel|MSD
MPD|Yes|250W|no need|16|——
31|50MB|0.00825|\n",
"Multi-Band Mel GAN|17 Nov 2020|mel|adv
fmr-STFT
smr-STFT|MSD|Yes|100W|100W
(not good enough,
need to adjust parameters)|64|305
148|8.2MB|0.00457|\n",
"Style Mel GAN|12 Feb 2021|mel
noise|adv
mr-STFT|FB-RAWs|No|150W|——|32|58
24|——|0.01343|\n",
"\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# 网络结构\n",
"## Mel GAN\n",
"