This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# GAN Vocoders 总览\n",
"\n",
"Loss 函数简称与全称的对应关系\n",
"\n",
"|Short Name|Full Name|\n",
":-----:|:-----|\n",
"|adv|adversial loss|\n",
"|FM|Feature Matching|\n",
"|MSD|Multi-Scale Discriminator|\n",
"|mr-STFT|Multi-resolution STFT loss|\n",
"|fmr-STFT|full band Multi-resolution STFT loss|\n",
"|smr-STFT|sub band Multi-resolution STFT loss|\n",
"|Mel|Mel-Spectrogram Loss|\n",
"|MPD|Multi-Period Discriminator|\n",
"|FB-RAWs|Filter Bank Random Window Discriminators|\n",
"\n",
"<br></br>\n",
"csmsc 数据集上 GAN Vocoder 整体对比如下, \n ",
"\n",
"测试机器:1 x Tesla V100-32G 40 core Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz\n ",
"Mel GAN|9 Dec 2019|mel|adv<br>FM |MSD|——|——|——|——|——|——|——|\n",
"Parallel Wave GAN |6 Feb 2020|mel<br>noise|adv<br>mr-STFT|adv|No|40W|——|8|18<br>10|5.1MB|0.01786|\n",
"HiFi GAN|23 Oct 2020|mel|adv<br>FM<br>Mel|MSD<br>MPD|Yes|250W|no need|16|——<br>31|50MB|0.00825|\n",
"Multi-Band Mel GAN|17 Nov 2020|mel|adv<br>fmr-STFT<br>smr-STFT|MSD|Yes|100W|100W<br><font size=1>(not good enough,<br>need to adjust parameters)</font>|64|305<br>148|8.2MB|0.00457|\n",
"Style Mel GAN|12 Feb 2021|mel<br>noise|adv<br>mr-STFT|FB-RAWs|No|150W|——|32|58<br>24|——|0.01343|\n",