From 394e635958bcd04c2ab487f8bdc445390cbdc3af Mon Sep 17 00:00:00 2001 From: longrookie Date: Thu, 23 Mar 2023 19:45:43 +0800 Subject: [PATCH] modify the format of iSTFTNet.md --- examples/csmsc/voc5/iSTFTNet.md | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/examples/csmsc/voc5/iSTFTNet.md b/examples/csmsc/voc5/iSTFTNet.md index 9fb429712..4fb0e4622 100644 --- a/examples/csmsc/voc5/iSTFTNet.md +++ b/examples/csmsc/voc5/iSTFTNet.md @@ -1,6 +1,6 @@ # iSTFTNet with CSMSC -This example contains code used to train a [iSTFTNet]([[2203.02395] iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform (arxiv.org)](https://arxiv.org/abs/2203.02395)) model with [Chinese Standard Mandarin Speech Copus](https://www.data-baker.com/open_source.html), +This example contains code used to train a [iSTFTNet](https://arxiv.org/abs/2203.02395) model with [Chinese Standard Mandarin Speech Copus](https://www.data-baker.com/open_source.html). ## Dataset ### Download and Extract @@ -121,21 +121,15 @@ The pretrained model can be downloaded here: iSTFTNet checkpoint contains files listed below. - ```text - iSTFTNet_csmsc_ckpt - ├── iSTFT.yaml                  # config used to train iSTFTNet - ├── feats_stats.npy               # statistics used to normalize spectrogram when training hifigan - └── snapshot_iter_50000.pdz     # generator parameters of hifigan - ``` A Comparison between iSTFTNet and Hifigan -| Model | Step | eval/generator_loss | eval/mel_loss | eval/feature_matching_loss | rtf | +| Model | Step | eval/generator_loss | eval/mel_loss | eval/feature_matching_loss | rtf | |:--------:|:--------------:|:-------------------:|:-------------:|:--------------------------:| :---: | | hifigan | 1(gpu) x 50000 | 13.989 | 0.14683 | 1.3484 | 0.01517 | | istftNet | 1(gpu) x 50000 | 12.815 | 0.14293 | 0.98067 | 0.00747 |