Merge branch 'PaddlePaddle:develop' into develop

pull/4057/head
Echo-Nie 5 months ago committed by GitHub
commit 8f29aefef0
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -173,7 +173,7 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision
- 🏆 **Streaming ASR and TTS System**: we provide production ready streaming asr and streaming tts system. - 🏆 **Streaming ASR and TTS System**: we provide production ready streaming asr and streaming tts system.
- 💯 **Rule-based Chinese frontend**: our frontend contains Text Normalization and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi). Moreover, we use self-defined linguistic rules to adapt Chinese context. - 💯 **Rule-based Chinese frontend**: our frontend contains Text Normalization and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi). Moreover, we use self-defined linguistic rules to adapt Chinese context.
- 📦 **Varieties of Functions that Vitalize both Industrial and Academia**: - 📦 **Varieties of Functions that Vitalize both Industrial and Academia**:
- 🛎️ *Implementation of critical audio tasks*: this toolkit contains audio functions like Automatic Speech Recognition, Text-to-Speech Synthesis, Speaker Verfication, KeyWord Spotting, Audio Classification, and Speech Translation, etc. - 🛎️ *Implementation of critical audio tasks*: this toolkit contains audio functions like Automatic Speech Recognition, Text-to-Speech Synthesis, Speaker Verification, KeyWord Spotting, Audio Classification, and Speech Translation, etc.
- 🔬 *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of the speech tasks, and uses mainstream datasets like LibriSpeech, LJSpeech, AIShell, CSMSC, etc. See also [model list](#model-list) for more details. - 🔬 *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of the speech tasks, and uses mainstream datasets like LibriSpeech, LJSpeech, AIShell, CSMSC, etc. See also [model list](#model-list) for more details.
- 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV). - 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV).
@ -1025,7 +1025,7 @@ You are warmly welcome to submit questions in [discussions](https://github.com/P
- Many thanks to [vpegasus](https://github.com/vpegasus)/[xuesebot](https://github.com/vpegasus/xuesebot) for developing a rasa chatbot,which is able to speak and listen thanks to PaddleSpeech. - Many thanks to [vpegasus](https://github.com/vpegasus)/[xuesebot](https://github.com/vpegasus/xuesebot) for developing a rasa chatbot,which is able to speak and listen thanks to PaddleSpeech.
- Many thanks to [chenkui164](https://github.com/chenkui164)/[FastASR](https://github.com/chenkui164/FastASR) for the C++ inference implementation of PaddleSpeech ASR. - Many thanks to [chenkui164](https://github.com/chenkui164)/[FastASR](https://github.com/chenkui164/FastASR) for the C++ inference implementation of PaddleSpeech ASR.
- Many thanks to [heyudage](https://github.com/heyudage)/[VoiceTyping](https://github.com/heyudage/VoiceTyping) for the real-time voice typing tool implementation of PaddleSpeech ASR streaming services. - Many thanks to [heyudage](https://github.com/heyudage)/[VoiceTyping](https://github.com/heyudage/VoiceTyping) for the real-time voice typing tool implementation of PaddleSpeech ASR streaming services.
- Many thanks to [EscaticZheng](https://github.com/EscaticZheng)/[ps3.9wheel-install](https://github.com/EscaticZheng/ps3.9wheel-install) for the python3.9 prebuilt wheel for PaddleSpeech installation in Windows without Viusal Studio. - Many thanks to [EscaticZheng](https://github.com/EscaticZheng)/[ps3.9wheel-install](https://github.com/EscaticZheng/ps3.9wheel-install) for the python3.9 prebuilt wheel for PaddleSpeech installation in Windows without Visual Studio.
Besides, PaddleSpeech depends on a lot of open source repositories. See [references](./docs/source/reference.md) for more information. Besides, PaddleSpeech depends on a lot of open source repositories. See [references](./docs/source/reference.md) for more information.
- Many thanks to [chinobing](https://github.com/chinobing)/[FastAPI-PaddleSpeech-Audio-To-Text](https://github.com/chinobing/FastAPI-PaddleSpeech-Audio-To-Text) for converting audio to text based on FastAPI and PaddleSpeech. - Many thanks to [chinobing](https://github.com/chinobing)/[FastAPI-PaddleSpeech-Audio-To-Text](https://github.com/chinobing/FastAPI-PaddleSpeech-Audio-To-Text) for converting audio to text based on FastAPI and PaddleSpeech.
- Many thanks to [MistEO](https://github.com/MistEO)/[Pallas-Bot](https://github.com/MistEO/Pallas-Bot) for QQ bot based on PaddleSpeech TTS. - Many thanks to [MistEO](https://github.com/MistEO)/[Pallas-Bot](https://github.com/MistEO/Pallas-Bot) for QQ bot based on PaddleSpeech TTS.

@ -233,7 +233,7 @@ def spectrogram(waveform: Tensor,
round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input
to FFT. Defaults to True. to FFT. Defaults to True.
sr (int, optional): Sample rate of input waveform. Defaults to 16000. sr (int, optional): Sample rate of input waveform. Defaults to 16000.
snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a signal frame when it snip_edges (bool, optional): Drop samples in the end of waveform that can't fit a signal frame when it
is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True. is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True.
subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False. subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False.
window_type (str, optional): Choose type of window for FFT computation. Defaults to "povey". window_type (str, optional): Choose type of window for FFT computation. Defaults to "povey".
@ -443,7 +443,7 @@ def fbank(waveform: Tensor,
round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input
to FFT. Defaults to True. to FFT. Defaults to True.
sr (int, optional): Sample rate of input waveform. Defaults to 16000. sr (int, optional): Sample rate of input waveform. Defaults to 16000.
snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a signal frame when it snip_edges (bool, optional): Drop samples in the end of waveform that can't fit a signal frame when it
is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True. is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True.
subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False. subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False.
use_energy (bool, optional): Add an dimension with energy of spectrogram to the output. Defaults to False. use_energy (bool, optional): Add an dimension with energy of spectrogram to the output. Defaults to False.
@ -566,7 +566,7 @@ def mfcc(waveform: Tensor,
round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input
to FFT. Defaults to True. to FFT. Defaults to True.
sr (int, optional): Sample rate of input waveform. Defaults to 16000. sr (int, optional): Sample rate of input waveform. Defaults to 16000.
snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a signal frame when it snip_edges (bool, optional): Drop samples in the end of waveform that can't fit a signal frame when it
is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True. is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True.
subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False. subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False.
use_energy (bool, optional): Add an dimension with energy of spectrogram to the output. Defaults to False. use_energy (bool, optional): Add an dimension with energy of spectrogram to the output. Defaults to False.

@ -527,7 +527,7 @@ def melspectrogram(x: np.ndarray,
if fmax is None: if fmax is None:
fmax = sr // 2 fmax = sr // 2
if fmin < 0 or fmin >= fmax: if fmin < 0 or fmin >= fmax:
raise ParameterError('fmin and fmax must statisfy 0<fmin<fmax') raise ParameterError('fmin and fmax must satisfy 0<fmin<fmax')
s = stft( s = stft(
x, x,

@ -43,7 +43,7 @@ class AudioClassificationDataset(paddle.io.Dataset):
sample_rate: int=None, sample_rate: int=None,
**kwargs): **kwargs):
""" """
Ags: Args:
files (:obj:`List[str]`): A list of absolute path of audio files. files (:obj:`List[str]`): A list of absolute path of audio files.
labels (:obj:`List[int]`): Labels of audio files. labels (:obj:`List[int]`): Labels of audio files.
feat_type (:obj:`str`, `optional`, defaults to `raw`): feat_type (:obj:`str`, `optional`, defaults to `raw`):

@ -35,7 +35,7 @@ class ESC50(AudioClassificationDataset):
http://dx.doi.org/10.1145/2733373.2806390 http://dx.doi.org/10.1145/2733373.2806390
""" """
archieves = [ archives = [
{ {
'url': 'url':
'https://paddleaudio.bj.bcebos.com/datasets/ESC-50-master.zip', 'https://paddleaudio.bj.bcebos.com/datasets/ESC-50-master.zip',
@ -111,7 +111,7 @@ class ESC50(AudioClassificationDataset):
feat_type: str='raw', feat_type: str='raw',
**kwargs): **kwargs):
""" """
Ags: Args:
mode (:obj:`str`, `optional`, defaults to `train`): mode (:obj:`str`, `optional`, defaults to `train`):
It identifies the dataset mode (train or dev). It identifies the dataset mode (train or dev).
split (:obj:`int`, `optional`, defaults to 1): split (:obj:`int`, `optional`, defaults to 1):
@ -133,7 +133,7 @@ class ESC50(AudioClassificationDataset):
def _get_data(self, mode: str, split: int) -> Tuple[List[str], List[int]]: def _get_data(self, mode: str, split: int) -> Tuple[List[str], List[int]]:
if not os.path.isdir(os.path.join(DATA_HOME, self.audio_path)) or \ if not os.path.isdir(os.path.join(DATA_HOME, self.audio_path)) or \
not os.path.isfile(os.path.join(DATA_HOME, self.meta)): not os.path.isfile(os.path.join(DATA_HOME, self.meta)):
download_and_decompress(self.archieves, DATA_HOME) download_and_decompress(self.archives, DATA_HOME)
meta_info = self._get_meta_info() meta_info = self._get_meta_info()

@ -35,7 +35,7 @@ class GTZAN(AudioClassificationDataset):
https://ieeexplore.ieee.org/document/1021072/ https://ieeexplore.ieee.org/document/1021072/
""" """
archieves = [ archives = [
{ {
'url': 'http://opihi.cs.uvic.ca/sound/genres.tar.gz', 'url': 'http://opihi.cs.uvic.ca/sound/genres.tar.gz',
'md5': '5b3d6dddb579ab49814ab86dba69e7c7', 'md5': '5b3d6dddb579ab49814ab86dba69e7c7',
@ -57,7 +57,7 @@ class GTZAN(AudioClassificationDataset):
feat_type='raw', feat_type='raw',
**kwargs): **kwargs):
""" """
Ags: Args:
mode (:obj:`str`, `optional`, defaults to `train`): mode (:obj:`str`, `optional`, defaults to `train`):
It identifies the dataset mode (train or dev). It identifies the dataset mode (train or dev).
seed (:obj:`int`, `optional`, defaults to 0): seed (:obj:`int`, `optional`, defaults to 0):
@ -85,7 +85,7 @@ class GTZAN(AudioClassificationDataset):
split) -> Tuple[List[str], List[int]]: split) -> Tuple[List[str], List[int]]:
if not os.path.isdir(os.path.join(DATA_HOME, self.audio_path)) or \ if not os.path.isdir(os.path.join(DATA_HOME, self.audio_path)) or \
not os.path.isfile(os.path.join(DATA_HOME, self.meta)): not os.path.isfile(os.path.join(DATA_HOME, self.meta)):
download_and_decompress(self.archieves, DATA_HOME) download_and_decompress(self.archives, DATA_HOME)
meta_info = self._get_meta_info() meta_info = self._get_meta_info()
random.seed(seed) # shuffle samples to split data random.seed(seed) # shuffle samples to split data

@ -30,7 +30,7 @@ __all__ = ['OpenRIRNoise']
class OpenRIRNoise(Dataset): class OpenRIRNoise(Dataset):
archieves = [ archives = [
{ {
'url': 'http://www.openslr.org/resources/28/rirs_noises.zip', 'url': 'http://www.openslr.org/resources/28/rirs_noises.zip',
'md5': 'e6f48e257286e05de56413b4779d8ffb', 'md5': 'e6f48e257286e05de56413b4779d8ffb',
@ -76,7 +76,7 @@ class OpenRIRNoise(Dataset):
print(f"rirs noises base path: {self.base_path}") print(f"rirs noises base path: {self.base_path}")
if not os.path.isdir(self.base_path): if not os.path.isdir(self.base_path):
download_and_decompress( download_and_decompress(
self.archieves, self.base_path, decompress=True) self.archives, self.base_path, decompress=True)
else: else:
print( print(
f"{self.base_path} already exists, we will not download and decompress again" f"{self.base_path} already exists, we will not download and decompress again"

@ -37,7 +37,7 @@ class TESS(AudioClassificationDataset):
https://doi.org/10.5683/SP2/E8H2MF https://doi.org/10.5683/SP2/E8H2MF
""" """
archieves = [ archives = [
{ {
'url': 'url':
'https://bj.bcebos.com/paddleaudio/datasets/TESS_Toronto_emotional_speech_set.zip', 'https://bj.bcebos.com/paddleaudio/datasets/TESS_Toronto_emotional_speech_set.zip',
@ -66,7 +66,7 @@ class TESS(AudioClassificationDataset):
feat_type='raw', feat_type='raw',
**kwargs): **kwargs):
""" """
Ags: Args:
mode (:obj:`str`, `optional`, defaults to `train`): mode (:obj:`str`, `optional`, defaults to `train`):
It identifies the dataset mode (train or dev). It identifies the dataset mode (train or dev).
seed (:obj:`int`, `optional`, defaults to 0): seed (:obj:`int`, `optional`, defaults to 0):
@ -93,7 +93,7 @@ class TESS(AudioClassificationDataset):
def _get_data(self, mode, seed, n_folds, def _get_data(self, mode, seed, n_folds,
split) -> Tuple[List[str], List[int]]: split) -> Tuple[List[str], List[int]]:
if not os.path.isdir(os.path.join(DATA_HOME, self.audio_path)): if not os.path.isdir(os.path.join(DATA_HOME, self.audio_path)):
download_and_decompress(self.archieves, DATA_HOME) download_and_decompress(self.archives, DATA_HOME)
wav_files = [] wav_files = []
for root, _, files in os.walk(os.path.join(DATA_HOME, self.audio_path)): for root, _, files in os.walk(os.path.join(DATA_HOME, self.audio_path)):

@ -35,7 +35,7 @@ class UrbanSound8K(AudioClassificationDataset):
https://dl.acm.org/doi/10.1145/2647868.2655045 https://dl.acm.org/doi/10.1145/2647868.2655045
""" """
archieves = [ archives = [
{ {
'url': 'url':
'https://zenodo.org/record/1203745/files/UrbanSound8K.tar.gz', 'https://zenodo.org/record/1203745/files/UrbanSound8K.tar.gz',
@ -62,7 +62,7 @@ class UrbanSound8K(AudioClassificationDataset):
super(UrbanSound8K, self).__init__( super(UrbanSound8K, self).__init__(
files=files, labels=labels, feat_type=feat_type, **kwargs) files=files, labels=labels, feat_type=feat_type, **kwargs)
""" """
Ags: Args:
mode (:obj:`str`, `optional`, defaults to `train`): mode (:obj:`str`, `optional`, defaults to `train`):
It identifies the dataset mode (train or dev). It identifies the dataset mode (train or dev).
split (:obj:`int`, `optional`, defaults to 1): split (:obj:`int`, `optional`, defaults to 1):
@ -81,7 +81,7 @@ class UrbanSound8K(AudioClassificationDataset):
def _get_data(self, mode: str, split: int) -> Tuple[List[str], List[int]]: def _get_data(self, mode: str, split: int) -> Tuple[List[str], List[int]]:
if not os.path.isdir(os.path.join(DATA_HOME, self.audio_path)) or \ if not os.path.isdir(os.path.join(DATA_HOME, self.audio_path)) or \
not os.path.isfile(os.path.join(DATA_HOME, self.meta)): not os.path.isfile(os.path.join(DATA_HOME, self.meta)):
download_and_decompress(self.archieves, DATA_HOME) download_and_decompress(self.archives, DATA_HOME)
meta_info = self._get_meta_info() meta_info = self._get_meta_info()

@ -34,7 +34,7 @@ __all__ = ['VoxCeleb']
class VoxCeleb(Dataset): class VoxCeleb(Dataset):
source_url = 'https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/' source_url = 'https://thor.robots.ox.ac.uk/~vgg/data/voxceleb/vox1a/'
archieves_audio_dev = [ archives_audio_dev = [
{ {
'url': source_url + 'vox1_dev_wav_partaa', 'url': source_url + 'vox1_dev_wav_partaa',
'md5': 'e395d020928bc15670b570a21695ed96', 'md5': 'e395d020928bc15670b570a21695ed96',
@ -52,13 +52,13 @@ class VoxCeleb(Dataset):
'md5': '7bb1e9f70fddc7a678fa998ea8b3ba19', 'md5': '7bb1e9f70fddc7a678fa998ea8b3ba19',
}, },
] ]
archieves_audio_test = [ archives_audio_test = [
{ {
'url': source_url + 'vox1_test_wav.zip', 'url': source_url + 'vox1_test_wav.zip',
'md5': '185fdc63c3c739954633d50379a3d102', 'md5': '185fdc63c3c739954633d50379a3d102',
}, },
] ]
archieves_meta = [ archives_meta = [
{ {
'url': 'url':
'https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/veri_test2.txt', 'https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/veri_test2.txt',
@ -135,11 +135,11 @@ class VoxCeleb(Dataset):
if not os.path.isdir(self.wav_path): if not os.path.isdir(self.wav_path):
print("start to download the voxceleb1 dataset") print("start to download the voxceleb1 dataset")
download_and_decompress( # multi-zip parts concatenate to vox1_dev_wav.zip download_and_decompress( # multi-zip parts concatenate to vox1_dev_wav.zip
self.archieves_audio_dev, self.archives_audio_dev,
self.base_path, self.base_path,
decompress=False) decompress=False)
download_and_decompress( # download the vox1_test_wav.zip and unzip download_and_decompress( # download the vox1_test_wav.zip and unzip
self.archieves_audio_test, self.archives_audio_test,
self.base_path, self.base_path,
decompress=True) decompress=True)
@ -157,7 +157,7 @@ class VoxCeleb(Dataset):
if not os.path.isdir(self.meta_path): if not os.path.isdir(self.meta_path):
print("prepare the meta data") print("prepare the meta data")
download_and_decompress( download_and_decompress(
self.archieves_meta, self.meta_path, decompress=False) self.archives_meta, self.meta_path, decompress=False)
# Data preparation. # Data preparation.
if not os.path.isdir(self.csv_path): if not os.path.isdir(self.csv_path):

@ -41,7 +41,7 @@ class TestSaveBase(TempDirMixin):
test_mode: str="path", ): test_mode: str="path", ):
"""`save` function produces file that is comparable with `sox` command """`save` function produces file that is comparable with `sox` command
To compare that the file produced by `save` function agains the file produced by To compare that the file produced by `save` function against the file produced by
the equivalent `sox` command, we need to load both files. the equivalent `sox` command, we need to load both files.
But there are many formats that cannot be opened with common Python modules (like But there are many formats that cannot be opened with common Python modules (like
SciPy). SciPy).

@ -109,7 +109,7 @@ def create_manifest(data_dir, manifest_path):
def prepare_chime3(url, md5sum, target_dir, manifest_path): def prepare_chime3(url, md5sum, target_dir, manifest_path):
"""Download, unpack and create summmary manifest file.""" """Download, unpack and create summary manifest file."""
if not os.path.exists(os.path.join(target_dir, "CHiME3")): if not os.path.exists(os.path.join(target_dir, "CHiME3")):
# download # download
filepath = download(url, md5sum, target_dir, filepath = download(url, md5sum, target_dir,

@ -210,7 +210,7 @@ def create_manifest(data_dir, manifest_path_prefix):
def prepare_dataset(url, md5sum, target_dir, manifest_path): def prepare_dataset(url, md5sum, target_dir, manifest_path):
"""Download, unpack and create summmary manifest file. """Download, unpack and create summary manifest file.
""" """
filepath = os.path.join(target_dir, "TIMIT.zip") filepath = os.path.join(target_dir, "TIMIT.zip")
if not os.path.exists(filepath): if not os.path.exists(filepath):

@ -8,7 +8,7 @@
### 环境准备 ### 环境准备
1. 在本地环境安装好 Android Studio 工具,详细安装方法请见 [Android Stuido 官网](https://developer.android.com/studio)。 1. 在本地环境安装好 Android Studio 工具,详细安装方法请见 [Android Studio 官网](https://developer.android.com/studio)。
2. 准备一部 Android 手机,并开启 USB 调试模式。开启方法: `手机设置 -> 查找开发者选项 -> 打开开发者选项和 USB 调试模式` 2. 准备一部 Android 手机,并开启 USB 调试模式。开启方法: `手机设置 -> 查找开发者选项 -> 打开开发者选项和 USB 调试模式`
**注意** **注意**
@ -20,10 +20,10 @@
2. 手机连接电脑,打开 USB 调试和文件传输模式,并在 Android Studio 上连接自己的手机设备(手机需要开启允许从 USB 安装软件权限)。 2. 手机连接电脑,打开 USB 调试和文件传输模式,并在 Android Studio 上连接自己的手机设备(手机需要开启允许从 USB 安装软件权限)。
**注意:** **注意:**
>1. 如果您在导入项目、编译或者运行过程中遇到 NDK 配置错误的提示,请打开 `File > Project Structure > SDK Location`,修改 `Andriod NDK location` 为您本机配置的 NDK 所在路径。 >1. 如果您在导入项目、编译或者运行过程中遇到 NDK 配置错误的提示,请打开 `File > Project Structure > SDK Location`,修改 `Android NDK location` 为您本机配置的 NDK 所在路径。
>2. 如果您是通过 Andriod Studio 的 SDK Tools 下载的 NDK (见本章节"环境准备"),可以直接点击下拉框选择默认路径。 >2. 如果您是通过 Android Studio 的 SDK Tools 下载的 NDK (见本章节"环境准备"),可以直接点击下拉框选择默认路径。
>3. 还有一种 NDK 配置方法,你可以在 `TTSAndroid/local.properties` 文件中手动添加 NDK 路径配置 `nkd.dir=/root/android-ndk-r20b` >3. 还有一种 NDK 配置方法,你可以在 `TTSAndroid/local.properties` 文件中手动添加 NDK 路径配置 `nkd.dir=/root/android-ndk-r20b`
>4. 如果以上步骤仍旧无法解决 NDK 配置错误,请尝试根据 Andriod Studio 官方文档中的[更新 Android Gradle 插件](https://developer.android.com/studio/releases/gradle-plugin?hl=zh-cn#updating-plugin)章节,尝试更新 Android Gradle plugin 版本。 >4. 如果以上步骤仍旧无法解决 NDK 配置错误,请尝试根据 Android Studio 官方文档中的[更新 Android Gradle 插件](https://developer.android.com/studio/releases/gradle-plugin?hl=zh-cn#updating-plugin)章节,尝试更新 Android Gradle plugin 版本。
3. 点击 Run 按钮,自动编译 APP 并安装到手机。(该过程会自动下载 Paddle Lite 预测库和模型,需要联网) 3. 点击 Run 按钮,自动编译 APP 并安装到手机。(该过程会自动下载 Paddle Lite 预测库和模型,需要联网)
成功后效果如下: 成功后效果如下:

@ -115,27 +115,27 @@ int FrontEngineInterface::init() {
// 生成词典(词到音素的映射) // 生成词典(词到音素的映射)
if (0 != GenDict(_word2phone_path, &word_phone_map)) { if (0 != GenDict(_word2phone_path, &word_phone_map)) {
LOG(ERROR) << "Genarate word2phone dict failed"; LOG(ERROR) << "Generate word2phone dict failed";
return -1; return -1;
} }
// 生成音素字典音素到音素id的映射 // 生成音素字典音素到音素id的映射
if (0 != GenDict(_phone2id_path, &phone_id_map)) { if (0 != GenDict(_phone2id_path, &phone_id_map)) {
LOG(ERROR) << "Genarate phone2id dict failed"; LOG(ERROR) << "Generate phone2id dict failed";
return -1; return -1;
} }
// 生成音调字典音调到音调id的映射 // 生成音调字典音调到音调id的映射
if (_separate_tone == "true") { if (_separate_tone == "true") {
if (0 != GenDict(_tone2id_path, &tone_id_map)) { if (0 != GenDict(_tone2id_path, &tone_id_map)) {
LOG(ERROR) << "Genarate tone2id dict failed"; LOG(ERROR) << "Generate tone2id dict failed";
return -1; return -1;
} }
} }
// 生成繁简字典繁体到简体id的映射 // 生成繁简字典繁体到简体id的映射
if (0 != GenDict(_trand2simp_path, &trand_simp_map)) { if (0 != GenDict(_trand2simp_path, &trand_simp_map)) {
LOG(ERROR) << "Genarate trand2simp dict failed"; LOG(ERROR) << "Generate trand2simp dict failed";
return -1; return -1;
} }
@ -263,7 +263,7 @@ int FrontEngineInterface::GetWordsIds(
if (0 != if (0 !=
GetInitialsFinals(word, &word_initials, &word_finals)) { GetInitialsFinals(word, &word_initials, &word_finals)) {
LOG(ERROR) LOG(ERROR)
<< "Genarate the word_initials and word_finals of " << "Generate the word_initials and word_finals of "
<< word << " failed"; << word << " failed";
return -1; return -1;
} }
@ -304,7 +304,7 @@ int FrontEngineInterface::GetWordsIds(
// 音素到音素id // 音素到音素id
if (0 != Phone2Phoneid(phone, phoneids, toneids)) { if (0 != Phone2Phoneid(phone, phoneids, toneids)) {
LOG(ERROR) << "Genarate the phone id of " << word << " failed"; LOG(ERROR) << "Generate the phone id of " << word << " failed";
return -1; return -1;
} }
} }
@ -916,11 +916,11 @@ int FrontEngineInterface::NeuralSandhi(const std::string &word,
if (find(must_neural_tone_words.begin(), if (find(must_neural_tone_words.begin(),
must_neural_tone_words.end(), must_neural_tone_words.end(),
word) != must_neural_tone_words.end() || word) != must_neural_tone_words.end() ||
(word_num >= 2 && (word_num >= 2 && find(must_neural_tone_words.begin(),
find(must_neural_tone_words.begin(), must_neural_tone_words.end(),
must_neural_tone_words.end(), ppspeech::wstring2utf8string(
ppspeech::wstring2utf8string(word_wstr.substr( word_wstr.substr(word_num - 2))) !=
word_num - 2))) != must_neural_tone_words.end())) { must_neural_tone_words.end())) {
(*finals).back() = (*finals).back() =
(*finals).back().replace((*finals).back().length() - 1, 1, "5"); (*finals).back().replace((*finals).back().length() - 1, 1, "5");
} }

@ -217,7 +217,7 @@ Then to start the system server, and it provides HTTP backend services.
- memory132G - memory132G
dataset dataset
- CN-Celeb, train size 650,000, test size 10,000, dimention 192, distance L2 - CN-Celeb, train size 650,000, test size 10,000, dimension 192, distance L2
recall and elapsed time statistics are shown in the following figure recall and elapsed time statistics are shown in the following figure
@ -226,7 +226,7 @@ recall and elapsed time statistics are shown in the following figure
The retrieval framework based on Milvus takes about 2.9 milliseconds to retrieve on the premise of 90% recall rate, and it takes about 500 milliseconds for feature extraction (testing audio takes about 5 seconds), that is, a single audio test takes about 503 milliseconds in total, which can meet most application scenarios. The retrieval framework based on Milvus takes about 2.9 milliseconds to retrieve on the premise of 90% recall rate, and it takes about 500 milliseconds for feature extraction (testing audio takes about 5 seconds), that is, a single audio test takes about 503 milliseconds in total, which can meet most application scenarios.
* compute embeding takes 500 ms * compute embedding takes 500 ms
* retrieval with cosine takes 2.9 ms * retrieval with cosine takes 2.9 ms
* total takes 503 ms * total takes 503 ms

@ -1,5 +1,4 @@
diskcache diskcache
dtaidistane
fastapi fastapi
librosa==0.8.0 librosa==0.8.0
numpy==1.22.0 numpy==1.22.0

@ -77,13 +77,13 @@ class MilvusHelper:
field1 = FieldSchema( field1 = FieldSchema(
name="id", name="id",
dtype=DataType.INT64, dtype=DataType.INT64,
descrition="int64", description="int64",
is_primary=True, is_primary=True,
auto_id=True) auto_id=True)
field2 = FieldSchema( field2 = FieldSchema(
name="embedding", name="embedding",
dtype=DataType.FLOAT_VECTOR, dtype=DataType.FLOAT_VECTOR,
descrition="speaker embeddings", description="speaker embeddings",
dim=VECTOR_DIMENSION, dim=VECTOR_DIMENSION,
is_primary=False) is_primary=False)
schema = CollectionSchema( schema = CollectionSchema(

@ -42,7 +42,7 @@ Currently the engine type supports two forms: python and inference (Paddle Infer
paddlespeech_server start --help paddlespeech_server start --help
``` ```
Arguments: Arguments:
- `config_file`: yaml file of the app, defalut: ./conf/application.yaml - `config_file`: yaml file of the app, default: ./conf/application.yaml
- `log_file`: log file. Default: ./log/paddlespeech.log - `log_file`: log file. Default: ./log/paddlespeech.log
Output: Output:

@ -225,7 +225,7 @@ async def websocket_endpoint_online(websocket: WebSocket):
websocket (WebSocket): the websocket instance websocket (WebSocket): the websocket instance
""" """
#1. the interface wait to accept the websocket protocal header #1. the interface wait to accept the websocket protocol header
# and only we receive the header, it establish the connection with specific thread # and only we receive the header, it establish the connection with specific thread
await websocket.accept() await websocket.accept()
@ -238,7 +238,7 @@ async def websocket_endpoint_online(websocket: WebSocket):
connection_handler = None connection_handler = None
try: try:
#4. we do a loop to process the audio package by package according the protocal #4. we do a loop to process the audio package by package according the protocol
# and only if the client send finished signal, we will break the loop # and only if the client send finished signal, we will break the loop
while True: while True:
# careful here, changed the source code from starlette.websockets # careful here, changed the source code from starlette.websockets

@ -61,11 +61,12 @@
} }
}, },
"node_modules/@babel/runtime": { "node_modules/@babel/runtime": {
"version": "7.17.9", "version": "7.26.10",
"resolved": "https://registry.npmmirror.com/@babel/runtime/-/runtime-7.17.9.tgz", "resolved": "https://registry.npmjs.org/@babel/runtime/-/runtime-7.26.10.tgz",
"integrity": "sha512-lSiBBvodq29uShpWGNbgFdKYNiFDo5/HIYsaCEY9ff4sb10x9jizo2+pRrSyF4jKZCXqgzuqBOQKbUm90gQwJg==", "integrity": "sha512-2WJMeRQPHKSPemqk/awGrAiuFfzBmOIPXKizAsVhWH9YJqLZ0H+HS4c8loHGgW6utJ3E/ejXQUsiGaQy2NZ9Fw==",
"license": "MIT",
"dependencies": { "dependencies": {
"regenerator-runtime": "^0.13.4" "regenerator-runtime": "^0.14.0"
}, },
"engines": { "engines": {
"node": ">=6.9.0" "node": ">=6.9.0"
@ -1138,9 +1139,10 @@
"optional": true "optional": true
}, },
"node_modules/regenerator-runtime": { "node_modules/regenerator-runtime": {
"version": "0.13.9", "version": "0.14.1",
"resolved": "https://registry.npmmirror.com/regenerator-runtime/-/regenerator-runtime-0.13.9.tgz", "resolved": "https://registry.npmjs.org/regenerator-runtime/-/regenerator-runtime-0.14.1.tgz",
"integrity": "sha512-p3VT+cOEgxFsRRA9X4lkI1E+k2/CtnKtU4gcxyaCUreilL/vqI6CdZ3wxVUx3UOUg+gnUOQQcRI7BmSI656MYA==" "integrity": "sha512-dYnhHh0nJoMfnkZs6GmmhFknAGRrLznOu5nc9ML+EJxGvrx6H7teuevqVqCuPcPK//3eDrrjQhehXVx9cnkGdw==",
"license": "MIT"
}, },
"node_modules/resize-observer-polyfill": { "node_modules/resize-observer-polyfill": {
"version": "1.5.1", "version": "1.5.1",
@ -1392,11 +1394,11 @@
"integrity": "sha512-vqUSBLP8dQHFPdPi9bc5GK9vRkYHJ49fsZdtoJ8EQ8ibpwk5rPKfvNIwChB0KVXcIjcepEBBd2VHC5r9Gy8ueg==" "integrity": "sha512-vqUSBLP8dQHFPdPi9bc5GK9vRkYHJ49fsZdtoJ8EQ8ibpwk5rPKfvNIwChB0KVXcIjcepEBBd2VHC5r9Gy8ueg=="
}, },
"@babel/runtime": { "@babel/runtime": {
"version": "7.17.9", "version": "7.26.10",
"resolved": "https://registry.npmmirror.com/@babel/runtime/-/runtime-7.17.9.tgz", "resolved": "https://registry.npmjs.org/@babel/runtime/-/runtime-7.26.10.tgz",
"integrity": "sha512-lSiBBvodq29uShpWGNbgFdKYNiFDo5/HIYsaCEY9ff4sb10x9jizo2+pRrSyF4jKZCXqgzuqBOQKbUm90gQwJg==", "integrity": "sha512-2WJMeRQPHKSPemqk/awGrAiuFfzBmOIPXKizAsVhWH9YJqLZ0H+HS4c8loHGgW6utJ3E/ejXQUsiGaQy2NZ9Fw==",
"requires": { "requires": {
"regenerator-runtime": "^0.13.4" "regenerator-runtime": "^0.14.0"
} }
}, },
"@ctrl/tinycolor": { "@ctrl/tinycolor": {
@ -2149,9 +2151,9 @@
"optional": true "optional": true
}, },
"regenerator-runtime": { "regenerator-runtime": {
"version": "0.13.9", "version": "0.14.1",
"resolved": "https://registry.npmmirror.com/regenerator-runtime/-/regenerator-runtime-0.13.9.tgz", "resolved": "https://registry.npmjs.org/regenerator-runtime/-/regenerator-runtime-0.14.1.tgz",
"integrity": "sha512-p3VT+cOEgxFsRRA9X4lkI1E+k2/CtnKtU4gcxyaCUreilL/vqI6CdZ3wxVUx3UOUg+gnUOQQcRI7BmSI656MYA==" "integrity": "sha512-dYnhHh0nJoMfnkZs6GmmhFknAGRrLznOu5nc9ML+EJxGvrx6H7teuevqVqCuPcPK//3eDrrjQhehXVx9cnkGdw=="
}, },
"resize-observer-polyfill": { "resize-observer-polyfill": {
"version": "1.5.1", "version": "1.5.1",

@ -22,42 +22,17 @@
"@ant-design/colors" "^6.0.0" "@ant-design/colors" "^6.0.0"
"@ant-design/icons-svg" "^4.2.1" "@ant-design/icons-svg" "^4.2.1"
"@babel/helper-string-parser@^7.25.9":
version "7.25.9"
resolved "https://registry.yarnpkg.com/@babel/helper-string-parser/-/helper-string-parser-7.25.9.tgz#1aabb72ee72ed35789b4bbcad3ca2862ce614e8c"
integrity sha512-4A/SCr/2KLd5jrtOMFzaKjVtAei3+2r/NChoBNoZ3EyP/+GlhoaEGoWOZUmFmoITP7zOJyHIMm+DYRd8o3PvHA==
"@babel/helper-validator-identifier@^7.25.9":
version "7.25.9"
resolved "https://registry.yarnpkg.com/@babel/helper-validator-identifier/-/helper-validator-identifier-7.25.9.tgz#24b64e2c3ec7cd3b3c547729b8d16871f22cbdc7"
integrity sha512-Ed61U6XJc3CVRfkERJWDz4dJwKe7iLmmJsbOGu9wSloNSFttHV0I8g6UAgb7qnK5ly5bGLPd4oXZlxCdANBOWQ==
"@babel/parser@^7.16.4": "@babel/parser@^7.16.4":
version "7.17.9" version "7.17.9"
resolved "https://registry.npmmirror.com/@babel/parser/-/parser-7.17.9.tgz" resolved "https://registry.npmmirror.com/@babel/parser/-/parser-7.17.9.tgz"
integrity sha512-vqUSBLP8dQHFPdPi9bc5GK9vRkYHJ49fsZdtoJ8EQ8ibpwk5rPKfvNIwChB0KVXcIjcepEBBd2VHC5r9Gy8ueg== integrity sha512-vqUSBLP8dQHFPdPi9bc5GK9vRkYHJ49fsZdtoJ8EQ8ibpwk5rPKfvNIwChB0KVXcIjcepEBBd2VHC5r9Gy8ueg==
"@babel/parser@^7.25.3":
version "7.26.9"
resolved "https://registry.yarnpkg.com/@babel/parser/-/parser-7.26.9.tgz#d9e78bee6dc80f9efd8f2349dcfbbcdace280fd5"
integrity sha512-81NWa1njQblgZbQHxWHpxxCzNsa3ZwvFqpUg7P+NNUU6f3UU2jBEg4OlF/J6rl8+PQGh1q6/zWScd001YwcA5A==
dependencies:
"@babel/types" "^7.26.9"
"@babel/runtime@^7.10.5": "@babel/runtime@^7.10.5":
version "7.17.9" version "7.26.10"
resolved "https://registry.npmmirror.com/@babel/runtime/-/runtime-7.17.9.tgz" resolved "https://registry.yarnpkg.com/@babel/runtime/-/runtime-7.26.10.tgz#a07b4d8fa27af131a633d7b3524db803eb4764c2"
integrity sha512-lSiBBvodq29uShpWGNbgFdKYNiFDo5/HIYsaCEY9ff4sb10x9jizo2+pRrSyF4jKZCXqgzuqBOQKbUm90gQwJg== integrity sha512-2WJMeRQPHKSPemqk/awGrAiuFfzBmOIPXKizAsVhWH9YJqLZ0H+HS4c8loHGgW6utJ3E/ejXQUsiGaQy2NZ9Fw==
dependencies: dependencies:
regenerator-runtime "^0.13.4" regenerator-runtime "^0.14.0"
"@babel/types@^7.26.9":
version "7.26.9"
resolved "https://registry.yarnpkg.com/@babel/types/-/types-7.26.9.tgz#08b43dec79ee8e682c2ac631c010bdcac54a21ce"
integrity sha512-Y3IR1cRnOxOCDvMmNiym7XpXQ93iGDDPHx+Zj+NM+rg0fBaShfQLkg+hKPaZCEvg5N/LeCo4+Rj/i3FuJsIQaw==
dependencies:
"@babel/helper-string-parser" "^7.25.9"
"@babel/helper-validator-identifier" "^7.25.9"
"@ctrl/tinycolor@^3.4.0": "@ctrl/tinycolor@^3.4.0":
version "3.4.1" version "3.4.1"
@ -66,13 +41,13 @@
"@element-plus/icons-vue@^1.1.4": "@element-plus/icons-vue@^1.1.4":
version "1.1.4" version "1.1.4"
resolved "https://registry.npmmirror.com/@element-plus/icons-vue/-/icons-vue-1.1.4.tgz" resolved "https://registry.npmjs.org/@element-plus/icons-vue/-/icons-vue-1.1.4.tgz"
integrity sha512-Iz/nHqdp1sFPmdzRwHkEQQA3lKvoObk8azgABZ81QUOpW9s/lUyQVUSh0tNtEPZXQlKwlSh7SPgoVxzrE0uuVQ== integrity sha512-Iz/nHqdp1sFPmdzRwHkEQQA3lKvoObk8azgABZ81QUOpW9s/lUyQVUSh0tNtEPZXQlKwlSh7SPgoVxzrE0uuVQ==
"@element-plus/icons-vue@^2.0.9": "@element-plus/icons-vue@^2.0.9":
version "2.0.9" version "2.3.1"
resolved "https://registry.npmmirror.com/@element-plus/icons-vue/-/icons-vue-2.0.9.tgz#b7777c57534522e387303d194451d50ff549d49a" resolved "https://registry.npmjs.org/@element-plus/icons-vue/-/icons-vue-2.3.1.tgz"
integrity sha512-okdrwiVeKBmW41Hkl0eMrXDjzJwhQMuKiBOu17rOszqM+LS/yBYpNQNV5Jvoh06Wc+89fMmb/uhzf8NZuDuUaQ== integrity sha512-XxVUZv48RZAd87ucGS48jPf6pKu0yV5UCg9f4FFwtrYxXOwWuVJo6wOvSLKEoMQKjv8GsX/mhP6UsC1lRwbUWg==
"@floating-ui/core@^0.6.1": "@floating-ui/core@^0.6.1":
version "0.6.1" version "0.6.1"
@ -86,11 +61,6 @@
dependencies: dependencies:
"@floating-ui/core" "^0.6.1" "@floating-ui/core" "^0.6.1"
"@jridgewell/sourcemap-codec@^1.5.0":
version "1.5.0"
resolved "https://registry.yarnpkg.com/@jridgewell/sourcemap-codec/-/sourcemap-codec-1.5.0.tgz#3188bcb273a414b0d215fd22a58540b989b9409a"
integrity sha512-gv3ZRaISU3fjPAgNsriBRqGWQL6quFx04YMPW/zD8XMLsU32mhCCbfbO6KZFLjvYpCZ8zyDEgqsgf+PwPaM7GQ==
"@popperjs/core@^2.11.4": "@popperjs/core@^2.11.4":
version "2.11.5" version "2.11.5"
resolved "https://registry.npmmirror.com/@popperjs/core/-/core-2.11.5.tgz" resolved "https://registry.npmmirror.com/@popperjs/core/-/core-2.11.5.tgz"
@ -104,7 +74,7 @@
core-js "^3.15.1" core-js "^3.15.1"
nanopop "^2.1.0" nanopop "^2.1.0"
"@types/lodash-es@^4.17.6": "@types/lodash-es@*", "@types/lodash-es@^4.17.6":
version "4.17.6" version "4.17.6"
resolved "https://registry.npmmirror.com/@types/lodash-es/-/lodash-es-4.17.6.tgz" resolved "https://registry.npmmirror.com/@types/lodash-es/-/lodash-es-4.17.6.tgz"
integrity sha512-R+zTeVUKDdfoRxpAryaQNRKk3105Rrgx2CFRClIgRGaqDTdjsm8h6IYA8ir584W3ePzkZfst5xIgDwYrlh9HLg== integrity sha512-R+zTeVUKDdfoRxpAryaQNRKk3105Rrgx2CFRClIgRGaqDTdjsm8h6IYA8ir584W3ePzkZfst5xIgDwYrlh9HLg==
@ -131,17 +101,6 @@
estree-walker "^2.0.2" estree-walker "^2.0.2"
source-map "^0.6.1" source-map "^0.6.1"
"@vue/compiler-core@3.5.13":
version "3.5.13"
resolved "https://registry.yarnpkg.com/@vue/compiler-core/-/compiler-core-3.5.13.tgz#b0ae6c4347f60c03e849a05d34e5bf747c9bda05"
integrity sha512-oOdAkwqUfW1WqpwSYJce06wvt6HljgY3fGeM9NcVA1HaYOij3mZG9Rkysn0OHuyUAGMbEbARIpsG+LPVlBJ5/Q==
dependencies:
"@babel/parser" "^7.25.3"
"@vue/shared" "3.5.13"
entities "^4.5.0"
estree-walker "^2.0.2"
source-map-js "^1.2.0"
"@vue/compiler-dom@3.2.32": "@vue/compiler-dom@3.2.32":
version "3.2.32" version "3.2.32"
resolved "https://registry.npmmirror.com/@vue/compiler-dom/-/compiler-dom-3.2.32.tgz" resolved "https://registry.npmmirror.com/@vue/compiler-dom/-/compiler-dom-3.2.32.tgz"
@ -150,15 +109,7 @@
"@vue/compiler-core" "3.2.32" "@vue/compiler-core" "3.2.32"
"@vue/shared" "3.2.32" "@vue/shared" "3.2.32"
"@vue/compiler-dom@3.5.13": "@vue/compiler-sfc@^3.1.0", "@vue/compiler-sfc@>=3.1.0", "@vue/compiler-sfc@3.2.32":
version "3.5.13"
resolved "https://registry.yarnpkg.com/@vue/compiler-dom/-/compiler-dom-3.5.13.tgz#bb1b8758dbc542b3658dda973b98a1c9311a8a58"
integrity sha512-ZOJ46sMOKUjO3e94wPdCzQ6P1Lx/vhp2RSvfaab88Ajexs0AHeV0uasYhi99WPaogmBlRHNRuly8xV75cNTMDA==
dependencies:
"@vue/compiler-core" "3.5.13"
"@vue/shared" "3.5.13"
"@vue/compiler-sfc@3.2.32":
version "3.2.32" version "3.2.32"
resolved "https://registry.npmmirror.com/@vue/compiler-sfc/-/compiler-sfc-3.2.32.tgz" resolved "https://registry.npmmirror.com/@vue/compiler-sfc/-/compiler-sfc-3.2.32.tgz"
integrity sha512-uO6+Gh3AVdWm72lRRCjMr8nMOEqc6ezT9lWs5dPzh1E9TNaJkMYPaRtdY9flUv/fyVQotkfjY/ponjfR+trPSg== integrity sha512-uO6+Gh3AVdWm72lRRCjMr8nMOEqc6ezT9lWs5dPzh1E9TNaJkMYPaRtdY9flUv/fyVQotkfjY/ponjfR+trPSg==
@ -174,21 +125,6 @@
postcss "^8.1.10" postcss "^8.1.10"
source-map "^0.6.1" source-map "^0.6.1"
"@vue/compiler-sfc@^3.1.0":
version "3.5.13"
resolved "https://registry.yarnpkg.com/@vue/compiler-sfc/-/compiler-sfc-3.5.13.tgz#461f8bd343b5c06fac4189c4fef8af32dea82b46"
integrity sha512-6VdaljMpD82w6c2749Zhf5T9u5uLBWKnVue6XWxprDobftnletJ8+oel7sexFfM3qIxNmVE7LSFGTpv6obNyaQ==
dependencies:
"@babel/parser" "^7.25.3"
"@vue/compiler-core" "3.5.13"
"@vue/compiler-dom" "3.5.13"
"@vue/compiler-ssr" "3.5.13"
"@vue/shared" "3.5.13"
estree-walker "^2.0.2"
magic-string "^0.30.11"
postcss "^8.4.48"
source-map-js "^1.2.0"
"@vue/compiler-ssr@3.2.32": "@vue/compiler-ssr@3.2.32":
version "3.2.32" version "3.2.32"
resolved "https://registry.npmmirror.com/@vue/compiler-ssr/-/compiler-ssr-3.2.32.tgz" resolved "https://registry.npmmirror.com/@vue/compiler-ssr/-/compiler-ssr-3.2.32.tgz"
@ -197,14 +133,6 @@
"@vue/compiler-dom" "3.2.32" "@vue/compiler-dom" "3.2.32"
"@vue/shared" "3.2.32" "@vue/shared" "3.2.32"
"@vue/compiler-ssr@3.5.13":
version "3.5.13"
resolved "https://registry.yarnpkg.com/@vue/compiler-ssr/-/compiler-ssr-3.5.13.tgz#e771adcca6d3d000f91a4277c972a996d07f43ba"
integrity sha512-wMH6vrYHxQl/IybKJagqbquvxpWCuVYpoUJfCqFZwa/JY1GdATAQ+TgVtgrwwMZ0D07QhA99rs/EAAWfvG6KpA==
dependencies:
"@vue/compiler-dom" "3.5.13"
"@vue/shared" "3.5.13"
"@vue/reactivity-transform@3.2.32": "@vue/reactivity-transform@3.2.32":
version "3.2.32" version "3.2.32"
resolved "https://registry.npmmirror.com/@vue/reactivity-transform/-/reactivity-transform-3.2.32.tgz" resolved "https://registry.npmmirror.com/@vue/reactivity-transform/-/reactivity-transform-3.2.32.tgz"
@ -253,11 +181,6 @@
resolved "https://registry.npmmirror.com/@vue/shared/-/shared-3.2.32.tgz" resolved "https://registry.npmmirror.com/@vue/shared/-/shared-3.2.32.tgz"
integrity sha512-bjcixPErUsAnTQRQX4Z5IQnICYjIfNCyCl8p29v1M6kfVzvwOICPw+dz48nNuWlTOOx2RHhzHdazJibE8GSnsw== integrity sha512-bjcixPErUsAnTQRQX4Z5IQnICYjIfNCyCl8p29v1M6kfVzvwOICPw+dz48nNuWlTOOx2RHhzHdazJibE8GSnsw==
"@vue/shared@3.5.13":
version "3.5.13"
resolved "https://registry.yarnpkg.com/@vue/shared/-/shared-3.5.13.tgz#87b309a6379c22b926e696893237826f64339b6f"
integrity sha512-/hnE/qP5ZoGpol0a5mDi45bOd7t3tjYJBjsgCsivow7D48cJeV5l05RD82lPqi7gRiphZM37rnhW1l6ZoCNNnQ==
"@vueuse/core@^8.2.4": "@vueuse/core@^8.2.4":
version "8.2.5" version "8.2.5"
resolved "https://registry.npmmirror.com/@vueuse/core/-/core-8.2.5.tgz" resolved "https://registry.npmmirror.com/@vueuse/core/-/core-8.2.5.tgz"
@ -318,12 +241,12 @@ async-validator@^4.0.7:
asynckit@^0.4.0: asynckit@^0.4.0:
version "0.4.0" version "0.4.0"
resolved "https://registry.yarnpkg.com/asynckit/-/asynckit-0.4.0.tgz#c79ed97f7f34cb8f2ba1bc9790bcc366474b4b79" resolved "https://registry.npmjs.org/asynckit/-/asynckit-0.4.0.tgz"
integrity sha512-Oei9OH4tRh0YqU3GxhX79dM/mwVgvbZJaSNaRk+bshkj0S5cfHcgYakreBjrHwatXKbz+IoIdYLxrKim2MjW0Q== integrity sha512-Oei9OH4tRh0YqU3GxhX79dM/mwVgvbZJaSNaRk+bshkj0S5cfHcgYakreBjrHwatXKbz+IoIdYLxrKim2MjW0Q==
axios@^1.8.2: axios@^1.8.2:
version "1.8.2" version "1.8.2"
resolved "https://registry.yarnpkg.com/axios/-/axios-1.8.2.tgz#fabe06e241dfe83071d4edfbcaa7b1c3a40f7979" resolved "https://registry.npmjs.org/axios/-/axios-1.8.2.tgz"
integrity sha512-ls4GYBm5aig9vWx8AWDSGLpnpDQRtWAfrjU+EuytuODrFBkqesN2RkOQCBzrA1RQNHw1SmRMSDDDSwzNAYQ6Rg== integrity sha512-ls4GYBm5aig9vWx8AWDSGLpnpDQRtWAfrjU+EuytuODrFBkqesN2RkOQCBzrA1RQNHw1SmRMSDDDSwzNAYQ6Rg==
dependencies: dependencies:
follow-redirects "^1.15.6" follow-redirects "^1.15.6"
@ -332,7 +255,7 @@ axios@^1.8.2:
call-bind-apply-helpers@^1.0.1, call-bind-apply-helpers@^1.0.2: call-bind-apply-helpers@^1.0.1, call-bind-apply-helpers@^1.0.2:
version "1.0.2" version "1.0.2"
resolved "https://registry.yarnpkg.com/call-bind-apply-helpers/-/call-bind-apply-helpers-1.0.2.tgz#4b5428c222be985d79c3d82657479dbe0b59b2d6" resolved "https://registry.npmjs.org/call-bind-apply-helpers/-/call-bind-apply-helpers-1.0.2.tgz"
integrity sha512-Sp1ablJ0ivDkSzjcaJdxEunN5/XvksFJ2sMBFfq6x0ryhQV/2b/KwFe21cMpmHtPOSij8K99/wSfoEuTObmuMQ== integrity sha512-Sp1ablJ0ivDkSzjcaJdxEunN5/XvksFJ2sMBFfq6x0ryhQV/2b/KwFe21cMpmHtPOSij8K99/wSfoEuTObmuMQ==
dependencies: dependencies:
es-errors "^1.3.0" es-errors "^1.3.0"
@ -340,7 +263,7 @@ call-bind-apply-helpers@^1.0.1, call-bind-apply-helpers@^1.0.2:
combined-stream@^1.0.8: combined-stream@^1.0.8:
version "1.0.8" version "1.0.8"
resolved "https://registry.yarnpkg.com/combined-stream/-/combined-stream-1.0.8.tgz#c3d45a8b34fd730631a110a8a2520682b31d5a7f" resolved "https://registry.npmjs.org/combined-stream/-/combined-stream-1.0.8.tgz"
integrity sha512-FQN4MRfuJeHf7cBbBMJFXhKSDq+2kAArBlmRBvcvFE5BB1HZKXtSFASDhdlz9zOYwxh8lDdnvmMOe/+5cdoEdg== integrity sha512-FQN4MRfuJeHf7cBbBMJFXhKSDq+2kAArBlmRBvcvFE5BB1HZKXtSFASDhdlz9zOYwxh8lDdnvmMOe/+5cdoEdg==
dependencies: dependencies:
delayed-stream "~1.0.0" delayed-stream "~1.0.0"
@ -381,7 +304,7 @@ debug@^3.2.6:
delayed-stream@~1.0.0: delayed-stream@~1.0.0:
version "1.0.0" version "1.0.0"
resolved "https://registry.yarnpkg.com/delayed-stream/-/delayed-stream-1.0.0.tgz#df3ae199acadfb7d440aaae0b29e2272b24ec619" resolved "https://registry.npmjs.org/delayed-stream/-/delayed-stream-1.0.0.tgz"
integrity sha512-ZySD7Nf91aLB0RxL4KGrKHBXl7Eds1DAmEdcoVawXnLD7SDhpNgtuII2aAkg7a7QS41jxPSZ17p4VdGnMHk3MQ== integrity sha512-ZySD7Nf91aLB0RxL4KGrKHBXl7Eds1DAmEdcoVawXnLD7SDhpNgtuII2aAkg7a7QS41jxPSZ17p4VdGnMHk3MQ==
dom-align@^1.12.1: dom-align@^1.12.1:
@ -396,7 +319,7 @@ dom-scroll-into-view@^2.0.0:
dunder-proto@^1.0.1: dunder-proto@^1.0.1:
version "1.0.1" version "1.0.1"
resolved "https://registry.yarnpkg.com/dunder-proto/-/dunder-proto-1.0.1.tgz#d7ae667e1dc83482f8b70fd0f6eefc50da30f58a" resolved "https://registry.npmjs.org/dunder-proto/-/dunder-proto-1.0.1.tgz"
integrity sha512-KIN/nDJBQRcXw0MLVhZE9iQHmG68qAVIBg9CqmUYjmQIhgij9U5MFvrqkUL5FbtyyzZuOeOt0zdeRe4UY7ct+A== integrity sha512-KIN/nDJBQRcXw0MLVhZE9iQHmG68qAVIBg9CqmUYjmQIhgij9U5MFvrqkUL5FbtyyzZuOeOt0zdeRe4UY7ct+A==
dependencies: dependencies:
call-bind-apply-helpers "^1.0.1" call-bind-apply-helpers "^1.0.1"
@ -424,11 +347,6 @@ element-plus@^2.1.9:
memoize-one "^6.0.0" memoize-one "^6.0.0"
normalize-wheel-es "^1.1.2" normalize-wheel-es "^1.1.2"
entities@^4.5.0:
version "4.5.0"
resolved "https://registry.yarnpkg.com/entities/-/entities-4.5.0.tgz#5d268ea5e7113ec74c4d033b79ea5a35a488fb48"
integrity sha512-V0hjH4dGPh9Ao5p0MoRY6BVqtwCjhz6vI5LT8AJ55H+4g9/4vbHx1I54fS0XuclLhDHArPQCiMjDxjaL8fPxhw==
errno@^0.1.1: errno@^0.1.1:
version "0.1.8" version "0.1.8"
resolved "https://registry.npmmirror.com/errno/-/errno-0.1.8.tgz" resolved "https://registry.npmmirror.com/errno/-/errno-0.1.8.tgz"
@ -438,24 +356,24 @@ errno@^0.1.1:
es-define-property@^1.0.1: es-define-property@^1.0.1:
version "1.0.1" version "1.0.1"
resolved "https://registry.yarnpkg.com/es-define-property/-/es-define-property-1.0.1.tgz#983eb2f9a6724e9303f61addf011c72e09e0b0fa" resolved "https://registry.npmjs.org/es-define-property/-/es-define-property-1.0.1.tgz"
integrity sha512-e3nRfgfUZ4rNGL232gUgX06QNyyez04KdjFrF+LTRoOXmrOgFKDg4BCdsjW8EnT69eqdYGmRpJwiPVYNrCaW3g== integrity sha512-e3nRfgfUZ4rNGL232gUgX06QNyyez04KdjFrF+LTRoOXmrOgFKDg4BCdsjW8EnT69eqdYGmRpJwiPVYNrCaW3g==
es-errors@^1.3.0: es-errors@^1.3.0:
version "1.3.0" version "1.3.0"
resolved "https://registry.yarnpkg.com/es-errors/-/es-errors-1.3.0.tgz#05f75a25dab98e4fb1dcd5e1472c0546d5057c8f" resolved "https://registry.npmjs.org/es-errors/-/es-errors-1.3.0.tgz"
integrity sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw== integrity sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw==
es-object-atoms@^1.0.0, es-object-atoms@^1.1.1: es-object-atoms@^1.0.0, es-object-atoms@^1.1.1:
version "1.1.1" version "1.1.1"
resolved "https://registry.yarnpkg.com/es-object-atoms/-/es-object-atoms-1.1.1.tgz#1c4f2c4837327597ce69d2ca190a7fdd172338c1" resolved "https://registry.npmjs.org/es-object-atoms/-/es-object-atoms-1.1.1.tgz"
integrity sha512-FGgH2h8zKNim9ljj7dankFPcICIK9Cp5bm+c2gQSYePhpaG5+esrLODihIorn+Pe6FGJzWhXQotPv73jTaldXA== integrity sha512-FGgH2h8zKNim9ljj7dankFPcICIK9Cp5bm+c2gQSYePhpaG5+esrLODihIorn+Pe6FGJzWhXQotPv73jTaldXA==
dependencies: dependencies:
es-errors "^1.3.0" es-errors "^1.3.0"
es-set-tostringtag@^2.1.0: es-set-tostringtag@^2.1.0:
version "2.1.0" version "2.1.0"
resolved "https://registry.yarnpkg.com/es-set-tostringtag/-/es-set-tostringtag-2.1.0.tgz#f31dbbe0c183b00a6d26eb6325c810c0fd18bd4d" resolved "https://registry.npmjs.org/es-set-tostringtag/-/es-set-tostringtag-2.1.0.tgz"
integrity sha512-j6vWzfrGVfyXxge+O0x5sh6cvxAog0a/4Rdd2K36zCMV5eJ+/+tOAngRO8cODMNWbVRdVlmGZQL2YS3yR8bIUA== integrity sha512-j6vWzfrGVfyXxge+O0x5sh6cvxAog0a/4Rdd2K36zCMV5eJ+/+tOAngRO8cODMNWbVRdVlmGZQL2YS3yR8bIUA==
dependencies: dependencies:
es-errors "^1.3.0" es-errors "^1.3.0"
@ -463,106 +381,11 @@ es-set-tostringtag@^2.1.0:
has-tostringtag "^1.0.2" has-tostringtag "^1.0.2"
hasown "^2.0.2" hasown "^2.0.2"
esbuild-android-64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-android-64/-/esbuild-android-64-0.14.36.tgz#fc5f95ce78c8c3d790fa16bc71bd904f2bb42aa1"
integrity sha512-jwpBhF1jmo0tVCYC/ORzVN+hyVcNZUWuozGcLHfod0RJCedTDTvR4nwlTXdx1gtncDqjk33itjO+27OZHbiavw==
esbuild-android-arm64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-android-arm64/-/esbuild-android-arm64-0.14.36.tgz#44356fbb9f8de82a5cdf11849e011dfb3ad0a8a8"
integrity sha512-/hYkyFe7x7Yapmfv4X/tBmyKnggUmdQmlvZ8ZlBnV4+PjisrEhAvC3yWpURuD9XoB8Wa1d5dGkTsF53pIvpjsg==
esbuild-darwin-64@0.14.36: esbuild-darwin-64@0.14.36:
version "0.14.36" version "0.14.36"
resolved "https://registry.npmmirror.com/esbuild-darwin-64/-/esbuild-darwin-64-0.14.36.tgz" resolved "https://registry.npmmirror.com/esbuild-darwin-64/-/esbuild-darwin-64-0.14.36.tgz"
integrity sha512-kkl6qmV0dTpyIMKagluzYqlc1vO0ecgpviK/7jwPbRDEv5fejRTaBBEE2KxEQbTHcLhiiDbhG7d5UybZWo/1zQ== integrity sha512-kkl6qmV0dTpyIMKagluzYqlc1vO0ecgpviK/7jwPbRDEv5fejRTaBBEE2KxEQbTHcLhiiDbhG7d5UybZWo/1zQ==
esbuild-darwin-arm64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-darwin-arm64/-/esbuild-darwin-arm64-0.14.36.tgz#2a8040c2e465131e5281034f3c72405e643cb7b2"
integrity sha512-q8fY4r2Sx6P0Pr3VUm//eFYKVk07C5MHcEinU1BjyFnuYz4IxR/03uBbDwluR6ILIHnZTE7AkTUWIdidRi1Jjw==
esbuild-freebsd-64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-freebsd-64/-/esbuild-freebsd-64-0.14.36.tgz#d82c387b4d01fe9e8631f97d41eb54f2dbeb68a3"
integrity sha512-Hn8AYuxXXRptybPqoMkga4HRFE7/XmhtlQjXFHoAIhKUPPMeJH35GYEUWGbjteai9FLFvBAjEAlwEtSGxnqWww==
esbuild-freebsd-arm64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-freebsd-arm64/-/esbuild-freebsd-arm64-0.14.36.tgz#e8ce2e6c697da6c7ecd0cc0ac821d47c5ab68529"
integrity sha512-S3C0attylLLRiCcHiJd036eDEMOY32+h8P+jJ3kTcfhJANNjP0TNBNL30TZmEdOSx/820HJFgRrqpNAvTbjnDA==
esbuild-linux-32@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-linux-32/-/esbuild-linux-32-0.14.36.tgz#a4a261e2af91986ea62451f2db712a556cb38a15"
integrity sha512-Eh9OkyTrEZn9WGO4xkI3OPPpUX7p/3QYvdG0lL4rfr73Ap2HAr6D9lP59VMF64Ex01LhHSXwIsFG/8AQjh6eNw==
esbuild-linux-64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-linux-64/-/esbuild-linux-64-0.14.36.tgz#4a9500f9197e2c8fcb884a511d2c9d4c2debde72"
integrity sha512-vFVFS5ve7PuwlfgoWNyRccGDi2QTNkQo/2k5U5ttVD0jRFaMlc8UQee708fOZA6zTCDy5RWsT5MJw3sl2X6KDg==
esbuild-linux-arm64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-linux-arm64/-/esbuild-linux-arm64-0.14.36.tgz#c91c21e25b315464bd7da867365dd1dae14ca176"
integrity sha512-24Vq1M7FdpSmaTYuu1w0Hdhiqkbto1I5Pjyi+4Cdw5fJKGlwQuw+hWynTcRI/cOZxBcBpP21gND7W27gHAiftw==
esbuild-linux-arm@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-linux-arm/-/esbuild-linux-arm-0.14.36.tgz#90e23bca2e6e549affbbe994f80ba3bb6c4d934a"
integrity sha512-NhgU4n+NCsYgt7Hy61PCquEz5aevI6VjQvxwBxtxrooXsxt5b2xtOUXYZe04JxqQo+XZk3d1gcr7pbV9MAQ/Lg==
esbuild-linux-mips64le@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-linux-mips64le/-/esbuild-linux-mips64le-0.14.36.tgz#40e11afb08353ff24709fc89e4db0f866bc131d2"
integrity sha512-hZUeTXvppJN+5rEz2EjsOFM9F1bZt7/d2FUM1lmQo//rXh1RTFYzhC0txn7WV0/jCC7SvrGRaRz0NMsRPf8SIA==
esbuild-linux-ppc64le@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-linux-ppc64le/-/esbuild-linux-ppc64le-0.14.36.tgz#9e8a588c513d06cc3859f9dcc52e5fdfce8a1a5e"
integrity sha512-1Bg3QgzZjO+QtPhP9VeIBhAduHEc2kzU43MzBnMwpLSZ890azr4/A9Dganun8nsqD/1TBcqhId0z4mFDO8FAvg==
esbuild-linux-riscv64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-linux-riscv64/-/esbuild-linux-riscv64-0.14.36.tgz#e578c09b23b3b97652e60e3692bfda628b541f06"
integrity sha512-dOE5pt3cOdqEhaufDRzNCHf5BSwxgygVak9UR7PH7KPVHwSTDAZHDoEjblxLqjJYpc5XaU9+gKJ9F8mp9r5I4A==
esbuild-linux-s390x@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-linux-s390x/-/esbuild-linux-s390x-0.14.36.tgz#3c9dab40d0d69932ffded0fd7317bb403626c9bc"
integrity sha512-g4FMdh//BBGTfVHjF6MO7Cz8gqRoDPzXWxRvWkJoGroKA18G9m0wddvPbEqcQf5Tbt2vSc1CIgag7cXwTmoTXg==
esbuild-netbsd-64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-netbsd-64/-/esbuild-netbsd-64-0.14.36.tgz#e27847f6d506218291619b8c1e121ecd97628494"
integrity sha512-UB2bVImxkWk4vjnP62ehFNZ73lQY1xcnL5ZNYF3x0AG+j8HgdkNF05v67YJdCIuUJpBuTyCK8LORCYo9onSW+A==
esbuild-openbsd-64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-openbsd-64/-/esbuild-openbsd-64-0.14.36.tgz#c94c04c557fae516872a586eae67423da6d2fabb"
integrity sha512-NvGB2Chf8GxuleXRGk8e9zD3aSdRO5kLt9coTQbCg7WMGXeX471sBgh4kSg8pjx0yTXRt0MlrUDnjVYnetyivg==
esbuild-sunos-64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-sunos-64/-/esbuild-sunos-64-0.14.36.tgz#9b79febc0df65a30f1c9bd63047d1675511bf99d"
integrity sha512-VkUZS5ftTSjhRjuRLp+v78auMO3PZBXu6xl4ajomGenEm2/rGuWlhFSjB7YbBNErOchj51Jb2OK8lKAo8qdmsQ==
esbuild-windows-32@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-windows-32/-/esbuild-windows-32-0.14.36.tgz#910d11936c8d2122ffdd3275e5b28d8a4e1240ec"
integrity sha512-bIar+A6hdytJjZrDxfMBUSEHHLfx3ynoEZXx/39nxy86pX/w249WZm8Bm0dtOAByAf4Z6qV0LsnTIJHiIqbw0w==
esbuild-windows-64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-windows-64/-/esbuild-windows-64-0.14.36.tgz#21b4ce8b42a4efc63f4b58ec617f1302448aad26"
integrity sha512-+p4MuRZekVChAeueT1Y9LGkxrT5x7YYJxYE8ZOTcEfeUUN43vktSn6hUNsvxzzATrSgq5QqRdllkVBxWZg7KqQ==
esbuild-windows-arm64@0.14.36:
version "0.14.36"
resolved "https://registry.yarnpkg.com/esbuild-windows-arm64/-/esbuild-windows-arm64-0.14.36.tgz#ba21546fecb7297667d0052d00150de22c044b24"
integrity sha512-fBB4WlDqV1m18EF/aheGYQkQZHfPHiHJSBYzXIo8yKehek+0BtBwo/4PNwKGJ5T0YK0oc8pBKjgwPbzSrPLb+Q==
esbuild@^0.14.27: esbuild@^0.14.27:
version "0.14.36" version "0.14.36"
resolved "https://registry.npmmirror.com/esbuild/-/esbuild-0.14.36.tgz" resolved "https://registry.npmmirror.com/esbuild/-/esbuild-0.14.36.tgz"
@ -601,12 +424,12 @@ estree-walker@^2.0.2:
follow-redirects@^1.15.6: follow-redirects@^1.15.6:
version "1.15.9" version "1.15.9"
resolved "https://registry.yarnpkg.com/follow-redirects/-/follow-redirects-1.15.9.tgz#a604fa10e443bf98ca94228d9eebcc2e8a2c8ee1" resolved "https://registry.npmjs.org/follow-redirects/-/follow-redirects-1.15.9.tgz"
integrity sha512-gew4GsXizNgdoRyqmyfMHyAmXsZDk6mHkSxZFCzW9gwlbtOW44CDtYavM+y+72qD/Vq2l550kMF52DT8fOLJqQ== integrity sha512-gew4GsXizNgdoRyqmyfMHyAmXsZDk6mHkSxZFCzW9gwlbtOW44CDtYavM+y+72qD/Vq2l550kMF52DT8fOLJqQ==
form-data@^4.0.0: form-data@^4.0.0:
version "4.0.2" version "4.0.2"
resolved "https://registry.yarnpkg.com/form-data/-/form-data-4.0.2.tgz#35cabbdd30c3ce73deb2c42d3c8d3ed9ca51794c" resolved "https://registry.npmjs.org/form-data/-/form-data-4.0.2.tgz"
integrity sha512-hGfm/slu0ZabnNt4oaRZ6uREyfCj6P4fT/n6A1rGV+Z0VdGXjfOhVUpkn6qVQONHGIFwmveGXyDs75+nr6FM8w== integrity sha512-hGfm/slu0ZabnNt4oaRZ6uREyfCj6P4fT/n6A1rGV+Z0VdGXjfOhVUpkn6qVQONHGIFwmveGXyDs75+nr6FM8w==
dependencies: dependencies:
asynckit "^0.4.0" asynckit "^0.4.0"
@ -619,19 +442,14 @@ fsevents@~2.3.2:
resolved "https://registry.npmmirror.com/fsevents/-/fsevents-2.3.2.tgz" resolved "https://registry.npmmirror.com/fsevents/-/fsevents-2.3.2.tgz"
integrity sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA== integrity sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==
function-bind@^1.1.1: function-bind@^1.1.1, function-bind@^1.1.2:
version "1.1.1"
resolved "https://registry.npmmirror.com/function-bind/-/function-bind-1.1.1.tgz"
integrity sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A==
function-bind@^1.1.2:
version "1.1.2" version "1.1.2"
resolved "https://registry.yarnpkg.com/function-bind/-/function-bind-1.1.2.tgz#2c02d864d97f3ea6c8830c464cbd11ab6eab7a1c" resolved "https://registry.npmjs.org/function-bind/-/function-bind-1.1.2.tgz"
integrity sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA== integrity sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==
get-intrinsic@^1.2.6: get-intrinsic@^1.2.6:
version "1.3.0" version "1.3.0"
resolved "https://registry.yarnpkg.com/get-intrinsic/-/get-intrinsic-1.3.0.tgz#743f0e3b6964a93a5491ed1bffaae054d7f98d01" resolved "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.3.0.tgz"
integrity sha512-9fSjSaos/fRIVIp+xSJlE6lfwhES7LNtKaCBIamHsjr2na1BiABJPo0mOjjz8GJDURarmCPGqaiVg5mfjb98CQ== integrity sha512-9fSjSaos/fRIVIp+xSJlE6lfwhES7LNtKaCBIamHsjr2na1BiABJPo0mOjjz8GJDURarmCPGqaiVg5mfjb98CQ==
dependencies: dependencies:
call-bind-apply-helpers "^1.0.2" call-bind-apply-helpers "^1.0.2"
@ -647,7 +465,7 @@ get-intrinsic@^1.2.6:
get-proto@^1.0.1: get-proto@^1.0.1:
version "1.0.1" version "1.0.1"
resolved "https://registry.yarnpkg.com/get-proto/-/get-proto-1.0.1.tgz#150b3f2743869ef3e851ec0c49d15b1d14d00ee1" resolved "https://registry.npmjs.org/get-proto/-/get-proto-1.0.1.tgz"
integrity sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g== integrity sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g==
dependencies: dependencies:
dunder-proto "^1.0.1" dunder-proto "^1.0.1"
@ -655,7 +473,7 @@ get-proto@^1.0.1:
gopd@^1.2.0: gopd@^1.2.0:
version "1.2.0" version "1.2.0"
resolved "https://registry.yarnpkg.com/gopd/-/gopd-1.2.0.tgz#89f56b8217bdbc8802bd299df6d7f1081d7e51a1" resolved "https://registry.npmjs.org/gopd/-/gopd-1.2.0.tgz"
integrity sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg== integrity sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg==
graceful-fs@^4.1.2: graceful-fs@^4.1.2:
@ -665,12 +483,12 @@ graceful-fs@^4.1.2:
has-symbols@^1.0.3, has-symbols@^1.1.0: has-symbols@^1.0.3, has-symbols@^1.1.0:
version "1.1.0" version "1.1.0"
resolved "https://registry.yarnpkg.com/has-symbols/-/has-symbols-1.1.0.tgz#fc9c6a783a084951d0b971fe1018de813707a338" resolved "https://registry.npmjs.org/has-symbols/-/has-symbols-1.1.0.tgz"
integrity sha512-1cDNdwJ2Jaohmb3sg4OmKaMBwuC48sYni5HUw2DvsC8LjGTLK9h+eb1X6RyuOHe4hT0ULCW68iomhjUoKUqlPQ== integrity sha512-1cDNdwJ2Jaohmb3sg4OmKaMBwuC48sYni5HUw2DvsC8LjGTLK9h+eb1X6RyuOHe4hT0ULCW68iomhjUoKUqlPQ==
has-tostringtag@^1.0.2: has-tostringtag@^1.0.2:
version "1.0.2" version "1.0.2"
resolved "https://registry.yarnpkg.com/has-tostringtag/-/has-tostringtag-1.0.2.tgz#2cdc42d40bef2e5b4eeab7c01a73c54ce7ab5abc" resolved "https://registry.npmjs.org/has-tostringtag/-/has-tostringtag-1.0.2.tgz"
integrity sha512-NqADB8VjPFLM2V0VvHUewwwsw0ZWBaIdgo+ieHtK3hasLz4qeCRjYcqfB6AQrBggRKppKF8L52/VqdVsO47Dlw== integrity sha512-NqADB8VjPFLM2V0VvHUewwwsw0ZWBaIdgo+ieHtK3hasLz4qeCRjYcqfB6AQrBggRKppKF8L52/VqdVsO47Dlw==
dependencies: dependencies:
has-symbols "^1.0.3" has-symbols "^1.0.3"
@ -684,7 +502,7 @@ has@^1.0.3:
hasown@^2.0.2: hasown@^2.0.2:
version "2.0.2" version "2.0.2"
resolved "https://registry.yarnpkg.com/hasown/-/hasown-2.0.2.tgz#003eaf91be7adc372e84ec59dc37252cedb80003" resolved "https://registry.npmjs.org/hasown/-/hasown-2.0.2.tgz"
integrity sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ== integrity sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ==
dependencies: dependencies:
function-bind "^1.1.2" function-bind "^1.1.2"
@ -735,7 +553,7 @@ lamejs@^1.2.1:
dependencies: dependencies:
use-strict "1.0.1" use-strict "1.0.1"
less@^4.1.2: less@*, less@^4.1.2:
version "4.1.2" version "4.1.2"
resolved "https://registry.npmmirror.com/less/-/less-4.1.2.tgz" resolved "https://registry.npmmirror.com/less/-/less-4.1.2.tgz"
integrity sha512-EoQp/Et7OSOVu0aJknJOtlXZsnr8XE8KwuzTHOLeVSEx8pVWUICc8Q0VYRHgzyjX78nMEyC/oztWFbgyhtNfDA== integrity sha512-EoQp/Et7OSOVu0aJknJOtlXZsnr8XE8KwuzTHOLeVSEx8pVWUICc8Q0VYRHgzyjX78nMEyC/oztWFbgyhtNfDA==
@ -752,7 +570,7 @@ less@^4.1.2:
needle "^2.5.2" needle "^2.5.2"
source-map "~0.6.0" source-map "~0.6.0"
lodash-es@^4.17.15, lodash-es@^4.17.21: lodash-es@*, lodash-es@^4.17.15, lodash-es@^4.17.21:
version "4.17.21" version "4.17.21"
resolved "https://registry.npmmirror.com/lodash-es/-/lodash-es-4.17.21.tgz" resolved "https://registry.npmmirror.com/lodash-es/-/lodash-es-4.17.21.tgz"
integrity sha512-mKnC+QJ9pWVzv+C4/U3rRsHapFfHvQFoFB92e52xeyGMcX6/OlIl78je1u8vePzYZSkkogMPJ2yjxxsb89cxyw== integrity sha512-mKnC+QJ9pWVzv+C4/U3rRsHapFfHvQFoFB92e52xeyGMcX6/OlIl78je1u8vePzYZSkkogMPJ2yjxxsb89cxyw==
@ -762,7 +580,7 @@ lodash-unified@^1.0.2:
resolved "https://registry.npmmirror.com/lodash-unified/-/lodash-unified-1.0.2.tgz" resolved "https://registry.npmmirror.com/lodash-unified/-/lodash-unified-1.0.2.tgz"
integrity sha512-OGbEy+1P+UT26CYi4opY4gebD8cWRDxAT6MAObIVQMiqYdxZr1g3QHWCToVsm31x2NkLS4K3+MC2qInaRMa39g== integrity sha512-OGbEy+1P+UT26CYi4opY4gebD8cWRDxAT6MAObIVQMiqYdxZr1g3QHWCToVsm31x2NkLS4K3+MC2qInaRMa39g==
lodash@^4.17.21: lodash@*, lodash@^4.17.21:
version "4.17.21" version "4.17.21"
resolved "https://registry.npmmirror.com/lodash/-/lodash-4.17.21.tgz" resolved "https://registry.npmmirror.com/lodash/-/lodash-4.17.21.tgz"
integrity sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg== integrity sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg==
@ -781,13 +599,6 @@ magic-string@^0.25.7:
dependencies: dependencies:
sourcemap-codec "^1.4.8" sourcemap-codec "^1.4.8"
magic-string@^0.30.11:
version "0.30.17"
resolved "https://registry.yarnpkg.com/magic-string/-/magic-string-0.30.17.tgz#450a449673d2460e5bbcfba9a61916a1714c7453"
integrity sha512-sNPKHvyjVf7gyjwS4xGTaW/mCnF8wnjtifKBEhxfZ7E/S8tQ0rssrwGNn6q8JH/ohItJfSQp9mBtQYuTlH5QnA==
dependencies:
"@jridgewell/sourcemap-codec" "^1.5.0"
make-dir@^2.1.0: make-dir@^2.1.0:
version "2.1.0" version "2.1.0"
resolved "https://registry.npmmirror.com/make-dir/-/make-dir-2.1.0.tgz" resolved "https://registry.npmmirror.com/make-dir/-/make-dir-2.1.0.tgz"
@ -798,7 +609,7 @@ make-dir@^2.1.0:
math-intrinsics@^1.1.0: math-intrinsics@^1.1.0:
version "1.1.0" version "1.1.0"
resolved "https://registry.yarnpkg.com/math-intrinsics/-/math-intrinsics-1.1.0.tgz#a0dd74be81e2aa5c2f27e65ce283605ee4e2b7f9" resolved "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz"
integrity sha512-/IXtbwEk5HTPyEwyKX6hGkYXxM9nbj64B+ilVJnC/R6B0pH5G4V3b0pVbL7DBj4tkhBAppbQUlf6F6Xl9LHu1g== integrity sha512-/IXtbwEk5HTPyEwyKX6hGkYXxM9nbj64B+ilVJnC/R6B0pH5G4V3b0pVbL7DBj4tkhBAppbQUlf6F6Xl9LHu1g==
memoize-one@^6.0.0: memoize-one@^6.0.0:
@ -808,12 +619,12 @@ memoize-one@^6.0.0:
mime-db@1.52.0: mime-db@1.52.0:
version "1.52.0" version "1.52.0"
resolved "https://registry.yarnpkg.com/mime-db/-/mime-db-1.52.0.tgz#bbabcdc02859f4987301c856e3387ce5ec43bf70" resolved "https://registry.npmjs.org/mime-db/-/mime-db-1.52.0.tgz"
integrity sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg== integrity sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg==
mime-types@^2.1.12: mime-types@^2.1.12:
version "2.1.35" version "2.1.35"
resolved "https://registry.yarnpkg.com/mime-types/-/mime-types-2.1.35.tgz#381a871b62a734450660ae3deee44813f70d959a" resolved "https://registry.npmjs.org/mime-types/-/mime-types-2.1.35.tgz"
integrity sha512-ZDY+bPm5zTTF+YpCrAU9nK0UgICYPT0QtT1NZWFv4s++TNkcgVaT0g6+4R2uI4MjQjzysHB1zxuWL50hzaeXiw== integrity sha512-ZDY+bPm5zTTF+YpCrAU9nK0UgICYPT0QtT1NZWFv4s++TNkcgVaT0g6+4R2uI4MjQjzysHB1zxuWL50hzaeXiw==
dependencies: dependencies:
mime-db "1.52.0" mime-db "1.52.0"
@ -825,7 +636,7 @@ mime@^1.4.1:
moment@^2.27.0: moment@^2.27.0:
version "2.29.4" version "2.29.4"
resolved "https://registry.yarnpkg.com/moment/-/moment-2.29.4.tgz#3dbe052889fe7c1b2ed966fcb3a77328964ef108" resolved "https://registry.npmjs.org/moment/-/moment-2.29.4.tgz"
integrity sha512-5LC9SOxjSc2HF6vO2CyuTDNivEdoz2IvyJJGj6X8DJ0eFyfszE0QiEd+iXmBvUP3WHxSjFH/vIsA0EN00cgr8w== integrity sha512-5LC9SOxjSc2HF6vO2CyuTDNivEdoz2IvyJJGj6X8DJ0eFyfszE0QiEd+iXmBvUP3WHxSjFH/vIsA0EN00cgr8w==
ms@^2.1.1: ms@^2.1.1:
@ -833,14 +644,9 @@ ms@^2.1.1:
resolved "https://registry.npmmirror.com/ms/-/ms-2.1.3.tgz" resolved "https://registry.npmmirror.com/ms/-/ms-2.1.3.tgz"
integrity sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA== integrity sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==
nanoid@^3.3.1:
version "3.3.2"
resolved "https://registry.npmmirror.com/nanoid/-/nanoid-3.3.2.tgz"
integrity sha512-CuHBogktKwpm5g2sRgv83jEy2ijFzBwMoYA60orPDR7ynsLijJDqgsi4RDGj3OJpy3Ieb+LYwiRmIOGyytgITA==
nanoid@^3.3.8: nanoid@^3.3.8:
version "3.3.9" version "3.3.9"
resolved "https://registry.yarnpkg.com/nanoid/-/nanoid-3.3.9.tgz#e0097d8e026b3343ff053e9ccd407360a03f503a" resolved "https://registry.npmjs.org/nanoid/-/nanoid-3.3.9.tgz"
integrity sha512-SppoicMGpZvbF1l3z4x7No3OlIjP7QJvC9XR7AhZr1kL133KHnKPztkKDc+Ir4aJ/1VhTySrtKhrsycmrMQfvg== integrity sha512-SppoicMGpZvbF1l3z4x7No3OlIjP7QJvC9XR7AhZr1kL133KHnKPztkKDc+Ir4aJ/1VhTySrtKhrsycmrMQfvg==
nanopop@^2.1.0: nanopop@^2.1.0:
@ -877,14 +683,9 @@ path-parse@^1.0.7:
resolved "https://registry.npmmirror.com/path-parse/-/path-parse-1.0.7.tgz" resolved "https://registry.npmmirror.com/path-parse/-/path-parse-1.0.7.tgz"
integrity sha512-LDJzPVEEEPR+y48z93A0Ed0yXb8pAByGWo/k5YYdYgpY2/2EsOsksJrq7lOHxryrVOn1ejG6oAp8ahvOIQD8sw== integrity sha512-LDJzPVEEEPR+y48z93A0Ed0yXb8pAByGWo/k5YYdYgpY2/2EsOsksJrq7lOHxryrVOn1ejG6oAp8ahvOIQD8sw==
picocolors@^1.0.0:
version "1.0.0"
resolved "https://registry.npmmirror.com/picocolors/-/picocolors-1.0.0.tgz"
integrity sha512-1fygroTLlHu66zi26VoTDv8yRgm0Fccecssto+MhsZ0D/DGW2sm8E8AjW7NU5VVTRt5GxbeZ5qBuJr+HyLYkjQ==
picocolors@^1.1.1: picocolors@^1.1.1:
version "1.1.1" version "1.1.1"
resolved "https://registry.yarnpkg.com/picocolors/-/picocolors-1.1.1.tgz#3d321af3eab939b083c8f929a1d12cda81c26b6b" resolved "https://registry.npmjs.org/picocolors/-/picocolors-1.1.1.tgz"
integrity sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA== integrity sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==
pify@^4.0.1: pify@^4.0.1:
@ -892,18 +693,9 @@ pify@^4.0.1:
resolved "https://registry.npmmirror.com/pify/-/pify-4.0.1.tgz" resolved "https://registry.npmmirror.com/pify/-/pify-4.0.1.tgz"
integrity sha512-uB80kBFb/tfd68bVleG9T5GGsGPjJrLAUpR5PZIrhBnIaRTQRjqdJSsIKkOP6OAIFbj7GOrcudc5pNjZ+geV2g== integrity sha512-uB80kBFb/tfd68bVleG9T5GGsGPjJrLAUpR5PZIrhBnIaRTQRjqdJSsIKkOP6OAIFbj7GOrcudc5pNjZ+geV2g==
postcss@^8.1.10: postcss@^8.1.10, postcss@^8.4.13:
version "8.4.12"
resolved "https://registry.npmmirror.com/postcss/-/postcss-8.4.12.tgz"
integrity sha512-lg6eITwYe9v6Hr5CncVbK70SoioNQIq81nsaG86ev5hAidQvmOeETBqs7jm43K2F5/Ley3ytDtriImV6TpNiSg==
dependencies:
nanoid "^3.3.1"
picocolors "^1.0.0"
source-map-js "^1.0.2"
postcss@^8.4.13, postcss@^8.4.48:
version "8.5.3" version "8.5.3"
resolved "https://registry.yarnpkg.com/postcss/-/postcss-8.5.3.tgz#1463b6f1c7fb16fe258736cba29a2de35237eafb" resolved "https://registry.npmjs.org/postcss/-/postcss-8.5.3.tgz"
integrity sha512-dle9A3yYxlBSrt8Fu+IpjGT8SY8hN0mlaA6GY8t0P5PjIOZemULz/E2Bnm/2dcUOena75OTNkHI76uZBNUUq3A== integrity sha512-dle9A3yYxlBSrt8Fu+IpjGT8SY8hN0mlaA6GY8t0P5PjIOZemULz/E2Bnm/2dcUOena75OTNkHI76uZBNUUq3A==
dependencies: dependencies:
nanoid "^3.3.8" nanoid "^3.3.8"
@ -912,7 +704,7 @@ postcss@^8.4.13, postcss@^8.4.48:
proxy-from-env@^1.1.0: proxy-from-env@^1.1.0:
version "1.1.0" version "1.1.0"
resolved "https://registry.yarnpkg.com/proxy-from-env/-/proxy-from-env-1.1.0.tgz#e102f16ca355424865755d2c9e8ea4f24d58c3e2" resolved "https://registry.npmjs.org/proxy-from-env/-/proxy-from-env-1.1.0.tgz"
integrity sha512-D+zkORCbA9f1tdWRK0RaCR3GPv50cMxcrz4X8k5LTSUD1Dkw47mKJEZQNunItRTkWwgtaUSo1RVFRIG9ZXiFYg== integrity sha512-D+zkORCbA9f1tdWRK0RaCR3GPv50cMxcrz4X8k5LTSUD1Dkw47mKJEZQNunItRTkWwgtaUSo1RVFRIG9ZXiFYg==
prr@~1.0.1: prr@~1.0.1:
@ -920,10 +712,10 @@ prr@~1.0.1:
resolved "https://registry.npmmirror.com/prr/-/prr-1.0.1.tgz" resolved "https://registry.npmmirror.com/prr/-/prr-1.0.1.tgz"
integrity sha512-yPw4Sng1gWghHQWj0B3ZggWUm4qVbPwPFcRG8KyxiU7J2OHFSoEHKS+EZ3fv5l1t9CyCiop6l/ZYeWbrgoQejw== integrity sha512-yPw4Sng1gWghHQWj0B3ZggWUm4qVbPwPFcRG8KyxiU7J2OHFSoEHKS+EZ3fv5l1t9CyCiop6l/ZYeWbrgoQejw==
regenerator-runtime@^0.13.4: regenerator-runtime@^0.14.0:
version "0.13.9" version "0.14.1"
resolved "https://registry.npmmirror.com/regenerator-runtime/-/regenerator-runtime-0.13.9.tgz" resolved "https://registry.yarnpkg.com/regenerator-runtime/-/regenerator-runtime-0.14.1.tgz#356ade10263f685dda125100cd862c1db895327f"
integrity sha512-p3VT+cOEgxFsRRA9X4lkI1E+k2/CtnKtU4gcxyaCUreilL/vqI6CdZ3wxVUx3UOUg+gnUOQQcRI7BmSI656MYA== integrity sha512-dYnhHh0nJoMfnkZs6GmmhFknAGRrLznOu5nc9ML+EJxGvrx6H7teuevqVqCuPcPK//3eDrrjQhehXVx9cnkGdw==
resize-observer-polyfill@^1.5.1: resize-observer-polyfill@^1.5.1:
version "1.5.1" version "1.5.1"
@ -940,9 +732,9 @@ resolve@^1.22.0:
supports-preserve-symlinks-flag "^1.0.0" supports-preserve-symlinks-flag "^1.0.0"
"rollup@>=2.59.0 <2.78.0": "rollup@>=2.59.0 <2.78.0":
version "2.77.3" version "2.70.1"
resolved "https://registry.yarnpkg.com/rollup/-/rollup-2.77.3.tgz#8f00418d3a2740036e15deb653bed1a90ee0cc12" resolved "https://registry.npmmirror.com/rollup/-/rollup-2.70.1.tgz"
integrity sha512-/qxNTG7FbmefJWoeeYJFbHehJ2HNWnjkAFRKzWN/45eNBBF/r8lo992CwcJXEzyVxs5FmfId+vTSTQDb+bxA+g== integrity sha512-CRYsI5EuzLbXdxC6RnYhOuRdtz4bhejPMSWjsFLfVM/7w/85n2szZv6yExqUXsBdz5KT8eoubeyDUDjhLHEslA==
optionalDependencies: optionalDependencies:
fsevents "~2.3.2" fsevents "~2.3.2"
@ -973,14 +765,9 @@ shallow-equal@^1.0.0:
resolved "https://registry.npmmirror.com/shallow-equal/-/shallow-equal-1.2.1.tgz" resolved "https://registry.npmmirror.com/shallow-equal/-/shallow-equal-1.2.1.tgz"
integrity sha512-S4vJDjHHMBaiZuT9NPb616CSmLf618jawtv3sufLl6ivK8WocjAo58cXwbRV1cgqxH0Qbv+iUt6m05eqEa2IRA== integrity sha512-S4vJDjHHMBaiZuT9NPb616CSmLf618jawtv3sufLl6ivK8WocjAo58cXwbRV1cgqxH0Qbv+iUt6m05eqEa2IRA==
source-map-js@^1.0.2: source-map-js@^1.2.1:
version "1.0.2"
resolved "https://registry.npmmirror.com/source-map-js/-/source-map-js-1.0.2.tgz"
integrity sha512-R0XvVJ9WusLiqTCEiGCmICCMplcCkIwwR11mOSD9CR5u+IXYdiseeEuXCVAjS54zqwkLcPNnmU4OeJ6tUrWhDw==
source-map-js@^1.2.0, source-map-js@^1.2.1:
version "1.2.1" version "1.2.1"
resolved "https://registry.yarnpkg.com/source-map-js/-/source-map-js-1.2.1.tgz#1ce5650fddd87abc099eda37dcff024c2667ae46" resolved "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.1.tgz"
integrity sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA== integrity sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==
source-map@^0.6.1, source-map@~0.6.0: source-map@^0.6.1, source-map@~0.6.0:
@ -1008,9 +795,9 @@ use-strict@1.0.1:
resolved "https://registry.npmmirror.com/use-strict/-/use-strict-1.0.1.tgz" resolved "https://registry.npmmirror.com/use-strict/-/use-strict-1.0.1.tgz"
integrity sha512-IeiWvvEXfW5ltKVMkxq6FvNf2LojMKvB2OCeja6+ct24S1XOmQw2dGr2JyndwACWAGJva9B7yPHwAmeA9QCqAQ== integrity sha512-IeiWvvEXfW5ltKVMkxq6FvNf2LojMKvB2OCeja6+ct24S1XOmQw2dGr2JyndwACWAGJva9B7yPHwAmeA9QCqAQ==
vite@^2.9.13: vite@^2.5.10, vite@^2.9.13:
version "2.9.18" version "2.9.18"
resolved "https://registry.yarnpkg.com/vite/-/vite-2.9.18.tgz#74e2a83b29da81e602dac4c293312cc575f091c7" resolved "https://registry.npmjs.org/vite/-/vite-2.9.18.tgz"
integrity sha512-sAOqI5wNM9QvSEE70W3UGMdT8cyEn0+PmJMTFvTB8wB0YbYUWw3gUbY62AOyrXosGieF2htmeLATvNxpv/zNyQ== integrity sha512-sAOqI5wNM9QvSEE70W3UGMdT8cyEn0+PmJMTFvTB8wB0YbYUWw3gUbY62AOyrXosGieF2htmeLATvNxpv/zNyQ==
dependencies: dependencies:
esbuild "^0.14.27" esbuild "^0.14.27"
@ -1032,7 +819,7 @@ vue-types@^3.0.0:
dependencies: dependencies:
is-plain-object "3.0.1" is-plain-object "3.0.1"
vue@^3.2.25: "vue@^2.6.0 || ^3.2.0", vue@^3.0.0, "vue@^3.0.0-0 || ^2.6.0", vue@^3.2.0, vue@^3.2.25, vue@>=3.0.3, vue@>=3.1.0, vue@3.2.32:
version "3.2.32" version "3.2.32"
resolved "https://registry.npmmirror.com/vue/-/vue-3.2.32.tgz" resolved "https://registry.npmmirror.com/vue/-/vue-3.2.32.tgz"
integrity sha512-6L3jKZApF042OgbCkh+HcFeAkiYi3Lovi8wNhWqIK98Pi5efAMLZzRHgi91v+60oIRxdJsGS9sTMsb+yDpY8Eg== integrity sha512-6L3jKZApF042OgbCkh+HcFeAkiYi3Lovi8wNhWqIK98Pi5efAMLZzRHgi91v+60oIRxdJsGS9sTMsb+yDpY8Eg==

@ -51,7 +51,7 @@ wget -c https://paddlespeech.cdn.bcebos.com/PaddleAudio/zh.wav
paddlespeech_server start --help paddlespeech_server start --help
``` ```
Arguments: Arguments:
- `config_file`: yaml file of the app, defalut: `./conf/application.yaml` - `config_file`: yaml file of the app, default: `./conf/application.yaml`
- `log_file`: log file. Default: `./log/paddlespeech.log` - `log_file`: log file. Default: `./log/paddlespeech.log`
Output: Output:
@ -307,7 +307,7 @@ wget -c https://paddlespeech.cdn.bcebos.com/PaddleAudio/zh.wav
- Command Line - Command Line
**Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file. **Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file.
```bash ```bash
In PaddleSpeech/demos/streaming_asr_server directory to lanuch punctuation service In PaddleSpeech/demos/streaming_asr_server directory to launch punctuation service
paddlespeech_server start --config_file conf/punc_application.yaml paddlespeech_server start --config_file conf/punc_application.yaml
``` ```
@ -414,7 +414,7 @@ wget -c https://paddlespeech.cdn.bcebos.com/PaddleAudio/zh.wav
By default, each server is deployed on the 'CPU' device and speech recognition and punctuation prediction can be deployed on different 'GPU' by modifying the' device 'parameter in the service configuration file respectively. By default, each server is deployed on the 'CPU' device and speech recognition and punctuation prediction can be deployed on different 'GPU' by modifying the' device 'parameter in the service configuration file respectively.
We use `streaming_ asr_server.py` and `punc_server.py` two services to lanuch streaming speech recognition and punctuation prediction services respectively. And the `websocket_client.py` script can be used to call streaming speech recognition and punctuation prediction services at the same time. We use `streaming_ asr_server.py` and `punc_server.py` two services to launch streaming speech recognition and punctuation prediction services respectively. And the `websocket_client.py` script can be used to call streaming speech recognition and punctuation prediction services at the same time.
### 1. Start two server ### 1. Start two server
@ -584,7 +584,7 @@ bash server.sh
By default, each server is deployed on the 'CPU' device and speech recognition and punctuation prediction can be deployed on different 'GPU' by modifying the' device 'parameter in the service configuration file respectively. By default, each server is deployed on the 'CPU' device and speech recognition and punctuation prediction can be deployed on different 'GPU' by modifying the' device 'parameter in the service configuration file respectively.
We use `streaming_ asr_server.py` and `punc_server.py` two services to lanuch streaming speech recognition and punctuation prediction services respectively. And the `websocket_client_srt.py` script can be used to call streaming speech recognition and punctuation prediction services at the same time, and will generate the corresponding subtitle (.srt format). We use `streaming_ asr_server.py` and `punc_server.py` two services to launch streaming speech recognition and punctuation prediction services respectively. And the `websocket_client_srt.py` script can be used to call streaming speech recognition and punctuation prediction services at the same time, and will generate the corresponding subtitle (.srt format).
**need to install ffmpeg before running this script** **need to install ffmpeg before running this script**

@ -52,7 +52,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
paddlespeech_server start --help paddlespeech_server start --help
``` ```
Arguments: Arguments:
- `config_file`: yaml file of the app, defalut: ./conf/tts_online_application.yaml - `config_file`: yaml file of the app, default: ./conf/tts_online_application.yaml
- `log_file`: log file. Default: ./log/paddlespeech.log - `log_file`: log file. Default: ./log/paddlespeech.log
Output: Output:
@ -180,7 +180,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
paddlespeech_server start --help paddlespeech_server start --help
``` ```
Arguments: Arguments:
- `config_file`: yaml file of the app, defalut: ./conf/tts_online_application.yaml - `config_file`: yaml file of the app, default: ./conf/tts_online_application.yaml
- `log_file`: log file. Default: ./log/paddlespeech.log - `log_file`: log file. Default: ./log/paddlespeech.log
Output: Output:

@ -75,7 +75,7 @@ class TritonPythonModel:
def initialize(self, args): def initialize(self, args):
"""`initialize` is called only once when the model is being loaded. """`initialize` is called only once when the model is being loaded.
Implementing `initialize` function is optional. This function allows Implementing `initialize` function is optional. This function allows
the model to intialize any state associated with this model. the model to initialize any state associated with this model.
Parameters Parameters
---------- ----------
args : dict args : dict

@ -99,7 +99,7 @@ The input of this demo should be a text of the specific language that can be pas
Arguments: Arguments:
- `input`(required): Input text to generate.. - `input`(required): Input text to generate..
- `am`: Acoustic model type of tts task. Default: `fastspeech2_csmsc`. - `am`: Acoustic model type of tts task. Default: `fastspeech2_csmsc`.
- `am_config`: Config of acoustic model. Use deault config when it is None. Default: `None`. - `am_config`: Config of acoustic model. Use default config when it is None. Default: `None`.
- `am_ckpt`: Acoustic model checkpoint. Use pretrained model when it is None. Default: `None`. - `am_ckpt`: Acoustic model checkpoint. Use pretrained model when it is None. Default: `None`.
- `am_stat`: Mean and standard deviation used to normalize spectrogram when training acoustic model. Default: `None`. - `am_stat`: Mean and standard deviation used to normalize spectrogram when training acoustic model. Default: `None`.
- `phones_dict`: Phone vocabulary file. Default: `None`. - `phones_dict`: Phone vocabulary file. Default: `None`.
@ -107,7 +107,7 @@ The input of this demo should be a text of the specific language that can be pas
- `speaker_dict`: speaker id map file. Default: `None`. - `speaker_dict`: speaker id map file. Default: `None`.
- `spk_id`: Speaker id for multi speaker acoustic model. Default: `0`. - `spk_id`: Speaker id for multi speaker acoustic model. Default: `0`.
- `voc`: Vocoder type of tts task. Default: `pwgan_csmsc`. - `voc`: Vocoder type of tts task. Default: `pwgan_csmsc`.
- `voc_config`: Config of vocoder. Use deault config when it is None. Default: `None`. - `voc_config`: Config of vocoder. Use default config when it is None. Default: `None`.
- `voc_ckpt`: Vocoder checkpoint. Use pretrained model when it is None. Default: `None`. - `voc_ckpt`: Vocoder checkpoint. Use pretrained model when it is None. Default: `None`.
- `voc_stat`: Mean and standard deviation used to normalize spectrogram when training vocoder. Default: `None`. - `voc_stat`: Mean and standard deviation used to normalize spectrogram when training vocoder. Default: `None`.
- `lang`: Language of tts task. Default: `zh`. - `lang`: Language of tts task. Default: `zh`.

@ -42,7 +42,7 @@ Whisper model trained by OpenAI whisper https://github.com/openai/whisper
- `model`: Model type of asr task. Default: `whisper-large`. - `model`: Model type of asr task. Default: `whisper-large`.
- `task`: Output type. Default: `transcribe`. - `task`: Output type. Default: `transcribe`.
- `lang`: Model language. Default: ``. Use `en` to choice English-only model. Now [medium,base,small,tiny] size can support English-only. - `lang`: Model language. Default: ``. Use `en` to choice English-only model. Now [medium,base,small,tiny] size can support English-only.
- `size`: Model size for decode. Defalut: `large`. Now can support [large,medium,base,small,tiny]. - `size`: Model size for decode. Default: `large`. Now can support [large,medium,base,small,tiny].
- `language`: Set decode language. Default: `None`. Forcibly set the recognized language, which is determined by the model itself by default. - `language`: Set decode language. Default: `None`. Forcibly set the recognized language, which is determined by the model itself by default.
- `sample_rate`: Sample rate of the model. Default: `16000`. Other sampling rates are not supported now. - `sample_rate`: Sample rate of the model. Default: `16000`. Other sampling rates are not supported now.
- `config`: Config of asr task. Use pretrained model when it is None. Default: `None`. - `config`: Config of asr task. Use pretrained model when it is None. Default: `None`.

@ -303,7 +303,7 @@ The experimental codes in PaddleSpeech TTS are generally organized as follows:
. .
├── README.md (help information) ├── README.md (help information)
├── conf ├── conf
│ └── default.yaml (defalut config) │ └── default.yaml (default config)
├── local ├── local
│ ├── preprocess.sh (script to call data preprocessing.py) │ ├── preprocess.sh (script to call data preprocessing.py)
│ ├── synthesize.sh (script to call synthesis.py) │ ├── synthesize.sh (script to call synthesis.py)

@ -26,8 +26,8 @@ if [ ${seed} != 0 ]; then
export FLAGS_cudnn_deterministic=True export FLAGS_cudnn_deterministic=True
fi fi
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -35,8 +35,8 @@ echo ${ips_config}
mkdir -p exp mkdir -p exp
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -54,8 +54,8 @@ elif [ "${cmd_backend}" = sge ]; then
# "sbatch" (Slurm) # "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf. # The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partion" for your environment. # You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo". # To know the "partition" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*" # You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}". # The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
encoder_normalize_before: True # whether to perform layer normalization before the input encoder_normalize_before: True # whether to perform layer normalization before the input
@ -66,14 +66,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: true # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: true # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -60,14 +60,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -60,14 +60,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -60,14 +60,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -39,7 +39,7 @@ generator_params:
use_additional_convs: True # Whether to use additional conv layer in residual blocks. use_additional_convs: True # Whether to use additional conv layer in residual blocks.
bias: True # Whether to use bias parameter in conv. bias: True # Whether to use bias parameter in conv.
nonlinear_activation: "leakyrelu" # Nonlinear activation type. nonlinear_activation: "leakyrelu" # Nonlinear activation type.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
@ -77,7 +77,7 @@ discriminator_params:
max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers. max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers.
bias: True # Whether to use bias parameter in conv layer." bias: True # Whether to use bias parameter in conv layer."
nonlinear_activation: "leakyrelu" # Nonlinear activation. nonlinear_activation: "leakyrelu" # Nonlinear activation.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
use_spectral_norm: False # Whether to apply spectral normalization. use_spectral_norm: False # Whether to apply spectral normalization.

@ -45,7 +45,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -63,14 +63,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -60,14 +60,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: true # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: true # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -99,8 +99,10 @@ pwg_baker_ckpt_0.4
``` ```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`. `./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can use stage `0-4` to select the vocoder to use {`pwgan`, `multi band melgan`, `style melgan`, ` hifigan`, `wavernn`}
```text ```text
usage: synthesize.py [-h] usage: synthesize.py [-h]
[--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}] [--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}]
@ -146,9 +148,12 @@ optional arguments:
output dir. output dir.
``` ```
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file. `./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can use stage `0,1,3,4` to select the vocoder to use{`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
```text ```text
usage: synthesize_e2e.py [-h] usage: synthesize_e2e.py [-h]
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}] [--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

@ -27,13 +27,15 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan # synthesize, vocoder is pwgan by default stage 0
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 # stage 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan # synthesize_e2e, vocoder is pwgan by default stage 0
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 # stage 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

@ -116,8 +116,10 @@ pwg_baker_ckpt_0.4
``` ```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`. `./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can use stage `0-4` to select the vocoder to use {`pwgan`, `multi band melgan`, `style melgan`, `hifigan`, `wavernn`}
```text ```text
usage: synthesize.py [-h] usage: synthesize.py [-h]
[--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}] [--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}]
@ -164,8 +166,10 @@ optional arguments:
``` ```
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file. `./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can use stage `0,1,3,4` to select the vocoder to use {`pwgan`, `multi band melgan`, `hifigan`, `wavernn`}
```text ```text
usage: synthesize_e2e.py [-h] usage: synthesize_e2e.py [-h]
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}] [--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

@ -27,13 +27,15 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default # synthesize, vocoder is pwgan by default stage 0
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 # use stage 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default # synthesize_e2e, vocoder is pwgan by default stage 0
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 # use stage 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

@ -43,7 +43,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -65,14 +65,14 @@ model:
cnn_decoder_embedding_dim: 256 cnn_decoder_embedding_dim: 256
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
encoder_normalize_before: True # whether to perform layer normalization before the input encoder_normalize_before: True # whether to perform layer normalization before the input
@ -66,14 +66,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -60,14 +60,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -28,11 +28,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default # synthesize, vocoder is pwgan by default stage 0
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 # use stage 1-4 to select the vocoder to use {multi band melgan, style melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default # synthesize_e2e, vocoder is pwgan by default stage 0
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 # use stage 1,3,4 to select the vocoder to use {multi band melgan, hifigan, wavernn}
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi

@ -8,13 +8,13 @@ FLAGS_allocator_strategy=naive_best_fit \
FLAGS_fraction_of_gpu_memory_to_use=0.01 \ FLAGS_fraction_of_gpu_memory_to_use=0.01 \
python3 ${BIN_DIR}/../../synthesize_e2e.py \ python3 ${BIN_DIR}/../../synthesize_e2e.py \
--am=fastspeech2_csmsc \ --am=fastspeech2_csmsc \
--am_config=${config_path} \ --am_config=fastspeech2_nosil_baker_ckpt_0.4/default.yaml \
--am_ckpt=${train_output_path}/checkpoints/${ckpt_name} \ --am_ckpt=fastspeech2_nosil_baker_ckpt_0.4/snapshot_iter_76000.pdz \
--am_stat=dump/train/speech_stats.npy \ --am_stat=fastspeech2_nosil_baker_ckpt_0.4/speech_stats.npy \
--voc=pwgan_csmsc \ --voc=pwgan_csmsc \
--voc_config=pwg_baker_ckpt_0.4/pwg_default.yaml \ --voc_config=${config_path} \
--voc_ckpt=pwg_baker_ckpt_0.4/pwg_snapshot_iter_400000.pdz \ --voc_ckpt=${train_output_path}/checkpoints/${ckpt_name} \
--voc_stat=pwg_baker_ckpt_0.4/pwg_stats.npy \ --voc_stat=dump/train/feats_stats.npy \
--lang=zh \ --lang=zh \
--text=${BIN_DIR}/../../assets/sentences.txt \ --text=${BIN_DIR}/../../assets/sentences.txt \
--output_dir=${train_output_path}/test_e2e \ --output_dir=${train_output_path}/test_e2e \

@ -86,7 +86,7 @@ Download pretrained MultiBand MelGAN model from [mb_melgan_csmsc_ckpt_0.1.1.zip]
```bash ```bash
unzip mb_melgan_csmsc_ckpt_0.1.1.zip unzip mb_melgan_csmsc_ckpt_0.1.1.zip
``` ```
HiFiGAN checkpoint contains files listed below. MultiBand MelGAN checkpoint contains files listed below.
```text ```text
mb_melgan_csmsc_ckpt_0.1.1 mb_melgan_csmsc_ckpt_0.1.1
├── default.yaml # default config used to train MultiBand MelGAN ├── default.yaml # default config used to train MultiBand MelGAN

@ -4,7 +4,7 @@
# This configuration requires ~ 8GB memory and will finish within 7 days on Titan V. # This configuration requires ~ 8GB memory and will finish within 7 days on Titan V.
# This configuration is based on full-band MelGAN but the hop size and sampling # This configuration is based on full-band MelGAN but the hop size and sampling
# rate is different from the paper (16kHz vs 24kHz). The number of iteraions # rate is different from the paper (16kHz vs 24kHz). The number of iterations
# is not shown in the paper so currently we train 1M iterations (not sure enough # is not shown in the paper so currently we train 1M iterations (not sure enough
# to converge). # to converge).

@ -4,7 +4,7 @@
# This configuration requires ~ 8GB memory and will finish within 7 days on Titan V. # This configuration requires ~ 8GB memory and will finish within 7 days on Titan V.
# This configuration is based on full-band MelGAN but the hop size and sampling # This configuration is based on full-band MelGAN but the hop size and sampling
# rate is different from the paper (16kHz vs 24kHz). The number of iteraions # rate is different from the paper (16kHz vs 24kHz). The number of iterations
# is not shown in the paper so currently we train 1M iterations (not sure enough # is not shown in the paper so currently we train 1M iterations (not sure enough
# to converge). # to converge).

@ -38,7 +38,7 @@ generator_params:
use_additional_convs: True # Whether to use additional conv layer in residual blocks. use_additional_convs: True # Whether to use additional conv layer in residual blocks.
bias: True # Whether to use bias parameter in conv. bias: True # Whether to use bias parameter in conv.
nonlinear_activation: "leakyrelu" # Nonlinear activation type. nonlinear_activation: "leakyrelu" # Nonlinear activation type.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
@ -76,7 +76,7 @@ discriminator_params:
max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers. max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers.
bias: True # Whether to use bias parameter in conv layer." bias: True # Whether to use bias parameter in conv layer."
nonlinear_activation: "leakyrelu" # Nonlinear activation. nonlinear_activation: "leakyrelu" # Nonlinear activation.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
use_spectral_norm: False # Whether to apply spectral normalization. use_spectral_norm: False # Whether to apply spectral normalization.

@ -38,7 +38,7 @@ generator_params:
use_additional_convs: True # Whether to use additional conv layer in residual blocks. use_additional_convs: True # Whether to use additional conv layer in residual blocks.
bias: True # Whether to use bias parameter in conv. bias: True # Whether to use bias parameter in conv.
nonlinear_activation: "leakyrelu" # Nonlinear activation type. nonlinear_activation: "leakyrelu" # Nonlinear activation type.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
@ -76,7 +76,7 @@ discriminator_params:
max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers. max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers.
bias: True # Whether to use bias parameter in conv layer." bias: True # Whether to use bias parameter in conv layer."
nonlinear_activation: "leakyrelu" # Nonlinear activation. nonlinear_activation: "leakyrelu" # Nonlinear activation.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
use_spectral_norm: False # Whether to apply spectral normalization. use_spectral_norm: False # Whether to apply spectral normalization.

@ -42,7 +42,7 @@ generator_params:
use_additional_convs: True # Whether to use additional conv layer in residual blocks. use_additional_convs: True # Whether to use additional conv layer in residual blocks.
bias: True # Whether to use bias parameter in conv. bias: True # Whether to use bias parameter in conv.
nonlinear_activation: "leakyrelu" # Nonlinear activation type. nonlinear_activation: "leakyrelu" # Nonlinear activation type.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
@ -83,7 +83,7 @@ discriminator_params:
max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers. max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers.
bias: True # Whether to use bias parameter in conv layer." bias: True # Whether to use bias parameter in conv layer."
nonlinear_activation: "leakyrelu" # Nonlinear activation. nonlinear_activation: "leakyrelu" # Nonlinear activation.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
use_spectral_norm: False # Whether to apply spectral normalization. use_spectral_norm: False # Whether to apply spectral normalization.

@ -26,8 +26,8 @@ if [ ${seed} != 0 ]; then
export FLAGS_cudnn_deterministic=True export FLAGS_cudnn_deterministic=True
fi fi
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -54,8 +54,8 @@ elif [ "${cmd_backend}" = sge ]; then
# "sbatch" (Slurm) # "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf. # The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partion" for your environment. # You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo". # To know the "partition" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*" # You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}". # The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

@ -29,8 +29,8 @@ fi
# export FLAGS_cudnn_exhaustive_search=true # export FLAGS_cudnn_exhaustive_search=true
# export FLAGS_conv_workspace_size_limit=4000 # export FLAGS_conv_workspace_size_limit=4000
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -54,8 +54,8 @@ elif [ "${cmd_backend}" = sge ]; then
# "sbatch" (Slurm) # "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf. # The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partion" for your environment. # You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo". # To know the "partition" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*" # You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}". # The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

@ -26,8 +26,8 @@ if [ ${seed} != 0 ]; then
export FLAGS_cudnn_deterministic=True export FLAGS_cudnn_deterministic=True
fi fi
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -54,8 +54,8 @@ elif [ "${cmd_backend}" = sge ]; then
# "sbatch" (Slurm) # "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf. # The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partion" for your environment. # You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo". # To know the "partition" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*" # You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}". # The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

@ -54,8 +54,8 @@ elif [ "${cmd_backend}" = sge ]; then
# "sbatch" (Slurm) # "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf. # The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partion" for your environment. # You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo". # To know the "partition" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*" # You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}". # The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

@ -54,8 +54,8 @@ elif [ "${cmd_backend}" = sge ]; then
# "sbatch" (Slurm) # "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf. # The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partion" for your environment. # You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo". # To know the "partition" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*" # You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}". # The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

@ -34,7 +34,7 @@ model: # keyword arguments for the selected model
dunits: 1024 # number of decoder ff units dunits: 1024 # number of decoder ff units
positionwise_layer_type: conv1d # type of position-wise layer positionwise_layer_type: conv1d # type of position-wise layer
positionwise_conv_kernel_size: 1 # kernel size of position wise conv layer positionwise_conv_kernel_size: 1 # kernel size of position wise conv layer
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding

@ -105,8 +105,10 @@ pwg_ljspeech_ckpt_0.5
``` ```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`. `./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text ```text
usage: synthesize.py [-h] usage: synthesize.py [-h]
[--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}] [--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}]
@ -153,8 +155,10 @@ optional arguments:
``` ```
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file. `./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text ```text
usage: synthesize_e2e.py [-h] usage: synthesize_e2e.py [-h]
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}] [--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -60,14 +60,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -27,13 +27,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default # synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default # synthesize_e2e, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

@ -38,7 +38,7 @@ generator_params:
use_additional_convs: True # Whether to use additional conv layer in residual blocks. use_additional_convs: True # Whether to use additional conv layer in residual blocks.
bias: True # Whether to use bias parameter in conv. bias: True # Whether to use bias parameter in conv.
nonlinear_activation: "leakyrelu" # Nonlinear activation type. nonlinear_activation: "leakyrelu" # Nonlinear activation type.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
@ -76,7 +76,7 @@ discriminator_params:
max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers. max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers.
bias: True # Whether to use bias parameter in conv layer." bias: True # Whether to use bias parameter in conv layer."
nonlinear_activation: "leakyrelu" # Nonlinear activation. nonlinear_activation: "leakyrelu" # Nonlinear activation.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
use_spectral_norm: False # Whether to apply spectral normalization. use_spectral_norm: False # Whether to apply spectral normalization.

@ -54,8 +54,8 @@ elif [ "${cmd_backend}" = sge ]; then
# "sbatch" (Slurm) # "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf. # The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partion" for your environment. # You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo". # To know the "partition" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*" # You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}". # The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

@ -68,14 +68,14 @@ model:
duration_predictor_dropout_rate: 0.5 # dropout rate in energy predictor duration_predictor_dropout_rate: 0.5 # dropout rate in energy predictor
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -38,7 +38,7 @@ generator_params:
use_additional_convs: True # Whether to use additional conv layer in residual blocks. use_additional_convs: True # Whether to use additional conv layer in residual blocks.
bias: True # Whether to use bias parameter in conv. bias: True # Whether to use bias parameter in conv.
nonlinear_activation: "leakyrelu" # Nonlinear activation type. nonlinear_activation: "leakyrelu" # Nonlinear activation type.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
@ -76,7 +76,7 @@ discriminator_params:
max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers. max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers.
bias: True # Whether to use bias parameter in conv layer." bias: True # Whether to use bias parameter in conv layer."
nonlinear_activation: "leakyrelu" # Nonlinear activation. nonlinear_activation: "leakyrelu" # Nonlinear activation.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
use_spectral_norm: False # Whether to apply spectral normalization. use_spectral_norm: False # Whether to apply spectral normalization.

@ -38,7 +38,7 @@ generator_params:
use_additional_convs: True # Whether to use additional conv layer in residual blocks. use_additional_convs: True # Whether to use additional conv layer in residual blocks.
bias: True # Whether to use bias parameter in conv. bias: True # Whether to use bias parameter in conv.
nonlinear_activation: "leakyrelu" # Nonlinear activation type. nonlinear_activation: "leakyrelu" # Nonlinear activation type.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
@ -76,7 +76,7 @@ discriminator_params:
max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers. max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers.
bias: True # Whether to use bias parameter in conv layer." bias: True # Whether to use bias parameter in conv layer."
nonlinear_activation: "leakyrelu" # Nonlinear activation. nonlinear_activation: "leakyrelu" # Nonlinear activation.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
use_spectral_norm: False # Whether to apply spectral normalization. use_spectral_norm: False # Whether to apply spectral normalization.

@ -97,7 +97,7 @@ def test_full_scores_words():
if w not in model: if w not in model:
print('"{0}" is an OOV'.format(w)) print('"{0}" is an OOV'.format(w))
oov.append(w) oov.append(w)
# zh_giga.no_cna_cmn.prune01244.klm is chinese charactor LM # zh_giga.no_cna_cmn.prune01244.klm is chinese character LM
assert oov == ["盘点", "不怕", "网站", "", "", "海淘", "向来", "便宜", "保真", assert oov == ["盘点", "不怕", "网站", "", "", "海淘", "向来", "便宜", "保真",
""], 'error oov' ""], 'error oov'

@ -24,7 +24,7 @@ mkdir -p ${TARGET_DIR}
#prepare data #prepare data
if [ ${stage} -le -1 ] && [ ${stop_stage} -ge -1 ]; then if [ ${stage} -le -1 ] && [ ${stop_stage} -ge -1 ]; then
if [ ! -d "${MAIN_ROOT}/dataset/tal_cs/TALCS_corpus" ]; then if [ ! -d "${MAIN_ROOT}/dataset/tal_cs/TALCS_corpus" ]; then
echo "${MAIN_ROOT}/dataset/tal_cs/TALCS_corpus does not exist. Please donwload tal_cs data and unpack it from https://ai.100tal.com/dataset first." echo "${MAIN_ROOT}/dataset/tal_cs/TALCS_corpus does not exist. Please download tal_cs data and unpack it from https://ai.100tal.com/dataset first."
echo "data md5 reference: 4c879b3c9c05365fc9dee1fc68713afe" echo "data md5 reference: 4c879b3c9c05365fc9dee1fc68713afe"
exit exit
fi fi

@ -35,8 +35,8 @@ echo ${ips_config}
mkdir -p exp mkdir -p exp
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -54,8 +54,8 @@ elif [ "${cmd_backend}" = sge ]; then
# "sbatch" (Slurm) # "sbatch" (Slurm)
elif [ "${cmd_backend}" = slurm ]; then elif [ "${cmd_backend}" = slurm ]; then
# The default setting is written in conf/slurm.conf. # The default setting is written in conf/slurm.conf.
# You must change "-p cpu" and "-p gpu" for the "partion" for your environment. # You must change "-p cpu" and "-p gpu" for the "partition" for your environment.
# To know the "partion" names, type "sinfo". # To know the "partition" names, type "sinfo".
# You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*" # You can use "--gpu * " by default for slurm and it is interpreted as "--gres gpu:*"
# The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}". # The devices are allocated exclusively using "${CUDA_VISIBLE_DEVICES}".

@ -19,8 +19,8 @@ if [ ${seed} != 0 ]; then
export FLAGS_cudnn_deterministic=True export FLAGS_cudnn_deterministic=True
fi fi
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -32,8 +32,8 @@ fi
mkdir -p exp mkdir -p exp
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -34,8 +34,8 @@ fi
mkdir -p exp mkdir -p exp
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -85,9 +85,12 @@ hifigan_vctk_ckpt_0.2.0
``` ```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`. `./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can be `0` , use`hifigan` model as vocoder.
## Speech Synthesis and Speech Editing ## Speech Synthesis and Speech Editing
### Prepare ### Prepare
**prepare aligner** **prepare aligner**
```bash ```bash

@ -27,10 +27,11 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan # synthesize, vocoder is hifigan by default
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 # synthesize, task_name is speech synthesize by default stage 0, stage 1 will use speech edit as taskname
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi

@ -108,8 +108,10 @@ pwg_vctk_ckpt_0.1.1
``` ```
`./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`. `./local/synthesize.sh` calls `${BIN_DIR}/../synthesize.py`, which can synthesize waveform from `metadata.jsonl`.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text ```text
usage: synthesize.py [-h] usage: synthesize.py [-h]
[--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}] [--am {speedyspeech_csmsc,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech,tacotron2_aishell3}]
@ -156,8 +158,10 @@ optional arguments:
``` ```
`./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file. `./local/synthesize_e2e.sh` calls `${BIN_DIR}/../synthesize_e2e.py`, which can synthesize waveform from text file.
```bash ```bash
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name}
``` ```
`--stage` controls the vocoder model during synthesis, which can be `0` or `1`, use `pwgan` or `hifigan` model as vocoder.
```text ```text
usage: synthesize_e2e.py [-h] usage: synthesize_e2e.py [-h]
[--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}] [--am {speedyspeech_csmsc,speedyspeech_aishell3,fastspeech2_csmsc,fastspeech2_ljspeech,fastspeech2_aishell3,fastspeech2_vctk,tacotron2_csmsc,tacotron2_ljspeech}]

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -60,14 +60,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -27,13 +27,13 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi fi
if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then if [ ${stage} -le 2 ] && [ ${stop_stage} -ge 2 ]; then
# synthesize, vocoder is pwgan by default # synthesize, vocoder is pwgan by default stage 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
# synthesize_e2e, vocoder is pwgan by default # synthesize_e2e, vocoder is pwgan by default 0, stage 1 will use hifigan as vocoder
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${conf_path} ${train_output_path} ${ckpt_name} || exit -1 CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh --stage 0 ${conf_path} ${train_output_path} ${ckpt_name} || exit -1
fi fi
if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then if [ ${stage} -le 4 ] && [ ${stop_stage} -ge 4 ]; then

@ -39,7 +39,7 @@ generator_params:
use_additional_convs: True # Whether to use additional conv layer in residual blocks. use_additional_convs: True # Whether to use additional conv layer in residual blocks.
bias: True # Whether to use bias parameter in conv. bias: True # Whether to use bias parameter in conv.
nonlinear_activation: "leakyrelu" # Nonlinear activation type. nonlinear_activation: "leakyrelu" # Nonlinear activation type.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
@ -77,7 +77,7 @@ discriminator_params:
max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers. max_downsample_channels: 1024 # Maximum number of channels in downsampling conv layers.
bias: True # Whether to use bias parameter in conv layer." bias: True # Whether to use bias parameter in conv layer."
nonlinear_activation: "leakyrelu" # Nonlinear activation. nonlinear_activation: "leakyrelu" # Nonlinear activation.
nonlinear_activation_params: # Nonlinear activation paramters. nonlinear_activation_params: # Nonlinear activation parameters.
negative_slope: 0.1 negative_slope: 0.1
use_weight_norm: True # Whether to apply weight normalization. use_weight_norm: True # Whether to apply weight normalization.
use_spectral_norm: False # Whether to apply spectral normalization. use_spectral_norm: False # Whether to apply spectral normalization.

@ -32,8 +32,8 @@ def main(args, config):
seed_everything(config.seed) seed_everything(config.seed)
# stage 1: generate the voxceleb csv file # stage 1: generate the voxceleb csv file
# Note: this may occurs c++ execption, but the program will execute fine # Note: this may occurs c++ exception, but the program will execute fine
# so we ignore the execption # so we ignore the exception
# we explicitly pass the vox2 base path to data prepare and generate the audio info # we explicitly pass the vox2 base path to data prepare and generate the audio info
logger.info("start to generate the voxceleb dataset info") logger.info("start to generate the voxceleb dataset info")
train_dataset = VoxCeleb( train_dataset = VoxCeleb(

@ -35,8 +35,8 @@ echo ${ips_config}
mkdir -p exp mkdir -p exp
# default memeory allocator strategy may case gpu training hang # default memory allocator strategy may case gpu training hang
# for no OOM raised when memory exhaused # for no OOM raised when memory exhausted
export FLAGS_allocator_strategy=naive_best_fit export FLAGS_allocator_strategy=naive_best_fit
if [ ${ngpu} == 0 ]; then if [ ${ngpu} == 0 ]; then

@ -42,7 +42,7 @@ model:
duration_predictor_layers: 2 # number of layers of duration predictor duration_predictor_layers: 2 # number of layers of duration predictor
duration_predictor_chans: 256 # number of channels of duration predictor duration_predictor_chans: 256 # number of channels of duration predictor
duration_predictor_kernel_size: 3 # filter size of duration predictor duration_predictor_kernel_size: 3 # filter size of duration predictor
postnet_layers: 5 # number of layers of postnset postnet_layers: 5 # number of layers of postnet
postnet_filts: 5 # filter size of conv layers in postnet postnet_filts: 5 # filter size of conv layers in postnet
postnet_chans: 256 # number of channels of conv layers in postnet postnet_chans: 256 # number of channels of conv layers in postnet
use_scaled_pos_enc: True # whether to use scaled positional encoding use_scaled_pos_enc: True # whether to use scaled positional encoding
@ -60,14 +60,14 @@ model:
transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer transformer_dec_attn_dropout_rate: 0.2 # dropout rate for transformer decoder attention layer
pitch_predictor_layers: 5 # number of conv layers in pitch predictor pitch_predictor_layers: 5 # number of conv layers in pitch predictor
pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor pitch_predictor_chans: 256 # number of channels of conv layers in pitch predictor
pitch_predictor_kernel_size: 5 # kernel size of conv leyers in pitch predictor pitch_predictor_kernel_size: 5 # kernel size of conv layers in pitch predictor
pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor pitch_predictor_dropout: 0.5 # dropout rate in pitch predictor
pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch pitch_embed_kernel_size: 1 # kernel size of conv embedding layer for pitch
pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch pitch_embed_dropout: 0.0 # dropout rate after conv embedding layer for pitch
stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder stop_gradient_from_pitch_predictor: True # whether to stop the gradient from pitch predictor to encoder
energy_predictor_layers: 2 # number of conv layers in energy predictor energy_predictor_layers: 2 # number of conv layers in energy predictor
energy_predictor_chans: 256 # number of channels of conv layers in energy predictor energy_predictor_chans: 256 # number of channels of conv layers in energy predictor
energy_predictor_kernel_size: 3 # kernel size of conv leyers in energy predictor energy_predictor_kernel_size: 3 # kernel size of conv layers in energy predictor
energy_predictor_dropout: 0.5 # dropout rate in energy predictor energy_predictor_dropout: 0.5 # dropout rate in energy predictor
energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy energy_embed_kernel_size: 1 # kernel size of conv embedding layer for energy
energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy energy_embed_dropout: 0.0 # dropout rate after conv embedding layer for energy

@ -79,7 +79,7 @@ def pad_sequence(sequences: List[paddle.Tensor],
# assuming trailing dimensions and type of all the Tensors # assuming trailing dimensions and type of all the Tensors
# in sequences are same and fetching those from sequences[0] # in sequences are same and fetching those from sequences[0]
max_size = paddle.shape(sequences[0]) max_size = paddle.shape(sequences[0])
# (TODO Hui Zhang): slice not supprot `end==start` # (TODO Hui Zhang): slice not support `end==start`
# trailing_dims = max_size[1:] # trailing_dims = max_size[1:]
trailing_dims = tuple( trailing_dims = tuple(
max_size[1:].numpy().tolist()) if sequences[0].ndim >= 2 else () max_size[1:].numpy().tolist()) if sequences[0].ndim >= 2 else ()
@ -93,7 +93,7 @@ def pad_sequence(sequences: List[paddle.Tensor],
length = tensor.shape[0] length = tensor.shape[0]
# use index notation to prevent duplicate references to the tensor # use index notation to prevent duplicate references to the tensor
if batch_first: if batch_first:
# TODO (Hui Zhang): set_value op not supprot `end==start` # TODO (Hui Zhang): set_value op not support `end==start`
# TODO (Hui Zhang): set_value op not support int16 # TODO (Hui Zhang): set_value op not support int16
# TODO (Hui Zhang): set_varbase 2 rank not support [0,0,...] # TODO (Hui Zhang): set_varbase 2 rank not support [0,0,...]
# out_tensor[i, :length, ...] = tensor # out_tensor[i, :length, ...] = tensor
@ -102,7 +102,7 @@ def pad_sequence(sequences: List[paddle.Tensor],
else: else:
out_tensor[i, length] = tensor out_tensor[i, length] = tensor
else: else:
# TODO (Hui Zhang): set_value op not supprot `end==start` # TODO (Hui Zhang): set_value op not support `end==start`
# out_tensor[:length, i, ...] = tensor # out_tensor[:length, i, ...] = tensor
if length != 0: if length != 0:
out_tensor[:length, i] = tensor out_tensor[:length, i] = tensor

@ -79,7 +79,7 @@ class ASRExecutor(BaseExecutor):
'--config', '--config',
type=str, type=str,
default=None, default=None,
help='Config of asr task. Use deault config when it is None.') help='Config of asr task. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
'--decode_method', '--decode_method',
type=str, type=str,

@ -51,7 +51,7 @@ class CLSExecutor(BaseExecutor):
'--config', '--config',
type=str, type=str,
default=None, default=None,
help='Config of cls task. Use deault config when it is None.') help='Config of cls task. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
'--ckpt_path', '--ckpt_path',
type=str, type=str,

@ -58,7 +58,7 @@ class KWSExecutor(BaseExecutor):
'--config', '--config',
type=str, type=str,
default=None, default=None,
help='Config of kws task. Use deault config when it is None.') help='Config of kws task. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
'--ckpt_path', '--ckpt_path',
type=str, type=str,

@ -76,7 +76,7 @@ class SSLExecutor(BaseExecutor):
'--config', '--config',
type=str, type=str,
default=None, default=None,
help='Config of asr task. Use deault config when it is None.') help='Config of asr task. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
'--decode_method', '--decode_method',
type=str, type=str,

@ -82,7 +82,7 @@ class STExecutor(BaseExecutor):
"--config", "--config",
type=str, type=str,
default=None, default=None,
help="Config of st task. Use deault config when it is None.") help="Config of st task. Use default config when it is None.")
self.parser.add_argument( self.parser.add_argument(
"--ckpt_path", "--ckpt_path",
type=str, type=str,

@ -63,7 +63,7 @@ class TextExecutor(BaseExecutor):
'--config', '--config',
type=str, type=str,
default=None, default=None,
help='Config of cls task. Use deault config when it is None.') help='Config of cls task. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
'--ckpt_path', '--ckpt_path',
type=str, type=str,

@ -90,7 +90,7 @@ class TTSExecutor(BaseExecutor):
'--am_config', '--am_config',
type=str, type=str,
default=None, default=None,
help='Config of acoustic model. Use deault config when it is None.') help='Config of acoustic model. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
'--am_ckpt', '--am_ckpt',
type=str, type=str,
@ -148,7 +148,7 @@ class TTSExecutor(BaseExecutor):
'--voc_config', '--voc_config',
type=str, type=str,
default=None, default=None,
help='Config of voc. Use deault config when it is None.') help='Config of voc. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
'--voc_ckpt', '--voc_ckpt',
type=str, type=str,

@ -82,7 +82,7 @@ class VectorExecutor(BaseExecutor):
'--config', '--config',
type=str, type=str,
default=None, default=None,
help='Config of asr task. Use deault config when it is None.') help='Config of asr task. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
"--device", "--device",
type=str, type=str,

@ -96,7 +96,7 @@ class WhisperExecutor(BaseExecutor):
'--config', '--config',
type=str, type=str,
default=None, default=None,
help='Config of asr task. Use deault config when it is None.') help='Config of asr task. Use default config when it is None.')
self.parser.add_argument( self.parser.add_argument(
'--decode_method', '--decode_method',
type=str, type=str,

@ -62,7 +62,7 @@ def create_manifest(data_dir, manifest_path_prefix):
if line == '': if line == '':
continue continue
audio_id, text = line.split(' ', 1) audio_id, text = line.split(' ', 1)
# remove withespace, charactor text # remove withespace, character text
text = ''.join(text.split()) text = ''.join(text.split())
transcript_dict[audio_id] = text transcript_dict[audio_id] = text

@ -65,7 +65,7 @@ def create_manifest(data_dir, manifest_path_prefix):
if line == '': if line == '':
continue continue
audio_id, text = line.split(' ', 1) audio_id, text = line.split(' ', 1)
# remove withespace, charactor text # remove withespace, character text
text = ''.join(text.split()) text = ''.join(text.split())
transcript_dict[audio_id] = text transcript_dict[audio_id] = text
@ -159,7 +159,7 @@ def check_dataset(data_dir):
if line == '': if line == '':
continue continue
audio_id, text = line.split(' ', 1) audio_id, text = line.split(' ', 1)
# remove withespace, charactor text # remove withespace, character text
text = ''.join(text.split()) text = ''.join(text.split())
transcript_dict[audio_id] = text transcript_dict[audio_id] = text

@ -2106,7 +2106,7 @@ g2pw_onnx_models = {
}, },
'1.1': { '1.1': {
'url': 'url':
'https://paddlespeech.cdn.bcebos.com/Parakeet/released_models/g2p/G2PWModel_1.1.zip', 'https://paddlespeech.cdn.bcebos.com/Parakeet/released_models/g2p/new/G2PWModel_1.1.zip',
'md5': 'md5':
'f8b60501770bff92ed6ce90860a610e6', 'f8b60501770bff92ed6ce90860a610e6',
}, },

@ -362,7 +362,7 @@ class HubertASRTrainer(Trainer):
scratch = None scratch = None
if self.args.resume: if self.args.resume:
# just restore ckpt # just restore ckpt
# lr will resotre from optimizer ckpt # lr will restore from optimizer ckpt
resume_json_path = os.path.join(self.checkpoint_dir, resume_json_path = os.path.join(self.checkpoint_dir,
self.args.resume + '.json') self.args.resume + '.json')
with open(resume_json_path, 'r', encoding='utf8') as f: with open(resume_json_path, 'r', encoding='utf8') as f:
@ -370,20 +370,20 @@ class HubertASRTrainer(Trainer):
self.iteration = 0 self.iteration = 0
self.epoch = resume_json["epoch"] self.epoch = resume_json["epoch"]
# resotre model from *.pdparams # restore model from *.pdparams
params_path = os.path.join(self.checkpoint_dir, params_path = os.path.join(self.checkpoint_dir,
"{}".format(self.epoch)) + '.pdparams' "{}".format(self.epoch)) + '.pdparams'
model_dict = paddle.load(params_path) model_dict = paddle.load(params_path)
self.model.set_state_dict(model_dict) self.model.set_state_dict(model_dict)
# resotre optimizer from *.pdopt # restore optimizer from *.pdopt
optimizer_path = os.path.join(self.checkpoint_dir, optimizer_path = os.path.join(self.checkpoint_dir,
"{}".format(self.epoch)) + '.pdopt' "{}".format(self.epoch)) + '.pdopt'
optimizer_dict = paddle.load(optimizer_path) optimizer_dict = paddle.load(optimizer_path)
self.model_optimizer.set_state_dict(optimizer_dict['model']) self.model_optimizer.set_state_dict(optimizer_dict['model'])
self.hubert_optimizer.set_state_dict(optimizer_dict['hubert']) self.hubert_optimizer.set_state_dict(optimizer_dict['hubert'])
# resotre lr_scheduler from *.pdlrs # restore lr_scheduler from *.pdlrs
scheduler_path = os.path.join(self.checkpoint_dir, scheduler_path = os.path.join(self.checkpoint_dir,
"{}".format(self.epoch)) + '.pdlrs' "{}".format(self.epoch)) + '.pdlrs'
if os.path.isfile(os.path.join(scheduler_path)): if os.path.isfile(os.path.join(scheduler_path)):

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save