You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
PaddleSpeech/third_party/python-pinyin/pypinyin/style/__init__.py

76 lines
2.1 KiB

E2E/Streaming Transformer/Conformer ASR (#578) * add cmvn and label smoothing loss layer * add layer for transformer * add glu and conformer conv * add torch compatiable hack, mask funcs * not hack size since it exists * add test; attention * add attention, common utils, hack paddle * add audio utils * conformer batch padding mask bug fix #223 * fix typo, python infer fix rnn mem opt name error and batchnorm1d, will be available at 2.0.2 * fix ci * fix ci * add encoder * refactor egs * add decoder * refactor ctc, add ctc align, refactor ckpt, add warmup lr scheduler, cmvn utils * refactor docs * add fix * fix readme * fix bugs, refactor collator, add pad_sequence, fix ckpt bugs * fix docstring * refactor data feed order * add u2 model * refactor cmvn, test * add utils * add u2 config * fix bugs * fix bugs * fix autograd maybe has problem when using inplace operation * refactor data, build vocab; add format data * fix text featurizer * refactor build vocab * add fbank, refactor feature of speech * refactor audio feat * refactor data preprare * refactor data * model init from config * add u2 bins * flake8 * can train * fix bugs, add coverage, add scripts * test can run * fix data * speed perturb with sox * add spec aug * fix for train * fix train logitc * fix logger * log valid loss, time dataset process * using np for speed perturb, remove some debug log of grad clip * fix logger * fix build vocab * fix logger name * using module logger as default * fix * fix install * reorder imports * fix board logger * fix logger * kaldi fbank and mfcc * fix cmvn and print prarams * fix add_eos_sos and cmvn * fix cmvn compute * fix logger and cmvn * fix subsampling, label smoothing loss, remove useless * add notebook test * fix log * fix tb logger * multi gpu valid * fix log * fix log * fix config * fix compute cmvn, need paddle 2.1 * add cmvn notebook * fix layer tools * fix compute cmvn * add rtf * fix decoding * fix layer tools * fix log, add avg script * more avg and test info * fix dataset pickle problem; using 2.1 paddle; num_workers can > 0; ckpt save in exp dir;fix setup.sh; * add vimrc * refactor tiny script, add transformer and stream conf * spm demo; librisppech scripts and confs * fix log * add librispeech scripts * refactor data pipe; fix conf; fix u2 default params * fix bugs * refactor aishell scripts * fix test * fix cmvn * fix s0 scripts * fix ds2 scripts and bugs * fix dev & test dataset filter * fix dataset filter * filter dev * fix ckpt path * filter test, since librispeech will cause OOM, but all test wer will be worse, since mismatch train with test * add comment * add syllable doc * fix ds2 configs * add doc * add pypinyin tools * fix decoder using blank_id=0 * mmseg with pybind11 * format code
3 years ago
from functools import wraps
from typing import Any
from typing import Callable
from typing import Dict
from typing import Optional
from typing import Text
from typing import Union
from pypinyin.constants import Style
TStyle = Style
TRegisterFunc = Optional[Callable[[Text, Dict[Any, Any]], Text]]
TWrapperFunc = Optional[Callable[[Text, Dict[Any, Any]], Text]]
# 存储各拼音风格对应的实现
_registry = {} # type: Dict[Union[TStyle, int, str, Any], TRegisterFunc]
def convert(pinyin: Text,
style: TStyle,
strict: bool,
default: Optional[Text]=None,
**kwargs: Any) -> Text:
"""根据拼音风格把原始拼音转换为不同的格式
:param pinyin: 原始有声调的单个拼音
:type pinyin: unicode
:param style: 拼音风格
:param strict: 只获取声母或只获取韵母相关拼音风格的返回结果
是否严格遵照汉语拼音方案来处理声母和韵母
详见 :ref:`strict`
:type strict: bool
:param default: 拼音风格对应的实现不存在时返回的默认值
:return: 按照拼音风格进行处理过后的拼音字符串
:rtype: unicode
"""
if style in _registry:
return _registry[style](pinyin, strict=strict, **kwargs)
return default
def register(style: Union[TStyle, int, str, Any],
func: TRegisterFunc=None) -> TWrapperFunc:
"""注册一个拼音风格实现
::
@register('echo')
def echo(pinyin, **kwargs):
return pinyin
# or
register('echo', echo)
"""
if func is not None:
_registry[style] = func
return
def decorator(func):
_registry[style] = func
@wraps(func)
def wrapper(pinyin, **kwargs):
return func(pinyin, **kwargs)
return wrapper
return decorator
def auto_discover() -> None:
"""自动注册内置的拼音风格实现"""
from pypinyin.style import (
initials,
tone,
finals,
bopomofo,
cyrillic,
others, )