You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
PaddleSpeech/third_party/python-pinyin/phrase-pinyin-data
Hui Zhang 71e046b0ba
E2E/Streaming Transformer/Conformer ASR (#578)
3 years ago
..
.bumpversion.cfg E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
.gitignore E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
.travis.yml E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
CHANGELOG.md E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
LICENSE E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
Makefile E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
README.md E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
cc_cedict.txt E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
get_latest_cc_cedict.py E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
large_pinyin.txt E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
merge.py E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
overwrite.txt E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
parse_latest_cc_cedict.py E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
pinyin.txt E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
requirements_dev.txt E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
zdic_cibs.txt E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago
zdic_cybs.txt E2E/Streaming Transformer/Conformer ASR (#578) 3 years ago

README.md

phrase-pinyin-data Build Status

词语拼音数据。

数据介绍

拼音数据的格式:

{phrase}: {pinyin}
  • # 开头的行是注释

  • 行尾的 # 也是注释

  • {phrase} 汉字词语

  • {pinyin} 词语的拼音,使用空格分隔每个汉字的拼音

  • 一行一个词语的读音,有多个音的词语会出现在多行

  • 示例:

    # 注释
    中国: zhōng guó
    北京: běi jīng  # 注释
    

文件说明:

  • overwrite.txt: 手工纠正的拼音数据
  • pinyin.txt: pinyin.txt + overwrite.txt 后的拼音数据
  • zdic_cibs.txt: 汉典网 汉语词典拼音数据
  • zdic_cybs.txt: 汉典网 成语词典拼音数据
  • cc_cedict.txt: cc-cedict.org 拼音数据
  • large_pinyin.txt: zdic_cibs.txt + zdic_cybs.txt + cc_cedict.txt + pinyin.txt + overwrite.txt 后的拼音数据

修改数据

  • 修改 pinyin.txtoverwrite.txt 都可以了
  • 执行 make merge 命令可以按照合并规则生成最新的 pinyin.txt

参考资料

相关项目