You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Hui Zhang
538bf271eb
|
4 years ago | |
---|---|---|
.. | ||
cn | 4 years ago | |
en | 4 years ago | |
number_data | 4 years ago | |
ru | 4 years ago | |
universal | 4 years ago | |
util | 4 years ago | |
LICENSE | 4 years ago | |
Makefile | 4 years ago | |
README.md | 4 years ago |
README.md
Text normalization covering grammars
This repository provides covering grammars for English and Russian text normalization as documented in:
Gorman, K., and Sproat, R. 2016. Minimally supervised number normalization. Transactions of the Association for Computational Linguistics 4: 507-519.
Ng, A. H., Gorman, K., and Sproat, R. 2017. Minimally supervised written-to-spoken text normalization. In ASRU, pages 665-670.
If you use these grammars in a publication, we would appreciate if you cite these works.
Building
The grammars are written in Thrax and compile into OpenFst FAR (FstARchive) files. To compile, simply run make
in the src/
directory.
License
See LICENSE
.
Mandatory disclaimer
This is not an official Google product.