PaddleSpeech/.pre-commit-config.yaml

repos:
-   repo: https://github.com/pre-commit/mirrors-yapf.git
    rev: v0.16.0
    hooks:
    -   id: yapf
        files: \.py$
        exclude: (?=runtime/engine/kaldi|audio/paddleaudio/src|third_party).*(\.cpp|\.cc|\.h\.hpp|\.py)$

-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: a11d9314b22d8f8c7556443875b731ef05965464
    hooks:
    -   id: check-merge-conflict
    -   id: check-symlinks
    -   id: detect-private-key
        files: (?!.*paddle)^.*$
    -   id: end-of-file-fixer
        files: \.md$
    #-   id: trailing-whitespace
    #    files: \.md$
    -   id: requirements-txt-fixer
        exclude: (?=third_party).*$
    -   id: check-yaml
    -   id: check-json
    -   id: pretty-format-json
        args:
        - --no-sort-keys
        - --autofix
    -   id: check-merge-conflict
    -   id: flake8
        aergs:
        -  --ignore=E501,E228,E226,E261,E266,E128,E402,W503
        -  --builtins=G,request
        -  --jobs=1
        exclude: (?=runtime/engine/kaldi|audio/paddleaudio/src|third_party).*(\.cpp|\.cc|\.h\.hpp|\.py)$

-   repo : https://github.com/Lucas-C/pre-commit-hooks
    rev: v1.0.1
    hooks:
    -   id: forbid-crlf
        files: \.md$
    -   id: remove-crlf
        files: \.md$
    -   id: forbid-tabs
        files: \.md$
    -   id: remove-tabs
        files: \.md$

-   repo: local
    hooks:
    -   id: clang-format
        name: clang-format
        description: Format files with ClangFormat
        entry: bash .pre-commit-hooks/clang-format.hook -i
        language: system
        files: \.(h\+\+|h|hh|hxx|hpp|cuh|c|cc|cpp|cu|c\+\+|cxx|tpp|txx)$
        exclude: (?=runtime/engine/kaldi|audio/paddleaudio/src|runtime/patch|runtime/tools/fstbin|runtime/tools/lmbin|third_party/ctc_decoders|runtime/engine/common/utils).*(\.cpp|\.cc|\.h|\.hpp|\.py)$ 
    -   id: cpplint
        name: cpplint
        description: Static code analysis of C/C++ files
        language: python
        files: \.(h\+\+|h|hh|hxx|hpp|cuh|c|cc|cpp|cu|c\+\+|cxx|tpp|txx)$
        exclude: (?=runtime/engine/kaldi|runtime/engine/common/matrix|audio/paddleaudio/src|runtime/patch|runtime/tools/fstbin|runtime/tools/lmbin|third_party/ctc_decoders|runtime/engine/common/utils).*(\.cpp|\.cc|\.h|\.hpp|\.py)$ 
        entry: cpplint --filter=-build,-whitespace,+whitespace/comma,-whitespace/indent
-   repo: https://github.com/asottile/reorder_python_imports
    rev: v2.4.0
    hooks:
      - id: reorder-python-imports
        exclude: (?=runtime/engine/kaldi|audio/paddleaudio/src|runtime/patch|runtime/tools/fstbin|runtime/tools/lmbin|third_party/ctc_decoders).*(\.cpp|\.cc|\.h\.hpp|\.py)$
[vector]add voxceleb1 data prepare scripts (#1409) * add voxceleb1 data prepare scripts * add voxceleb1 vox1_test_wav.zip md5sum * optimize the voxceleb1 data prepare logic * voxceleb1 data prepare: adjust the code a little 3 years ago			`repos:`
Add ci and code format checking. 7 years ago			`- repo: https://github.com/pre-commit/mirrors-yapf.git`
[vector]add voxceleb1 data prepare scripts (#1409) * add voxceleb1 data prepare scripts * add voxceleb1 vox1_test_wav.zip md5sum * optimize the voxceleb1 data prepare logic * voxceleb1 data prepare: adjust the code a little 3 years ago			`rev: v0.16.0`
Add ci and code format checking. 7 years ago			`hooks:`
			`- id: yapf`
Update .pre-commit-config.yaml 2 years ago			`files: \.py$`
[engine] rename speechx (#2892) * rename speechx * fix wfst decode error * replace reset with make_unique 2 years ago			`exclude: (?=runtime/engine/kaldi\|audio/paddleaudio/src\|third_party).*(\.cpp\|\.cc\|\.h\.hpp\|\.py)$`
[audio] mv paddlespeech/audio to paddleaudio (#2706) * split paddlespeech/audio to paddleaudio. * add sox io ,sox effect, kaldi native fbank to paddleaudio. 2 years ago
Add ci and code format checking. 7 years ago			`- repo: https://github.com/pre-commit/pre-commit-hooks`
[vector]add voxceleb1 data prepare scripts (#1409) * add voxceleb1 data prepare scripts * add voxceleb1 vox1_test_wav.zip md5sum * optimize the voxceleb1 data prepare logic * voxceleb1 data prepare: adjust the code a little 3 years ago			`rev: a11d9314b22d8f8c7556443875b731ef05965464`
Add ci and code format checking. 7 years ago			`hooks:`
			`- id: check-merge-conflict`
			`- id: check-symlinks`
			`- id: detect-private-key`
			`files: (?!.paddle)^.$`
			`- id: end-of-file-fixer`
			`files: \.md$`
add ctc loss topic 3 years ago			`#- id: trailing-whitespace`
			`# files: \.md$`
E2E/Streaming Transformer/Conformer ASR (#578) * add cmvn and label smoothing loss layer * add layer for transformer * add glu and conformer conv * add torch compatiable hack, mask funcs * not hack size since it exists * add test; attention * add attention, common utils, hack paddle * add audio utils * conformer batch padding mask bug fix #223 * fix typo, python infer fix rnn mem opt name error and batchnorm1d, will be available at 2.0.2 * fix ci * fix ci * add encoder * refactor egs * add decoder * refactor ctc, add ctc align, refactor ckpt, add warmup lr scheduler, cmvn utils * refactor docs * add fix * fix readme * fix bugs, refactor collator, add pad_sequence, fix ckpt bugs * fix docstring * refactor data feed order * add u2 model * refactor cmvn, test * add utils * add u2 config * fix bugs * fix bugs * fix autograd maybe has problem when using inplace operation * refactor data, build vocab; add format data * fix text featurizer * refactor build vocab * add fbank, refactor feature of speech * refactor audio feat * refactor data preprare * refactor data * model init from config * add u2 bins * flake8 * can train * fix bugs, add coverage, add scripts * test can run * fix data * speed perturb with sox * add spec aug * fix for train * fix train logitc * fix logger * log valid loss, time dataset process * using np for speed perturb, remove some debug log of grad clip * fix logger * fix build vocab * fix logger name * using module logger as default * fix * fix install * reorder imports * fix board logger * fix logger * kaldi fbank and mfcc * fix cmvn and print prarams * fix add_eos_sos and cmvn * fix cmvn compute * fix logger and cmvn * fix subsampling, label smoothing loss, remove useless * add notebook test * fix log * fix tb logger * multi gpu valid * fix log * fix log * fix config * fix compute cmvn, need paddle 2.1 * add cmvn notebook * fix layer tools * fix compute cmvn * add rtf * fix decoding * fix layer tools * fix log, add avg script * more avg and test info * fix dataset pickle problem; using 2.1 paddle; num_workers can > 0; ckpt save in exp dir;fix setup.sh; * add vimrc * refactor tiny script, add transformer and stream conf * spm demo; librisppech scripts and confs * fix log * add librispeech scripts * refactor data pipe; fix conf; fix u2 default params * fix bugs * refactor aishell scripts * fix test * fix cmvn * fix s0 scripts * fix ds2 scripts and bugs * fix dev & test dataset filter * fix dataset filter * filter dev * fix ckpt path * filter test, since librispeech will cause OOM, but all test wer will be worse, since mismatch train with test * add comment * add syllable doc * fix ds2 configs * add doc * add pypinyin tools * fix decoder using blank_id=0 * mmseg with pybind11 * format code 4 years ago			`- id: requirements-txt-fixer`
			`exclude: (?=third_party).*$`
			`- id: check-yaml`
			`- id: check-json`
			`- id: pretty-format-json`
			`args:`
			`- --no-sort-keys`
			`- --autofix`
			`- id: check-merge-conflict`
			`- id: flake8`
			`aergs:`
			`- --ignore=E501,E228,E226,E261,E266,E128,E402,W503`
			`- --builtins=G,request`
			`- --jobs=1`
[engine] rename speechx (#2892) * rename speechx * fix wfst decode error * replace reset with make_unique 2 years ago			`exclude: (?=runtime/engine/kaldi\|audio/paddleaudio/src\|third_party).*(\.cpp\|\.cc\|\.h\.hpp\|\.py)$`
[audio] mv paddlespeech/audio to paddleaudio (#2706) * split paddlespeech/audio to paddleaudio. * add sox io ,sox effect, kaldi native fbank to paddleaudio. 2 years ago
E2E/Streaming Transformer/Conformer ASR (#578) * add cmvn and label smoothing loss layer * add layer for transformer * add glu and conformer conv * add torch compatiable hack, mask funcs * not hack size since it exists * add test; attention * add attention, common utils, hack paddle * add audio utils * conformer batch padding mask bug fix #223 * fix typo, python infer fix rnn mem opt name error and batchnorm1d, will be available at 2.0.2 * fix ci * fix ci * add encoder * refactor egs * add decoder * refactor ctc, add ctc align, refactor ckpt, add warmup lr scheduler, cmvn utils * refactor docs * add fix * fix readme * fix bugs, refactor collator, add pad_sequence, fix ckpt bugs * fix docstring * refactor data feed order * add u2 model * refactor cmvn, test * add utils * add u2 config * fix bugs * fix bugs * fix autograd maybe has problem when using inplace operation * refactor data, build vocab; add format data * fix text featurizer * refactor build vocab * add fbank, refactor feature of speech * refactor audio feat * refactor data preprare * refactor data * model init from config * add u2 bins * flake8 * can train * fix bugs, add coverage, add scripts * test can run * fix data * speed perturb with sox * add spec aug * fix for train * fix train logitc * fix logger * log valid loss, time dataset process * using np for speed perturb, remove some debug log of grad clip * fix logger * fix build vocab * fix logger name * using module logger as default * fix * fix install * reorder imports * fix board logger * fix logger * kaldi fbank and mfcc * fix cmvn and print prarams * fix add_eos_sos and cmvn * fix cmvn compute * fix logger and cmvn * fix subsampling, label smoothing loss, remove useless * add notebook test * fix log * fix tb logger * multi gpu valid * fix log * fix log * fix config * fix compute cmvn, need paddle 2.1 * add cmvn notebook * fix layer tools * fix compute cmvn * add rtf * fix decoding * fix layer tools * fix log, add avg script * more avg and test info * fix dataset pickle problem; using 2.1 paddle; num_workers can > 0; ckpt save in exp dir;fix setup.sh; * add vimrc * refactor tiny script, add transformer and stream conf * spm demo; librisppech scripts and confs * fix log * add librispeech scripts * refactor data pipe; fix conf; fix u2 default params * fix bugs * refactor aishell scripts * fix test * fix cmvn * fix s0 scripts * fix ds2 scripts and bugs * fix dev & test dataset filter * fix dataset filter * filter dev * fix ckpt path * filter test, since librispeech will cause OOM, but all test wer will be worse, since mismatch train with test * add comment * add syllable doc * fix ds2 configs * add doc * add pypinyin tools * fix decoder using blank_id=0 * mmseg with pybind11 * format code 4 years ago			`- repo : https://github.com/Lucas-C/pre-commit-hooks`
[vector]add voxceleb1 data prepare scripts (#1409) * add voxceleb1 data prepare scripts * add voxceleb1 vox1_test_wav.zip md5sum * optimize the voxceleb1 data prepare logic * voxceleb1 data prepare: adjust the code a little 3 years ago			`rev: v1.0.1`
Add ci and code format checking. 7 years ago			`hooks:`
			`- id: forbid-crlf`
			`files: \.md$`
			`- id: remove-crlf`
			`files: \.md$`
			`- id: forbid-tabs`
			`files: \.md$`
			`- id: remove-tabs`
			`files: \.md$`
[audio] mv paddlespeech/audio to paddleaudio (#2706) * split paddlespeech/audio to paddleaudio. * add sox io ,sox effect, kaldi native fbank to paddleaudio. 2 years ago
Add ci and code format checking. 7 years ago			`- repo: local`
			`hooks:`
			`- id: clang-format`
			`name: clang-format`
			`description: Format files with ClangFormat`
add copyright 4 years ago			`entry: bash .pre-commit-hooks/clang-format.hook -i`
Add ci and code format checking. 7 years ago			`language: system`
[audio] mv paddlespeech/audio to paddleaudio (#2706) * split paddlespeech/audio to paddleaudio. * add sox io ,sox effect, kaldi native fbank to paddleaudio. 2 years ago			`files: \.(h\+\+\|h\|hh\|hxx\|hpp\|cuh\|c\|cc\|cpp\|cu\|c\+\+\|cxx\|tpp\|txx)$`
[engine] rename speechx (#2892) * rename speechx * fix wfst decode error * replace reset with make_unique 2 years ago			`exclude: (?=runtime/engine/kaldi\|audio/paddleaudio/src\|runtime/patch\|runtime/tools/fstbin\|runtime/tools/lmbin\|third_party/ctc_decoders\|runtime/engine/common/utils).*(\.cpp\|\.cc\|\.h\|\.hpp\|\.py)$`
cpplint 2 years ago			`- id: cpplint`
			`name: cpplint`
			`description: Static code analysis of C/C++ files`
			`language: python`
			`files: \.(h\+\+\|h\|hh\|hxx\|hpp\|cuh\|c\|cc\|cpp\|cu\|c\+\+\|cxx\|tpp\|txx)$`
[engine] rename speechx (#2892) * rename speechx * fix wfst decode error * replace reset with make_unique 2 years ago			`exclude: (?=runtime/engine/kaldi\|runtime/engine/common/matrix\|audio/paddleaudio/src\|runtime/patch\|runtime/tools/fstbin\|runtime/tools/lmbin\|third_party/ctc_decoders\|runtime/engine/common/utils).*(\.cpp\|\.cc\|\.h\|\.hpp\|\.py)$`
cpplint 2 years ago			`entry: cpplint --filter=-build,-whitespace,+whitespace/comma,-whitespace/indent`
E2E/Streaming Transformer/Conformer ASR (#578) * add cmvn and label smoothing loss layer * add layer for transformer * add glu and conformer conv * add torch compatiable hack, mask funcs * not hack size since it exists * add test; attention * add attention, common utils, hack paddle * add audio utils * conformer batch padding mask bug fix #223 * fix typo, python infer fix rnn mem opt name error and batchnorm1d, will be available at 2.0.2 * fix ci * fix ci * add encoder * refactor egs * add decoder * refactor ctc, add ctc align, refactor ckpt, add warmup lr scheduler, cmvn utils * refactor docs * add fix * fix readme * fix bugs, refactor collator, add pad_sequence, fix ckpt bugs * fix docstring * refactor data feed order * add u2 model * refactor cmvn, test * add utils * add u2 config * fix bugs * fix bugs * fix autograd maybe has problem when using inplace operation * refactor data, build vocab; add format data * fix text featurizer * refactor build vocab * add fbank, refactor feature of speech * refactor audio feat * refactor data preprare * refactor data * model init from config * add u2 bins * flake8 * can train * fix bugs, add coverage, add scripts * test can run * fix data * speed perturb with sox * add spec aug * fix for train * fix train logitc * fix logger * log valid loss, time dataset process * using np for speed perturb, remove some debug log of grad clip * fix logger * fix build vocab * fix logger name * using module logger as default * fix * fix install * reorder imports * fix board logger * fix logger * kaldi fbank and mfcc * fix cmvn and print prarams * fix add_eos_sos and cmvn * fix cmvn compute * fix logger and cmvn * fix subsampling, label smoothing loss, remove useless * add notebook test * fix log * fix tb logger * multi gpu valid * fix log * fix log * fix config * fix compute cmvn, need paddle 2.1 * add cmvn notebook * fix layer tools * fix compute cmvn * add rtf * fix decoding * fix layer tools * fix log, add avg script * more avg and test info * fix dataset pickle problem; using 2.1 paddle; num_workers can > 0; ckpt save in exp dir;fix setup.sh; * add vimrc * refactor tiny script, add transformer and stream conf * spm demo; librisppech scripts and confs * fix log * add librispeech scripts * refactor data pipe; fix conf; fix u2 default params * fix bugs * refactor aishell scripts * fix test * fix cmvn * fix s0 scripts * fix ds2 scripts and bugs * fix dev & test dataset filter * fix dataset filter * filter dev * fix ckpt path * filter test, since librispeech will cause OOM, but all test wer will be worse, since mismatch train with test * add comment * add syllable doc * fix ds2 configs * add doc * add pypinyin tools * fix decoder using blank_id=0 * mmseg with pybind11 * format code 4 years ago			`- repo: https://github.com/asottile/reorder_python_imports`
			`rev: v2.4.0`
			`hooks:`
			`- id: reorder-python-imports`
[engine] rename speechx (#2892) * rename speechx * fix wfst decode error * replace reset with make_unique 2 years ago			`exclude: (?=runtime/engine/kaldi\|audio/paddleaudio/src\|runtime/patch\|runtime/tools/fstbin\|runtime/tools/lmbin\|third_party/ctc_decoders).*(\.cpp\|\.cc\|\.h\.hpp\|\.py)$`