lispc
23e4483069
fix a comment in audio_featurizer.py
7 years ago
yangyaming
39dbcb4dfb
Give option to disable converting from transcription text to ids.
7 years ago
Yibing Liu
37a29bf181
fix the core dump bug of DS2's training in docker
7 years ago
Yibing Liu
7e093ed1a3
expose param cutoff_top_n
7 years ago
Yibing Liu
a18e6a7eda
refine by following review comments
7 years ago
Xinghai Sun
0bbb9c3ee2
Re-organize folder structure and hierarchy for DS2.
7 years ago
yangyaming
be37b03f0c
Fix a typo caused exception for audio_featurizer.py.
7 years ago
wanghaoshuang
08a6d07811
remove binary files
7 years ago
wanghaoshuang
7e9daa32b7
Merge branch 'develop' of https://github.com/PaddlePaddle/models into ds2_pcloud
7 years ago
wanghaoshuang
c00db21e69
Implement uploading data to PaddleCloud
...
1. Refine data_utils/data.py, reuse process_utterance function.
2. Modified README.
3. Implement uploading data in cloud/upload_data.py
4. Merge branch 'develop' of https://github.com/PaddlePaddle/models into ds2_pcloud
7 years ago
Yibing Liu
43c483d1b9
Merge pull request #192 from kuke/mfcc_simplify_dev
...
Update mfcc computation in DS2
7 years ago
Yibing Liu
98f0b6d02d
update the mfcc computation in DS2
7 years ago
yangyaming
14d2fb795c
Unify encoding to 'utf-8' and optimize error rate calculation.
7 years ago
Xinghai Sun
961f6a2963
Accelerate mfcc computation for DS2.
7 years ago
Yibing Liu
ee5abbe37d
add mfcc feature for DS2
7 years ago
Xinghai Sun
13f708739b
Improve audio featurizer and add shift augmentor.
...
1. Improve audio featurizer.
2. Add shift augmentor.
3. Update default argument to be the current best seggestion.
4. Add checkpoints with pass id.
7 years ago
Xinghai Sun
04a225ae4f
Enable min_batch_num in train.py and update train info print.
8 years ago
Xinghai Sun
b07ee84a1d
Add function, class and module docs for data parts in DS2.
8 years ago
Xinghai Sun
cd3617aeb4
Refactor whole data preprocessor for DS2 (re-design classes, re-organize dir, add augmentaion interfaces etc.).
...
1. Refactor data preprocessor with new added class AudioSegment, SpeechSegment, TextFeaturizer, AudioFeaturizer, SpeechFeaturizer.
2. Add data augmentation interfaces and class AugmentorBase, AugmentationPipeline, VolumnPerturbAugmentor etc..
3. Seperate normalizer's mean and std computing from training, by adding FeatureNormalizer and a seperate tool compute_mean_std.py.
4. Re-organize directory.
8 years ago