loongw
a9ccc34a98
make process_utterance accept file object.
7 years ago
yangyaming
d3dfc3dd23
Decouple data provider from model configuration.
7 years ago
yangyaming
20e225875c
Simplify parallel part for data processing and fix abnormal exit.
7 years ago
Yang yaming
4913cba53c
Merge pull request #444 from pkuyym/fix-442
...
Support padding removing.
7 years ago
yangyaming
f38d948193
Add more comments.
7 years ago
yangyaming
a02b8f8084
Add clean callback.
7 years ago
yangyaming
b3ebf3fd62
Support padding removing.
7 years ago
yangyaming
a084394128
Add doc and adjust some codes.
7 years ago
yangyaming
39dbcb4dfb
Give option to disable converting from transcription text to ids.
7 years ago
Xinghai Sun
81207201da
Fix a bug in running tools/compute_meanstd.py with seqbin data.
7 years ago
Xinghai Sun
64ab19c165
Add multiprocess version of xmap_reader to speedup training.
...
Add seqbin data parser to adapt to internal 1w data training.
7 years ago
lispczz
0dcf13f0fc
fix a deep speech 2 speed bug
7 years ago
Xinghai Sun
0bbb9c3ee2
Re-organize folder structure and hierarchy for DS2.
7 years ago
wanghaoshuang
4b26bf620c
Rename self.local_data to self._local_data in class DataGenerator.
7 years ago
wanghaoshuang
19824a8d98
Move local data from global into class DataGenerator.
7 years ago
wanghaoshuang
e9baaa8613
Fix some syntax errors.
7 years ago
wanghaoshuang
0bc9996633
Merge branch 'ds2_pcloud' of https://github.com/wanghaoshuang/models into ds2_pcloud
7 years ago
wanghaoshuang
c00db21e69
Implement uploading data to PaddleCloud
...
1. Refine data_utils/data.py, reuse process_utterance function.
2. Modified README.
3. Implement uploading data in cloud/upload_data.py
4. Merge branch 'develop' of https://github.com/PaddlePaddle/models into ds2_pcloud
7 years ago
whs
9285551de4
Merge branch 'develop' into ds2_pcloud
7 years ago
yangyaming
14d2fb795c
Unify encoding to 'utf-8' and optimize error rate calculation.
7 years ago
Xinghai Sun
99e819e8ea
Add ImpulseResponseAugmentor and augmentation.config file.
7 years ago
Xinghai Sun
6df0f9bc44
Reset default multi-thread/process number to half of cpu count() for speedup.
7 years ago
Xinghai Sun
0ebf36b98f
Add a realtime ASR demo for users to test their own voice with mic.
8 years ago
wanghaoshuang
9fa9a352ac
Refine submitting scripts for deepspeech2 on paddle cloud.
8 years ago
wanghaoshuang
3c77d369ca
Make ds2 run on paddle cloud
...
1. Refine data_utils/data.py to read bytes from tar file
2. Add scripts to submit paddle cloud job for ds2 trainning
8 years ago
Xinghai Sun
13f708739b
Improve audio featurizer and add shift augmentor.
...
1. Improve audio featurizer.
2. Add shift augmentor.
3. Update default argument to be the current best seggestion.
4. Add checkpoints with pass id.
8 years ago
Xinghai Sun
d104eccf67
Update the default num_threads for DS2 data generator.
8 years ago
Xinghai Sun
1d8cc4a5a9
Add multi-threading support for DS2 data generator.
8 years ago
Xinghai Sun
ed5f04afb8
Add shuffle type of instance_shuffle and batch_shuffle_clipped.
8 years ago
Xinghai Sun
b07ee84a1d
Add function, class and module docs for data parts in DS2.
8 years ago
Xinghai Sun
cd3617aeb4
Refactor whole data preprocessor for DS2 (re-design classes, re-organize dir, add augmentaion interfaces etc.).
...
1. Refactor data preprocessor with new added class AudioSegment, SpeechSegment, TextFeaturizer, AudioFeaturizer, SpeechFeaturizer.
2. Add data augmentation interfaces and class AugmentorBase, AugmentationPipeline, VolumnPerturbAugmentor etc..
3. Seperate normalizer's mean and std computing from training, by adding FeatureNormalizer and a seperate tool compute_mean_std.py.
4. Re-organize directory.
8 years ago