1. Refactor data preprocessor with new added class AudioSegment, SpeechSegment, TextFeaturizer, AudioFeaturizer, SpeechFeaturizer.
2. Add data augmentation interfaces and class AugmentorBase, AugmentationPipeline, VolumnPerturbAugmentor etc..
3. Seperate normalizer's mean and std computing from training, by adding FeatureNormalizer and a seperate tool compute_mean_std.py.
4. Re-organize directory.
Summary:
1. Add manifest line check.
2. Avoid re-unpacking if unpacked data already exists.
3. Add full_download (download all 7 sub-datasets of LibriSpeech).
2. Fix incorrect batch-norm usage in RNN.
3. Fix overlapping train/dev/test manfests.
4. Update README.md and requirements.txt.
5. Expose more arguments to users in argparser.
6. Update all other details.