Update modle_arcitecture.md

4 years ago · d905555c3f
parent f35ee4053a
commit d905555c3f
1 changed files with 11 additions and 3 deletions
--- a/doc/src/modle_arcitecture.md
+++ b/doc/src/modle_arcitecture.md
@ -1,17 +1,25 @@
 # Model Arcitecture
 The implemented arcitecure of Deepspeech2 online model is based on [Deepspeech2 model](https://arxiv.org/pdf/1512.02595.pdf) with some changes. 
- The figure of arcitecture is shown in ![image](../image/ds2onlineModel.png).
+ The model is mainly composed of 2D convolution subsampling layer and single direction rnn layers. 
- The model is mainly composed of 2D convolution subsampling layer and single direction rnn layers. To illustrate the model implementation in detail, 5 parts is introduced.  
+ To illustrate the model implementation clearly, 5 parts is described in detail.  
     1. Feature Extraction.
     2. 2D Convolution subsampling layer.
     3. RNN layer with only forward direction.
     4. Softmax Layer.
     5. CTC Decoder.
 The arcitecture of the model is shown in Fig.1. 
 <p align="center">
 <img src="../images/ds2onlineModel.png" width=800> 
 <br/>Fig.1 The Arcitecture of deepspeech2 online modle
 </p>
- # Feature Extraction
+# Feature Extraction
 Three methods of feature extraction is implemented, which are linear, fbank and mfcc.
 For a single utterance $x^i$ sampled from the training set $S$,
 $ S= {(x^1,y^1),(x^2,y^2),...,(x^m,y^m)}$, where $y^i$ is the label correspodding to the ${x^i}
 # Backbone
 The Backbone is composed of 2D Convolution subsampling layer.