diff --git a/.github/ISSUE_TEMPLATE/bug-report-s2t.md b/.github/ISSUE_TEMPLATE/bug-report-s2t.md
index 512cdbb01..e9732ad8c 100644
--- a/.github/ISSUE_TEMPLATE/bug-report-s2t.md
+++ b/.github/ISSUE_TEMPLATE/bug-report-s2t.md
@@ -33,7 +33,7 @@ If applicable, add screenshots to help explain your problem.
  - Python Version [e.g. 3.7]
  - PaddlePaddle Version [e.g. 2.0.0]
  - Model Version [e.g. 2.0.0]
- - GPU/DRIVER Informationo [e.g. Tesla V100-SXM2-32GB/440.64.00]
+ - GPU/DRIVER Information [e.g. Tesla V100-SXM2-32GB/440.64.00]
  - CUDA/CUDNN Version [e.g. cuda-10.2]
  - MKL Version
 - TensorRT Version
diff --git a/.github/ISSUE_TEMPLATE/bug-report-tts.md b/.github/ISSUE_TEMPLATE/bug-report-tts.md
index e2322c239..b4c5dabdd 100644
--- a/.github/ISSUE_TEMPLATE/bug-report-tts.md
+++ b/.github/ISSUE_TEMPLATE/bug-report-tts.md
@@ -32,7 +32,7 @@ If applicable, add screenshots to help explain your problem.
  - Python Version [e.g. 3.7]
  - PaddlePaddle Version [e.g. 2.0.0]
  - Model Version [e.g. 2.0.0]
- - GPU/DRIVER Informationo [e.g. Tesla V100-SXM2-32GB/440.64.00]
+ - GPU/DRIVER Information [e.g. Tesla V100-SXM2-32GB/440.64.00]
  - CUDA/CUDNN Version [e.g. cuda-10.2]
  - MKL Version
 - TensorRT Version
diff --git a/README.md b/README.md
index 39cb1bc9d..6594a4b8f 100644
--- a/README.md
+++ b/README.md
@@ -265,6 +265,8 @@ git clone https://github.com/PaddlePaddle/PaddleSpeech.git
 cd PaddleSpeech
 pip install pytest-runner
 pip install .
+# If you need to install in editable mode, you need to use --use-pep517. The command is as follows:
+# pip install -e . --use-pep517
 ```
 
 For more installation problems, such as conda environment, librosa-dependent, gcc problems, kaldi installation, etc., you can refer to this [installation document](./docs/source/install.md). If you encounter problems during installation, you can leave a message on [#2150](https://github.com/PaddlePaddle/PaddleSpeech/issues/2150) and find related problems
diff --git a/README_cn.md b/README_cn.md
index a644e4c9f..5b95a2879 100644
--- a/README_cn.md
+++ b/README_cn.md
@@ -272,6 +272,8 @@ git clone https://github.com/PaddlePaddle/PaddleSpeech.git
 cd PaddleSpeech
 pip install pytest-runner
 pip install .
+# 如果需要在可编辑模式下安装，需要使用 --use-pep517，命令如下
+# pip install -e . --use-pep517
 ```
 
 更多关于安装问题，如 conda 环境，librosa 依赖的系统库，gcc 环境问题，kaldi 安装等，可以参考这篇[安装文档](docs/source/install_cn.md)，如安装上遇到问题可以在 [#2150](https://github.com/PaddlePaddle/PaddleSpeech/issues/2150) 上留言以及查找相关问题
diff --git a/audio/paddleaudio/backends/soundfile_backend.py b/audio/paddleaudio/backends/soundfile_backend.py
index 9195ea097..dcd2b4b1e 100644
--- a/audio/paddleaudio/backends/soundfile_backend.py
+++ b/audio/paddleaudio/backends/soundfile_backend.py
@@ -61,7 +61,7 @@ def resample(y: np.ndarray,
     if mode == 'kaiser_best':
         warnings.warn(
             f'Using resampy in kaiser_best to {src_sr}=>{target_sr}. This function is pretty slow, \
-        we recommend the mode kaiser_fast in large scale audio trainning')
+        we recommend the mode kaiser_fast in large scale audio training')
 
     if not isinstance(y, np.ndarray):
         raise ParameterError(
diff --git a/audio/paddleaudio/compliance/kaldi.py b/audio/paddleaudio/compliance/kaldi.py
index eb92ec1f2..a94ec4053 100644
--- a/audio/paddleaudio/compliance/kaldi.py
+++ b/audio/paddleaudio/compliance/kaldi.py
@@ -233,7 +233,7 @@ def spectrogram(waveform: Tensor,
         round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input
             to FFT. Defaults to True.
         sr (int, optional): Sample rate of input waveform. Defaults to 16000.
-        snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a singal frame when it
+        snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a signal frame when it
             is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True.
         subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False.
         window_type (str, optional): Choose type of window for FFT computation. Defaults to "povey".
@@ -443,7 +443,7 @@ def fbank(waveform: Tensor,
         round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input
             to FFT. Defaults to True.
         sr (int, optional): Sample rate of input waveform. Defaults to 16000.
-        snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a singal frame when it
+        snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a signal frame when it
             is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True.
         subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False.
         use_energy (bool, optional): Add an dimension with energy of spectrogram to the output. Defaults to False.
@@ -566,7 +566,7 @@ def mfcc(waveform: Tensor,
         round_to_power_of_two (bool, optional): If True, round window size to power of two by zero-padding input
             to FFT. Defaults to True.
         sr (int, optional): Sample rate of input waveform. Defaults to 16000.
-        snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a singal frame when it
+        snip_edges (bool, optional): Drop samples in the end of waveform that cann't fit a signal frame when it
             is set True. Otherwise performs reflect padding to the end of waveform. Defaults to True.
         subtract_mean (bool, optional): Whether to subtract mean of feature files. Defaults to False.
         use_energy (bool, optional): Add an dimension with energy of spectrogram to the output. Defaults to False.
diff --git a/audio/paddleaudio/datasets/dataset.py b/audio/paddleaudio/datasets/dataset.py
index f1dfc1ea3..170e91669 100644
--- a/audio/paddleaudio/datasets/dataset.py
+++ b/audio/paddleaudio/datasets/dataset.py
@@ -47,7 +47,7 @@ class AudioClassificationDataset(paddle.io.Dataset):
             files (:obj:`List[str]`): A list of absolute path of audio files.
             labels (:obj:`List[int]`): Labels of audio files.
             feat_type (:obj:`str`, `optional`, defaults to `raw`):
-                It identifies the feature type that user wants to extrace of an audio file.
+                It identifies the feature type that user wants to extract of an audio file.
         """
         super(AudioClassificationDataset, self).__init__()
 
diff --git a/audio/paddleaudio/datasets/esc50.py b/audio/paddleaudio/datasets/esc50.py
index e7477d40e..fd8c8503e 100644
--- a/audio/paddleaudio/datasets/esc50.py
+++ b/audio/paddleaudio/datasets/esc50.py
@@ -117,7 +117,7 @@ class ESC50(AudioClassificationDataset):
             split (:obj:`int`, `optional`, defaults to 1):
                 It specify the fold of dev dataset.
             feat_type (:obj:`str`, `optional`, defaults to `raw`):
-                It identifies the feature type that user wants to extrace of an audio file.
+                It identifies the feature type that user wants to extract of an audio file.
         """
         files, labels = self._get_data(mode, split)
         super(ESC50, self).__init__(
diff --git a/audio/paddleaudio/datasets/gtzan.py b/audio/paddleaudio/datasets/gtzan.py
index cfea6f37e..a76e9208e 100644
--- a/audio/paddleaudio/datasets/gtzan.py
+++ b/audio/paddleaudio/datasets/gtzan.py
@@ -67,7 +67,7 @@ class GTZAN(AudioClassificationDataset):
             split (:obj:`int`, `optional`, defaults to 1):
                 It specify the fold of dev dataset.
             feat_type (:obj:`str`, `optional`, defaults to `raw`):
-                It identifies the feature type that user wants to extrace of an audio file.
+                It identifies the feature type that user wants to extract of an audio file.
         """
         assert split <= n_folds, f'The selected split should not be larger than n_fold, but got {split} > {n_folds}'
         files, labels = self._get_data(mode, seed, n_folds, split)
diff --git a/audio/paddleaudio/datasets/tess.py b/audio/paddleaudio/datasets/tess.py
index 8faab9c39..e34eaea37 100644
--- a/audio/paddleaudio/datasets/tess.py
+++ b/audio/paddleaudio/datasets/tess.py
@@ -76,7 +76,7 @@ class TESS(AudioClassificationDataset):
             split (:obj:`int`, `optional`, defaults to 1):
                 It specify the fold of dev dataset.
             feat_type (:obj:`str`, `optional`, defaults to `raw`):
-                It identifies the feature type that user wants to extrace of an audio file.
+                It identifies the feature type that user wants to extract of an audio file.
         """
         assert split <= n_folds, f'The selected split should not be larger than n_fold, but got {split} > {n_folds}'
         files, labels = self._get_data(mode, seed, n_folds, split)
diff --git a/audio/paddleaudio/datasets/urban_sound.py b/audio/paddleaudio/datasets/urban_sound.py
index d97c4d1dc..43d1b36c4 100644
--- a/audio/paddleaudio/datasets/urban_sound.py
+++ b/audio/paddleaudio/datasets/urban_sound.py
@@ -68,7 +68,7 @@ class UrbanSound8K(AudioClassificationDataset):
             split (:obj:`int`, `optional`, defaults to 1):
                 It specify the fold of dev dataset.
             feat_type (:obj:`str`, `optional`, defaults to `raw`):
-                It identifies the feature type that user wants to extrace of an audio file.
+                It identifies the feature type that user wants to extract of an audio file.
         """
 
     def _get_meta_info(self):
diff --git a/audio/paddleaudio/datasets/voxceleb.py b/audio/paddleaudio/datasets/voxceleb.py
index b7160b24c..1fafb5176 100644
--- a/audio/paddleaudio/datasets/voxceleb.py
+++ b/audio/paddleaudio/datasets/voxceleb.py
@@ -262,8 +262,8 @@ class VoxCeleb(Dataset):
                      split_chunks: bool=True):
         print(f'Generating csv: {output_file}')
         header = ["id", "duration", "wav", "start", "stop", "spk_id"]
-        # Note: this may occurs c++ execption, but the program will execute fine
-        # so we can ignore the execption 
+        # Note: this may occurs c++ exception, but the program will execute fine
+        # so we can ignore the exception 
         with Pool(cpu_count()) as p:
             infos = list(
                 tqdm(
diff --git a/audio/paddleaudio/features/layers.py b/audio/paddleaudio/features/layers.py
index 292363e64..801ae34ce 100644
--- a/audio/paddleaudio/features/layers.py
+++ b/audio/paddleaudio/features/layers.py
@@ -34,7 +34,7 @@ __all__ = [
 
 class Spectrogram(nn.Layer):
     """Compute spectrogram of given signals, typically audio waveforms.
-    The spectorgram is defined as the complex norm of the short-time Fourier transformation.
+    The spectrogram is defined as the complex norm of the short-time Fourier transformation.
 
     Args:
         n_fft (int, optional): The number of frequency components of the discrete Fourier transform. Defaults to 512.
diff --git a/audio/paddleaudio/functional/functional.py b/audio/paddleaudio/functional/functional.py
index 19c63a9ae..7c20f9013 100644
--- a/audio/paddleaudio/functional/functional.py
+++ b/audio/paddleaudio/functional/functional.py
@@ -247,7 +247,7 @@ def create_dct(n_mfcc: int,
     Args:
         n_mfcc (int): Number of mel frequency cepstral coefficients. 
         n_mels (int): Number of mel filterbanks.
-        norm (Optional[str], optional): Normalizaiton type. Defaults to 'ortho'.
+        norm (Optional[str], optional): Normalization type. Defaults to 'ortho'.
         dtype (str, optional): The data type of the return matrix. Defaults to 'float32'.
 
     Returns:
diff --git a/audio/paddleaudio/metric/eer.py b/audio/paddleaudio/metric/eer.py
index a1166d3f9..a55695ac1 100644
--- a/audio/paddleaudio/metric/eer.py
+++ b/audio/paddleaudio/metric/eer.py
@@ -22,8 +22,8 @@ def compute_eer(labels: np.ndarray, scores: np.ndarray) -> List[float]:
     """Compute EER and return score threshold.
 
     Args:
-        labels (np.ndarray): the trial label, shape: [N], one-dimention, N refer to the samples num
-        scores (np.ndarray): the trial scores, shape: [N], one-dimention, N refer to the samples num
+        labels (np.ndarray): the trial label, shape: [N], one-dimension, N refer to the samples num
+        scores (np.ndarray): the trial scores, shape: [N], one-dimension, N refer to the samples num
 
     Returns:
         List[float]: eer and the specific threshold
diff --git a/audio/paddleaudio/sox_effects/sox_effects.py b/audio/paddleaudio/sox_effects/sox_effects.py
index cb7e1b0b9..aa282b572 100644
--- a/audio/paddleaudio/sox_effects/sox_effects.py
+++ b/audio/paddleaudio/sox_effects/sox_effects.py
@@ -121,8 +121,8 @@ def apply_effects_tensor(
 
     """
     tensor_np = tensor.numpy()
-    ret = paddleaudio._paddleaudio.sox_effects_apply_effects_tensor(tensor_np, sample_rate,
-                                                       effects, channels_first)
+    ret = paddleaudio._paddleaudio.sox_effects_apply_effects_tensor(
+        tensor_np, sample_rate, effects, channels_first)
     if ret is not None:
         return (paddle.to_tensor(ret[0]), ret[1])
     raise RuntimeError("Failed to apply sox effect")
@@ -139,7 +139,7 @@ def apply_effects_file(
 
     Note:
         This function works in the way very similar to ``sox`` command, however there are slight
-        differences. For example, ``sox`` commnad adds certain effects automatically (such as
+        differences. For example, ``sox`` command adds certain effects automatically (such as
         ``rate`` effect after ``speed``, ``pitch`` etc), but this function only applies the given
         effects. Therefore, to actually apply ``speed`` effect, you also need to give ``rate``
         effect with desired sampling rate, because internally, ``speed`` effects only alter sampling
@@ -228,14 +228,14 @@ def apply_effects_file(
         >>>     pass
     """
     if hasattr(path, "read"):
-        ret = paddleaudio._paddleaudio.apply_effects_fileobj(path, effects, normalize,
-                                                channels_first, format)
+        ret = paddleaudio._paddleaudio.apply_effects_fileobj(
+            path, effects, normalize, channels_first, format)
         if ret is None:
             raise RuntimeError("Failed to load audio from {}".format(path))
         return (paddle.to_tensor(ret[0]), ret[1])
     path = os.fspath(path)
-    ret = paddleaudio._paddleaudio.sox_effects_apply_effects_file(path, effects, normalize,
-                                                     channels_first, format)
+    ret = paddleaudio._paddleaudio.sox_effects_apply_effects_file(
+        path, effects, normalize, channels_first, format)
     if ret is not None:
         return (paddle.to_tensor(ret[0]), ret[1])
     raise RuntimeError("Failed to load audio from {}".format(path))
diff --git a/audio/paddleaudio/src/pybind/kaldi/feature_common_inl.h b/audio/paddleaudio/src/pybind/kaldi/feature_common_inl.h
index 985d586fe..3c62bb0d4 100644
--- a/audio/paddleaudio/src/pybind/kaldi/feature_common_inl.h
+++ b/audio/paddleaudio/src/pybind/kaldi/feature_common_inl.h
@@ -26,7 +26,7 @@ template <class F>
 bool StreamingFeatureTpl<F>::ComputeFeature(
     const std::vector<float>& wav,
     std::vector<float>* feats) {
-    // append remaned waves
+    // append remained waves
     int wav_len = wav.size();
     if (wav_len == 0) return false;
     int left_len = remained_wav_.size();
@@ -38,7 +38,7 @@ bool StreamingFeatureTpl<F>::ComputeFeature(
                 wav.data(),
                 wav_len * sizeof(float));
 
-    // cache remaned waves
+    // cache remained waves
     knf::FrameExtractionOptions frame_opts = computer_.GetFrameOptions();
     int num_frames = knf::NumFrames(waves.size(), frame_opts);
     int frame_shift = frame_opts.WindowShift();
diff --git a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.cc b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.cc
index 8b8ff18be..6fdf68af2 100644
--- a/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.cc
+++ b/audio/paddleaudio/src/pybind/kaldi/kaldi_feature_wrapper.cc
@@ -44,5 +44,5 @@ py::array_t<float> KaldiFeatureWrapper::ComputeFbank(
     return result.reshape(shape);
 }
 
-}  // namesapce kaldi
+}  // namespace kaldi
 }  // namespace paddleaudio
diff --git a/audio/paddleaudio/src/pybind/sox/effects.cpp b/audio/paddleaudio/src/pybind/sox/effects.cpp
index ea77527bb..5b8959f6c 100644
--- a/audio/paddleaudio/src/pybind/sox/effects.cpp
+++ b/audio/paddleaudio/src/pybind/sox/effects.cpp
@@ -12,9 +12,9 @@ using namespace paddleaudio::sox_utils;
 namespace paddleaudio::sox_effects {
 
 // Streaming decoding over file-like object is tricky because libsox operates on
-// FILE pointer. The folloing is what `sox` and `play` commands do
+// FILE pointer. The following is what `sox` and `play` commands do
 //  - file input -> FILE pointer
-//  - URL input -> call wget in suprocess and pipe the data -> FILE pointer
+//  - URL input -> call wget in subprocess and pipe the data -> FILE pointer
 //  - stdin -> FILE pointer
 //
 // We want to, instead, fetch byte strings chunk by chunk, consume them, and
@@ -127,12 +127,12 @@ namespace {
 
 enum SoxEffectsResourceState { NotInitialized, Initialized, ShutDown };
 SoxEffectsResourceState SOX_RESOURCE_STATE = NotInitialized;
-std::mutex SOX_RESOUCE_STATE_MUTEX;
+std::mutex SOX_RESOURCE_STATE_MUTEX;
 
 } // namespace
 
 void initialize_sox_effects() {
-  const std::lock_guard<std::mutex> lock(SOX_RESOUCE_STATE_MUTEX);
+  const std::lock_guard<std::mutex> lock(SOX_RESOURCE_STATE_MUTEX);
 
   switch (SOX_RESOURCE_STATE) {
     case NotInitialized:
@@ -150,7 +150,7 @@ void initialize_sox_effects() {
 };
 
 void shutdown_sox_effects() {
-  const std::lock_guard<std::mutex> lock(SOX_RESOUCE_STATE_MUTEX);
+  const std::lock_guard<std::mutex> lock(SOX_RESOURCE_STATE_MUTEX);
 
   switch (SOX_RESOURCE_STATE) {
     case NotInitialized:
diff --git a/audio/paddleaudio/src/pybind/sox/effects_chain.cpp b/audio/paddleaudio/src/pybind/sox/effects_chain.cpp
index 0204fb309..54f54840f 100644
--- a/audio/paddleaudio/src/pybind/sox/effects_chain.cpp
+++ b/audio/paddleaudio/src/pybind/sox/effects_chain.cpp
@@ -14,7 +14,7 @@ namespace {
 
 /// helper classes for passing the location of input tensor and output buffer
 ///
-/// drain/flow callback functions require plaing C style function signature and
+/// drain/flow callback functions require plain C style function signature and
 /// the way to pass extra data is to attach data to sox_effect_t::priv pointer.
 /// The following structs will be assigned to sox_effect_t::priv pointer which
 /// gives sox_effect_t an access to input Tensor and output buffer object.
@@ -50,7 +50,7 @@ int tensor_input_drain(sox_effect_t* effp, sox_sample_t* obuf, size_t* osamp) {
   *osamp -= *osamp % num_channels;
 
   // Slice the input Tensor
-  // refacor this module, chunk
+  // refactor this module, chunk
   auto i_frame = index / num_channels;
   auto num_frames = *osamp / num_channels;
 
diff --git a/audio/paddleaudio/src/pybind/sox/utils.cpp b/audio/paddleaudio/src/pybind/sox/utils.cpp
index bc32b7407..acdef8040 100644
--- a/audio/paddleaudio/src/pybind/sox/utils.cpp
+++ b/audio/paddleaudio/src/pybind/sox/utils.cpp
@@ -162,7 +162,7 @@ py::dtype get_dtype(
         }
       default:
         // default to float32 for the other formats, including
-        // 32-bit flaoting-point WAV,
+        // 32-bit floating-point WAV,
         // MP3,
         // FLAC,
         // VORBIS etc...
@@ -177,7 +177,7 @@ py::array convert_to_tensor(
     const py::dtype dtype,
     const bool normalize,
     const bool channels_first) {
-  // todo refector later(SGoat)
+  // todo refactor later(SGoat)
   py::array t;
   uint64_t dummy = 0;
   SOX_SAMPLE_LOCALS;
diff --git a/audio/paddleaudio/src/pybind/sox/utils.h b/audio/paddleaudio/src/pybind/sox/utils.h
index 6fce66714..c98e8f9ed 100644
--- a/audio/paddleaudio/src/pybind/sox/utils.h
+++ b/audio/paddleaudio/src/pybind/sox/utils.h
@@ -76,7 +76,7 @@ py::dtype get_dtype(
 /// Tensor.
 /// @param dtype Target dtype. Determines the output dtype and value range in
 /// conjunction with normalization.
-/// @param noramlize Perform normalization. Only effective when dtype is not
+/// @param normalize Perform normalization. Only effective when dtype is not
 /// kFloat32. When effective, the output tensor is kFloat32 type and value range
 /// is [-1.0, 1.0]
 /// @param channels_first When True, output Tensor has shape of [num_channels,
diff --git a/audio/paddleaudio/third_party/sox/CMakeLists.txt b/audio/paddleaudio/third_party/sox/CMakeLists.txt
index 8a5bc55c7..91be289bd 100644
--- a/audio/paddleaudio/third_party/sox/CMakeLists.txt
+++ b/audio/paddleaudio/third_party/sox/CMakeLists.txt
@@ -8,9 +8,9 @@ set(patch_dir ${CMAKE_CURRENT_SOURCE_DIR}/../patches)
 set(COMMON_ARGS --quiet --disable-shared --enable-static --prefix=${INSTALL_DIR} --with-pic --disable-dependency-tracking --disable-debug --disable-examples --disable-doc)
 
 # To pass custom environment variables to ExternalProject_Add command,
-# we need to do `${CMAKE_COMMAND} -E env ${envs} <COMMANAD>`.
+# we need to do `${CMAKE_COMMAND} -E env ${envs} <COMMAND>`.
 # https://stackoverflow.com/a/62437353
-# We constrcut the custom environment variables here
+# We construct the custom environment variables here
 set(envs
   "PKG_CONFIG_PATH=${INSTALL_DIR}/lib/pkgconfig"
   "LDFLAGS=-L${INSTALL_DIR}/lib $ENV{LDFLAGS}"
diff --git a/audio/paddleaudio/utils/download.py b/audio/paddleaudio/utils/download.py
index 07d5eea84..f47345dfc 100644
--- a/audio/paddleaudio/utils/download.py
+++ b/audio/paddleaudio/utils/download.py
@@ -41,14 +41,14 @@ def download_and_decompress(archives: List[Dict[str, str]],
                             path: str,
                             decompress: bool=True):
     """
-    Download archieves and decompress to specific path.
+    Download archives and decompress to specific path.
     """
     if not os.path.isdir(path):
         os.makedirs(path)
 
     for archive in archives:
         assert 'url' in archive and 'md5' in archive, \
-            'Dictionary keys of "url" and "md5" are required in the archive, but got: {list(archieve.keys())}'
+            'Dictionary keys of "url" and "md5" are required in the archive, but got: {list(archive.keys())}'
         download.get_path_from_url(
             archive['url'], path, archive['md5'], decompress=decompress)
 
diff --git a/audio/paddleaudio/utils/log.py b/audio/paddleaudio/utils/log.py
index 5656b286a..ddc8fd669 100644
--- a/audio/paddleaudio/utils/log.py
+++ b/audio/paddleaudio/utils/log.py
@@ -58,7 +58,7 @@ log_config = {
 
 class Logger(object):
     '''
-    Deafult logger in PaddleAudio
+    Default logger in PaddleAudio
     Args:
         name(str) : Logger name, default is 'PaddleAudio'
     '''
diff --git a/audio/paddleaudio/utils/sox_utils.py b/audio/paddleaudio/utils/sox_utils.py
index 305bb68b0..7665238ef 100644
--- a/audio/paddleaudio/utils/sox_utils.py
+++ b/audio/paddleaudio/utils/sox_utils.py
@@ -55,7 +55,7 @@ def set_use_threads(use_threads: bool):
 
     Args:
         use_threads (bool): When ``True``, enables ``libsox``'s parallel effects channels processing.
-            To use mutlithread, the underlying ``libsox`` has to be compiled with OpenMP support.
+            To use multithread, the underlying ``libsox`` has to be compiled with OpenMP support.
 
     See Also:
         http://sox.sourceforge.net/sox.html
diff --git a/audio/paddleaudio/utils/tensor_utils.py b/audio/paddleaudio/utils/tensor_utils.py
index cfd490b9a..1448d48a3 100644
--- a/audio/paddleaudio/utils/tensor_utils.py
+++ b/audio/paddleaudio/utils/tensor_utils.py
@@ -11,7 +11,7 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-"""Unility functions for Transformer."""
+"""Utility functions for Transformer."""
 from typing import List
 from typing import Tuple
 
@@ -80,7 +80,7 @@ def pad_sequence(sequences: List[paddle.Tensor],
     # assuming trailing dimensions and type of all the Tensors
     # in sequences are same and fetching those from sequences[0]
     max_size = paddle.shape(sequences[0])
-    # (TODO Hui Zhang): slice not supprot `end==start`
+    # (TODO Hui Zhang): slice not support `end==start`
     # trailing_dims = max_size[1:]
     trailing_dims = tuple(
         max_size[1:].numpy().tolist()) if sequences[0].ndim >= 2 else ()
@@ -94,7 +94,7 @@ def pad_sequence(sequences: List[paddle.Tensor],
         length = tensor.shape[0]
         # use index notation to prevent duplicate references to the tensor
         if batch_first:
-            # TODO (Hui Zhang): set_value op not supprot `end==start`
+            # TODO (Hui Zhang): set_value op not support `end==start`
             # TODO (Hui Zhang): set_value op not support int16
             # TODO (Hui Zhang): set_varbase 2 rank not support [0,0,...]
             # out_tensor[i, :length, ...] = tensor
@@ -103,7 +103,7 @@ def pad_sequence(sequences: List[paddle.Tensor],
             else:
                 out_tensor[i, length] = tensor
         else:
-            # TODO (Hui Zhang): set_value op not supprot `end==start`
+            # TODO (Hui Zhang): set_value op not support `end==start`
             # out_tensor[:length, i, ...] = tensor
             if length != 0:
                 out_tensor[:length, i] = tensor
diff --git a/audio/paddleaudio/utils/time.py b/audio/paddleaudio/utils/time.py
index 105208f91..4ea413282 100644
--- a/audio/paddleaudio/utils/time.py
+++ b/audio/paddleaudio/utils/time.py
@@ -21,7 +21,7 @@ __all__ = [
 
 
 class Timer(object):
-    '''Calculate runing speed and estimated time of arrival(ETA)'''
+    '''Calculate running speed and estimated time of arrival(ETA)'''
 
     def __init__(self, total_step: int):
         self.total_step = total_step
diff --git a/audio/tests/backends/base.py b/audio/tests/backends/base.py
index a67191887..c2d53d209 100644
--- a/audio/tests/backends/base.py
+++ b/audio/tests/backends/base.py
@@ -30,5 +30,5 @@ class BackendTest(unittest.TestCase):
                 urllib.request.urlretrieve(url, os.path.basename(url))
             self.files.append(os.path.basename(url))
 
-    def initParmas(self):
+    def initParams(self):
         raise NotImplementedError
diff --git a/audio/tests/backends/soundfile/base.py b/audio/tests/backends/soundfile/base.py
index a67191887..c2d53d209 100644
--- a/audio/tests/backends/soundfile/base.py
+++ b/audio/tests/backends/soundfile/base.py
@@ -30,5 +30,5 @@ class BackendTest(unittest.TestCase):
                 urllib.request.urlretrieve(url, os.path.basename(url))
             self.files.append(os.path.basename(url))
 
-    def initParmas(self):
+    def initParams(self):
         raise NotImplementedError
diff --git a/audio/tests/backends/soundfile/save_test.py b/audio/tests/backends/soundfile/save_test.py
index 4f3df6e48..4b5facd08 100644
--- a/audio/tests/backends/soundfile/save_test.py
+++ b/audio/tests/backends/soundfile/save_test.py
@@ -103,7 +103,7 @@ class MockedSaveTest(unittest.TestCase):
             encoding=encoding,
             bits_per_sample=bits_per_sample, )
 
-        # on +Py3.8 call_args.kwargs is more descreptive
+        # on +Py3.8 call_args.kwargs is more descriptive
         args = mocked_write.call_args[1]
         assert args["file"] == filepath
         assert args["samplerate"] == sample_rate
@@ -191,7 +191,7 @@ class SaveTestBase(TempDirMixin, unittest.TestCase):
     def _assert_non_wav(self, fmt, dtype, sample_rate, num_channels):
         """`soundfile_backend.save` can save non-wav format.
 
-        Due to precision missmatch, and the lack of alternative way to decode the
+        Due to precision mismatch, and the lack of alternative way to decode the
         resulting files without using soundfile, only meta data are validated.
         """
         num_frames = sample_rate * 3
diff --git a/audio/tests/common_utils/data_utils.py b/audio/tests/common_utils/data_utils.py
index b5618618c..16f575701 100644
--- a/audio/tests/common_utils/data_utils.py
+++ b/audio/tests/common_utils/data_utils.py
@@ -81,7 +81,7 @@ def convert_tensor_encoding(
 #dtype = getattr(paddle, dtype)
 #if dtype not in [paddle.float64, paddle.float32, paddle.int32, paddle.int16, paddle.uint8]:
 #raise NotImplementedError(f"dtype {dtype} is not supported.")
-## According to the doc, folking rng on all CUDA devices is slow when there are many CUDA devices,
+## According to the doc, forking rng on all CUDA devices is slow when there are many CUDA devices,
 ## so we only fork on CPU, generate values and move the data to the given device
 #with paddle.random.fork_rng([]):
 #paddle.random.manual_seed(seed)
diff --git a/audio/tests/common_utils/sox_utils.py b/audio/tests/common_utils/sox_utils.py
index 6ceae081e..4c0866ed9 100644
--- a/audio/tests/common_utils/sox_utils.py
+++ b/audio/tests/common_utils/sox_utils.py
@@ -24,20 +24,21 @@ def get_bit_depth(dtype):
 
 
 def gen_audio_file(
-    path,
-    sample_rate,
-    num_channels,
-    *,
-    encoding=None,
-    bit_depth=None,
-    compression=None,
-    attenuation=None,
-    duration=1,
-    comment_file=None,
-):
+        path,
+        sample_rate,
+        num_channels,
+        *,
+        encoding=None,
+        bit_depth=None,
+        compression=None,
+        attenuation=None,
+        duration=1,
+        comment_file=None, ):
     """Generate synthetic audio file with `sox` command."""
     if path.endswith(".wav"):
-        warnings.warn("Use get_wav_data and save_wav to generate wav file for accurate result.")
+        warnings.warn(
+            "Use get_wav_data and save_wav to generate wav file for accurate result."
+        )
     command = [
         "sox",
         "-V3",  # verbose
@@ -81,7 +82,12 @@ def gen_audio_file(
     subprocess.run(command, check=True)
 
 
-def convert_audio_file(src_path, dst_path, *, encoding=None, bit_depth=None, compression=None):
+def convert_audio_file(src_path,
+                       dst_path,
+                       *,
+                       encoding=None,
+                       bit_depth=None,
+                       compression=None):
     """Convert audio file with `sox` command."""
     command = ["sox", "-V3", "--no-dither", "-R", str(src_path)]
     if encoding is not None:
@@ -95,7 +101,7 @@ def convert_audio_file(src_path, dst_path, *, encoding=None, bit_depth=None, com
     subprocess.run(command, check=True)
 
 
-def _flattern(effects):
+def _flatten(effects):
     if not effects:
         return effects
     if isinstance(effects[0], str):
@@ -103,9 +109,14 @@ def _flattern(effects):
     return [item for sublist in effects for item in sublist]
 
 
-def run_sox_effect(input_file, output_file, effect, *, output_sample_rate=None, output_bitdepth=None):
+def run_sox_effect(input_file,
+                   output_file,
+                   effect,
+                   *,
+                   output_sample_rate=None,
+                   output_bitdepth=None):
     """Run sox effects"""
-    effect = _flattern(effect)
+    effect = _flatten(effect)
     command = ["sox", "-V", "--no-dither", input_file]
     if output_bitdepth:
         command += ["--bits", str(output_bitdepth)]
diff --git a/audio/tests/features/base.py b/audio/tests/features/base.py
index 3bb1d1dde..4a44e04bb 100644
--- a/audio/tests/features/base.py
+++ b/audio/tests/features/base.py
@@ -24,7 +24,7 @@ wav_url = 'https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav'
 
 class FeatTest(unittest.TestCase):
     def setUp(self):
-        self.initParmas()
+        self.initParams()
         self.initWavInput()
         self.setUpDevice()
 
@@ -44,5 +44,5 @@ class FeatTest(unittest.TestCase):
         if dim == 1:
             self.waveform = np.expand_dims(self.waveform, 0)
 
-    def initParmas(self):
+    def initParams(self):
         raise NotImplementedError
diff --git a/audio/tests/features/test_istft.py b/audio/tests/features/test_istft.py
index ea1ee5cb6..862a1d753 100644
--- a/audio/tests/features/test_istft.py
+++ b/audio/tests/features/test_istft.py
@@ -23,7 +23,7 @@ from paddlespeech.audio.transform.spectrogram import Stft
 
 
 class TestIstft(FeatTest):
-    def initParmas(self):
+    def initParams(self):
         self.n_fft = 512
         self.hop_length = 128
         self.window_str = 'hann'
diff --git a/audio/tests/features/test_kaldi.py b/audio/tests/features/test_kaldi.py
index 2bd5dc734..50e2571ca 100644
--- a/audio/tests/features/test_kaldi.py
+++ b/audio/tests/features/test_kaldi.py
@@ -18,12 +18,11 @@ import paddle
 import paddleaudio
 import torch
 import torchaudio
-
 from base import FeatTest
 
 
 class TestKaldi(FeatTest):
-    def initParmas(self):
+    def initParams(self):
         self.window_size = 1024
         self.dtype = 'float32'
 
diff --git a/audio/tests/features/test_librosa.py b/audio/tests/features/test_librosa.py
index 8cda25b19..07b117cb0 100644
--- a/audio/tests/features/test_librosa.py
+++ b/audio/tests/features/test_librosa.py
@@ -17,13 +17,12 @@ import librosa
 import numpy as np
 import paddle
 import paddleaudio
-from paddleaudio.functional.window import get_window
-
 from base import FeatTest
+from paddleaudio.functional.window import get_window
 
 
 class TestLibrosa(FeatTest):
-    def initParmas(self):
+    def initParams(self):
         self.n_fft = 512
         self.hop_length = 128
         self.n_mels = 40
diff --git a/audio/tests/features/test_log_melspectrogram.py b/audio/tests/features/test_log_melspectrogram.py
index b2765d3be..6152d6ff2 100644
--- a/audio/tests/features/test_log_melspectrogram.py
+++ b/audio/tests/features/test_log_melspectrogram.py
@@ -22,7 +22,7 @@ from paddlespeech.audio.transform.spectrogram import LogMelSpectrogram
 
 
 class TestLogMelSpectrogram(FeatTest):
-    def initParmas(self):
+    def initParams(self):
         self.n_fft = 512
         self.hop_length = 128
         self.n_mels = 40
diff --git a/audio/tests/features/test_spectrogram.py b/audio/tests/features/test_spectrogram.py
index 6f4609632..c2dced2e7 100644
--- a/audio/tests/features/test_spectrogram.py
+++ b/audio/tests/features/test_spectrogram.py
@@ -22,7 +22,7 @@ from paddlespeech.audio.transform.spectrogram import Spectrogram
 
 
 class TestSpectrogram(FeatTest):
-    def initParmas(self):
+    def initParams(self):
         self.n_fft = 512
         self.hop_length = 128
 
diff --git a/audio/tests/features/test_stft.py b/audio/tests/features/test_stft.py
index 9511a2926..5bab170be 100644
--- a/audio/tests/features/test_stft.py
+++ b/audio/tests/features/test_stft.py
@@ -22,7 +22,7 @@ from paddlespeech.audio.transform.spectrogram import Stft
 
 
 class TestStft(FeatTest):
-    def initParmas(self):
+    def initParams(self):
         self.n_fft = 512
         self.hop_length = 128
         self.window_str = 'hann'
@@ -30,7 +30,7 @@ class TestStft(FeatTest):
     def test_stft(self):
         ps_stft = Stft(self.n_fft, self.hop_length)
         ps_res = ps_stft(
-            self.waveform.T).squeeze(1).T  # (n_fft//2 + 1, n_frmaes)
+            self.waveform.T).squeeze(1).T  # (n_fft//2 + 1, n_frames)
 
         x = paddle.to_tensor(self.waveform)
         window = get_window(self.window_str, self.n_fft, dtype=x.dtype)
diff --git a/dataset/librispeech/librispeech.py b/dataset/librispeech/librispeech.py
index 2f5f9016c..ccf8d4b49 100644
--- a/dataset/librispeech/librispeech.py
+++ b/dataset/librispeech/librispeech.py
@@ -132,7 +132,7 @@ def create_manifest(data_dir, manifest_path):
 
 
 def prepare_dataset(url, md5sum, target_dir, manifest_path):
-    """Download, unpack and create summmary manifest file.
+    """Download, unpack and create summary manifest file.
     """
     if not os.path.exists(os.path.join(target_dir, "LibriSpeech")):
         # download
diff --git a/dataset/ted_en_zh/ted_en_zh.py b/dataset/ted_en_zh/ted_en_zh.py
index 2d1fc6710..66810c85e 100644
--- a/dataset/ted_en_zh/ted_en_zh.py
+++ b/dataset/ted_en_zh/ted_en_zh.py
@@ -13,7 +13,7 @@
 # limitations under the License.
 """Prepare Ted-En-Zh speech translation dataset
 
-Create manifest files from splited datased. 
+Create manifest files from splited dataset. 
 dev set: tst2010, test set: tst2015
 Manifest file is a json-format file with each line containing the
 meta data (i.e. audio filepath, transcript and audio duration)
diff --git a/dataset/thchs30/thchs30.py b/dataset/thchs30/thchs30.py
index c5c3eb7a8..fc8338984 100644
--- a/dataset/thchs30/thchs30.py
+++ b/dataset/thchs30/thchs30.py
@@ -71,7 +71,7 @@ def read_trn(filepath):
     with open(filepath, 'r') as f:
         lines = f.read().strip().split('\n')
         assert len(lines) == 3, lines
-    # charactor text, remove withespace
+    # character text, remove whitespace
     texts.append(''.join(lines[0].split()))
     texts.extend(lines[1:])
     return texts
@@ -127,7 +127,7 @@ def create_manifest(data_dir, manifest_path_prefix):
                             'utt2spk': spk,
                             'feat': audio_path,
                             'feat_shape': (duration, ),  # second
-                            'text': word_text,  # charactor
+                            'text': word_text,  # character
                             'syllable': syllable_text,
                             'phone': phone_text,
                         },
diff --git a/dataset/timit/timit.py b/dataset/timit/timit.py
index f3889d176..2943ff548 100644
--- a/dataset/timit/timit.py
+++ b/dataset/timit/timit.py
@@ -123,7 +123,7 @@ def read_algin(filepath: str) -> str:
         filepath (str): [description]
 
     Returns:
-        str: token sepearte by <space>
+        str: token separate by <space>
     """
     aligns = []  # (start, end, token)
     with open(filepath, 'r') as f:
diff --git a/dataset/timit/timit_kaldi_standard_split.py b/dataset/timit/timit_kaldi_standard_split.py
index 473fc856f..59ce2e64a 100644
--- a/dataset/timit/timit_kaldi_standard_split.py
+++ b/dataset/timit/timit_kaldi_standard_split.py
@@ -13,7 +13,7 @@
 # limitations under the License.
 """Prepare TIMIT dataset (Standard split from Kaldi)
 
-Create manifest files from splited datased.
+Create manifest files from splited dataset.
 Manifest file is a json-format file with each line containing the
 meta data (i.e. audio filepath, transcript and audio duration)
 of each audio file in the data set.
diff --git a/dataset/voxceleb/voxceleb1.py b/dataset/voxceleb/voxceleb1.py
index 8d4100678..49a2a6baa 100644
--- a/dataset/voxceleb/voxceleb1.py
+++ b/dataset/voxceleb/voxceleb1.py
@@ -167,7 +167,7 @@ def prepare_dataset(base_url, data_list, target_dir, manifest_path,
 
         # check the target zip file md5sum
         if not check_md5sum(target_name, target_md5sum):
-            raise RuntimeError("{} MD5 checkssum failed".format(target_name))
+            raise RuntimeError("{} MD5 checksum failed".format(target_name))
         else:
             print("Check {} md5sum successfully".format(target_name))
 
diff --git a/dataset/voxceleb/voxceleb2.py b/dataset/voxceleb/voxceleb2.py
index 6df6d1f38..faa3b99bc 100644
--- a/dataset/voxceleb/voxceleb2.py
+++ b/dataset/voxceleb/voxceleb2.py
@@ -179,7 +179,7 @@ def download_dataset(base_url, data_list, target_data, target_dir, dataset):
 
         # check the target zip file md5sum
         if not check_md5sum(target_name, target_md5sum):
-            raise RuntimeError("{} MD5 checkssum failed".format(target_name))
+            raise RuntimeError("{} MD5 checksum failed".format(target_name))
         else:
             print("Check {} md5sum successfully".format(target_name))
 
@@ -187,7 +187,7 @@ def download_dataset(base_url, data_list, target_data, target_dir, dataset):
             # we need make the test directory
             unzip(target_name, os.path.join(target_dir, "test"))
         else:
-            # upzip dev zip pacakge and will create the dev directory
+            # unzip dev zip package and will create the dev directory
             unzip(target_name, target_dir)
 
 
diff --git a/demos/audio_content_search/README.md b/demos/audio_content_search/README.md
index f04ac447e..89b1c0d89 100644
--- a/demos/audio_content_search/README.md
+++ b/demos/audio_content_search/README.md
@@ -14,7 +14,7 @@ Now, the search word in demo is:
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from meduim and hard to install paddlespeech.
+You can choose one way from medium and hard to install paddlespeech.
 
 The dependency refers to the requirements.txt, and install the dependency as follows:
 
diff --git a/demos/audio_searching/README.md b/demos/audio_searching/README.md
index 0fc901432..528fce9e8 100644
--- a/demos/audio_searching/README.md
+++ b/demos/audio_searching/README.md
@@ -19,7 +19,7 @@ Note：this demo uses the [CN-Celeb](http://openslr.org/82/) dataset of at least
 ### 1. Prepare PaddleSpeech
 Audio vector extraction requires PaddleSpeech training model, so please make sure that PaddleSpeech has been installed before running. Specific installation steps: See [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).  
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare MySQL and Milvus services by docker-compose
 The audio similarity search system requires Milvus, MySQL services. We can start these containers with one click through [docker-compose.yaml](./docker-compose.yaml), so please make sure you have [installed Docker Engine](https://docs.docker.com/engine/install/) and [Docker Compose](https://docs.docker.com/compose/install/) before running. then
diff --git a/demos/audio_tagging/README.md b/demos/audio_tagging/README.md
index fc4a334ea..b602c6022 100644
--- a/demos/audio_tagging/README.md
+++ b/demos/audio_tagging/README.md
@@ -11,7 +11,7 @@ This demo is an implementation to tag an audio file with 527 [AudioSet](https://
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare Input File
 The input of this demo should be a WAV file(`.wav`).
diff --git a/demos/automatic_video_subtitiles/README.md b/demos/automatic_video_subtitiles/README.md
index b815425ec..89d8c73c9 100644
--- a/demos/automatic_video_subtitiles/README.md
+++ b/demos/automatic_video_subtitiles/README.md
@@ -10,7 +10,7 @@ This demo is an implementation to automatic video subtitles from a video file. I
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md). 
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare Input
 Get a video file with the speech of the specific language:
diff --git a/demos/keyword_spotting/README.md b/demos/keyword_spotting/README.md
index 6544cf71e..b55c71124 100644
--- a/demos/keyword_spotting/README.md
+++ b/demos/keyword_spotting/README.md
@@ -10,7 +10,7 @@ This demo is an implementation to recognize keyword from a specific audio file.
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare Input File
 The input of this demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
diff --git a/demos/punctuation_restoration/README.md b/demos/punctuation_restoration/README.md
index 458ab92f9..3544a2060 100644
--- a/demos/punctuation_restoration/README.md
+++ b/demos/punctuation_restoration/README.md
@@ -9,7 +9,7 @@ This demo is an implementation to restore punctuation from raw text. It can be d
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare Input
 The input of this demo should be a text of the specific language that can be passed via argument.
diff --git a/demos/speaker_verification/README.md b/demos/speaker_verification/README.md
index 55f9a7360..37c6bf3b9 100644
--- a/demos/speaker_verification/README.md
+++ b/demos/speaker_verification/README.md
@@ -11,7 +11,7 @@ This demo is an implementation to extract speaker embedding from a specific audi
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare Input File
 The input of this cli demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
diff --git a/demos/speech_recognition/README.md b/demos/speech_recognition/README.md
index ee2acd6fd..e406590d2 100644
--- a/demos/speech_recognition/README.md
+++ b/demos/speech_recognition/README.md
@@ -10,7 +10,7 @@ This demo is an implementation to recognize text from a specific audio file. It
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare Input File
 The input of this demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
diff --git a/demos/speech_server/README.md b/demos/speech_server/README.md
index 116f1fd7b..08788a89e 100644
--- a/demos/speech_server/README.md
+++ b/demos/speech_server/README.md
@@ -15,7 +15,7 @@ see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/doc
 
 It is recommended to use **paddlepaddle 2.4rc** or above.
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 **If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
 
diff --git a/demos/speech_ssl/README.md b/demos/speech_ssl/README.md
index ef9b2237d..8677ebc57 100644
--- a/demos/speech_ssl/README.md
+++ b/demos/speech_ssl/README.md
@@ -10,7 +10,7 @@ This demo is an implementation to recognize text or produce the acoustic represe
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare Input File
 The input of this demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
diff --git a/demos/speech_translation/README.md b/demos/speech_translation/README.md
index 00a9c7932..4866336c0 100644
--- a/demos/speech_translation/README.md
+++ b/demos/speech_translation/README.md
@@ -9,7 +9,7 @@ This demo is an implementation to recognize text from a specific audio file and
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 
 ### 2. Prepare Input File
diff --git a/demos/streaming_asr_server/README.md b/demos/streaming_asr_server/README.md
index 136863b96..423485466 100644
--- a/demos/streaming_asr_server/README.md
+++ b/demos/streaming_asr_server/README.md
@@ -18,7 +18,7 @@ see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/doc
 
 It is recommended to use **paddlepaddle 2.4rc** or above.
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 **If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to 
 
diff --git a/demos/streaming_tts_server/README.md b/demos/streaming_tts_server/README.md
index ca5d6f1f8..ad87bebdc 100644
--- a/demos/streaming_tts_server/README.md
+++ b/demos/streaming_tts_server/README.md
@@ -15,7 +15,7 @@ see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/doc
 
 It is recommended to use **paddlepaddle 2.4rc** or above.
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 **If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.**
 
diff --git a/demos/text_to_speech/README.md b/demos/text_to_speech/README.md
index d7bb8ca1c..b58777def 100644
--- a/demos/text_to_speech/README.md
+++ b/demos/text_to_speech/README.md
@@ -10,7 +10,7 @@ This demo is an implementation to generate audio from the given text. It can be
 ### 1. Installation
 see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
-You can choose one way from easy, meduim and hard to install paddlespeech.
+You can choose one way from easy, medium and hard to install paddlespeech.
 
 ### 2. Prepare Input
 The input of this demo should be a text of the specific language that can be passed via argument.
diff --git a/demos/whisper/README.md b/demos/whisper/README.md
index 9b12554e6..6e1b8011f 100644
--- a/demos/whisper/README.md
+++ b/demos/whisper/README.md
@@ -9,7 +9,7 @@ Whisper model trained by OpenAI whisper https://github.com/openai/whisper
  ### 1. Installation
  see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
 
- You can choose one way from easy, meduim and hard to install paddlespeech.
+ You can choose one way from easy, medium and hard to install paddlespeech.
 
  ### 2. Prepare Input File
  The input of this demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.