From 99f392d5c476b152f7b08775711f0042bafa43c7 Mon Sep 17 00:00:00 2001
From: nyx-c-language <nyxchaoji123@163.com>
Date: Sat, 12 Apr 2025 23:36:37 +0800
Subject: [PATCH] update the stage of run.sh and synthesize_e2e.sh, to be clear

---
 examples/aishell3/ernie_sat/README.md | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/examples/aishell3/ernie_sat/README.md b/examples/aishell3/ernie_sat/README.md
index e26808e95..aee732cf8 100644
--- a/examples/aishell3/ernie_sat/README.md
+++ b/examples/aishell3/ernie_sat/README.md
@@ -13,7 +13,7 @@ In ERNIE-SAT, we propose two innovations:
 ## Dataset
 ### Download and Extract
 Download AISHELL-3 from it's [Official Website](http://www.aishelltech.com/aishell_3) and extract it to `~/datasets`. Then the dataset is in the directory `~/datasets/data_aishell3`.
- 
+
 ### Get MFA Result and Extract
 We use [MFA2.x](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner) to get durations for aishell3_fastspeech2.
 You can download from here [aishell3_alignment_tone.tar.gz](https://paddlespeech.cdn.bcebos.com/MFA/AISHELL-3/with_tone/aishell3_alignment_tone.tar.gz), or train your MFA model reference to [mfa example](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/other/mfa) (use MFA1.x now) of our repo.
@@ -138,7 +138,13 @@ You can check the text of downloaded wavs in `source/README.md`.
 ```bash
 ./run.sh --stage 3 --stop-stage 3 --gpus 0
 ```
-`stage 3` of `run.sh` calls `local/synthesize_e2e.sh`, `stage 0` of it is **Speech Synthesis** and  `stage 1` of it is **Speech Editing**.
+`run.sh`'s `stage 3` invokes `synthesize_e2e.sh` and uses the `--stage` parameter to select between tasks. By default, `synthesize_e2e.sh` executes `stage 0`, which performs speech synthesis. To switch to speech editing, use `--stage 1`.
+
+To perform speech synthesis, modify the command to:
+
+```bash
+./run.sh --stage 3 --stop-stage 3 --gpus 0 --stage 1
+```
 
 You can modify `--wav_path`、`--old_str` and `--new_str` yourself, `--old_str`  should be the text corresponding to the audio of  `--wav_path`, `--new_str` should be designed according to `--task_name`, both `--source_lang` and `--target_lang` should be `zh` for model trained with AISHELL3 dataset.
 ## Pretrained Model