PaddleSpeech/demos/style_fs2/README.md

([简体中文](./README_cn.md)|English)

# Style FastSpeech2
## Introduction
[FastSpeech2](https://arxiv.org/abs/2006.04558)  is a classical acoustic model for Text-to-Speech synthesis, which introduces controllable speech input, including `phoneme duration`、 `energy` and `pitch`. 

In the prediction phase, you can change these controllable variables to get some interesting results.

For example:

1. The `duration` control in `FastSpeech2` can control the speed of audios will keep the `pitch`. (in some speech tools, increasing the speed will increase the pitch and vice versa.)

2. When we set the `pitch` of one sentence to a mean value and set the `tones` of phones to `1`, we will get a `robot-style` timbre.

3. When we raise the `pitch` of an adult female (with a fixed scale ratio), we will get a `child-style` timbre.

The `duration` and `pitch` of different phonemes in a sentence can have different scale ratios. You can set different scale ratios to emphasize or weaken the pronunciation of some phonemes.
## Usage
Run the following command line to get started:
```
./run.sh
```
In `run.sh`, it will execute `source path.sh` firstly, which will set the environment variants.

If you would like to try your sentence, please replace the sentence in `sentences.txt`.

For more details, please see `style_syn.py`

The audio samples are in [style-control-in-fastspeech2](https://paddlespeech.readthedocs.io/en/latest/tts/demo.html#style-control-in-fastspeech2)
Add Chinese doc and language switcher for demos of metaverse, style_fs2 and story talker, test=doc 2 years ago			`([简体中文](./README_cn.md)\|English)`

add readme 3 years ago			`# Style FastSpeech2`
update demos readme 3 years ago			`## Introduction`
update readme, test=doc_fix (#1156) 3 years ago			[FastSpeech2](https://arxiv.org/abs/2006.04558) is a classical acoustic model for Text-to-Speech synthesis, which introduces controllable speech input, including `phoneme duration`、 `energy` and `pitch`.
add readme 3 years ago
update demos readme 3 years ago			`In the prediction phase, you can change these controllable variables to get some interesting results.`

			`For example:`

update readme, test=doc_fix (#1156) 3 years ago			1. The `duration` control in `FastSpeech2` can control the speed of audios will keep the `pitch`. (in some speech tools, increasing the speed will increase the pitch and vice versa.)
update demos readme 3 years ago
update readme, test=doc_fix (#1156) 3 years ago			2. When we set the `pitch` of one sentence to a mean value and set the `tones` of phones to `1`, we will get a `robot-style` timbre.
update demos readme 3 years ago
			3. When we raise the `pitch` of an adult female (with a fixed scale ratio), we will get a `child-style` timbre.

			The `duration` and `pitch` of different phonemes in a sentence can have different scale ratios. You can set different scale ratios to emphasize or weaken the pronunciation of some phonemes.
			`## Usage`
add readme 3 years ago			`Run the following command line to get started:`
			```
			`./run.sh`
			```
Update README.md 3 years ago			In `run.sh`, it will execute `source path.sh` firstly, which will set the environment variants.

update readme, test=doc_fix (#1156) 3 years ago			If you would like to try your sentence, please replace the sentence in `sentences.txt`.
Update README.md 3 years ago
add readme 3 years ago			For more details, please see `style_syn.py`
Update README.md 3 years ago
			`The audio samples are in [style-control-in-fastspeech2](https://paddlespeech.readthedocs.io/en/latest/tts/demo.html#style-control-in-fastspeech2)`