Merge branch 'develop' into fix_cli

pull/1074/head
Hui Zhang 3 years ago committed by GitHub
commit 03678c08c5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

3
.gitignore vendored

@ -26,5 +26,8 @@ tools/sctk
tools/sctk-20159b5/
tools/kaldi
tools/OpenBLAS/
tools/Miniconda3-latest-Linux-x86_64.sh
tools/activate_python.sh
tools/miniconda.sh
*output/

@ -1,4 +1,4 @@
Welcome to paddle PaddleSpeech documentation !
Welcome to PaddleSpeech
==============================================
**PaddleSpeech** is a Speech toolkits implemented by paddlepaddle.

@ -1,32 +1,20 @@
# Installation
There are 3 ways to use the repository. According to the degree of difficulty, the 3 ways can be divided into Easy, Medium and Hard.
There are 3 ways to use `PaddleSpeech`. According to the degree of difficulty, the 3 ways can be divided into `Easy`, `Medium` and `Hard`.
## Easy: Get the Basic Funcition Without Your Own Mechine
If you are in touch with PaddleSpeech for the first time and want to experience it easily without your own mechine. We recommand you to go to aistudio to experience the PaddleSpeech project. There is a step-by-step tutorial for PaddleSpeech and you can use the basic function of PaddleSpeech with a free machine.
If you are a newer of `PaddleSpeech` and want to experience it easily without your own mechine. We recommand you to use [AI Studio](https://aistudio.baidu.com/aistudio/index) to experience it. There is a step-by-step tutorial for `PaddleSpeech` and you can use the basic function of `PaddleSpeech` with a free machine.
## Prerequisites for Medium and Hard
- Python >= 3.7
- PaddlePaddle latest version (please refer to the [Installation Guide](https://www.paddlepaddle.org.cn/documentation/docs/en/beginners_guide/index_en.html))
- Only Linux is supported
- Hip: Do not use command `sh` instead of command `bash`
## Medium: Get the Basic Funciton on Your Mechine
If you want to install the paddlespeech on your own mechine. There are 3 steps you need to do.
If you want to install `paddlespeech` on your own mechine. There are 3 steps you need to do.
### Install the Conda
The first setup is installing the conda. Conda is environment management system. You can go to [minicoda](https://docs.conda.io/en/latest/miniconda.html) to select a version (py>=3.7) and install it by yourself or you can use the scripts below:
Conda is environment management system. You can go to [minicoda](https://docs.conda.io/en/latest/miniconda.html) to select a version (py>=3.7) and install it by yourself or you can use the following command:
```bash
# download the miniconda
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
@ -37,59 +25,39 @@ $HOME/miniconda3/bin/conda init
# activate the conda
bash
```
Then you can create an conda virtual environment using the script:
Then you can create an conda virtual environment using the following command:
```bash
conda create -n py37 python=3.7
conda create -y -p tools/venv python=3.7
```
Activate the conda virtual environment:
```bash
conda activate py37
conda activate tools/venv
```
Intall the conda dependencies
Install conda dependencies for `paddlespeech` :
```bash
conda install -c conda-forge sox libsndfile swig bzip2 gcc_linux-64=8.4.0 gxx_linux-64=8.4.0 --yes
conda install -y -c conda-forge sox libsndfile swig bzip2 gcc_linux-64=8.4.0 gxx_linux-64=8.4.0
```
### Install PaddlePaddle
For example, for CUDA 10.2, CuDNN7.5 install paddle 2.2.0:
```bash
python3 -m pip install paddlepaddle-gpu==2.2.0
```
### Install the PaddleSpeech Using PiP
To Install the PaddleSpeech, there are two methods. You can use the script below:
### Install PaddleSpeech
To Install `paddlespeech`, there are two methods. You can use the following command:
```bash
pip install paddlespeech
```
If you install `paddlespeech` by `pip`, you can use it to help you build your own model. However, you can not use the `ready-made `examples in paddlespeech.
If you install the paddlespeech by pip, you can use it to help you to build your own model. However, you can not use the ready-made examples in paddlespeech.
If you want to use the ready-made examples in paddlespeech, you need to clone the repository and install the paddlespeech package.
If you want to use the` ready-made `examples in `paddlespeech`, you need to clone this repository and install `paddlespeech` by the foll
```bash
https://github.com/PaddlePaddle/PaddleSpeech.git
## Into the PaddleSpeech
cd PaddleSpeech
pip install .
```
## Hard: Get the Full Funciton on Your Mechine
### Prerequisites
- choice 1: working with `ubuntu` Docker Container.
- choice 1: working with `Ubuntu` Docker Container.
or
@ -98,50 +66,35 @@ pip install .
To avoid the trouble of environment setup, [running in Docker container](#running-in-docker-container) is highly recommended. Otherwise If you work on `Ubuntu` with `root` privilege, you can skip the next step.
### Choice 1: Running in Docker Container (Recommand)
Docker is an open source tool to build, ship, and run distributed applications in an isolated environment. A Docker image for this project has been provided in [hub.docker.com](https://hub.docker.com) with all the dependencies installed. This Docker image requires the support of NVIDIA GPU, so please make sure its availiability and the [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) has been installed.
Take several steps to launch the Docker image:
- Download the Docker image
For example, pull paddle 2.2.0 image:
```bash
nvidia-docker pull registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7
```
- Clone this repository
```
```bash
git clone https://github.com/PaddlePaddle/PaddleSpeech.git
```
- Run the Docker image
```bash
sudo nvidia-docker run --net=host --ipc=host --rm -it -v $(pwd)/PaddleSpeech:/PaddleSpeech registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7 /bin/bash
```
Now you can execute training, inference and hyper-parameters tuning in the Docker container.
Now you can execute training, inference and hyper-parameters tuning in Docker container.
### Choice 2: Running in Ubuntu with Root Privilege
- Clone this repository
```
```bash
git clone https://github.com/PaddlePaddle/PaddleSpeech.git
```
Install paddle 2.2.0:
```bash
python3 -m pip install paddlepaddle-gpu==2.2.0
```
### Install the Conda
```bash
# download and install the miniconda
pushd tools
@ -150,29 +103,23 @@ popd
# use the "bash" command to make the conda environment works
bash
# create an conda virtual environment
conda create -n py37 python=3.7
conda create -y -n tools/venv python=3.7
# Activate the conda virtual environment:
conda activate py37
conda activate tools/venv
# Install the conda packags
conda install -c conda-forge sox libsndfile swig bzip2 gcc_linux-64=8.4.0 gxx_linux-64=8.4.0 --yes
conda install -y -c conda-forge sox libsndfile swig bzip2 libflac bc gcc_linux-64=8.4.0 gxx_linux-64=8.4.0
```
### Install PaddlePaddle
For example, for CUDA 10.2, CuDNN7.5 install paddle 2.2.0:
```bash
python3 -m pip install paddlepaddle-gpu==2.2.0
```
### Get the Funcition for Developing PaddleSpeech
```bash
pip install -e .[develop]
pip install .[develop]
```
### Install the Kaldi (Optional)
```bash
pushd tools
bash extras/install_openblas.sh
@ -181,11 +128,8 @@ popd
```
## Setup for Other Platform
- Make sure these libraries or tools in [dependencies](./dependencies.md) installed. More information please see: `setup.py `and ` tools/Makefile`.
- Make sure these libraries or tools in [dependencies](./dependencies.md) installed. More information please see: `setup.py `and `tools/Makefile`.
- The version of `swig` should >= 3.0
- we will do more to simplify the install process.
- Install Paddlespeech
- we will simplify the install process in the future.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 117 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.5 MiB

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 224 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.5 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 581 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 212 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 368 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

File diff suppressed because one or more lines are too long

@ -24,4 +24,4 @@
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | attention | 6.484564081827799 | 0.044355 |
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | ctc_greedy_search | 6.484564081827799 | 0.050479 |
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | ctc_prefix_beam_search | 6.484564081827799 | 0.049890 |
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | attention_rescoring | 6.484564081827799 | 0.039200 |
| transformer | 32.52 M | conf/transformer.yaml | spec_aug | test-clean | attention_rescoring | 6.484564081827799 | 0.039200 |

@ -133,9 +133,7 @@ class Frontend():
phones.append('sp')
if v and v not in self.punc:
phones.append(v)
# add sp between sentence (replace the last punc with sp)
if initials[-1] in self.punc:
phones.append('sp')
phones_list.append(phones)
if merge_sentences:
merge_list = sum(phones_list, [])

@ -130,6 +130,7 @@ def _post_install(install_lib_dir):
check_call("bash -e setup.sh")
print("ctcdecoder install.")
class DevelopCommand(develop):
def run(self):
develop.run(self)
@ -213,8 +214,7 @@ setup_info = dict(
},
# Package info
packages=find_packages(exclude=('utils', 'tests', 'tests.*', 'examples*',
'paddleaudio*', 'third_party*', 'tools*')),
packages=find_packages(include=('paddlespeech*')),
zip_safe=True,
classifiers=[
'Development Status :: 3 - Alpha',

@ -8,9 +8,10 @@ CC ?= gcc # used for sph2pipe
WGET ?= wget --no-check-certificate
.PHONY: all clean
.PHONY: all clean
all: apt.done kenlm.done mfa.done sctk.done
all: apt.done kenlm.done mfa.done sclite.done
virtualenv.done:
test -d venv || virtualenv -p $(PYTHON) venv
@ -28,6 +29,7 @@ apt.done:
echo "check_certificate = off" >> ~/.wgetrc
touch apt.done
kenlm.done:
# Ubuntu 16.04 透過 apt 會安裝 boost 1.58.0
# it seems that boost (1.54.0) requires higher version. After I switched to g++-5 it compiles normally.
@ -38,12 +40,6 @@ kenlm.done:
cd kenlm && python3 setup.py install
touch kenlm.done
sox.done:
apt install -y libvorbis-dev libmp3lame-dev libmad-ocaml-dev
test -d sox-14.4.2 || $(WGET) https://nchc.dl.sourceforge.net/project/sox/sox/14.4.2/sox-14.4.2.tar.gz
tar -xvzf sox-14.4.2.tar.gz -C .
cd sox-14.4.2 && ./configure --prefix=/usr/ && make -j4 && make install
touch sox.done
mfa.done:
test -d montreal-forced-aligner || $(WGET) https://paddlespeech.bj.bcebos.com/Parakeet/montreal-forced-aligner_linux.tar.gz
@ -58,48 +54,46 @@ kaldi.done: openblas.done
bash extras/install_kaldi.sh
touch kaldi.done
#== SCTK ===============================================================================
# SCTK official repo does not have version tags. Here's the mapping:
# # 2.4.9 = 659bc36; 2.4.10 = d914e1b; 2.4.11 = 20159b5.
SCTK_GITHASH = 20159b5
SCTK_CXFLAGS = -w -march=native
SCTK_MKENV = CFLAGS="$(CFLAGS) $(SCTK_CXFLAGS)" \
CXXFLAGS="$(CXXFLAGS) -std=c++11 $(SCTK_CXFLAGS)" \
# Keep the existing target 'sclite' to avoid breaking the users who might have
# scripted it in.
.PHONY: sclite.done sctk_cleaned sctk_made
sclite.done sctk_made: sctk/.compiled
touch sclite.done
sctk/.compiled: sctk
rm -f sctk/.compiled
$(SCTK_MKENV) $(MAKE) -C sctk config
$(SCTK_MKENV) $(MAKE) -C sctk all doc
$(MAKE) -C sctk install
touch sctk/.compiled
# The GitHub archive unpacks into SCTK-{40-character-long-hash}/
sctk: sctk-$(SCTK_GITHASH).tar.gz
tar zxvf sctk-$(SCTK_GITHASH).tar.gz
rm -rf sctk-$(SCTK_GITHASH) sctk
mv SCTK-$(SCTK_GITHASH)* sctk-$(SCTK_GITHASH)
ln -s sctk-$(SCTK_GITHASH) sctk
touch sctk-$(SCTK_GITHASH).tar.gz
sctk-$(SCTK_GITHASH).tar.gz:
if [ -d '$(DOWNLOAD_DIR)' ]; then \
cp -p '$(DOWNLOAD_DIR)/sctk-$(SCTK_GITHASH).tar.gz' .; \
else \
$(WGET) -nv -T 10 -t 3 -O sctk-$(SCTK_GITHASH).tar.gz \
https://github.com/usnistgov/SCTK/archive/$(SCTK_GITHASH).tar.gz; \
fi
sctk_cleaned:
-for d in sctk/ sctk-*/; do \
[ ! -f $$d/.compiled ] || $(MAKE) -C $$d clean; \
rm -f $$d/.compiled; \
done
sctk.done:
./extras/install_sclite.sh
touch sctk.done
######################
dev: python conda_packages.done sctk.done
# Use pip for paddle installation even if you have anaconda
ifneq ($(shell test -f ./activate_python.sh && grep 'conda activate' ./activate_python.sh),)
USE_CONDA := 1
else
USE_CONDA :=
endif
python: activate_python.sh
activate_python.sh:
test -f activate_python.sh || { echo "Error: Run ./setup_python.sh or ./setup_anaconda.sh"; exit 1; }
bc.done: activate_python.sh
. ./activate_python.sh && { command -v bc || conda install -y bc -c conda-forge; }
touch bc.done
cmake.done: activate_python.sh
. ./activate_python.sh && { command -v cmake || conda install -y cmake; }
touch cmake.done
flac.done: activate_python.sh
. ./activate_python.sh && { command -v flac || conda install -y libflac -c conda-forge; }
touch flac.done
ffmpeg.done: activate_python.sh
. ./activate_python.sh && { command -v ffmpeg || conda install -y ffmpeg -c conda-forge; }
touch ffmpeg.done
sox.done: activate_python.sh
. ./activate_python.sh && { command -v sox || conda install -y sox -c conda-forge; }
touch sox.done
sndfile.done: activate_python.sh
. ./activate_python.sh && { python3 -c "from ctypes.util import find_library as F; assert F('sndfile') is not None" || conda install -y libsndfile=1.0.28 -c conda-forge; }
touch sndfile.done
ifneq ($(strip $(USE_CONDA)),)
conda_packages.done: bc.done cmake.done flac.done ffmpeg.done sox.done sndfile.done
else
conda_packages.done:
endif
touch conda_packages.done

@ -9,7 +9,7 @@ WGET=${WGET:-wget}
if [ -d "$DOWNLOAD_DIR" ]; then
cp -p "$DOWNLOAD_DIR/Miniconda3-latest-Linux-x86_64.sh" . || exit 1
else
$WGET https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh || exit 1
$WGET -c https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh || exit 1
fi
bash Miniconda3-latest-Linux-x86_64.sh -b

@ -0,0 +1,67 @@
#!/usr/bin/env bash
set -euo pipefail
if [ -z "${PS1:-}" ]; then
PS1=__dummy__
fi
CONDA_URL=https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
if [ $# -gt 4 ]; then
echo "Usage: $0 [output] [conda-env-name] [python-version>]"
exit 1;
elif [ $# -eq 3 ]; then
output_dir="$1"
name="$2"
PYTHON_VERSION="$3"
elif [ $# -eq 2 ]; then
output_dir="$1"
name="$2"
PYTHON_VERSION=""
elif [ $# -eq 1 ]; then
output_dir="$1"
name=""
PYTHON_VERSION=""
elif [ $# -eq 0 ]; then
output_dir=venv
name=""
PYTHON_VERSION=""
fi
if [ -e activate_python.sh ]; then
echo "Warning: activate_python.sh already exists. It will be overwritten"
fi
if [ ! -e "${output_dir}/etc/profile.d/conda.sh" ]; then
if [ ! -e miniconda.sh ]; then
wget --tries=3 "${CONDA_URL}" -O miniconda.sh
fi
bash miniconda.sh -b -p "${output_dir}"
fi
# shellcheck disable=SC1090
source "${output_dir}/etc/profile.d/conda.sh"
conda deactivate
# If the env already exists, skip recreation
if [ -n "${name}" ] && ! conda activate ${name}; then
conda create -yn "${name}"
fi
conda activate ${name}
if [ -n "${PYTHON_VERSION}" ]; then
conda install -y conda "python=${PYTHON_VERSION}"
else
conda install -y conda
fi
conda install -y pip setuptools
cat << EOF > activate_python.sh
#!/usr/bin/env bash
# THIS FILE IS GENERATED BY tools/setup_anaconda.sh
if [ -z "\${PS1:-}" ]; then
PS1=__dummy__
fi
. $(cd ${output_dir}; pwd)/etc/profile.d/conda.sh && conda deactivate && conda activate ${name}
EOF
Loading…
Cancel
Save