You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
PaddleSpeech/docs/source/install.md

9.1 KiB

(简体中文|English)

Installation

There are 3 ways to use PaddleSpeech. According to the degree of difficulty, the 3 ways can be divided into Easy, Medium, and Hard. You can choose one of the 3 ways to install PaddleSpeech.

Way Function Support
Easy (1) Use command-line functions of PaddleSpeech.
(2) Experience PaddleSpeech on Ai Studio.
Linux, Mac(not support M1 chip)Windows
Medium Support major functions such as using theready-madeexamples and using PaddleSpeech to train your model. Linux
Hard Support full function of Paddlespeech, including using join ctc decoder with kaldi, training n-gram language model, Montreal-Forced-Aligner, and so on. And you are more able to be a developer! Ubuntu

Prerequisites

Easy: Get the Basic Function (Support Linux, Mac, and Windows)

  • If you are newer to PaddleSpeech and want to experience it easily without your machine. We recommend you to use AI Studio to experience it. There is a step-by-step tutorial for PaddleSpeech, and you can use the basic function of PaddleSpeech with a free machine.
  • If you want to use the command line function of Paddlespeech, you need to complete the following steps to install PaddleSpeech. For more information about how to use the command line function, you can see the cli.

Install Conda

Conda is a management system of the environment. You can go to minicoda (select a version py>=3.7) to download and install the conda. And then Install conda dependencies for paddlespeech :

conda install -y -c conda-forge sox libsndfile bzip2

Install C++ Compilation Environment

(If you already have C++ compilation environment, you can miss this step.)

Windows

You need to install Visual Studio to make the C++ compilation environment.

https://visualstudio.microsoft.com/visual-cpp-build-tools/

You can also see #1195 for more help.

Mac

brew install gcc

Linux

#  centos
sudo yum install gcc gcc-c++
# ubuntu
sudo apt install build-essential
# Others
conda install -y -c gcc_linux-64=8.4.0 gxx_linux-64=8.4.0

Install PaddleSpeech

Some users may fail to install kaldiio due to the default download source, you can install pytest-runner at first

pip install pytest-runner -i https://pypi.tuna.tsinghua.edu.cn/simple

Then you can use the following commands:

pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
pip install paddlespeech -i https://pypi.tuna.tsinghua.edu.cn/simple

If you encounter problem with downloading nltk_data while using paddlespeech, it maybe due to your poor network, we suggest you download the nltk_data provided by us, and extract it to your ${HOME}.

If you fail to install paddlespeech-ctcdecoders, it doesn't matter.

Medium: Get the Major Functions (Support Linux)

If you want to get the major function of paddlespeech, you need to do following steps:

Git clone PaddleSpeech

You need to git clone this repository at first.

git clone https://github.com/PaddlePaddle/PaddleSpeech.git
cd PaddleSpeech

Install Conda

Conda is a management system of the environment. You can go to minicoda to select a version (py>=3.7) and install it by yourself or you can use the following command:

# download the miniconda
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -P tools/
# install the miniconda
bash tools/Miniconda3-latest-Linux-x86_64.sh -b
# conda init
$HOME/miniconda3/bin/conda init
# activate the conda
bash

Then you can create a conda virtual environment using the following command:

conda create -y -p tools/venv python=3.7

Activate the conda virtual environment:

conda activate tools/venv

Install conda dependencies for paddlespeech :

conda install -y -c conda-forge sox libsndfile swig bzip2

Install C++ Compilation Environment

(If you already have C++ compilation environment, you can miss this step.) Do not forget to install gcc and gxx on your system. You can choose to use the scripts below to install them.

#  centos
sudo yum install gcc gcc-c++
# ubuntu
sudo apt install build-essential
# Others
conda install -y -c gcc_linux-64=8.4.0 gxx_linux-64=8.4.0

(Hip: Do not use the last script if you want to install by Hard way):

Install PaddlePaddle

You can choose the PaddlePaddle version based on your system. For example, for CUDA 10.2, CuDNN7.5 install paddlepaddle-gpu 2.2.0:

python3 -m pip install paddlepaddle-gpu==2.2.0 -i https://mirror.baidu.com/pypi/simple

Install PaddleSpeech

You can install paddlespeech by the following commandthen you can use the ready-made examples in paddlespeech :

# Some users may fail to install `kaldiio` due to the default download source, you can install `pytest-runner` at first
pip install pytest-runner -i https://pypi.tuna.tsinghua.edu.cn/simple
# Make sure you are in the root directory of PaddleSpeech
pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple

Hard: Get the Full Function (Support Ubuntu)

Prerequisites

  • Ubuntu >= 16.04.
  • choice 1: working with Ubuntu Docker Container.
  • choice 2: working on Ubuntu with root privilege.

To avoid the trouble of environment setup, running in a Docker container is highly recommended. Otherwise, if you work on Ubuntu with root privilege, you can still complete the installation.

Choice 1: Running in Docker Container (Recommend)

Docker is an open-source tool to build, ship, and run distributed applications in an isolated environment. A Docker image for this project has been provided in hub.docker.com with dependencies of cuda and cudnn installed. This Docker image requires the support of NVIDIA GPU, so please make sure its availability and the nvidia-docker has been installed.

Take several steps to launch the Docker image:

  • Download the Docker image

For example, pull paddle 2.2.0 image:

sudo nvidia-docker pull registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7
  • Clone this repository
git clone https://github.com/PaddlePaddle/PaddleSpeech.git
  • Run the Docker image
sudo nvidia-docker run --net=host --ipc=host --rm -it -v $(pwd)/PaddleSpeech:/PaddleSpeech registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7 /bin/bash
  • Enter PaddleSpeech directory.
cd /PaddleSpeech

Now you can execute training, inference, and hyper-parameters tuning in Docker container.

Choice 2: Running in Ubuntu with Root Privilege

  • Install build-essential by apt
sudo apt install build-essential
  • Clone this repository
git clone https://github.com/PaddlePaddle/PaddleSpeech.git
# Enter the PaddleSpeech dir
cd PaddleSpeech

Install the Conda

# download the miniconda
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -P tools/
# install the miniconda
bash tools/Miniconda3-latest-Linux-x86_64.sh -b
# conda init
$HOME/miniconda3/bin/conda init
# use the "bash" command to make the conda environment works
bash
# create a conda virtual environment
conda create -y -p tools/venv python=3.7
# Activate the conda virtual environment:
conda activate tools/venv
# Install the conda packages
conda install -y -c conda-forge sox libsndfile swig bzip2 libflac bc

Install PaddlePaddle

Some users may fail to install kaldiio due to the default download source, you can install pytest-runner at first

pip install pytest-runner -i https://pypi.tuna.tsinghua.edu.cn/simple

Make sure you have GPU and the paddlepaddle version is right. For example, for CUDA 10.2, CuDNN7.5 install paddle 2.2.0:

python3 -m pip install paddlepaddle-gpu==2.2.0 -i https://mirror.baidu.com/pypi/simple

Install PaddleSpeech in Developing Mode

pip install -e .[develop] -i https://pypi.tuna.tsinghua.edu.cn/simple

Install the Kaldi (Optional)

pushd tools
bash extras/install_openblas.sh
bash extras/install_kaldi.sh
popd