Merge pull request #1063 from Jackwaterveg/install

[Setup]optimize the setup.py and setup.sh
4 years ago · ab9bc3c31e
parent 9c57808fb7 829b7758de
commit ab9bc3c31e
6 changed files with 204 additions and 80 deletions
--- a/docs/source/install.md
+++ b/docs/source/install.md
@ -1,44 +1,103 @@
 # Installation
-To avoid the trouble of environment setup, [running in Docker container](#running-in-docker-container) is highly recommended. Otherwise follow the guidelines below to install the dependencies manually.
+There are 3 ways to use the repository. According to the degree of difficulty, the 3 ways can be divided into Easy, Medium and Hard.
 ## Easy: Get the Basic Funcition Without Your Own Mechine
 If you are in touch with PaddleSpeech for the first time and want to experience it easily without your own mechine. We recommand you to go to aistudio to experience the PaddleSpeech project. There is a step-by-step tutorial for PaddleSpeech and you can use the basic function of PaddleSpeech with a free machine.
 ## Prerequisites for Medium and Hard
 ## Prerequisites
 - Python >= 3.7
 - PaddlePaddle latest version (please refer to the [Installation Guide](https://www.paddlepaddle.org.cn/documentation/docs/en/beginners_guide/index_en.html))
 - Only Linux is supported
 - Hip: Do not use command `sh` instead of command `bash`
 ## Simple Setup
 For user who working on `Ubuntu` with `root`  privilege.
-```python
+## Medium: Get the Basic Funciton on Your Mechine
 git clone https://github.com/PaddlePaddle/DeepSpeech.git
 cd DeepSpeech
 pip install -e .
 ```
-For user who only needs the basic function of paddlespeech, using conda to do installing is recommended.
+If you want to install the paddlespeech on your own mechine. There are 3 steps you need to do.
 You can go to [minicoda](https://docs.conda.io/en/latest/miniconda.html) to select a version and install it by yourself, or you can use the scripts below to install the last miniconda version.
-```python
+### Install the Conda
-pushd tools
+
-bash extras/install_miniconda.sh
+The first setup is installing the conda. Conda is environment management system. You can go to [minicoda](https://docs.conda.io/en/latest/miniconda.html) to select a version (py>=3.7) and install it by yourself or you can use the scripts below:
-popd
+
 ```bash
 # download the miniconda
 wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
 # install the miniconda
 bash Miniconda3-latest-Linux-x86_64.sh -b
 # conda init
 $HOME/miniconda3/bin/conda init
 # activate the conda
 bash
 ```
-After installing the conda, run the setup.sh to complete the installing process.
+Then you can create an conda virtual environment using the script:
-```python
+
-bash setup.sh
+```bash
 conda create -n py37 python=3.7
 ```
 Activate the conda virtual environment:
 ```bash
 conda activate py37
 ```
 Intall the conda dependencies
 ```bash
 conda install -c conda-forge sox libsndfile swig bzip2 gcc_linux-64=8.4.0 gxx_linux-64=8.4.0 --yes
 ```
 ### Install PaddlePaddle
 For example, for CUDA 10.2, CuDNN7.5 install paddle 2.2.0:
 ```bash
 python3 -m pip install paddlepaddle-gpu==2.2.0
 ```
 ### Install the PaddleSpeech Using PiP
 To Install the PaddleSpeech, there are two methods. You can use the script below:
 ```bash
 pip install paddlespeech
 ```
 If you install the paddlespeech by pip, you can use it to help you to build your own model. However, you can not use the ready-made examples in paddlespeech. 
 If you want to use the ready-made examples in paddlespeech, you need to clone the repository and install the paddlespeech package.
 ```bash
 https://github.com/PaddlePaddle/PaddleSpeech.git
 ## Into the PaddleSpeech
 cd PaddleSpeech
 pip install .
 ```
 ## Setup (Other Platform)
 - Make sure these libraries or tools in [dependencies](./dependencies.md) installed. More information please see: `setup.py `and ` tools/Makefile`.
 - The version of `swig` should >= 3.0
 - we will do more to simplify the install process.
-## Running in Docker Container (optional)
+## Hard: Get the Full Funciton on Your Mechine
 ### Prerequisites
 - choice 1: working with `ubuntu` Docker Container.
  or
 - choice 2: working on `Ubuntu` with `root` privilege. 
 To avoid the trouble of environment setup, [running in Docker container](#running-in-docker-container) is highly recommended. Otherwise If you work on `Ubuntu` with `root` privilege, you can skip the next step.
 ### Choice 1: Running in Docker Container (Recommand)
 Docker is an open source tool to build, ship, and run distributed applications in an isolated environment. A Docker image for this project has been provided in [hub.docker.com](https://hub.docker.com) with all the dependencies installed. This Docker image requires the support of NVIDIA GPU, so please make sure its availiability and the [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) has been installed.
@ -46,35 +105,87 @@ Take several steps to launch the Docker image:
 - Download the Docker image
-For example, pull paddle 2.0.0 image:
+For example, pull paddle 2.2.0 image:
 ```bash
-nvidia-docker pull registry.baidubce.com/paddlepaddle/paddle:2.0.0-gpu-cuda10.1-cudnn7
+nvidia-docker pull registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7
 ```
 - Clone this repository
 ```
-git clone https://github.com/PaddlePaddle/DeepSpeech.git
+git clone https://github.com/PaddlePaddle/PaddleSpeech.git
 ```
 - Run the Docker image
 ```bash
-sudo nvidia-docker run --rm -it -v $(pwd)/DeepSpeech:/DeepSpeech registry.baidubce.com/paddlepaddle/paddle:2.0.0-gpu-cuda10.1-cudnn7 /bin/bash
+sudo nvidia-docker run --net=host --ipc=host --rm -it -v $(pwd)/PaddleSpeech:/PaddleSpeech registry.baidubce.com/paddlepaddle/paddle:2.2.0-gpu-cuda10.2-cudnn7 /bin/bash
 ```
 Now you can execute training, inference and hyper-parameters tuning in the Docker container.
 ### Choice 2: Running in Ubuntu with Root Privilege
 - Clone this repository
 ```
 git clone https://github.com/PaddlePaddle/PaddleSpeech.git
 ```
 Install paddle 2.2.0:
 ```bash
 python3 -m pip install paddlepaddle-gpu==2.2.0
 ```
 ### Install the Conda
 ```bash
 # download and install the miniconda
 pushd tools
 bash extras/install_miniconda.sh
 popd
 # use the "bash" command to make the conda environment works
 bash
 # create an conda virtual environment
 conda create -n py37 python=3.7
 # Activate the conda virtual environment:
 conda activate py37
 # Install the conda packags
 conda install -c conda-forge sox libsndfile swig bzip2 gcc_linux-64=8.4.0 gxx_linux-64=8.4.0 --yes
 ```
 ### Install PaddlePaddle
 For example, for CUDA 10.2, CuDNN7.5 install paddle 2.2.0:
 ```bash
 python3 -m pip install paddlepaddle-gpu==2.2.0
 ```
 ### Get the Funcition for Developing PaddleSpeech
- Install PaddlePaddle
+```bash
 pip install -e .[develop]
 ```
-For example, for CUDA 10.1, CuDNN7.5 install paddle 2.0.0:
+### Install the Kaldi (Optional)
 ```bash
-python3 -m pip install paddlepaddle-gpu==2.0.0
+pushd tools
 bash extras/install_openblas.sh
 bash extras/install_kaldi.sh
 popd
 ```
 - Install Deepspeech
-Please see [Setup](#setup)  section.
+
 ## Setup for Other Platform 
 - Make sure these libraries or tools in [dependencies](./dependencies.md) installed. More information please see: `setup.py `and ` tools/Makefile`.
 - The version of `swig` should >= 3.0
 - we will do more to simplify the install process.
 - Install Paddlespeech
--- a/requirements.txt
+++ b/requirements.txt
@ -1,23 +1,19 @@
 ConfigArgParse
 coverage
 distro
 editdistance
 g2p_en
 g2pM
 gpustat
 GPUtil
 h5py
 inflect
 jieba
 jsonlines
 kaldiio
 librosa
 llvmlite
 loguru
 matplotlib
 nara_wpe
 nltk
 numba
 paddlespeech_ctcdecoders
 paddlespeech_feat
 pandas
@ -25,9 +21,7 @@ phkit
 Pillow
 praatio~=4.1
 pre-commit
 psutil
 pybind11
 pynvml
 pypi-kenlm
 pypinyin
 python-dateutil
--- a/setup.cfg
+++ b/setup.cfg
@ -7,3 +7,6 @@ description-file = README.md
 [magformat]
 formatters=yapf
 [easy_install]
 index-url=https://pypi.tuna.tsinghua.edu.cn/simple
--- a/setup.py
+++ b/setup.py
@ -27,6 +27,58 @@ from setuptools.command.install import install
 HERE = Path(os.path.abspath(os.path.dirname(__file__)))
 requirements = {
    "install": [
        "editdistance",
        "g2p_en",
        "g2pM",
        "h5py",
        "inflect",
        "jieba",
        "jsonlines",
        "kaldiio",
        "librosa",
        "loguru",
        "matplotlib",
        "nara_wpe",
        "nltk",
        "pandas",
        "paddlespeech_ctcdecoders",
        "paddlespeech_feat",
        "praatio~=4.1",
        "pypi-kenlm",
        "pypinyin",
        "python-dateutil",
        "pyworld",
        "resampy==0.2.2",
        "sacrebleu",
        "scipy",
        "sentencepiece~=0.1.96",
        "soundfile~=0.10",
        "sox",
        "soxbindings",
        "textgrid",
        "timer",
        "tqdm",
        "typeguard",
        "visualdl",
        "webrtcvad",
        "yacs",
    ],
    "develop": [
        "ConfigArgParse",
        "coverage",
        "gpustat",
        "phkit",
        "Pillow",
        "pybind11",
        "snakeviz",
        "unidecode",
        "yq",
        "pre-commit",
    ]
 }
@contextlib.contextmanager
 def pushd(new_dir):
@ -72,25 +124,12 @@ def _post_install(install_lib_dir):
        check_call("make")
    print("tools install.")
    # install autolog
    tools_extrs_dir = HERE / 'tools/extras'
    with pushd(tools_extrs_dir):
        print(os.getcwd())
        check_call("./install_autolog.sh")
    print("autolog install.")
    # ctcdecoder
    ctcdecoder_dir = HERE / 'paddlespeech/s2t/decoders/ctcdecoder/swig'
    with pushd(ctcdecoder_dir):
        check_call("bash -e setup.sh")
    print("ctcdecoder install.")
    # install third_party
    third_party_dir = HERE / 'third_party'
    with pushd(third_party_dir):
        check_call("bash -e install.sh")
    print("third_party install.")
 class DevelopCommand(develop):
    def run(self):
        develop.run(self)
@ -130,7 +169,7 @@ class UploadCommand(Command):
 setup_info = dict(
    # Metadata
    name='paddlespeech',
-    version='0.0.1a',
+    version='0.1.0a',
    author='PaddlePaddle Speech and Language Team',
    author_email='paddlesl@baidu.com',
    url='https://github.com/PaddlePaddle/PaddleSpeech',
@ -158,8 +197,10 @@ setup_info = dict(
        "gan",
    ],
    python_requires='>=3.6',
-    install_requires=[d.strip() for d in read('requirements.txt').split()],
+    install_requires=requirements["install"],
    extras_require={
        'develop':
        requirements["develop"],
        'doc': [
            "sphinx", "sphinx-rtd-theme", "numpydoc", "myst_parser",
            "recommonmark>=0.5.0", "sphinx-markdown-tables", "sphinx-autobuild"
--- a/setup.sh
+++ b/setup.sh
@ -1,20 +0,0 @@
 # Install conda dependencies
 conda install -c conda-forge sox libsndfile swig bzip2 bottleneck gcc_linux-64=8.4.0 gxx_linux-64=8.4.0 --yes
 # Install the python lib
 pip install -r requirements.txt
 # Install the auto_log
 pushd tools/extras
 bash install_autolog.sh
 popd
 # Install the ctcdecoder
 pushd paddlespeech/s2t/decoders/ctcdecoder/swig
 bash -e setup.sh
 popd
 # Install the python_speech_features
 pushd third_party
 bash -e install.sh
 popd
--- a/tools/Makefile
+++ b/tools/Makefile
@ -10,7 +10,7 @@ WGET ?= wget --no-check-certificate
 .PHONY: all clean
-all: virtualenv.done apt.done kenlm.done sox.done soxbindings.done mfa.done sclite.done
+all: apt.done kenlm.done mfa.done sclite.done
 virtualenv.done:
 	test -d venv || virtualenv -p $(PYTHON) venv
@ -35,7 +35,7 @@ kenlm.done:
 	apt-get install -y gcc-5 g++-5 && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 50  && update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 50
 	test -d kenlm || $(WGET) -O - https://kheafield.com/code/kenlm.tar.gz | tar xz
 	rm -rf kenlm/build && mkdir -p kenlm/build && cd kenlm/build && cmake .. && make -j4 && make install
-	cd kenlm && python setup.py install
+	cd kenlm && python3 setup.py install
 	touch kenlm.done
 sox.done:
@ -45,11 +45,6 @@ sox.done:
 	cd sox-14.4.2 && ./configure --prefix=/usr/ && make -j4 && make install
 	touch sox.done
 soxbindings.done:
 	test -d soxbindings || git clone https://github.com/pseeth/soxbindings.git
 	cd soxbindings && python setup.py install
 	touch soxbindings.done
 mfa.done:
 	test -d montreal-forced-aligner || $(WGET) https://paddlespeech.bj.bcebos.com/Parakeet/montreal-forced-aligner_linux.tar.gz
 	tar xvf montreal-forced-aligner_linux.tar.gz