You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
PaddleSpeech/docs/source/cls/quick_start.md

2.0 KiB

Quick Start of Audio Classification

Several shell scripts provided in ./examples/esc50/cls0 will help us to quickly give it a try, for most major modules, including data preparation, model training, model evaluation, with ESC50 dataset.

Some of the scripts in ./examples are not configured with GPUs. If you want to train with 8 GPUs, please modify CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7. If you don't have any GPU available, please set CUDA_VISIBLE_DEVICES= to use CPUs instead.

Let's start a audio classification task with the following steps:

  • Go to the directory

    cd examples/esc50/cls0
    
  • Source env

    source path.sh
    
  • Main entry point

    CUDA_VISIBLE_DEVICES=0 ./run.sh 1
    

This demo includes fine-tuning, evaluating and deploying a audio classificatio model. More detailed information is provided in the following sections.

Fine-tuning a model

PANNs(PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition) are pretrained models with Audioset. They can be easily used to extract audio embeddings for audio classification task.

To start a model fine-tuning, please run:

ngpu=$(echo $CUDA_VISIBLE_DEVICES | awk -F "," '{print NF}')
feat_backend=numpy
./local/train.sh ${ngpu} ${feat_backend}

Deploy a model

Once you save a model checkpoint, you can export it to static graph and deploy by python scirpt:

  • Export to a static graph

    ./local/export.sh ${ckpt_dir} ./export
    

    The argument ckpt_dir should be a directory in which a model checkpoint stored, for example checkpoint/epoch_50.

    The static graph will be exported to ./export.

  • Inference

    ./local/static_model_infer.sh ${infer_device} ./export ${audio_file}
    

    The argument infer_device can be cpu or gpu, and it means which device to be used to infer. And audio_file should be a wave file with name *.wav.