[wip][vec] update readme and yaml, test=doc #1543

4 years ago · 508f2f5b62
parent 7fd0b98eee
commit 508f2f5b62
6 changed files with 91 additions and 59 deletions
--- a/demos/audio_searching/README.md
+++ b/demos/audio_searching/README.md
@ -11,7 +11,9 @@ Audio retrieval (speech, music, speaker, etc.) enables querying and finding simi
 In this demo, you will learn how to build an audio retrieval system to retrieve similar sound snippets.  The uploaded audio clips are converted into vector data using paddlespeech-based pre-training models (audio classification model, speaker recognition model, etc.) and stored in Milvus.  Milvus automatically generates a unique ID for each vector, then stores the ID and the corresponding audio information (audio ID, audio speaker ID, etc.) in MySQL to complete the library construction.  During retrieval, users upload test audio to obtain vector, and then conduct vector similarity search in Milvus. The retrieval result returned by Milvus is vector ID, and the corresponding audio information can be queried in MySQL by ID
-The demo uses the [CN-Celeb](http://openslr.org/82/) dataset of at least 650,000 audio entries and 3000 speakers to build the audio vector library, which is then retrieved using a preset distance calculation. The dataset can also use other,  Adjust as needed, e.g. Librispeech, VoxCeleb, UrbanSound, etc
+![Workflow of an audio searching system](./img/audo_searching.png)
 Note：this demo uses the [CN-Celeb](http://openslr.org/82/) dataset of at least 650,000 audio entries and 3000 speakers to build the audio vector library, which is then retrieved using a preset distance calculation. The dataset can also use other,  Adjust as needed, e.g. Librispeech, VoxCeleb, UrbanSound, etc
 ## Usage
 ### 1. Prepare MySQL and Milvus services by docker-compose
--- a/demos/audio_searching/README_cn.md
+++ b/demos/audio_searching/README_cn.md
@ -12,11 +12,13 @@
 在本 demo 中，你将学会如何构建一个音频检索系统，用来检索相似的声音片段。使用基于 PaddleSpeech 预训练模型（音频分类模型，说话人识别模型等）将上传的音频片段转换为向量数据，并存储在 Milvus 中。Milvus 自动为每个向量生成唯一的 ID，然后将 ID 和 相应的音频信息（音频id，音频的说话人id等等）存储在 MySQL，这样就完成建库的工作。用户在检索时，上传测试音频，得到向量，然后在 Milvus 中进行向量相似度搜索，Milvus 返回的检索结果为向量 ID，通过 ID 在 MySQL 内部查询相应的音频信息即可
-这个 demo 使用 [CN-Celeb](http://openslr.org/82/) 数据集，包括至少 650000 条音频，3000 个说话人，来建立音频向量库，然后通过预设的距离计算方式进行检索，这里面数据集也可以使用其他的，根据需要调整，如Librispeech，VoxCeleb，UrbanSound等
+![音频检索程图](./img/audio_searching.png)
 注：该 demo 使用 [CN-Celeb](http://openslr.org/82/) 数据集，包括至少 650000 条音频，3000 个说话人，来建立音频向量库（音频特征，或音频说话人特征），然后通过预设的距离计算方式进行音频（或说话人）检索，这里面数据集也可以使用其他的，根据需要调整，如Librispeech，VoxCeleb，UrbanSound等
 ## 使用方法
 ### 1. MySQL 和 Milvus 安装
-音频相似度搜索系统需要Milvus, MySQL服务。 我们可以通过[Docker-Compose.yaml](./ Docker-Compose.yaml)一键启动这些容器，所以请确保在运行之前已经安装了[Docker Engine](https://docs.docker.com/engine/install/) 和[Docker Compose](https://docs.docker.com/compose/install/)。 即
+音频相似度搜索系统需要用到 Milvus, MySQL 服务。 我们可以通过 [docker-compose.yaml](./docker-compose.yaml) 一键启动这些容器，所以请确保在运行之前已经安装了 [Docker Engine](https://docs.docker.com/engine/install/) 和 [Docker Compose](https://docs.docker.com/compose/install/)。 即
 ```bash
 docker-compose -f docker-compose.yaml up -d
@ -30,6 +32,7 @@ Creating milvus-minio    ... done
 Creating milvus-etcd     ... done
 Creating audio-mysql     ... done
 Creating milvus-standalone ... done
 Creating audio-webclient     ... done
 ```
 可以采用'docker ps'来显示所有的容器，还可以使用'docker logs audio-mysql'来获取服务器容器的日志：
@ -40,6 +43,7 @@ b2bcf279e599  milvusdb/milvus:v2.0.1  "/tini -- milvus run…"  22 hours ago  Up
 d8ef4c84e25c  mysql:5.7 "docker-entrypoint.s…"  22 hours ago  Up 22 hours 0.0.0.0:3306->3306/tcp, 33060/tcp audio-mysql
 8fb501edb4f3  quay.io/coreos/etcd:v3.5.0  "etcd -advertise-cli…"  22 hours ago  Up 22 hours 2379-2380/tcp milvus-etcd
 ffce340b3790  minio/minio:RELEASE.2020-12-03T00-03-10Z  "/usr/bin/docker-ent…"  22 hours ago  Up 22 hours (healthy) 9000/tcp  milvus-minio
 15c84a506754  iregistry.baidu-int.com/paddlespeech/audio-search-client:1.0  "/bin/bash -c '/usr/…"  22 hours ago  Up 22 hours (healthy) 0.0.0.0:8068->80/tcp  audio-webclient
 ```
@ -48,77 +52,88 @@ ffce340b3790  minio/minio:RELEASE.2020-12-03T00-03-10Z  "/usr/bin/docker-ent…"
 - 安装服务依赖的 python 基础包
-```bash
+  ```bash
-pip install -r requirements.txt
+  pip install -r requirements.txt
-```
+  ```
 - 修改配置
-```bash
+  ```bash
-vim src/config.py
+  vim src/config.py
-```
+  ```
-请根据实际环境进行修改。 这里列出了一些需要设置的参数，更多信息请参考[config.py](./src/config.py)  
+  请根据实际环境进行修改。 这里列出了一些需要设置的参数，更多信息请参考 [config.py](./src/config.py)  
-| **Parameter**    | **Description**                                       | **Default setting** |
+  | **Parameter**    | **Description**                                       | **Default setting** |
-| ---------------- | ----------------------------------------------------- | ------------------- |
+  | ---------------- | ----------------------------------------------------- | ------------------- |
-| MILVUS_HOST      | The IP address of Milvus, you can get it by ifconfig. If running everything on one machine, most likely 127.0.0.1 | 127.0.0.1           |
+  | MILVUS_HOST      | The IP address of Milvus, you can get it by ifconfig. If running everything on one machine, most likely 127.0.0.1 | 127.0.0.1           |
-| MILVUS_PORT      | Port of Milvus.                                       | 19530               |
+  | MILVUS_PORT      | Port of Milvus.                                       | 19530               |
-| VECTOR_DIMENSION | Dimension of the vectors.                             | 2048                |
+  | VECTOR_DIMENSION | Dimension of the vectors.                             | 2048                |
-| MYSQL_HOST       | The IP address of Mysql.                              | 127.0.0.1           |
+  | MYSQL_HOST       | The IP address of Mysql.                              | 127.0.0.1           |
-| MYSQL_PORT       | Port of Milvus.                                       | 3306                |
+  | MYSQL_PORT       | Port of Milvus.                                       | 3306                |
-| DEFAULT_TABLE    | The milvus and mysql default collection name.         | audio_table          |
+  | DEFAULT_TABLE    | The milvus and mysql default collection name.         | audio_table          |
 - 运行程序
-启动用 Fastapi 构建的服务
+  启动用 Fastapi 构建的服务
-```bash
+  ```bash
-python src/main.py
+  python src/main.py
-```
+  ```
-然后你会看到应用程序启动:
+  然后你会看到应用程序启动:
-```bash
+  ```bash
-INFO:     Started server process [3949]
+  INFO:     Started server process [3949]
-2022-03-07 17:39:14,864 ｜ INFO ｜ server.py ｜ serve ｜ 75 ｜ Started server process [3949]
+  2022-03-07 17:39:14,864 ｜ INFO ｜ server.py ｜ serve ｜ 75 ｜ Started server process [3949]
-INFO:     Waiting for application startup.
+  INFO:     Waiting for application startup.
-2022-03-07 17:39:14,865 ｜ INFO ｜ on.py ｜ startup ｜ 45 ｜ Waiting for application startup.
+  2022-03-07 17:39:14,865 ｜ INFO ｜ on.py ｜ startup ｜ 45 ｜ Waiting for application startup.
-INFO:     Application startup complete.
+  INFO:     Application startup complete.
-2022-03-07 17:39:14,866 ｜ INFO ｜ on.py ｜ startup ｜ 59 ｜ Application startup complete.
+  2022-03-07 17:39:14,866 ｜ INFO ｜ on.py ｜ startup ｜ 59 ｜ Application startup complete.
-INFO:     Uvicorn running on http://127.0.0.1:8002 (Press CTRL+C to quit)
+  INFO:     Uvicorn running on http://127.0.0.1:8002 (Press CTRL+C to quit)
-2022-03-07 17:39:14,867 ｜ INFO ｜ server.py ｜ _log_started_message ｜ 206 ｜ Uvicorn running on http://127.0.0.1:8002 (Press CTRL+C to quit)
+  2022-03-07 17:39:14,867 ｜ INFO ｜ server.py ｜ _log_started_message ｜ 206 ｜ Uvicorn running on http://127.0.0.1:8002 (Press CTRL+C to quit)
-```
+  ```
-### 3. 使用方法
+### 3. 测试方法
 - 准备数据
  ```bash
  wget -c https://www.openslr.org/resources/82/cn-celeb_v2.tar.gz && tar -xvf cn-celeb_v2.tar.gz 
  ```
-  注：如果希望快速搭建 demo，可以采用 ./src/test_main.py:download_audio_data 内部的 20 条音频，后续结果展示以该集合为例
+  注：如果希望快速搭建 demo，可以采用 ./src/test_main.py:download_audio_data 内部的 20 条音频，另外后续结果展示以该集合为例
- - 运行测试程序
+ - 脚本测试（推荐）
-  内部将依次下载数据，加载 paddlespeech 模型，提取 embedding，存储建库，检索，删库
+
-  ```bash
+    ```bash
-  python ./src/test_main.py
+    python ./src/test_main.py
-  ```
+    ```
-
+    注：内部将依次下载数据，加载 paddlespeech 模型，提取 embedding，存储建库，检索，删库
-  输出：
+
-  ```bash
+    输出：
-  Checkpoint path: %your model path%
+    ```bash
-  Extracting feature from audio No. 1 , 20 audios in total
+    Checkpoint path: %your model path%
-  Extracting feature from audio No. 2 , 20 audios in total
+    Extracting feature from audio No. 1 , 20 audios in total
-  ...
+    Extracting feature from audio No. 2 , 20 audios in total
-  2022-03-09 17:22:13,870 ｜ INFO ｜ main.py ｜ load_audios ｜ 85 ｜ Successfully loaded data, total count: 20
+    ...
-  2022-03-09 17:22:13,898 ｜ INFO ｜ main.py ｜ count_audio ｜ 147 ｜ Successfully count the number of data!
+    2022-03-09 17:22:13,870 ｜ INFO ｜ main.py ｜ load_audios ｜ 85 ｜ Successfully loaded data, total count: 20
-  2022-03-09 17:22:13,918 ｜ INFO ｜ main.py ｜ audio_path ｜ 57 ｜ Successfully load audio: ./example_audio/test.wav
+    2022-03-09 17:22:13,898 ｜ INFO ｜ main.py ｜ count_audio ｜ 147 ｜ Successfully count the number of data!
-  ...
+    2022-03-09 17:22:13,918 ｜ INFO ｜ main.py ｜ audio_path ｜ 57 ｜ Successfully load audio: ./example_audio/test.wav
-  2022-03-09 17:22:32,580 ｜ INFO ｜ main.py ｜ search_local_audio ｜ 131 ｜ search result http://testserver/data?audio_path=./example_audio/test.wav, distance 0.0
+    ...
-  2022-03-09 17:22:32,580 ｜ INFO ｜ main.py ｜ search_local_audio ｜ 131 ｜ search result http://testserver/data?audio_path=./example_audio/knife_chopping.wav, distance 0.021805256605148315
+    2022-03-09 17:22:32,580 ｜ INFO ｜ main.py ｜ search_local_audio ｜ 131 ｜ search result http://testserver/data?audio_path=./example_audio/test.wav, distance 0.0
-  2022-03-09 17:22:32,580 ｜ INFO ｜ main.py ｜ search_local_audio ｜ 131 ｜ search result http://testserver/data?audio_path=./example_audio/knife_cut_into_flesh.wav, distance 0.052762262523174286
+    2022-03-09 17:22:32,580 ｜ INFO ｜ main.py ｜ search_local_audio ｜ 131 ｜ search result http://testserver/data?audio_path=./example_audio/knife_chopping.wav, distance 0.021805256605148315
-  ...
+    2022-03-09 17:22:32,580 ｜ INFO ｜ main.py ｜ search_local_audio ｜ 131 ｜ search result http://testserver/data?audio_path=./example_audio/knife_cut_into_flesh.wav, distance 0.052762262523174286
-  2022-03-09 17:22:32,582 ｜ INFO ｜ main.py ｜ search_local_audio ｜ 135 ｜ Successfully searched similar audio!
+    ...
-  2022-03-09 17:22:33,658 ｜ INFO ｜ main.py ｜ drop_tables ｜ 159 ｜ Successfully drop tables in Milvus and MySQL!
+    2022-03-09 17:22:32,582 ｜ INFO ｜ main.py ｜ search_local_audio ｜ 135 ｜ Successfully searched similar audio!
-  ```
+    2022-03-09 17:22:33,658 ｜ INFO ｜ main.py ｜ drop_tables ｜ 159 ｜ Successfully drop tables in Milvus and MySQL!
    ```
  - 前端测试（可选）
    在浏览器中输入 127.0.0.1:8068 访问前端页面
    - 上传音频
      ![](./img/insert.png)
    - 检索相似音频
      ![](./img/search.png)
 ### 4. 预训练模型
--- a/demos/audio_searching/docker-compose.yaml
+++ b/demos/audio_searching/docker-compose.yaml
@ -48,7 +48,6 @@ services:
    depends_on:
      - "etcd"
      - "minio"
  mysql:
    container_name: audio-mysql
@ -63,6 +62,22 @@ services:
    ports:
      - "3306:3306"
  webclient:
    container_name: audio-webclient
    image: iregistry.baidu-int.com/paddlespeech/audio-search-client:1.0
    networks:
      app_net:
        ipv4_address: 172.16.23.13
    environment:
      API_URL: 'http://127.0.0.1:8002'
    ports:
      - "8068:80"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost/"]
      interval: 30s
      timeout: 20s
      retries: 3
 networks:
  app_net:
    driver: bridge
--- a/demos/audio_searching/img/audio_searching.png
+++ b/demos/audio_searching/img/audio_searching.png
--- a/demos/audio_searching/img/insert.png
+++ b/demos/audio_searching/img/insert.png
--- a/demos/audio_searching/img/search.png
+++ b/demos/audio_searching/img/search.png