diff --git a/demos/audio_searching/README.md b/demos/audio_searching/README.md index c0df12ece..8a6f38639 100644 --- a/demos/audio_searching/README.md +++ b/demos/audio_searching/README.md @@ -25,10 +25,14 @@ You can choose one way from easy, meduim and hard to install paddlespeech. The audio similarity search system requires Milvus, MySQL services. We can start these containers with one click through [docker-compose.yaml](./docker-compose.yaml), so please make sure you have [installed Docker Engine](https://docs.docker.com/engine/install/) and [Docker Compose](https://docs.docker.com/compose/install/) before running. then ```bash +## Enter the audio_searching directory for the following example +cd ~/PaddleSpeech/demos/audio_searching/ + +## Then start the related services within the container docker-compose -f docker-compose.yaml up -d ``` -Then you will see the that all containers are created: +You will see the that all containers are created: ```bash Creating network "quick_deploy_app_net" with driver "bridge" @@ -47,7 +51,7 @@ b2bcf279e599 milvusdb/milvus:v2.0.1 "/tini -- milvus run…" 22 hours ago Up d8ef4c84e25c mysql:5.7 "docker-entrypoint.s…" 22 hours ago Up 22 hours 0.0.0.0:3306->3306/tcp, 33060/tcp audio-mysql 8fb501edb4f3 quay.io/coreos/etcd:v3.5.0 "etcd -advertise-cli…" 22 hours ago Up 22 hours 2379-2380/tcp milvus-etcd ffce340b3790 minio/minio:RELEASE.2020-12-03T00-03-10Z "/usr/bin/docker-ent…" 22 hours ago Up 22 hours (healthy) 9000/tcp milvus-minio -15c84a506754 qingen1/paddlespeech-audio-search-client:2.3 "/bin/bash -c '/usr/…" 22 hours ago Up 22 hours (healthy) 0.0.0.0:8068->80/tcp audio-webclient +15c84a506754 paddlepaddle/paddlespeech-audio-search-client:2.3 "/bin/bash -c '/usr/…" 22 hours ago Up 22 hours (healthy) 0.0.0.0:8068->80/tcp audio-webclient ``` ### 3. Start API Server @@ -58,22 +62,27 @@ Then to start the system server, and it provides HTTP backend services. ```bash pip install -r requirements.txt ``` -- Set configuration +- Set configuration(In the case of local running, you can skip this step.) ```bash + ## Method 1: Modify the source file vim src/config.py + + ## Method 2: Modify the environment variables, as shown in + export MILVUS_HOST=127.0.0.1 + export MYSQL_HOST=127.0.0.1 ``` - Modify the parameters according to your own environment. Here listing some parameters that need to be set, for more information please refer to [config.py](./src/config.py). + Here listing some parameters that need to be set, for more information please refer to [config.py](./src/config.py). - | **Parameter** | **Description** | **Default setting** | - | ---------------- | ----------------------------------------------------- | ------------------- | - | MILVUS_HOST | The IP address of Milvus, you can get it by ifconfig. If running everything on one machine, most likely 127.0.0.1 | 127.0.0.1 | - | MILVUS_PORT | Port of Milvus. | 19530 | - | VECTOR_DIMENSION | Dimension of the vectors. | 2048 | - | MYSQL_HOST | The IP address of Mysql. | 127.0.0.1 | - | MYSQL_PORT | Port of Milvus. | 3306 | - | DEFAULT_TABLE | The milvus and mysql default collection name. | audio_table | + | **Parameter** |**Description** | **Default setting** | + | ---------------- | -----------------------| ------------------- | + | MILVUS_HOST | The IP address of Milvus, you can get it by ifconfig. If running everything on one machine, most likely 127.0.0.1 | 127.0.0.1 + | MILVUS_PORT | Port of Milvus. | 19530 | + | VECTOR_DIMENSION | Dimension of the vectors. | 2048 | + | MYSQL_HOST | The IP address of Mysql. | 127.0.0.1 | + | MYSQL_PORT | Port of Mysql. | 3306 | + | DEFAULT_TABLE | The milvus and mysql default collection name. | audio_table | - Run the code @@ -102,7 +111,13 @@ Then to start the system server, and it provides HTTP backend services. ```bash wget -c https://www.openslr.org/resources/82/cn-celeb_v2.tar.gz && tar -xvf cn-celeb_v2.tar.gz ``` - Note: If you want to build a quick demo, you can use ./src/test_main.py:download_audio_data function, it downloads 20 audio files , Subsequent results show this collection as an example + **Note**: If you want to build a quick demo, you can use ./src/test_main.py:download_audio_data function, it downloads 20 audio files , Subsequent results show this collection as an example + +- Prepare model(Skip this step if you use the default model.) + ```bash + ## Modify model configuration parameters. Currently, only ecapatdnn_voxceleb12 is supported, and multiple types will be supported in the future + vim ./src/encode.py + ``` - Scripts test (Recommended) @@ -179,7 +194,7 @@ Then to start the system server, and it provides HTTP backend services. Navigate to 127.0.0.1:8068 in your browser to access the front-end interface. - Note: If the browser and the service are not on the same machine, then the IP needs to be changed to the IP of the machine where the service is located, and the corresponding API_URL in docker-compose.yaml needs to be changed, and the docker-compose.yaml file needs to be re-executed for the change to take effect. + **Note**: If the browser and the service are not on the same machine, then the IP needs to be changed to the IP of the machine where the service is located, and the corresponding API_URL in docker-compose.yaml needs to be changed, and the docker-compose.yaml file needs to be re-executed for the change to take effect. - Insert data @@ -218,6 +233,3 @@ Here is a list of pretrained models released by PaddleSpeech : | Model | Sample Rate | :--- | :---: | ecapa_tdnn | 16000 -| panns_cnn6| 32000 -| panns_cnn10| 32000 -| panns_cnn14| 32000 diff --git a/demos/audio_searching/README_cn.md b/demos/audio_searching/README_cn.md index c851bd0f6..0d0f42a0f 100644 --- a/demos/audio_searching/README_cn.md +++ b/demos/audio_searching/README_cn.md @@ -20,16 +20,20 @@ ### 1. PaddleSpeech 安装 音频向量的提取需要用到基于 PaddleSpeech 训练的模型,所以请确保在运行之前已经安装了 PaddleSpeech,具体安装步骤,详见[安装文档](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md)。 -你可以从 easy,medium,hard 三中方式中选择一种方式安装。 +你可以从 easy,medium,hard 三种方式中选择一种方式安装。 ### 2. MySQL 和 Milvus 安装 音频相似性的检索需要用到 Milvus, MySQL 服务。 我们可以通过 [docker-compose.yaml](./docker-compose.yaml) 一键启动这些容器,所以请确保在运行之前已经安装了 [Docker Engine](https://docs.docker.com/engine/install/) 和 [Docker Compose](https://docs.docker.com/compose/install/)。 即 ```bash +## 先进入到 audio_searching 目录,如下示例 +cd ~/PaddleSpeech/demos/audio_searching/ + +## 然后启动容器内的相关服务 docker-compose -f docker-compose.yaml up -d ``` -然后你会看到所有的容器都被创建: +你会看到所有的容器都被创建: ```bash Creating network "quick_deploy_app_net" with driver "bridge" @@ -48,7 +52,7 @@ b2bcf279e599 milvusdb/milvus:v2.0.1 "/tini -- milvus run…" 22 hours ago Up d8ef4c84e25c mysql:5.7 "docker-entrypoint.s…" 22 hours ago Up 22 hours 0.0.0.0:3306->3306/tcp, 33060/tcp audio-mysql 8fb501edb4f3 quay.io/coreos/etcd:v3.5.0 "etcd -advertise-cli…" 22 hours ago Up 22 hours 2379-2380/tcp milvus-etcd ffce340b3790 minio/minio:RELEASE.2020-12-03T00-03-10Z "/usr/bin/docker-ent…" 22 hours ago Up 22 hours (healthy) 9000/tcp milvus-minio -15c84a506754 qingen1/paddlespeech-audio-search-client:2.3 "/bin/bash -c '/usr/…" 22 hours ago Up 22 hours (healthy) 0.0.0.0:8068->80/tcp audio-webclient +15c84a506754 paddlepaddle/paddlespeech-audio-search-client:2.3 "/bin/bash -c '/usr/…" 22 hours ago Up 22 hours (healthy) 0.0.0.0:8068->80/tcp audio-webclient ``` @@ -60,22 +64,27 @@ ffce340b3790 minio/minio:RELEASE.2020-12-03T00-03-10Z "/usr/bin/docker-ent…" ```bash pip install -r requirements.txt ``` -- 修改配置 +- 修改配置(本地运行情况下,一般不用修改,可以跳过该步骤) ```bash + ## 方法一:修改源码文件 vim src/config.py + + ## 方法二:修改环境变量,如下所示 + export MILVUS_HOST=127.0.0.1 + export MYSQL_HOST=127.0.0.1 ``` - 请根据实际环境进行修改。 这里列出了一些需要设置的参数,更多信息请参考 [config.py](./src/config.py) + 这里列出了一些需要设置的参数,更多信息请参考 [config.py](./src/config.py) - | **Parameter** | **Description** | **Default setting** | - | ---------------- | ----------------------------------------------------- | ------------------- | - | MILVUS_HOST | The IP address of Milvus, you can get it by ifconfig. If running everything on one machine, most likely 127.0.0.1 | 127.0.0.1 | - | MILVUS_PORT | Port of Milvus. | 19530 | - | VECTOR_DIMENSION | Dimension of the vectors. | 2048 | - | MYSQL_HOST | The IP address of Mysql. | 127.0.0.1 | - | MYSQL_PORT | Port of Milvus. | 3306 | - | DEFAULT_TABLE | The milvus and mysql default collection name. | audio_table | + | **参数** | **描述** | **默认设置** | + | ---------------- | -------------------- | ------------------- | + | MILVUS_HOST | Milvus 服务的 IP 地址 | 127.0.0.1 | + | MILVUS_PORT | Milvus 服务的端口号 | 19530 | + | VECTOR_DIMENSION | 特征向量的维度 | 192 | + | MYSQL_HOST | Mysql 服务的 IP 地址 | 127.0.0.1 | + | MYSQL_PORT | Mysql 服务的端口号 | 3306 | + | DEFAULT_TABLE | 默认存储的表名 | audio_table | - 运行程序 @@ -104,7 +113,13 @@ ffce340b3790 minio/minio:RELEASE.2020-12-03T00-03-10Z "/usr/bin/docker-ent…" ```bash wget -c https://www.openslr.org/resources/82/cn-celeb_v2.tar.gz && tar -xvf cn-celeb_v2.tar.gz ``` - 注:如果希望快速搭建 demo,可以采用 ./src/test_main.py:download_audio_data 内部的 20 条音频,另外后续结果展示以该集合为例 + **注**:如果希望快速搭建 demo,可以采用 ./src/test_main.py:download_audio_data 内部的 20 条音频,另外后续结果展示以该集合为例 + +- 准备模型(如果使用默认模型,可以跳过此步骤) + ```bash + ## 修改模型配置参数,目前 model 仅支持 ecapatdnn_voxceleb12,后续将支持多种类型 + vim ./src/encode.py + ``` - 脚本测试(推荐) @@ -182,7 +197,7 @@ ffce340b3790 minio/minio:RELEASE.2020-12-03T00-03-10Z "/usr/bin/docker-ent…" 在浏览器中输入 127.0.0.1:8068 访问前端页面 - 注:如果浏览器和服务不在同一台机器上,那么 IP 需要修改成服务所在的机器 IP,并且 docker-compose.yaml 中相应的 API_URL 也要修改,然后重新执行 docker-compose.yaml 文件,使修改生效。 + **注**:如果浏览器和服务不在同一台机器上,那么 IP 需要修改成服务所在的机器 IP,并且 docker-compose.yaml 中相应的 API_URL 也要修改,然后重新执行 docker-compose.yaml 文件,使修改生效。 - 上传音频 @@ -220,6 +235,3 @@ ffce340b3790 minio/minio:RELEASE.2020-12-03T00-03-10Z "/usr/bin/docker-ent…" | 模型 | 采样率 | :--- | :---: | ecapa_tdnn| 16000 -| panns_cnn6| 32000 -| panns_cnn10| 32000 -| panns_cnn14| 32000 diff --git a/demos/audio_searching/src/encode.py b/demos/audio_searching/src/encode.py index 358057841..f67184c29 100644 --- a/demos/audio_searching/src/encode.py +++ b/demos/audio_searching/src/encode.py @@ -12,8 +12,8 @@ # See the License for the specific language governing permissions and # limitations under the License. import numpy as np -from logs import LOGGER +from logs import LOGGER from paddlespeech.cli import VectorExecutor vector_executor = VectorExecutor() @@ -24,7 +24,8 @@ def get_audio_embedding(path): Use vpr_inference to generate embedding of audio """ try: - embedding = vector_executor(audio_file=path) + embedding = vector_executor( + audio_file=path, model='ecapatdnn_voxceleb12') embedding = embedding / np.linalg.norm(embedding) embedding = embedding.tolist() return embedding