From 2c5121c53299089e15f87c51e5f6808c4ed7853e Mon Sep 17 00:00:00 2001
From: qingen <qingenz123@126.com>
Date: Tue, 26 Apr 2022 14:35:03 +0800
Subject: [PATCH 1/3] [vec] update readme, test=doc

---
 examples/ami/README.md     |  2 +-
 examples/ami/sd0/README.md | 18 +++++++++++++++++-
 examples/ami/sd0/run.sh    | 12 ------------
 3 files changed, 18 insertions(+), 14 deletions(-)
diff --git a/examples/ami/README.md b/examples/ami/README.md
index a038eaebe..adc9dc4b0 100644
--- a/examples/ami/README.md
+++ b/examples/ami/README.md
@@ -1,3 +1,3 @@
 # Speaker Diarization on AMI corpus
 
-* sd0 - speaker diarization by AHC,SC base on x-vectors
+* sd0 - speaker diarization by AHC,SC base on embeddings
diff --git a/examples/ami/sd0/README.md b/examples/ami/sd0/README.md
index ffe95741a..e9ecc2854 100644
--- a/examples/ami/sd0/README.md
+++ b/examples/ami/sd0/README.md
@@ -7,7 +7,23 @@
 The script performs diarization using x-vectors(TDNN,ECAPA-TDNN) on the AMI mix-headset data. We demonstrate the use of different clustering methods: AHC, spectral.
 
 ## How to Run
+### prepare annotations and audios
+Download AMI corpus, You need around 10GB of free space to get whole data
+The signals are too large to package in this way, so you need to use the chooser to indicate which ones you wish to download
+
+```bash
+## download  annotations
+wget http://groups.inf.ed.ac.uk/ami/AMICorpusAnnotations/ami_public_manual_1.6.2.zip && unzip ami_public_manual_1.6.2.zip
+```
+
+then please follow https://groups.inf.ed.ac.uk/ami/download/ to download the Signals:
+1) Select one or more AMI meetings: the IDs please follow ./ami_split.py
+2) Select media streams: Just select Headset mix
+
+### start running
 Use the following command to run diarization on AMI corpus.
-`bash ./run.sh` 
+```bash
+./run.sh  --data_folder ./amicorpus  --manual_annot_folder ./ami_public_manual_1.6.2
+```
 
 ## Results (DER) coming soon! :)
diff --git a/examples/ami/sd0/run.sh b/examples/ami/sd0/run.sh
index 9035f5955..1fcec269d 100644
--- a/examples/ami/sd0/run.sh
+++ b/examples/ami/sd0/run.sh
@@ -17,18 +17,6 @@ device=gpu
 
 . ${MAIN_ROOT}/utils/parse_options.sh || exit 1;
 
-if [ $stage -le 0 ]; then
-    # Prepare data
-    # Download AMI corpus, You need around 10GB of free space to get whole data
-    # The signals are too large to package in this way,
-    # so you need to use the chooser to indicate which ones you wish to download
-    echo "Please follow https://groups.inf.ed.ac.uk/ami/download/ to download the data."
-    echo "Annotations: AMI manual annotations v1.6.2 "
-    echo "Signals: "
-    echo "1) Select one or more AMI meetings: the IDs please follow ./ami_split.py"
-    echo "2) Select media streams: Just select Headset mix"
-fi
-
 if [ $stage -le 1 ]; then
     # Download the pretrained model
     wget https://paddlespeech.bj.bcebos.com/vector/voxceleb/sv0_ecapa_tdnn_voxceleb12_ckpt_0_1_1.tar.gz

From 758d5fc5e22cd55960e7fa2e13e7ef10deeadfa4 Mon Sep 17 00:00:00 2001
From: qingen <qingenz123@126.com>
Date: Tue, 17 May 2022 15:35:13 +0800
Subject: [PATCH 2/3] [vec][doc] add ppvpr doc, test=doc

---
 docs/source/vpr/PPVPR.md    | 79 ++++++++++++++++++++++++++++++++++++
 docs/source/vpr/PPVPR_cn.md | 80 +++++++++++++++++++++++++++++++++++++
 2 files changed, 159 insertions(+)
 create mode 100644 docs/source/vpr/PPVPR.md
 create mode 100644 docs/source/vpr/PPVPR_cn.md

diff --git a/docs/source/vpr/PPVPR.md b/docs/source/vpr/PPVPR.md
new file mode 100644
index 000000000..2c0ed8f54
--- /dev/null
+++ b/docs/source/vpr/PPVPR.md
@@ -0,0 +1,79 @@
+([简体中文](./PPVPR_cn.md)|English)
+# PP-VPR
+
+## Catalogue
+- [1. Introduction](#1)
+- [2. Characteristic](#2)
+- [3. Tutorials](#3)
+    - [3.1 Pre-trained Models](#31)
+    - [3.2 Training](#32)
+    - [3.3 Inference](#33)
+    - [3.4 Service Deployment](#33)
+- [4. Quick Start](#4)
+
+<a name="1"></a>
+## 1. Introduction
+
+PP-VPR is a tool that provides voice print feature extraction and retrieval functions.  Provides a variety of quasi-industrial solutions, easy to solve the difficult problems in complex scenes, support the use of command line model reasoning.  PP-VPR also supports interface operations and container deployment.  
+
+<a name="2"></a>
+## 2. Characteristic
+The basic process of VPR is shown in the figure below:  
+<center><img src=https://ai-studio-static-online.cdn.bcebos.com/3aed59b8c8874046ad19fe583d15a8dd53c5b33e68db4383b79706e5add5c2d0 width="800" ></center>
+
+
+The main characteristics of PP-ASR are shown below:
+-  Provides pre-trained models on Chinese open source datasets: VoxCeleb(English). The models include ecapa-tdnn.
+-  Complete quasi-industrial solutions, including labelless training, cross-domain adaptive, super-large scale speaker training, data long tail problem solving, etc.
+-  Support model training/evaluation.
+-  Support model inference using the command line. You can use to use `paddlespeech vector --task spk --input xxx.wav` to use the pre-trained model to do model inference. 
+-  Support interface operations and container deployment.
+
+<a name="3"></a>
+## 3. Tutorials
+
+<a name="31"></a>
+## 3.1 Pre-trained Models
+The support pre-trained model list: [released_model](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md).  
+For more information about model design, you can refer to the aistudio tutorial:
+- [ecapa-tdnn](https://aistudio.baidu.com/aistudio/projectdetail/4027664)
+
+<a name="32"></a>
+## 3.2 Training
+The referenced script for model training is stored in [examples](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples) and stored according to "examples/dataset/model". The dataset mainly supports VoxCeleb. The model supports ecapa-tdnn.
+The specific steps of executing the script are recorded in `run.sh`.
+
+For more information, you can refer to [sv0](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/voxceleb/sv0)
+
+
+<a name="33"></a>
+## 3.3 Inference
+
+PP-VPR supports use `paddlespeech vector --task spk --input xxx.wav` to use the pre-trained model to do inference after install `paddlespeech` by `pip install paddlespeech`.
+
+Specific supported functions include:
+
+- Prediction of single audio
+- Score the similarity between the two audios
+- Support RTF calculation
+
+For specific usage, please refer to: [speaker_verification](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speaker_verification/README_cn.md) 
+
+
+<a name="34"></a>
+## 3.4 Service Deployment
+
+PP-VPR supports Docker containerized service deployment.  Through Milvus, MySQL performs high performance library building search.  
+
+Demo of VPR Server: [audio_searching](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/audio_searching)
+
+![arch](https://ai-studio-static-online.cdn.bcebos.com/7b32dd0200084866863095677e8b40d3b725b867d2e6439e9cf21514e235dfd5)
+
+For more information about service deployment, you can refer to the aistudio tutorial:
+- [speaker_recognition](https://aistudio.baidu.com/aistudio/projectdetail/4027664)
+
+<a name="4"></a>
+
+## 4. Quick Start
+
+To use PP-VPR, you can see here [install](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md), It supplies three methods to install `paddlespeech`, which are **Easy**, **Medium** and **Hard**. If you want to experience the inference function of paddlespeech, you can use **Easy** installation method.
diff --git a/docs/source/vpr/PPVPR_cn.md b/docs/source/vpr/PPVPR_cn.md
new file mode 100644
index 000000000..87e8897f2
--- /dev/null
+++ b/docs/source/vpr/PPVPR_cn.md
@@ -0,0 +1,80 @@
+(简体中文|[English](./PPVPR.md))
+# PP-VPR
+
+## 目录
+- [1. 简介](#1)
+- [2. 特点](#2)
+- [3. 使用教程](#3)
+    - [3.1 预训练模型](#31)
+    - [3.2 模型训练](#32)
+    - [3.3 模型推理](#33)
+    - [3.4 服务部署](#33)
+- [4. 快速开始](#4)
+
+<a name="1"></a>
+## 1. 简介
+
+PP-VPR 是一个 提供声纹特征提取，检索功能的工具。提供了多种准工业化的方案，轻松搞定复杂场景中的难题，支持使用命令行的方式进行模型的推理。 PP-VPR 也支持界面化的操作，容器化的部署。
+
+<a name="2"></a>
+## 2. 特点
+VPR 的基本流程如下图所示：  
+<center><img src=https://ai-studio-static-online.cdn.bcebos.com/3aed59b8c8874046ad19fe583d15a8dd53c5b33e68db4383b79706e5add5c2d0 width="800" ></center>
+
+
+PP-VPR 的主要特点如下：
+-  提供在英文开源数据集 VoxCeleb（英文）上的预训练模型，ecapa-tdnn。
+-  完备的准工业化方案，包括无标签训练，跨域自适应，超大规模说话人训练，解决数据长尾问题等。
+-  支持模型训练评估功能。
+-  支持命令行方式的模型推理，可使用 `paddlespeech vector --task spk --input xxx.wav` 方式调用预训练模型进行推理。
+-  支持 VPR 的服务容器化部署，界面化操作。
+
+
+<a name="3"></a>
+## 3. 使用教程
+
+<a name="31"></a>
+## 3.1 预训练模型
+支持的预训练模型列表：[released_model](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/released_model.md)。
+更多关于模型设计的部分，可以参考 AIStudio 教程：
+- [ecapa-tdnn](https://aistudio.baidu.com/aistudio/projectdetail/4027664)
+
+<a name="32"></a>
+## 3.2 模型训练
+
+模型的训练的参考脚本存放在 [examples](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples) 中，并按照 `examples/数据集/模型` 存放，数据集主要支持 VoxCeleb，模型支持 ecapa-tdnn 模型。
+具体的执行脚本的步骤记录在 `run.sh` 当中。具体可参考： [sv0](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/examples/voxceleb/sv0)
+
+
+<a name="33"></a>
+## 3.3 模型推理
+
+PP-VPR 支持在使用`pip install paddlespeech`后 使用命令行的方式来使用预训练模型进行推理。
+
+具体支持的功能包括：
+
+- 对单条音频进行预测
+- 对两条音频进行打分
+- 支持 RTF 的计算
+
+具体的使用方式可以参考： [speaker_verification](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speaker_verification/README_cn.md) 
+
+
+<a name="34"></a>
+## 3.4 服务部署
+
+PP-VPR 支持 Docker 容器化服务部署。通过 Milvus, MySQL 进行高性能建库检索。
+
+server 的 demo： [audio_searching](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos/audio_searching)
+
+![arch](https://ai-studio-static-online.cdn.bcebos.com/7b32dd0200084866863095677e8b40d3b725b867d2e6439e9cf21514e235dfd5)
+
+
+关于服务部署方面的更多资料，可以参考 AIStudio 教程：
+- [speaker_recognition](https://aistudio.baidu.com/aistudio/projectdetail/4027664)
+
+<a name="4"></a>
+
+## 4. 快速开始
+
+关于如何使用 PP-VPR，可以看这里的 [install](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install_cn.md)，其中提供了 **简单**、**中等**、**困难** 三种安装方式。如果想体验 paddlespeech 的推理功能，可以用 **简单** 安装方式。

From 1fd9430737839f47d14b0de5619c43f48b57ad34 Mon Sep 17 00:00:00 2001
From: qingen <qingenz123@126.com>
Date: Wed, 18 May 2022 11:16:40 +0800
Subject: [PATCH 3/3] [vec][doc] add ppvpr doc, test=doc

---
 docs/source/vpr/PPVPR.md    | 1 -
 docs/source/vpr/PPVPR_cn.md | 1 -
 2 files changed, 2 deletions(-)

diff --git a/docs/source/vpr/PPVPR.md b/docs/source/vpr/PPVPR.md
index 2c0ed8f54..a87dd621b 100644
--- a/docs/source/vpr/PPVPR.md
+++ b/docs/source/vpr/PPVPR.md
@@ -24,7 +24,6 @@ The basic process of VPR is shown in the figure below:
 
 The main characteristics of PP-ASR are shown below:
 -  Provides pre-trained models on Chinese open source datasets: VoxCeleb(English). The models include ecapa-tdnn.
--  Complete quasi-industrial solutions, including labelless training, cross-domain adaptive, super-large scale speaker training, data long tail problem solving, etc.
 -  Support model training/evaluation.
 -  Support model inference using the command line. You can use to use `paddlespeech vector --task spk --input xxx.wav` to use the pre-trained model to do model inference. 
 -  Support interface operations and container deployment.
diff --git a/docs/source/vpr/PPVPR_cn.md b/docs/source/vpr/PPVPR_cn.md
index 87e8897f2..f0e562d1e 100644
--- a/docs/source/vpr/PPVPR_cn.md
+++ b/docs/source/vpr/PPVPR_cn.md
@@ -24,7 +24,6 @@ VPR 的基本流程如下图所示：
 
 PP-VPR 的主要特点如下：
 -  提供在英文开源数据集 VoxCeleb（英文）上的预训练模型，ecapa-tdnn。
--  完备的准工业化方案，包括无标签训练，跨域自适应，超大规模说话人训练，解决数据长尾问题等。
 -  支持模型训练评估功能。
 -  支持命令行方式的模型推理，可使用 `paddlespeech vector --task spk --input xxx.wav` 方式调用预训练模型进行推理。
 -  支持 VPR 的服务容器化部署，界面化操作。