Merge pull request #1791 from qingen/cluster

[vec] update readme
4 years ago · 5f62c84cb0
parent 0aed9fdf72 648cc5823b
commit 5f62c84cb0
3 changed files with 18 additions and 14 deletions
--- a/examples/ami/README.md
+++ b/examples/ami/README.md
@ -1,3 +1,3 @@
 # Speaker Diarization on AMI corpus

-* sd0 - speaker diarization by AHC,SC base on x-vectors
+* sd0 - speaker diarization by AHC,SC base on embeddings
--- a/examples/ami/sd0/README.md
+++ b/examples/ami/sd0/README.md
@ -7,7 +7,23 @@
 The script performs diarization using x-vectors(TDNN,ECAPA-TDNN) on the AMI mix-headset data. We demonstrate the use of different clustering methods: AHC, spectral.

 ## How to Run
+### prepare annotations and audios
+Download AMI corpus, You need around 10GB of free space to get whole data
+The signals are too large to package in this way, so you need to use the chooser to indicate which ones you wish to download
+
+```bash
+## download  annotations
+wget http://groups.inf.ed.ac.uk/ami/AMICorpusAnnotations/ami_public_manual_1.6.2.zip && unzip ami_public_manual_1.6.2.zip
+```
+
+then please follow https://groups.inf.ed.ac.uk/ami/download/ to download the Signals:
+1) Select one or more AMI meetings: the IDs please follow ./ami_split.py
+2) Select media streams: Just select Headset mix
+
+### start running
 Use the following command to run diarization on AMI corpus.
-`bash ./run.sh` 
+```bash
+./run.sh  --data_folder ./amicorpus  --manual_annot_folder ./ami_public_manual_1.6.2
+```

 ## Results (DER) coming soon! :)
--- a/examples/ami/sd0/run.sh
+++ b/examples/ami/sd0/run.sh
@ -17,18 +17,6 @@ device=gpu

 . ${MAIN_ROOT}/utils/parse_options.sh || exit 1;

-if [ $stage -le 0 ]; then
-    # Prepare data
-    # Download AMI corpus, You need around 10GB of free space to get whole data
-    # The signals are too large to package in this way,
-    # so you need to use the chooser to indicate which ones you wish to download
-    echo "Please follow https://groups.inf.ed.ac.uk/ami/download/ to download the data."
-    echo "Annotations: AMI manual annotations v1.6.2 "
-    echo "Signals: "
-    echo "1) Select one or more AMI meetings: the IDs please follow ./ami_split.py"
-    echo "2) Select media streams: Just select Headset mix"
-fi
-
 if [ $stage -le 1 ]; then
    # Download the pretrained model
    wget https://paddlespeech.bj.bcebos.com/vector/voxceleb/sv0_ecapa_tdnn_voxceleb12_ckpt_0_1_1.tar.gz