diff --git a/examples/timit/README.md b/examples/timit/README.md index 51fcfd57c..4f376cc38 100644 --- a/examples/timit/README.md +++ b/examples/timit/README.md @@ -1,7 +1,3 @@ # TIMIT -asr model with phone unit - -* asr0 - deepspeech2 Streaming/Non-Streaming -* asr1 - transformer/conformer Streaming/Non-Streaming -* asr2 - transformer/conformer Streaming/Non-Streaming with Kaldi feature +* asr1 - transformer Streaming/Non-Streaming diff --git a/examples/timit/asr1/README.md b/examples/timit/asr1/README.md index 99bda6691..b725894cd 100644 --- a/examples/timit/asr1/README.md +++ b/examples/timit/asr1/README.md @@ -1,5 +1,5 @@ -# Transformer/Conformer ASR with Timit -The phoneme-based continuous speech corpus is a collaboration between Texas Instruments, MIT, and SRI International. The [Timit](https://catalog.ldc.upenn.edu/docs/LDC93S1/) dataset has a voice sampling frequency of 16 khz and contains a total of 6,300 sentences, with 630 people from eight major U.S. dialects speaking a given 10 sentences each, all sentences are manually segmented and marked at the phone level. Seventy percent of the speakers are male; most of the speakers are white adults. +# Transformer ASR with Timit +The phoneme-based continuous speech corpus is a collaboration between Texas Instruments, MIT, and SRI International. The [Timit](https://catalog.ldc.upenn.edu/docs/LDC93S1/) dataset has a voice sampling frequency of 16 khz and contains a total of 6,300 sentences, with 630 people from 8 major U.S. dialects speaking a given 10 sentences each, all sentences are manually segmented and marked at the phone level. Seventy percent of the speakers are male; most of the speakers are white adults. ## Dataset ### Download and Extract