# [Aidatatang_200zh](http://openslr.elda.org/62/) Aidatatang_200zh is a free Chinese Mandarin speech corpus provided by Beijing DataTang Technology Co., Ltd under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License. The contents and the corresponding descriptions of the corpus include: * The corpus contains 200 hours of acoustic data, which is mostly mobile recorded data. * 600 speakers from different accent areas in China are invited to participate in the recording. * The transcription accuracy for each sentence is larger than 98%. * Recordings are conducted in a quiet indoor environment. * The database is divided into training set, validation set, and testing set in a ratio of 7: 1: 2. * Detail information such as speech data coding and speaker information is preserved in the metadata file. * Segmented transcripts are also provided. The corpus aims to support researchers in speech recognition, machine translation, voiceprint recognition, and other speech-related fields. Therefore, the corpus is totally free for academic use.