VidTIMIT音频视频数据集免费

jsaifc 19 2021-08-24 语音识别

资源介绍

VidTIMIT音频视频数据集 (http://ds.jsai.org.cn/) 语音识别 第1张

# Context This Dataset is conducive for various types of Audio-Video Analysis.(**Just want to mention one thing here,the files in the data for Video Analysis I am not Uploading courtesy Size issues but for those interested can download** [Here][1]) # Content The VidTIMIT dataset is comprised of video and corresponding audio recordings of 35 people(though the original data contains the data for 43 People but some links were missing), reciting short sentences. It can be useful for research on topics such as automatic lip reading, multi-view face recognition, multi-modal speech recognition and person identification. The dataset was recorded in 3 sessions, with a mean delay of 7 days between Session 1 and 2, and 6 days between Session 2 and 3. The sentences were chosen from the test section of the TIMIT corpus. There are 10 sentences per person. The first six sentences (sorted alpha-numerically by filename) are assigned to Session 1. The next two sentences are assigned to Session 2 with the remaining two to Session 3. The first two sentences for all persons are the same, with the remaining eight generally different for each person. The corresponding audio is stored as a mono, 16 bit, 32 kHz WAV file. # Acknowledgements The VidTIMIT dataset is Copyright ? 2001 Conrad Sanderson. For more details [refer here][2] # Inspiration There are many reasons for uploading the data as Fetching audio-video data free of cost (and even with cost) is relatively hard and this data can be used to build models like Speaker_recognition,Person Verification and much more. [1]: http://conradsanderson.id.au/vidtimit/ [2]: http://conradsanderson.id.au/vidtimit/

END

发表评论