Action Recognition in Temporally Untrimmed Videos!
Automatically recognizing and localizing a large number of action categories from videos in the wild of significant importance for video understanding and multimedia event detection. THUMOS workshop and challenge aims at exploring new challenges and approaches for large-scale action recognition with large number of classes from open source videos in a realistic setting.
Most of the existing action recognition datasets are composed of videos that have been manually trimmed to bound the action of interest. This has been identified to be a considerable limitation as it poorly matches how action recognition is applied in practical settings. Therefore, THUMOS 2015 will conduct the challenge on temporally untrimmed videos. The participants may train their methods using trimmed clips but will be required to test their systems on untrimmed data.
A new forward-looking dataset containing over 430 hours of video data and 45 million frames (70% larger than THUMOS'14) with the following components is made available under this challenge:
All videos are collected from YouTube. We will evaluate the success of the proposed methods based on their performance on the new THUMOS 2015 Dataset in two tasks:
Participants may either submit a notebook paper that briefly describes their system, or a research paper detailing their approach. All of the submission results will be summarized during the workshop and included in the workshopconference proceedings. Additionally, the top performers will be invited to give oral presentations, with remaining entries encouraged to present their work in the poster session.
For more details, please see the Evaluation Setup document or the released resources.
Please fill this form in order to receive the password required for unzipping some of the shared data.
Training Data (13320 trimmed videos) -- each includes one action: UCF101 videos (zipped folder): [Download] (updated Apr. 08, 2015) UCF101 videos (individual files): [Link] (updated Apr. 08, 2015) Description of UCF101: [Link] List of action classes and their numerical index: [Download(http://crcv.ucf.edu/THUMOS14/Class Index.txt)]
Background Data (2980 untrimmed videos) -- each guaranteed to not include any instance of the 101 actions: Videos (zipped folder - complete): [Download] (updated Apr. 13, 2015) Videos (zipped folder - 25GB splits): [Part0] [Part1] [Part2] [Part3] [Part4] (updated Apr. 08, 2015) Videos (individual files): [Link] (updated Apr. 08, 2015) Metadata and annotations (primary action class for each video): [Download] (updated Apr. 08, 2015)
Validation Data (2104 untrimmed videos) -- each includes one/multiple instances of one/multiple actions: Videos (zipped folder - complete): [Download] (updated Apr. 13, 2015)
Videos (zipped folder - 25GB splits): [Part0] [Part1] [Part2] [Part3] [Part4] [Part5] [Part6] (updated Apr. 08, 2015)
Videos (individual files): [Link] (updated Apr. 08, 2015) Metadata and class-level annotations (action classes in each video): [Download] (updated Apr. 08, 2015) Temporal annotations of actions (videos of [20 classes](http://crcv.ucf.edu/THUMOS14/Class Index_Detection.txt)): [Download] (updated Apr. 08, 2015)
Development Kit (evaluation code & additional software): [Link] (updated Apr. 08, 2015)
Additional Data: Class-level attributes: [Download] Bounding box annotations of humans (24 classes of UCF101): [Download] Evaluation Setup Document: [Download] (updated May 04, 2015) Sample submission files: [Temporal Action Detection],[Action