Action Recognition in the Dark Dataset (ARID Dataset)

The task of action recognition in dark videos is useful in various scenarios, e.g., night surveillance and self-driving at night. Though progress has been made in action recognition task for videos in normal illumination, few have studied action recognition in the dark. This is partly due to the lack of sufficient datasets for such a task. In this paper, we explored the task of action recognition in dark videos. We bridge the gap of the lack of data for this task by collecting a new dataset: the Action Recognition in the Dark (ARID) dataset. It consists of over 3,780 video clips with 11 action categories. To the best of our knowledge, it is the first dataset focused on human actions in dark videos. To gain further understanding of our ARID dataset, we analyze the ARID dataset in detail and showed its necessity over synthetic dark videos. Additionally, we benchmark the performance of several current action recognition models on our dataset and explored potential methods for increasing their performances. Our results show that current action recognition models and frame enhancement methods may not be effective solutions for the task of action recognition in dark videos.

Basic Statistics

The distribution of clips among the 11 classes is as follows:

alt text

Comparisons with HMDB51(-dark)

We compare our ARID dataset statistically with HMDB51/HMDB51-dark, with the results and sampled frame as shown:

alt text

alt text

Benchmark Results

Here we present some benchmark results of previous action recognition models: (Across three splits)

Method Top-1 Accuracy Top-5 Accuracy
VGG-Two Stream 32.08% 90.76%
TSN 57.96% 94.17%
C3D 40.34% 94.17%
I3D-RGB 54.68% 97.69%
I3D-Two Stream 72.78% 99.39%
3D-ResNet (50) 71.08% 99.39%

