Skip to main content
Top

2021 | OriginalPaper | Chapter

ARID: A New Dataset for Recognizing Action in the Dark

Authors : Yuecong Xu, Jianfei Yang, Haozhi Cao, Kezhi Mao, Jianxiong Yin, Simon See

Published in: Deep Learning for Human Activity Recognition

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The task of action recognition in dark videos is useful in various scenarios, e.g., night surveillance and self-driving at night. Though progress has been made in action recognition task for videos in normal illumination, few have studied action recognition in the dark, partly due to the lack of sufficient datasets for such a task. In this paper, we explored the task of action recognition in dark videos. We bridge the gap of the lack of data by collecting a new dataset: the Action Recognition in the Dark (ARID) dataset. It consists of 3,784 video clips with 11 action categories. To the best of our knowledge, it is the first dataset focused on human actions in dark videos. To gain further understanding of our ARID dataset, we analyze our dataset in detail and showed its necessity over synthetic dark videos. Additionally, we benchmark the performance of current action recognition models on our dataset and explored potential methods for increasing their performances. We show that current action recognition models and frame enhancement methods may not be effective solutions for the task of action recognition in dark videos (data available at https://​xuyu0010.​github.​io/​arid).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognitionm, pp. 6299–6308 (2017) Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognitionm, pp. 6299–6308 (2017)
2.
go back to reference Chen, C., Chen, Q., Do, M.N., Koltun, V.: Seeing motion in the dark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3185–3194 (2019) Chen, C., Chen, Q., Do, M.N., Koltun, V.: Seeing motion in the dark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3185–3194 (2019)
3.
go back to reference Chen, C., Chen, Q., Xu, J., Koltun, V.: Learning to see in the dark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3291–3300 (2018) Chen, C., Chen, Q., Xu, J., Koltun, V.: Learning to see in the dark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3291–3300 (2018)
4.
go back to reference Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)CrossRef Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)CrossRef
5.
go back to reference Guo, X., Li, Y., Ling, H.: Lime: Low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 26(2), 982–993 (2016)MathSciNetCrossRef Guo, X., Li, Y., Ling, H.: Lime: Low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 26(2), 982–993 (2016)MathSciNetCrossRef
6.
go back to reference Hara, K., Kataoka, H., Satoh, Y.: Can spatiotemporal 3D CNNS retrace the history of 2D CNNS and imagenet? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6546–6555 (2018) Hara, K., Kataoka, H., Satoh, Y.: Can spatiotemporal 3D CNNS retrace the history of 2D CNNS and imagenet? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6546–6555 (2018)
7.
go back to reference Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: alexnet-level accuracy with 50x fewer parameters and \(<\)0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016) Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: alexnet-level accuracy with 50x fewer parameters and \(<\)0.5 MB model size. arXiv preprint arXiv:​1602.​07360 (2016)
8.
go back to reference Jiang, H., Zheng, Y.: Learning to see moving objects in the dark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7324–7333 (2019) Jiang, H., Zheng, Y.: Learning to see moving objects in the dark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7324–7333 (2019)
9.
go back to reference Köpüklü, O., Kose, N., Gunduz, A., Rigoll, G.: Resource efficient 3D convolutional neural networks. arXiv preprint arXiv:1904.02422 (2019) Köpüklü, O., Kose, N., Gunduz, A., Rigoll, G.: Resource efficient 3D convolutional neural networks. arXiv preprint arXiv:​1904.​02422 (2019)
10.
go back to reference Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE (2011) Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: 2011 International Conference on Computer Vision, pp. 2556–2563. IEEE (2011)
11.
go back to reference Land, E.H.: The retinex theory of color vision. Sci. Am. 237(6), 108–129 (1977)CrossRef Land, E.H.: The retinex theory of color vision. Sci. Am. 237(6), 108–129 (1977)CrossRef
12.
go back to reference Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS Autodiff Workshop (2017) Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS Autodiff Workshop (2017)
13.
go back to reference Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5533–5541 (2017) Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3D residual networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5533–5541 (2017)
14.
go back to reference Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004. vol. 3, pp. 32–36. IEEE (2004) Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004. vol. 3, pp. 32–36. IEEE (2004)
15.
go back to reference Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012) Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:​1212.​0402 (2012)
16.
go back to reference Trahanias, P., Venetsanopoulos, A.: Color image enhancement through 3-D histogram equalization. In: Proceedings, 11th IAPR International Conference on Pattern Recognition. vol. III. Conference C: Image, Speech and Signal Analysis, pp. 545–548. IEEE (1992) Trahanias, P., Venetsanopoulos, A.: Color image enhancement through 3-D histogram equalization. In: Proceedings, 11th IAPR International Conference on Pattern Recognition. vol. III. Conference C: Image, Speech and Signal Analysis, pp. 545–548. IEEE (1992)
17.
go back to reference Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015) Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
18.
go back to reference Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 6450–6459 (2018) Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 6450–6459 (2018)
19.
go back to reference Wei, C., Chen, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. In: British Machine Vision Conference. British Machine Vision Association (2018) Wei, C., Chen, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. In: British Machine Vision Conference. British Machine Vision Association (2018)
20.
go back to reference Yang, J., Zou, H., Jiang, H., Xie, L.: Device-free occupant activity sensing using WIFI-enabled IoT devices for smart homes. IEEE Internet Things J. 5(5), 3991–4002 (2018)CrossRef Yang, J., Zou, H., Jiang, H., Xie, L.: Device-free occupant activity sensing using WIFI-enabled IoT devices for smart homes. IEEE Internet Things J. 5(5), 3991–4002 (2018)CrossRef
23.
go back to reference Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Computer Vision and Pattern Recognition (2016) Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Computer Vision and Pattern Recognition (2016)
24.
go back to reference Zou, H., Yang, J., Prasanna Das, H., Liu, H., Zhou, Y., Spanos, C.J.: WiFi and vision multimodal learning for accurate and robust device-free human activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019) Zou, H., Yang, J., Prasanna Das, H., Liu, H., Zhou, Y., Spanos, C.J.: WiFi and vision multimodal learning for accurate and robust device-free human activity recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Metadata
Title
ARID: A New Dataset for Recognizing Action in the Dark
Authors
Yuecong Xu
Jianfei Yang
Haozhi Cao
Kezhi Mao
Jianxiong Yin
Simon See
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-16-0575-8_6

Premium Partner