Abstract
Fire is one of the most dangerous disasters threatening human life and property globally. In order to reduce fire losses, researches on video analysis for early smoke detection have become particularly significant. However, it is still a challenging task to extract stable features for smoke recognition, largely due to its variations in color, shapes and texture. Classical convolutional neural networks can automatically learn feature representations of appearance from a single frame but fail to capture motion information between frames. For addressing this issue, in this paper, we propose a spatial-temporal based convolutional neural network for video smoke detection, and for real-time detection, propose an enhanced architecture, which utilizes a multitask learning strategy to jointly recognize smoke and estimate optical flow, capturing intra-frame appearance features and inter-frame motion features simultaneously. The effectiveness and efficiency of our proposed method is validated by experiments carried out on our self-created dataset, which achieves 97.0% detection rate and 3.5% false alarm rate with processing time of 5ms per frame, obviously outperforming existing methods.
Similar content being viewed by others
References
Brox T, Bruhn A, Papenberg N, Weickert J (2004) High accuracy optical flow estimation based on a theory for warping. Computer vision - ECCV 2004: 8th European conference on computer vision, Prague, Czech Republic, May 11-14, 2004. Proceedings, Part IV, pp 25–36
da Penha OS, Nakamura EF (2010) Fusing light and temperature data for fire detection. In: The IEEE Symposium on computers and communications, pp 107–112. https://doi.org/10.1109/ISCC.2010.5546519
Dosovitskiy A, Fischery P, Ilg E, Hausser P, Hazirbas C, Golkov V, Smagt VD, Cremers P, Brox D, Flownet T (2015) Learning optical flow with convolutional networks. In: 2015 IEEE International conference on computer vision (ICCV), pp 2758–2766. https://doi.org/10.1109/ICCV.2015.316
Frizzi S, Kaabi R, Bouchouicha M, Ginoux JM, Moreau E, Fnaiech F (2016) Convolutional neural network for video fire and smoke detection. In: IECON 2016 - 42nd Annual conference of the IEEE industrial electronics society, pp 877–882. https://doi.org/10.1109/IECON.2016.7793196
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on computer vision and pattern recognition, pp 580–587
Gubbi J, Marusic S, Palaniswami M (2009) Smoke detection in video using wavelets and support vector machines. Fire Safe J 44(8):1110–1115. https://doi.org/10.1016/j.firesaf.2009.08.003. http://www.sciencedirect.com/science/article/pii/S0379711209001155
Han Y, Yang Y, Wu F, Hong R (2015) Compact and discriminative descriptor inference using multi-cues. IEEE Trans Image Process 24(12):5114–5126. https://doi.org/10.1109/TIP.2015.2479917
Han J, Zhang D, Cheng G, Liu N, Xu D (2018) Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process Mag 35(1):84–100. https://doi.org/10.1109/MSP.2017.2749125
Howard AG (2013) Some improvements on deep convolutional neural network based image classification. CoRR 1312.5402
Hu Y, Chang H, Nian F, Wang Y, Li T (2016) Dense crowd counting from still images with convolutional neural networks. J Vis Commun Image Represent 38:530–539. https://doi.org/10.1016/j.jvcir.2016.03.021. http://www.sciencedirect.com/science/article/pii/S1047320316300256
Huang X (2018) Automatic video superimposed text detection based on nonsubsampled contourlet transform. Multimed Tools Appl 77(6):7033–7049. https://doi.org/10.1007/s11042-017-4619-8
Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T (2016) Flownet 2.0: evolution of optical flow estimation with deep networks. CoRR 1612.01925
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: MM 2014 - Proceedings of the 2014 ACM conference on multimedia
Kaiser T (2000) Fire detection with temperature sensor arrays. In: Proceedings IEEE 34th annual 2000 international carnahan conference on security technology (Cat. No.00CH37083), pp 262–268. https://doi.org/10.1109/CCST.2000.891198
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on computer vision and pattern recognition, pp 1725–1732. https://doi.org/10.1109/CVPR.2014.223
Ko B, Park J, Nam JY (2013) Spatiotemporal bag-of-features for early wildfire smoke detection. Image Vis Comput 31(10):786–795. https://doi.org/10.1016/j.imavis.2013.08.001
Krizhevsky A, Ilya S, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation, 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965
Mao X, Shen C, Yang Y (2016) Image denoising using very deep fully convolutional encoder-decoder networks with symmetric skip connections. CoRR 1603.09056
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: 2015 IEEE International conference on computer vision (ICCV), pp 1520–1528. https://doi.org/10.1109/ICCV.2015.178
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252. https://doi.org/10.1007/s11263-015-0816-y
Sainath T, Kingsbury B, Mohamed A, Dahl GE, Saon G, Soltau H, Beran T, Aravkin AY, Ramabhadran B (2013) Improvements to deep convolutional neural networks for lvcsr. In: IEEE Workshop on automatic speech recognition and understanding, pp 315–320
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. Adv Neural Inf Process Syst, 1
Srisuwan T, Ruchanurucks M (2013) Smoke detection using glcm, wavelet, and motion. In: Proceedings of SPIE - the international society for optical engineering, p 9069
Sun Y, Wang X, Tang X (2014) Deep learning face representation from predicting 10,000 classes. In: IEEE Conference on computer vision and pattern recognition, pp 1891–1898
Tao C, Zhang J, Wang P (2016) Smoke detection based on deep convolutional neural networks. In: 2016 International conference on industrial informatics - computing technology, intelligent technology, industrial information integration (ICIICII), pp 150–153. https://doi.org/10.1109/ICIICII.2016.0045
Tian H, Li W, Ogunbona P, Nguyen DT, Zhan C (2011) Smoke detection in videos using non-redundant local binary pattern-based features. In: 2011 IEEE 13th International workshop on multimedia signal processing, pp 1–4. https://doi.org/10.1109/MMSP.2011.6093844
Toreyin B, Dedeolu Y, Enis A, Etin C (2005) Wavelet based real-time smoke detection in video. In: Proceedings of 13th European signal processing conference
Xu G, Zhang Y, Zhang Q, Lin G, Wang J (2017) Domain adaptation from synthesis to reality in single-model detector for video smoke detection. arXiv:1709.08142
Yao X, Han J, Zhang D, Nie F (2017) Revisiting co-saliency detection: a novel approach based on two-stage multi-view spectral rotation co-clustering. IEEE Trans Image Process 26(7):3196–3209. https://doi.org/10.1109/TIP.2017.2694222
Yin Z, Wan B, Yuan F, Xia X, Shi J (2017) A deep normalization and convolutional neural network for image smoke detection. IEEE Access 5:18,429–18,438. https://doi.org/10.1109/ACCESS.2017.2747399
Yuan F (2008) A fast accumulative motion orientation model based on integral image for video smoke detection. Pattern Recogn Lett 29(7):925–932. https://doi.org/10.1016/j.patrec.2008.01.013. http://www.sciencedirect.com/science/article/pii/S0167865508000263
Yuan F (2011) Video-based smoke detection with histogram sequence of lbp and lbpv pyramids. Fire Safety J 46(3):132–139. https://doi.org/10.1016/j.firesaf.2011.01.001. http://www.sciencedirect.com/science/article/pii/S0379711211000026
Yuan F (2012) A double mapping framework for extraction of shape-invariant features based on multi-scale partitions with adaboost for video smoke detection. Pattern Recogn 45(12):4326–4336. https://doi.org/10.1016/j.patcog.2012.06.008. http://www.sciencedirect.com/science/article/pii/S0031320312002786
Yuan F, Shi J, Xia X, Fang Y, Fang Z, Mei T (2016) High-order local ternary patterns with locality preserving projection for smoke detection and image classification. Inf Sci 372:225–240. https://doi.org/10.1016/j.ins.2016.08.040. http://www.sciencedirect.com/science/article/pii/S0020025516306168
Zeiler M, Fergus R (2014) Visualizing and understanding convolutional networks. In: Europe Conference on computer vision, pp 818–833
Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 833–841. https://doi.org/10.1109/CVPR.2015.7298684
Zhang Q, Xu J, Xu L, Guo H (2016) Deep convolutional neural networks for forest fire detection. In: International forum on management, education & information technology application
Zhao S, Liu Y, Han Y, Hong R, Hu Q, Tian Q (2017) Pooling the convolutional layers in deep convnets for video action recognition. IEEE Trans Circ Syst Vid Technol PP(99):1–1. https://doi.org/10.1109/TCSVT.2017.2682196
Acknowledgements
The authors would like to thank the editor and the anonymous reviewers for their valuable comments and constructive suggestions. This work was supported by the National Key Science & Technology Pillar Program of China (No. 2014BAG01B03), the National Natural Science Foundation of China (No. 61374194), Key Research and Development Program of Jiangsu Province (No. BE2016739), and a Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Key Science & Technology Pillar Program of China (No. 2014BAG01B03), the National Natural Science Foundation of China (No. 61374194), Key Research and Development Program of Jiangsu Province (No. BE2016739), and aProject Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions.
Rights and permissions
About this article
Cite this article
Hu, Y., Lu, X. Real-time video fire smoke detection by utilizing spatial-temporal ConvNet features. Multimed Tools Appl 77, 29283–29301 (2018). https://doi.org/10.1007/s11042-018-5978-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5978-5