nach oben

Progress in Artificial Intelligence

Erschienen in:

20.12.2019 | Review

Convolutional neural network: a review of models, methodologies and applications to object detection

verfasst von: Anamika Dhillon, Gyanendra K. Verma

Erschienen in: Progress in Artificial Intelligence | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Deep learning has developed as an effective machine learning method that takes in numerous layers of features or representation of the data and provides state-of-the-art results. The application of deep learning has shown impressive performance in various application areas, particularly in image classification, segmentation and object detection. Recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we provide a detailed review of various deep architectures and model highlighting characteristics of particular model. Firstly, we described the functioning of CNN architectures and its components followed by detailed description of various CNN models starting with classical LeNet model to AlexNet, ZFNet, GoogleNet, VGGNet, ResNet, ResNeXt, SENet, DenseNet, Xception, PNAS/ENAS. We mainly focus on the application of deep learning architectures to three major applications, namely (i) wild animal detection, (ii) small arm detection and (iii) human being detection. A detailed review summary including the systems, database, application and accuracy claimed is also provided for each model to serve as guidelines for future work in the above application areas.

Nächster Artikel A systematic literature review of the SBSE research community in Spain

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef

Hong, Z.: A preliminary study on artificial neural network. In: 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference, vol. 2, pp. 336–338 (2011)

Wang, X.J., Zhao, L.L., Wang, S.: A novel SVM video object extraction technology. In: 2012 8th International Conference on Natural Computation, pp. 44–48. IEEE (2012)

Rish, I.: An empirical study of the naive Bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol. 3, no. 22, pp. 41–46 (2001)

Islam, N., Zeeshan I., Nazia N.: A survey on optical character recognition system. arXiv preprint arXiv:1710.05703 (2017)

Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A.C., Bengio, Y.: Generative adversarial networks. arXiv:1406.2661 (2014)

Besbinar, B., Alatan, A.A.: Visual object tracking with autoencoder representations. In: 2016 24th Signal Processing and Communication Application Conference (SIU), pp. 2041–2044 (2016)

Ma, X., Geng, J., Wang, H.: Hyperspectral image classification via contextual deep learning. EURASIP J. Image Video Process. 2015(1), 20 (2015)CrossRef

Hinton, G.: A practical guide to training restricted Boltzmann machines. Momentum 9(1), 926 (2010)

10.

Shin, H., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., Summers, R.M.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)CrossRef

11.

Li, W., Fu, H., Yu, L., Gong, P., Feng, D., Li, C., Clinton, N.: Stacked Autoencoder-based deep learning for remote-sensing image classification: a case study of African land-cover mapping. Int. J. Remote Sens. 37, 5632–5646 (2016)CrossRef

12.

Vincent, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetMATH

13.

Feng, F., Wang, X., Li, R.: Correspondence autoencoders for cross-modal retrieval. ACM Trans. Multimed. Comput. Commun. Appl. 12(1), 1–22 (2015)CrossRef

14.

Hutchison, D.: LNCS 8588—Intelligent Computing Theory. Springer, Berlin (2014)

15.

Koushik, J.: Understanding convolutional neural networks. arXiv preprint arXiv:1605.09081 (2016)

16.

Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609–616. ACM (2009)

17.

Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193–202 (1980)MATHCrossRef

18.

Papakostas, M., Giannakopoulos, T., Makedon, F., Karkaletsis, V.: Short-term recognition of human activities using convolutional neural networks. In: 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 302–307. IEEE (2016)

19.

Yudistira, N., Kurita, T.: Gated spatio and temporal convolutional neural network for activity recognition: towards gated multimodal deep learning. EURASIP J. Image Video Process. 2017, 85 (2017)CrossRef

20.

Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2011)

21.

Zhou, X., Gong, W., Fu, W., Du, F.: Application of deep learning in object detection. In: 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), pp. 631–634. IEEE (2017)

22.

Ranjan, R., Sankaranarayanan, S., Bansal, A., Bodla, N., Chen, J.-C., Patel, V.M., Castillo, C.D., Chellappa, R.: Deep learning for understanding faces: machines may be just as good, or better, than humans. IEEE Signal Process. Mag. 35(1), 66–83 (2018)CrossRef

23.

Milyaev, S., Laptev, I.: Towards reliable object detection in noisy images. Pattern Recognit. Image Anal. 27(4), 713–722 (2017)CrossRef

24.

Zhou, X., Gong, W., Fu, W., Du, F.: Application of deep learning in object detection, pp. 631–634 (2017)

25.

Druzhkov, P.N., Kustikova, V.D.: A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit. Image Anal. 26(1), 9–15 (2016)CrossRef

26.

Sze, V., Chen, Y.-H., Yang, T.-J., Emer, J.S.: Efficient processing of deep neural networks: atutorial and survey. Proc. IEEE 105, 2295–2329 (2017)CrossRef

27.

Park, S.U., Park, J.H., Al-masni, M.A., Al-antari, M.A., Uddin, Z., Kim, T.: A depth camera-based human activity recognition via deep learning recurrent neural network for health and social care services. Procedia Comput. Sci. 100, 78–84 (2016)CrossRef

28.

Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: International workshop on human behavior understanding, pp. 29–39. Springer, Berlin, Heidelberg (2011)

29.

Zhao, X., Shi, X., Zhang, S.: Facial expression recognition via deep learning. IETE Tech. Rev. 32(5), 347–355 (2015)CrossRef

30.

Xie, S., Yang, T., Wang, X., Lin, Y.: Hyper-class augmented and regularized deep learning for fine-grained image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2645–2654 (2015)

31.

Floyd, M.W., Turner, J.T., Aha, D.W.: Using deep learning to automate feature modeling in learning by observation: a preliminary study. In: 2017 AAAI Spring Symposium Series

32.

Tang, C., Feng, Y., Yang, X., Zheng, C., Zhou, Y.: The object detection based on deep learning. In: 2017 4th International Conference on Information Science and Control Engineering (ICISCE), pp. 723–728 (2017)

33.

Alom, M.Z., Taha, T.M., Yakopcic, C., Westberg, S., Hasan, M., Van Esesn, B.C., Awwal, A.A.S., Asari, V.K.: The history began from AlexNet: a comprehensive survey on deep learning approaches. arXiv:1803.01164 (2018)

34.

Nguyen, H., Maclagan, S.J., Nguyen, T.D., Nguyen, T., Flemons, P., Andrews, K., Ritchie, E.G., Phung, D.: Animal recognition and identification with deep convolutional neural networks for automated wildlife monitoring. In: 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 40–49. IEEE (2017)

35.

Norouzzadeh, M.S., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M.S., Packer, C., Clune, J.: Automatically identifying, counting, and describing wild animals incamera-trap images with deep learning. Proc. Nat. Acad. Sci. 115(25), E5716–E5725 (2018)CrossRef

36.

Yin, C., Zhu, Y., Fei, J., He, X.: A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5, 21954–21961 (2017)CrossRef

37.

Olmos, R., Tabik, S., Herrera, F.: Automatic handgun detection alarm in videosusing deep learning. Neurocomputing 275, 66–72 (2018)CrossRef

38.

Lee, J., Bang, J., Yang, S.I.: Object detection with sliding window in images including multiple similar objects. In: 2017 International Conference on Information and Communication Technology Convergence (ICTC), pp. 803–806 (2017)

39.

Zhao, R., Yan, R., Chen, Z., Mao, K., Wang, P., Gao, R.X.: Deep learning and its applications to machine health monitoring. Mech. Syst. Signal Process. 115, 213–237 (2019)CrossRef

40.

Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2015)

41.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European conference on computer vision, pp. 21–37. Springer, Cham (2016)

42.

Li, Y., Ren, F.: Light-Weight RetinaNet for Object Detection. arXiv preprint arXiv:1905.10011 (2019)

43.

Lin, T.-Y., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2999–3007 (2017)

44.

Lin, T.-Y., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. CoRR. arXiv:1612.03144 (2016)

45.

Zhiqiang, W., Jun, L.: A review of object detection based on convolutional neural network. In: 2017 36th Chinese Control Conference (CCC), pp. 11104–11109 (2017)

46.

Zhao, B.: A survey on deep learning-based fine-grained object classification and semantic segmentation. Int. J. Autom. Comput. 14, 119–135 (2017)CrossRef

47.

Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2015)

48.

Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3150–3158 (2015)

49.

Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing System, pp. 91–99 (2015)

50.

Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

51.

Xu, X., Li, Y., Wu, G., Luo, J.: Multi-modal deep feature learning for RGB-D object detection. Pattern Recognit. 72, 300–313 (2017)CrossRef

52.

Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

53.

Abousaleh, F.S., Lim, T., Cheng, W.H., Yu, N.H., Anwar Hossain, M., Alhamid, M.F.: A novel comparative deep learning framework for facial age estimation. EURASIP J. Image Video Process. 2016(1), 47 (2016)CrossRef

54.

Fang, X.: Understanding deep learning via back-tracking and deconvolution. J. Big Data 4, 40 (2017)CrossRef

55.

Mnih, V., Heess, N., Graves, A.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)

56.

Wang, A., Lu, J., Cai, J., Cham, T., Wang, G.: Large-margin multi-modal deep learning for RGB-D object recognition. IEEE Trans. Multimed. 17(11), 1887–1898 (2015)CrossRef

57.

Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)

58.

Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)

59.

Hua, Y., Alahari, K., Schmid, C.: Online object tracking with proposal selection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3092–3100 (2015)

60.

He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRef

61.

Yao, L., Torabi, A., Cho, K., Ballas, N., Pal, C., Larochelle, H., Courville, A.: Describing videos by exploiting temporal structure. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4507–4515 (2015)

62.

Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., Pal, C., Jodoin, P.-M., Larochelle, H.: Brain tumor segmentation with deep neural networks. Med. Image Anal. 35, 18–31 (2017)CrossRef

63.

Ding, Y., Cheng, Y., Cheng, X., Li, B., You, X., Yuan, X.: Noise-resistant network: a deep-learning method for face recognition under noise. EURASIP J. Image Video Process. 2017(1), 43 (2017)CrossRef

64.

Shan, K., Guo, J., You, W., Lu, D., Bie, R.: Automatic facial expression recognition based on a deep convolutional-neural-network structure. In: 2017 IEEE 15th International Conference on Software Engineering Research, Management and Applications (SERA), pp. 123–128 (2017)

65.

Wang, J.G., Mahendran, P.S., Teoh, E.K.: Deep affordance learning for single- and multiple-instance object detection. In: TENCON 2017-2017 IEEE Region 10 Conference, pp. 321–326 (2017)

66.

Tian, B., Li, L., Qu, Y., Yan, L.: Video object detection for tractability with deeplearning method. In: 2017 Fifth International Conference on Advanced Cloud and Big Data (CBD), pp. 397–401 (2017)

67.

Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)

68.

Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

69.

Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process. Mag. 35(1), 84–100 (2018)CrossRef

70.

Babaee, M., Tung, D., Rigoll, G.: A deep convolutional neural network for video sequence background subtraction. Pattern Recogn. 76, 635–649 (2018)CrossRef

71.

Li, S., Luo, Y., Sun, K., Choi, K.: Heterogeneous system implementation of deep learning neural network for object detection in OpenCL framework. In: 2018 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1–4 (2018)

72.

Wu, Z., Shen, C., Van Den Hengel, A.: Wider or deeper: revisiting the ResNet model for visual recognition. Pattern Recogn. 90, 119–133 (2019)CrossRef

73.

Hossain, M.S., Muhammad, G.: Emotion recognition using deep learning approach from audio and visual emotional big data. Inf. Fusion 49, 69–78 (2019)CrossRef

74.

Ranjan, R., Patel, V.M., Chellappa, R.: HyperFace: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 121–135 (2019)CrossRef

75.

Zhang, S., Yao, L., Sun, A., Tay, Y.I.: Deep learning based recommender system: a survey. ACM Comput. Surv. 52(1), 5 (2019)

76.

Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef

77.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

78.

Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

79.

Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: European Conference on Computer Vision, pp. 646–661 (2016)CrossRef

80.

Oh, S.I., Kang, H.B.: Object detection and classification by decision-level fusion for intelligent vehicle systems. Sensors 17(1), 207 (2017)MathSciNetCrossRef

81.

Xu, H., Han, Z., Feng, S., Zhou, H., Fang, Y.: Foreign object debris material recognition based on convolutional neural networks. EURASIP J. Image Video Process. 2018, 21 (2018)CrossRef

82.

Bui, H.M., Lech, M., Cheng, E.V.A., Neville, K., Burnett, I.S.: Object recognition using deep convolutional features transformed by a recursive network structure. IEEE Access 4, 10059–10066 (2017)CrossRef

83.

Jiang, X., Pang, Y., Li, X., Pan, J.: Neurocomputing speed up deep neural network based pedestrian detection by sharing features across multi-scale models. Neurocomputing 185, 163–170 (2016)CrossRef

84.

Tomè, D., Monti, F., Barof, L., Bondi, L., Tagliasacchi, M., Tubaro, S.: Deep convolutional neural networks for pedestrian detection. Signal Process. Image Commun. 47, 482–489 (2016)CrossRef

85.

Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer, Cham (2014)

86.

Xiao, L., Yan, Q., Deng, S.: Scene classification with improved AlexNet model. In: 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 1–6. IEEE

87.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

88.

Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7585), 484–489 (2016)CrossRef

89.

Zhang, Q., Yang, L.T., Chen, Z., Li, P.: A survey on deep learning for big data. Inf. Fusion 42, 146–157 (2018)CrossRef

90.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

91.

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a largescale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

92.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

93.

Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3150–3158 (2016)

94.

Han, G., Zhang, X., Li, C.: Revisiting faster r-cnn: a deeper look at region proposal network. In: International Conference on Neural Information Processing, pp. 14–24 (2017)CrossRef

95.

Wu, C.H., Huang, Q., Li, S., Kuo, C.C.J.: A Taught-Obesrve-Ask (TOA) Method for Object Detection with Critical Supervision. arXiv preprint arXiv:1711.01043

96.

Minaee, S., Abdolrashidiy, A., Wang, Y.: An experimental study of deep convolutional features for iris recognition. In: 2016 IEEE Signal Processing in Medicine and Biology Symposium (SPMB), pp. 1–6 (2016)

97.

Li, Q., Jin, S., Yan, J.: Mimicking very efficient network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6356–6364 (2017)

98.

Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

99.

Lee, Y., Kim, H., Park, E., Cui, X., Kim, H.: Wide-residual-inception networks for real-time object detection. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 758–764 (2017)

100.

Liu, C., Cao, Y., Luo, Y., Chen, G., Vokkarane, V., Ma, Y.: Deepfood: deep learning-based food image recognition for computer-aided dietary assessment. In: International Conference on Smart Homes and Health Telematics, pp. 37–48. Springer, Cham (2016)

101.

Xia, X., Xu, C., Nan, B.: Inception-v3 for flower classification. In: 2017 2nd International Conference on Image, Vision and Computing (ICIVC), pp. 783–787. IEEE (2017)

102.

Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)

103.

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

104.

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

105.

Hussain, M., Haque, M.A.: Swishnet: a fast convolutional neural network for speech, music and noise classification and segmentation. arXiv preprint arXiv:1812.00149 (2018)

106.

Zhu, L., Deng, R., Maire, M., Deng, Z., Mori, G., Tan, P.: Sparsely aggregated convolutional networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 186–201 (2018)CrossRef

107.

Zhou, P., Ni, B., Geng, C., Hu, J., Xu, Y.: Scale-transferrable object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 528–537 (2018)

108.

Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)

109.

Adam, G., Lorraine, J.: Understanding Neural Architecture Search Techniques. arXiv preprint arXiv:1904.00438 (2019)

110.

Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecturesearch via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)

111.

Chen, Y., Yang, T., Zhang, X., Meng, G., Pan, C., Sun, J.: Detnas: Neural Architecture Search on Object Detection. arXiv preprint arXiv:1903.10979 (2019)

112.

Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)

113.

Tan, M., Le, Q.V.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv preprint arXiv:1905.11946 (2019)

114.

Google AI Blog: EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling. https://ai.googleblog.com/2019/05/efficientnet-improvingaccuracy-and.html. Accessed 8 June 2019

115.

Torrey, L., Shavlik, J.: Transfer learning. In: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques, pp. 242–264. IGI Global (2010)

116.

Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks?. In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)

117.

Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., Liu, C.: A survey on deep transfer learning. In: International Conference on Artificial Neural Networks, pp. 270–279. Springer, Cham (2018)

118.

Guignard, L., Weinberger, N.: Animal identification from remote camera images (2016)

119.

Villa, A.G., Salazar, A., Vargas, F.: Towards automatic wild animal monitoring: identification of animal species in camera-trap images using very deep convolutional neural networks. Ecol. Inform. 41, 24–32 (2017)CrossRef

120.

Okafor, E., Pawara, P., Karaaba, F., Surinta, O., Codreanu, V., Schomaker, L., Wiering, M.: Comparative study between deep learning and bag of visual words for wild-animal recognition. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE (2016)

121.

Fang, Y., Du, S., Abdoola, R., Djouani, K.: Background categorization for automatic animal detection in aerial videos using neural networks. ANNPR 2016, 220–232 (2016)

122.

Yu, X., Wang, J., Kays, R., Jansen, P.A., Wang, T., Huang, T.: Automated identification of animal species in camera trap images. EURASIP J. Image Video Process. 2013(1), 52 (2013)CrossRef

123.

Zhang, T., Xu, H., Hu, Z.: Physiognomy: personality traits prediction by learning. Int. J. Autom. Comput. 14, 386–395 (2017)CrossRef

124.

Zhao, X., Shi, X., Zhang, S., Zhao, X., Shi, X., Zhang, S.: Facial expression recognition via deep learning facial expression recognition via deep learning. IETE Tech. Rev. 32(5), 347–355 (2015)CrossRef

125.

Taigman, Y., Yang, M., Ranzato, M.A., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)

126.

Yoo, B., Kwak, Y., Kim, Y., Choi, C., Kim, J.: Multitask learning with weak label expansion. IEEE Signal Process. Lett. 25(6), 808–812 (2018)CrossRef

127.

Grega, M., Matiolański, A., Guzik, P., Leszczuk, M.: Automated detection of firearms and knives in a CCTV image. Sensors 16(1), 47 (2016)CrossRef

128.

Lai, J., Maples, S.: Developing a Real-Time Gun Detection Classifier (2017)

129.

Anwar, M.K., Risnumawan, A., Darmawan, A., Tamara, M.N., Purnomo, D.S.: Deep multilayer network for automatic targeting system of gun turret. In: 2017 International Electronics Symposium on Engineering Technology and Applications (IES-ETA), pp. 134–139 (2017)

130.

Glowacz, A., Kmieć, M., Dziech, A.: Visual detection of knives in security applications using active appearance models. Multimedia Tools Appl. 74(12), 4253–4267 (2015)CrossRef

131.

Farahnakian, F., Heikkonen, J.: A deep auto-encoder based approach for intrusion detection system. In: 2018 20th International Conference on Advanced Communication Technology (ICACT), pp. 178–183 (2018)

132.

Ning, X., Zhu, W., Chen, S.: Recognition, object detection and segmentation of white background photos based on deep learning. In: 2017 32nd Youth Academic Annual Conference of Chinese Association of Automation (YAC), pp. 182–187 (2018)

133.

Olmos, R., Tabik, S., Lamas, A., Pérez-Hernández, F., Herrera, F.: A binocular image fusion approach for minimizing false positives in handgun detection with deep learning. Inf. Fusion 49, 271–280 (2019)CrossRef

134.

Ning, X., Zhu, W., Chen, S.: Recognition, object detection and segmentation of white background photos based on deep learning, pp. 182–187 (2017)

135.

Chin, T.-W., Halpern, M.: Domain-specific approximation for object detection. IEEE Micro 38, 31–40 (2018)CrossRef

136.

Cao, W., Yuan, J., He, Z.: Fast deep neural networks with knowledge guided training and predicted regions of interests for real-time video object detection. IEEE Access 6, 8990–8999 (2018)CrossRef

137.

Liu, Y., Hua, K.A.: Field effect deep networks for image recognition. ACM Trans. Multimed. Comput. Commun. Appl. 12(4), 1–22 (2016)

138.

Sangineto, E., Nabi, M., Culibrk, D., Sebe, N.: Self paced deep learning for weakly supervised object detection. IEEE Trans. Pattern Anal. Mach. Intell. 14(8), 712–725 (2015)

139.

Bazrafkan, S., Corcoran, P.: Enhancing iris authentication on handheld devices using deep learning derived segmentation techniques. In: 2018 IEEE International Conference on Consumer Electronics (ICCE), pp. 1–2 (2018)

140.

Xu, H., Lv, X., Wang, X., Ren, Z., Bodla, N., Chellappa, R.: Deep regionlets for object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 798–814 (2018)CrossRef

Titel: Convolutional neural network: a review of models, methodologies and applications to object detection
verfasst von: Anamika Dhillon
Gyanendra K. Verma
Publikationsdatum: 20.12.2019
Verlag: Springer Berlin Heidelberg
Erschienen in: Progress in Artificial Intelligence / Ausgabe 2/2020
Print ISSN: 2192-6352
Elektronische ISSN: 2192-6360
DOI: https://doi.org/10.1007/s13748-019-00203-0

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 2/2020

Learning similarity measures from data

Coaching: accelerating reinforcement learning through human-assisted approach

A systematic literature review of the SBSE research community in Spain

Comparing two multinomial samples using hierarchical Bayesian models