nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Hierarchy of Alternating Specialists for Scene Recognition

verfasst von : Hyo Jin Kim, Jan-Michael Frahm

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We introduce a method for improving convolutional neural networks (CNNs) for scene classification. We present a hierarchy of specialist networks, which disentangles the intra-class variation and inter-class similarity in a coarse to fine manner. Our key insight is that each subset within a class is often associated with different types of inter-class similarity. This suggests that existing network of experts approaches that organize classes into coarse categories are suboptimal. In contrast, we group images based on high-level appearance features rather than their class membership and dedicate a specialist model per group. In addition, we propose an alternating architecture with a global ordered- and a global orderless-representation to account for both the coarse layout of the scene and the transient objects. We demonstrate that it leads to better performance than using a single type of representation as well as the fused features. We also introduce a mini-batch soft k-means that allows end-to-end fine-tuning, as well as a novel routing function for assigning images to specialists. Experimental results show that the proposed approach achieves a significant improvement over baselines including the existing tree-structured CNNs with class-based grouping.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel TSC: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection

Nächstes Kapitel Move Forward and Tell: A Progressive Generator of Video Descriptions

Nur mit Berechtigung zugänglich

Agrawal, P., Girshick, R., Malik, J.: Analyzing the performance of multilayer neural networks for object recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 329–344. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_22CrossRef

Ahmed, K., Baig, M.H., Torresani, L.: Network of experts for large-scale image categorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 516–532. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_32CrossRef

Ahmed, K., Torresani, L.: Branchconnect: Large-scale visual recognition with learned branch connections. In: WACV (2018)

Aljundi, R., Chakravarty, P., Tuytelaars, T.: Expert gate: Lifelong learning with a network of experts. In: CVPR (2017)

Arandjelović, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: CVPR (2016)

Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: CVPR (2017)

Bishop, C.M., Svenskn, M.: Bayesian hierarchical mixtures of experts. In: Uncertainty in Artificial Intelligence (2002)

Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. In: CVPR (2017)

Cheng, G., Zhou, P., Han, J.: RIFD-CNN: Rotation-invariant and fisher discriminative convolutional neural networks for object detection. In: CVPR (2016)

10.

Cheng, X., Lu, J., Feng, J., Yuan, B., Zhou, J.: Scene recognition with objectness. Pattern Recogn. (2018)

11.

Deng, J., Berg, A.C., Fei-Fei, L.: Hierarchical semantic indexing for large scale image retrieval. In: CVPR (2011)

12.

Deng, J., et al.: Large-scale object classification using label relation graphs. In: ECCV (2014)

13.

Dixit, M., Chen, S., Gao, D., Rasiwasia, N., Vasconcelos, N.: Scene classification with semantic fisher vectors. In: CVPR (2015)

14.

Dixit, M.D., Vasconcelos, N.: Object based scene representations using fisher scores of local subspace projections. In: NIPS (2016)

15.

Goo, W., Kim, J., Kim, G., Hwang, S.J.: Taxonomy-regularized semantic deep convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 86–101. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_6CrossRef

16.

Gross, S., Ranzato, M., Szlam, A.: Hard mixtures of experts for large scale weakly supervised vision. In: CVPR (2017)

17.

Guo, S., Huang, W., Wang, L., Qiao, Y.: Locally supervised deep hybrid model for scene recognition. TIP (2017)

18.

Guo, Y., Zhao, G., Pietikäinen, M., Xu, Z.: Descriptor learning based on fisher separation criterion for texture classification. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6494, pp. 185–198. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19318-7_15CrossRef

19.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

20.

Herranz, L., Jiang, S., Li, X.: Scene recognition with CNNs: objects, scales and dataset bias. In: CVPR (2016)

21.

Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. Arxiv preprint arXiv:1503.02531 (2015)

22.

Hwang, S.J., Sigal, L.: A unified semantic embedding: Relating taxonomies and attributes. In: NIPS (2014)

23.

Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Comput. (1994)

24.

Juneja, M., Vedaldi, A., Jawahar, C., Zisserman, A.: Blocks that shout: distinctive parts for scene classification. In: CVPR (2013)

25.

Khan, S.H., Hayat, M., Porikli, F.: Scene categorization with spectral features. In: CVPR (2017)

26.

Kim, J., Park, Y., Kim, G., Hwang, S.J.: SplitNet: learning to semantically split deep networks for parameter reduction and model parallelization. In: ICML (2017)

27.

Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997 (2014)

28.

Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)

29.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)

30.

Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)

31.

Li, F., Neverova, N., Wolf, C., Taylor, G.: Modout: learning multi-modal architectures by stochastic regularization. In: FG (2017)

32.

Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)

33.

Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)

34.

Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV (1999)

35.

Mousavian, A., Kosecka, J.: Deep convolutional features for image based retrieval and scene categorization. arXiv preprint arXiv:1509.06033 (2015)

36.

Murdock, C., Li, Z., Zhou, H., Duerig, T.: Blockout: dynamic model selection for hierarchical deep networks. In: CVPR (2016)

37.

Murthy, V.N., Singh, V., Chen, T., Manmatha, R., Comaniciu, D.: Deep decision network for multi-class image classification. In: CVPR (2016)

38.

Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV (2001)

39.

Paszke, A., Gross, S., Chintala, S., Chanan, G.: PyTorch: tensors and dynamic neural networks in python with strong GPU acceleration (2017). http://pytorch.org/

40.

Qian, Q., Jin, R., Zhu, S., Lin, Y.: Fine-grained visual categorization via multi-stage metric learning. In: CVPR (2015)

41.

Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR (2009)

42.

Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: NIPS (2017)

43.

Selvaraju, R.R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., Batra, D.: Grad-CAM: why did you say that? Visual explanations from deep networks via gradient-based localization. In: ICCV (2017)

44.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

45.

Somanath, G., Kambhamettu, C.: Abstraction and generalization of 3D structure for recognition in large intra-class variation. In: ACCV (2010)

46.

Srivastava, N., Salakhutdinov, R.R.: Discriminative transfer learning with tree-based priors. In: NIPS (2013)

47.

Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)

48.

Tu, Z.: Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering. In: CVPR (2005)

49.

Urban, G., et al.: Do deep convolutional nets really need to be deep and convolutional? arXiv preprint arXiv:1603.05691 (2016)

50.

Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR (2012)

51.

Wang, L., Guo, S., Huang, W., Xiong, Y., Qiao, Y.: Knowledge guided disambiguation for large-scale scene classification with multi-resolution CNNs. TIP (2017)

52.

Wang, Z., Wang, L., Wang, Y., Zhang, B., Qiao, Y.: Weakly supervised patchnets: describing and aggregating local patches for scene recognition. TIP (2017)

53.

Warde-Farley, D., Rabinovich, A., Anguelov, D.: Self-informed neural network structure learning. arXiv preprint arXiv:1412.6563 (2014)

54.

Wu, R., Wang, B., Wang, W., Yu, Y.: Harvesting discriminative meta objects with deep CNN features for scene classification. In: ICCV (2015)

55.

Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: CVPR (2010)

56.

Xiao, Y., Wu, J., Yuan, J.: mCENTRIST: a multi-channel feature generation mechanism for scene categorization. TIP (2014)

57.

Yan, Z., et al.: HD-CNN: hierarchical deep convolutional neural networks for large scale visual recognition. In: ICCV (2015)

58.

Zhao, B., Li, F., Xing, E.P.: Large-scale category structure aware image categorization. In: NIPS (2011)

59.

Zheng, W.S., Gong, S., Xiang, T.: Person re-identification by probabilistic relative distance comparison. In: CVPR (2011)

60.

Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. In: ICLR (2014)

61.

Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)

62.

Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. PAMI (2017)

63.

Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS (2014)

Titel: Hierarchy of Alternating Specialists for Scene Recognition
verfasst von: Hyo Jin Kim
Jan-Michael Frahm
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2018
Print ISBN: 978-3-030-01251-9

Electronic ISBN: 978-3-030-01252-6

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-01252-6_28

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner