Top

International Journal of Machine Learning and Cybernetics

Published in:

12-04-2022 | Original Article

Exploring correlation of relationship reasoning for scene graph generation

Authors: Peng Tian, Hongwei Mo, Laihao Jiang

Published in: International Journal of Machine Learning and Cybernetics | Issue 9/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Accurately reasoning about the relationship between objects play a central role in scene understanding. Due to the complexity of modeling visual relationships and the unbalanced distribution of relationship types, the results obtained by the existing methods are far from satisfying. In this work, we find that the interplay between contextual information of object pairs and their relationships can effectively regularize the space of visual relationship types to improve the accuracy of relationship reasoning. To this end, we incorporate the interplay into deep neural networks to facilitate scene graph generation by developing a Relationship Reasoning Network (ReRN). Specifically, the model uses a feature updating structure to mutual connection and iterative update the semantic features of objects and relationships to explore contextual information between objects. Then a graph attention mechanism is used to obtain the correlation information between object pairs and their relationships. Finally, our model adopts the correlation information to facilitate interactions recognition between objects while leveraging the mutual connections and joint refines of different semantic features to improve the accuracy of scene graph generation. Extensive experiments on the Visual Genome dataset demonstrate that our method outperforms the other state-of-the-art methods.

previous article An improved multi-population whale optimization algorithm

next article Multi-distance metric network for few-shot learning

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

inform now

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

inform now

Johnson J, Krishna R, Stark M et al (2015) Image retrieval using scene graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3668–3678

Yatskar M, Zettlemoyer L, Farhadi A (2016) Situation recognition: visual semantic role labeling for image understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5534–5542

Liu Y, Yu J, Han Y et al (2018) Understanding the effective receptive field in semantic image segmentation. Multimed Tools Appl 77(17):22159–22171CrossRef

Yan M, Guo Y et al (2016) Deep learning for visual understanding: a review. Neurocomputing 187(Apr. 26):27–48

Sun J, Li Y, Lu H et al (2020) Deep learning for visual segmentation: a review. In: Proceedings of 44th IEEE annual computers, software, and applications conference, pp 1256–1260

Scharwächter T, Enzweiler M, Franke U et al (2014) Stixmantics: a medium-level model for real-time semantic scene understanding. Computer vision ECCV 2014. Lecture notes in computer science, vol 8693. Springer, Cham. https://doi.org/10.1007/978-3-319-10602-1_35CrossRef

Vijay BA et al (2017) SegNet: a deep convolutional encoder-decoder architecture for scene segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495CrossRef

Gadosey PK, Li Y, Zhang T et al (2020) SEB-Net: revisiting deep encoder-decoder networks for scene understanding. In: Proceedings of 6th international conference on computing and artificial intelligence, pp 542–551. https://doi.org/10.1145/3404555.3404629

Xu DF, Zhu YK et al (2017) Scene graph generation by iterative message passing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5410–5419

10.

Yang J, Lu J, Lee S et al (2018) Graph rcnn for scene graph generation. In: European conference on computer vision, pp 670–685

11.

Li Y, Ouyang W, Zhou B et al (2018) Factorizable net: an efficient sub graph-based framework for scene graph generation. In: European conference on computer vision, pp 335–351

12.

Cong WL, Wang WL, Lee CW (2018) Scene graph generation via conditional random fields. arXiv:1811.08075 (arXiv preprint)

13.

Mohamed KB (2020) After all, only the last neuron matters: comparing multi-modal fusion functions for scene graph generation. arXiv:2011.04779 (arXiv preprint)

14.

Li Y, Ouyang W, Zhou B et al (2017) Scene graph generation from objects, phrases and region captions. In: Proceedings of the IEEE international conference on computer vision, pp 1261–1270

15.

Krishna R, Zhu Y, Groth O et al (2017) Visual genome: connecting language and vision using crowdsourced dense image annotations. Int J Comput Vis 123(1):32–73MathSciNetCrossRef

16.

Zellers R, Yatskar M, Thomson S et al (2018) Neural motifs: scene graph parsing with global context. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5831–5840

17.

Zitnick C, Parikh, Vanderwende L (2013) Learning the visual interpretation of sentences. In: Proceedings of the IEEE international conference on computer vision, pp 1681–1688

18.

Sah S, Nguyen T, Ptucha R (2020) Understanding temporal structure for video captioning. Pattern Anal Appl 23:147–159. https://doi.org/10.1007/s10044-018-00770-3CrossRef

19.

Wang W, Wang R, Shan S et al (2019) Exploring context and visual pattern of relationship for scene graph generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8188–8197

20.

Tian P, Mo HW, Jiang LH (2021) Scene graph generation by multi-level semantic tasks. Appl Intell 51:7781–7793. https://doi.org/10.1007/s10489-020-02115-2CrossRef

21.

Li S, Tang M, Zhang J et al (2020) Attentive gated graph neural network for image scene graph generation. Symmetry 12(4):511CrossRef

22.

Zhang H, Kyaw Z, Chang S-F et al (2017) Visual translation embedding network for visual relation detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5532–5540

23.

Lu C, Krishna R, Bernstein M, Li F-F (2016) Visual relationship detection with language priors. In: European conference on computer vision, pp 852–869

24.

Dai B, Zhang Y, Lin D (2017) Detecting visual relationships with deep relational networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3076–3086

25.

Tao H, Gao L, Song JK et al (2020) Learning from the scene and borrowing from the rich: tackling the long tail in scene graph generation. arXiv:2006.07585 (arXiv preprint)

26.

Wan H, Luo YH et al (2018) Representation learning for scene graph completion via jointly structural and visual embedding. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, pp 949–956

27.

Hung ZS, Mallya A, Lazebnik S (2020) Contextual translation embedding for visual relationship detection and scene graph generation. IEEE Trans Pattern Anal Mach Intell 99:1–1

28.

Wei M, Yuan C, Yue X et al (2020) HOSE-Net: higher order structure embedded network for scene graph generation. arXiv:2008.05156 (arXiv preprint)

29.

Zhu Y, Jiang S (2018) Deep structured learning for visual relationship detection. In: AAAI conference on artificial intelligence, pp 7623–7630

30.

Zhang J, Zhang Y, Wu B et al (2020) Dual ResGCN for balanced scene graph generation. arXiv:2011.04234 (arXiv preprint)

31.

Lin X, Ding C, Zeng J et al (2020) GPS-Net: graph property sensing network for scene graph generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3746–3753

32.

Chen T, Yu W, Chen R et al (2019) Knowledge-embedded routing network for scene graph generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6163–6171

33.

Zhou Y, Sun S, Zhang C et al (2020) Exploring the hierarchy in relation labels for scene graph generation. arXiv:2009.05834 (arXiv preprint)

34.

Sharifzadeh S, Baharlou S M, Tresp V (2020) Classification by attention: scene graph classification with prior knowledge. arXiv:2011.1008 (arXiv preprint)

35.

Li Y, Tarlow D, Brockschmidt M et al (2016) Gated graph sequence neural networks. In: Proceedings of the IEEE international conference on learning representations, pp 1–20

36.

Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

37.

Ren S, He K, Girshick R et al (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRef

38.

Bochkovskiy A, Wang CY, Liao HY (2020)YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934 (arXiv preprint)

39.

Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector. In: European conference on computer vision, pp 21–37

40.

Desai C, Ramanan D (2012) Detecting actions, poses, and objects with relational phraselets. In: European conference on computer vision, pp 158–172

41.

Choi W, Chao YW, Pantofaru C et al (2015) Indoor scene understanding with geometric and semantic contexts. Int J Comput Vis 112:204–220MathSciNetCrossRef

42.

Li Y, Ouyang W, Wang X et al (2017) ViP-CNN: visual phrase guided convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1347–1356

43.

Che W, Fan X, Xiong R et al (2018) Paragraph generation network with visual relationship detection. In: Proceedings of the 26th IEEE ACM international conference, pp 1435–1443

44.

Wang L, Lin P, Cheng J et al (2021) Visual relationship detection with recurrent attention and negative sampling. Neurocomputing 434(28):55–66

45.

Mou L, Hua Y, Zhu XX (2019) Spatial relational reasoning in networks for improving semantic segmentation of aerial images. In: Proceedings of the IEEE international conference on geoscience and remote sensing symposium, pp 5232–5235

46.

Hofmarcher M, Unterthiner T, Arjona-Medina J et al (2019) Visual scene understanding for autonomous driving using semantic segmentation. Explainable AI: interpreting, explaining and visualizing deep learning. Lecture notes in computer science, vol 11700. Springer, Cham, pp 285–296CrossRef

47.

Vincent SC, Paroma V, Ranjay K et al (2019) Scene graph prediction with limited labels. In: Proceedings of the IEEE international conference on computer vision, pp 2580–2590

48.

Zareian A, Karaman S, Chang SF (2020) Bridging knowledge graphs to generate scene graphs. In: European conference on computer vision, pp 606–623

49.

Dornadula A, Narcomey A, Krishna R et al (2019) Learning predicates as functions to enable few-shot scene graph prediction. arXiv:1906.04876 (arXiv preprint)

50.

Qi X, Liao R, Jia J et al (2017) 3D graph neural networks for RGBD semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5199–5208

51.

Kenneth M, Ruslan S, Abhinav G (2017) The more you know: using knowledge graphs for image classification. arXiv:1612.04844 (arXiv preprint)

52.

Li R, Tapaswi M, Liao R et al (2017) Situation recognition with graph neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 4173–4182

53.

Pezeshki M (2015) Sequence modeling using gated recurrent neural networks. arXiv:1501.00299 (arXiv preprint)

54.

Zhou J, Cui G, Zhang Z et al (2020) Graph neural networks: a review of methods and applications. AI Open, pp 57–81

55.

Chen ZM, Wei XS, Wang P et al (2019) Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5172–5181

56.

Mazari A, Sahbi H (2019) Human action recognition with multi-laplacian graph convolutional networks. arXiv:1910.06934 (arXiv preprint)

57.

Jin-Hwa K, Kyoung-Woon O, Woosang L et al (2016) Hadamard product for low-rank bilinear pooling. arXiv:1610.04325 (arXiv preprint)

Title: Exploring correlation of relationship reasoning for scene graph generation
Authors: Peng Tian
Hongwei Mo
Laihao Jiang
Publication date: 12-04-2022
Publisher: Springer Berlin Heidelberg
Published in: International Journal of Machine Learning and Cybernetics / Issue 9/2022
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-022-01538-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Other articles of this Issue 9/2022

A Gaussian RBM with binary auxiliary units

An improved multi-population whale optimization algorithm

Clustering mixed type data: a space structure-based approach

RD-NMSVM: neural mapping support vector machine based on parameter regularization and knowledge distillation

Network pruning via probing the importance of filters

Incremental calculation approaches for granular reduct in formal context with attribute updating