Skip to main content
Erschienen in: International Journal of Computer Vision 8-9/2020

30.07.2020

Multi-task Compositional Network for Visual Relationship Detection

verfasst von: Yibing Zhan, Jun Yu, Ting Yu, Dacheng Tao

Erschienen in: International Journal of Computer Vision | Ausgabe 8-9/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Previous methods treat visual relationship detection as a combination of object detection and predicate detection. However, natural images likely contain hundreds of objects and thousands of object pairs. Relying only on object detection and predicate detection is insufficient for effective visual relationship detection because the significant relationships are easily overwhelmed by the dominant less-significant relationships. In this paper, we propose a novel subtask for visual relationship detection, the significance detection, as the complement of object detection and predicate detection. Significance detection refers to the task of identifying object pairs with significant relationships. Meanwhile, we propose a novel multi-task compositional network (MCN) that simultaneously performs object detection, predicate detection, and significance detection. MCN consists of three modules, an object detector, a relationship generator, and a relationship predictor. The object detector detects objects. The relationship generator provides useful relationships, and the relationship predictor produces significance scores and predicts predicates. Furthermore, MCN proposes a multimodal feature fusion strategy based on visual, spatial, and label features and a novel correlated loss function to deeply combine object detection, predicate detection, and significance detection. MCN is validated on two datasets: visual relationship detection dataset and visual genome dataset. The experimental results compared with state-of-the-art methods verify the competitiveness of MCN and the usefulness of significance detection in visual relationship detection.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
The source code is provided at https://​github.​com/​Atmegal/​MCN.
 
Literatur
Zurück zum Zitat Chen, T., Yu, W., Chen, R., & Lin, L. (2019). Knowledge-embedded routing network for scene graph generation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6163–6171). Chen, T., Yu, W., Chen, R., & Lin, L. (2019). Knowledge-embedded routing network for scene graph generation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6163–6171).
Zurück zum Zitat Desai, C., & Ramanan, D. (2012). Detecting actions, poses, and objects with relational phraselets. In European conference on computer vision (pp 158–172). Springer. Desai, C., & Ramanan, D. (2012). Detecting actions, poses, and objects with relational phraselets. In European conference on computer vision (pp 158–172). Springer.
Zurück zum Zitat Du Plessis, M., Niu, G., & Sugiyama, M. (2015). Convex formulation for learning from positive and unlabeled data. In International conference on machine learning (pp. 1386–1394). Du Plessis, M., Niu, G., & Sugiyama, M. (2015). Convex formulation for learning from positive and unlabeled data. In International conference on machine learning (pp. 1386–1394).
Zurück zum Zitat Fang, H., Gupta, S., Iandola, F., Srivastava, R.K., Deng, L., Dollar, P., Gao, J., He, X., Mitchell, M., Platt, J.C., Zitnick, C.L., & Zweig, G. (2015). From captions to visual concepts and back. In 2015 IEEE conference on computer vision and pattern recognition (CVPR) (vol 00, pp. 1473–1482). https://doi.org/10.1109/CVPR.2015.7298754. Fang, H., Gupta, S., Iandola, F., Srivastava, R.K., Deng, L., Dollar, P., Gao, J., He, X., Mitchell, M., Platt, J.C., Zitnick, C.L., & Zweig, G. (2015). From captions to visual concepts and back. In 2015 IEEE conference on computer vision and pattern recognition (CVPR) (vol 00, pp. 1473–1482). https://​doi.​org/​10.​1109/​CVPR.​2015.​7298754.
Zurück zum Zitat Galleguillos, C., Rabinovich, A., & Belongie, S. (2008). Object categorization using co-occurrence, location and appearance. In IEEE conference on computer vision and pattern recognition, 2008 (pp. 1–8). CVPR 2008, IEEE. Galleguillos, C., Rabinovich, A., & Belongie, S. (2008). Object categorization using co-occurrence, location and appearance. In IEEE conference on computer vision and pattern recognition, 2008 (pp. 1–8). CVPR 2008, IEEE.
Zurück zum Zitat Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580–587). Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580–587).
Zurück zum Zitat Gould, S., Rodgers, J., Cohen, D., Elidan, G., & Koller, D. (2008). Multi-class segmentation with relative location prior. International Journal of Computer Vision, 80(3), 300–316.CrossRef Gould, S., Rodgers, J., Cohen, D., Elidan, G., & Koller, D. (2008). Multi-class segmentation with relative location prior. International Journal of Computer Vision, 80(3), 300–316.CrossRef
Zurück zum Zitat Gu, J., Zhao, H., Lin, Z., Li, S., Cai, J., & Ling, M. (2019). Scene graph generation with external knowledge and image reconstruction. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1969–1978). Gu, J., Zhao, H., Lin, Z., Li, S., Cai, J., & Ling, M. (2019). Scene graph generation with external knowledge and image reconstruction. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1969–1978).
Zurück zum Zitat Han, C., Shen, F., Liu, L., & Yang, Y. (2018). Shen HT (2018) Visual spatial attention network for relationship detection. In ACM multimedia conference on multimedia conference, ACM (pp. 510–518). Han, C., Shen, F., Liu, L., & Yang, Y. (2018). Shen HT (2018) Visual spatial attention network for relationship detection. In ACM multimedia conference on multimedia conference, ACM (pp. 510–518).
Zurück zum Zitat Hsieh, C.J., Natarajan, N., & Dhillon, I.S. (2015). Pu learning for matrix completion. In ICML (pp. 2445–2453). Hsieh, C.J., Natarajan, N., & Dhillon, I.S. (2015). Pu learning for matrix completion. In ICML (pp. 2445–2453).
Zurück zum Zitat Hu, H., Gu, J., Zhang, Z., Dai, J., & Wei, Y. (2018). Relation networks for object detection. In The IEEE conference on computer vision and pattern recognition (CVPR). Hu, H., Gu, J., Zhang, Z., Dai, J., & Wei, Y. (2018). Relation networks for object detection. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Jae Hwang, S., Ravi, SN., Tao, Z., Kim, H.J., Collins, M.D., & Singh, V. (2018). Tensorize, factorize and regularize: Robust visual relationship learning. In The IEEE conference on computer vision and pattern recognition (CVPR). Jae Hwang, S., Ravi, SN., Tao, Z., Kim, H.J., Collins, M.D., & Singh, V. (2018). Tensorize, factorize and regularize: Robust visual relationship learning. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Kaji, H., Yamaguchi, H., & Sugiyama, M. (2018). Multi task learning with positive and unlabeled data and its application to mental state prediction. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE (pp. 2301–2305). Kaji, H., Yamaguchi, H., & Sugiyama, M. (2018). Multi task learning with positive and unlabeled data and its application to mental state prediction. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE (pp. 2301–2305).
Zurück zum Zitat Kanehira, A., & Harada, T. (2016). Multi-label ranking from positive and unlabeled data. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5138–5146). Kanehira, A., & Harada, T. (2016). Multi-label ranking from positive and unlabeled data. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5138–5146).
Zurück zum Zitat Kong, Y., & Fu, Y. (2017). Max-margin heterogeneous information machine for rgb-d action recognition. International Journal of Computer Vision, 123(3), 350–371.MathSciNetCrossRef Kong, Y., & Fu, Y. (2017). Max-margin heterogeneous information machine for rgb-d action recognition. International Journal of Computer Vision, 123(3), 350–371.MathSciNetCrossRef
Zurück zum Zitat Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., et al. (2017). Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1), 32–73.MathSciNetCrossRef Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., et al. (2017). Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1), 32–73.MathSciNetCrossRef
Zurück zum Zitat Li, X. L., Yu, P. S., Liu, B., & Ng, S. K. (2009). Positive unlabeled learning for data stream classification. In Proceedings of the 2009 SIAM international conference on data mining, SIAM (pp. 259–270). Li, X. L., Yu, P. S., Liu, B., & Ng, S. K. (2009). Positive unlabeled learning for data stream classification. In Proceedings of the 2009 SIAM international conference on data mining, SIAM (pp. 259–270).
Zurück zum Zitat Li, Y., Ouyang, W., Wang, X., & Tang, X. (2017a). Vip-cnn: Visual phrase guided convolutional neural network. In Computer vision and pattern recognition (pp. 7244–7253). Li, Y., Ouyang, W., Wang, X., & Tang, X. (2017a). Vip-cnn: Visual phrase guided convolutional neural network. In Computer vision and pattern recognition (pp. 7244–7253).
Zurück zum Zitat Li, Y., Ouyang, W., Zhou, B., Wang, K., & Wang, X. (2017b). Scene graph generation from objects, phrases and region captions. In Proceedings of the IEEE international conference on computer vision (pp. 1261–1270). Li, Y., Ouyang, W., Zhou, B., Wang, K., & Wang, X. (2017b). Scene graph generation from objects, phrases and region captions. In Proceedings of the IEEE international conference on computer vision (pp. 1261–1270).
Zurück zum Zitat Liang, K., Guo, Y., Chang, H., & Chen, X. (2018). Visual relationship detection with deep structural ranking. In AAAI Conference on artificial intelligence. Liang, K., Guo, Y., Chang, H., & Chen, X. (2018). Visual relationship detection with deep structural ranking. In AAAI Conference on artificial intelligence.
Zurück zum Zitat Liang, X., Lee, L., & Xing, E. P. (2017). Deep variation-structured reinforcement learning for visual relationship and attribute detection. In Computer vision and pattern recognition (pp. 4408–4417). Liang, X., Lee, L., & Xing, E. P. (2017). Deep variation-structured reinforcement learning for visual relationship and attribute detection. In Computer vision and pattern recognition (pp. 4408–4417).
Zurück zum Zitat Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980–2988). Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980–2988).
Zurück zum Zitat Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., et al. (2016). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21–37). Springer. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., et al. (2016). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21–37). Springer.
Zurück zum Zitat Lu, C., Krishna, R., Bernstein, M. S., & Feifei, L. (2016). Visual relationship detection with language priors. In European conference on computer vision (pp. 852–869). Lu, C., Krishna, R., Bernstein, M. S., & Feifei, L. (2016). Visual relationship detection with language priors. In European conference on computer vision (pp. 852–869).
Zurück zum Zitat Misra, I., Shrivastava, A., Gupta, A., & Hebert, M. (2016). Cross-stitch networks for multi-task learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3994–4003). Misra, I., Shrivastava, A., Gupta, A., & Hebert, M. (2016). Cross-stitch networks for multi-task learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3994–4003).
Zurück zum Zitat Ouyang, W., Zeng, X., & Wang, X. (2016). Learning mutual visibility relationship for pedestrian detection with a deep model. International Journal of Computer Vision, 120(1), 14–27.MathSciNetCrossRef Ouyang, W., Zeng, X., & Wang, X. (2016). Learning mutual visibility relationship for pedestrian detection with a deep model. International Journal of Computer Vision, 120(1), 14–27.MathSciNetCrossRef
Zurück zum Zitat Palmero, C., Clapés, A., Bahnsen, C., Møgelmose, A., Moeslund, T. B., & Escalera, S. (2016). Multi-modal rgb-depth-thermal human body segmentation. International Journal of Computer Vision, 118(2), 217–239.MathSciNetCrossRef Palmero, C., Clapés, A., Bahnsen, C., Møgelmose, A., Moeslund, T. B., & Escalera, S. (2016). Multi-modal rgb-depth-thermal human body segmentation. International Journal of Computer Vision, 118(2), 217–239.MathSciNetCrossRef
Zurück zum Zitat Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543). Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543).
Zurück zum Zitat Peyre, J., Laptev, I., Schmid, C., & Sivic, J. (2017). Weakly-supervised learning of visual relations. In international conference on computer vision (pp. 5189–5198). Peyre, J., Laptev, I., Schmid, C., & Sivic, J. (2017). Weakly-supervised learning of visual relations. In international conference on computer vision (pp. 5189–5198).
Zurück zum Zitat Platanios, E., Poon, H., Mitchell, T. M., & Horvitz, E. J. (2017). Estimating accuracy from unlabeled data: A probabilistic logic approach. In Advances in neural information processing systems (pp. 4361–4370). Platanios, E., Poon, H., Mitchell, T. M., & Horvitz, E. J. (2017). Estimating accuracy from unlabeled data: A probabilistic logic approach. In Advances in neural information processing systems (pp. 4361–4370).
Zurück zum Zitat Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788). Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788).
Zurück zum Zitat Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, 6, 1137–1149.CrossRef Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, 6, 1137–1149.CrossRef
Zurück zum Zitat Sansone, E., De Natale, F. G., & Zhou, Z. H. (2018). Efficient training for positive unlabeled learning. In IEEE Transactions on pattern analysis and machine intelligence. Sansone, E., De Natale, F. G., & Zhou, Z. H. (2018). Efficient training for positive unlabeled learning. In IEEE Transactions on pattern analysis and machine intelligence.
Zurück zum Zitat Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:​1409.​1556
Zurück zum Zitat Xu, D., Zhu, Y., Choy, C.B., & Fei-Fei, L. (2017). Scene graph generation by iterative message passing. In Proceedings of the IEEE conference on computer vision and pattern recognition (vol. 2). Xu, D., Zhu, Y., Choy, C.B., & Fei-Fei, L. (2017). Scene graph generation by iterative message passing. In Proceedings of the IEEE conference on computer vision and pattern recognition (vol. 2).
Zurück zum Zitat Yang, X., Zhang, H., & Cai, J. (2018). Shuffle-then-assemble: learning object-agnostic visual relationship features. arXiv preprint arXiv:1808.00171 Yang, X., Zhang, H., & Cai, J. (2018). Shuffle-then-assemble: learning object-agnostic visual relationship features. arXiv preprint arXiv:​1808.​00171
Zurück zum Zitat Yao, T., Pan, Y., Li, Y., & Mei, T. (2018). Exploring visual relationship for image captioning. In Proceedings of the European conference on computer vision (ECCV) (pp. 684–699). Yao, T., Pan, Y., Li, Y., & Mei, T. (2018). Exploring visual relationship for image captioning. In Proceedings of the European conference on computer vision (ECCV) (pp. 684–699).
Zurück zum Zitat Yin, G., Sheng, L., Liu, B., Yu, N., Wang, X., Shao, J., & Loy, C.C. (2018). Zoom-net: Mining deep feature interactions for visual relationship recognition. arXiv preprint arXiv:1807.04979 Yin, G., Sheng, L., Liu, B., Yu, N., Wang, X., Shao, J., & Loy, C.C. (2018). Zoom-net: Mining deep feature interactions for visual relationship recognition. arXiv preprint arXiv:​1807.​04979
Zurück zum Zitat Yu, R., Li, A., Morariu, V. I., & Davis, L. S. (2017). Visual relationship detection with internal and external linguistic knowledge distillation. In International conference on computer vision (pp. 1068–1076). Yu, R., Li, A., Morariu, V. I., & Davis, L. S. (2017). Visual relationship detection with internal and external linguistic knowledge distillation. In International conference on computer vision (pp. 1068–1076).
Zurück zum Zitat Yu, Z., Yu, J., Fan, J., & Tao, D. (2017). Multi-modal factorized bilinear pooling with co-attention learning for visual question answering. In Proceedings of the IEEE international conference on computer vision (pp. 1821–1830). Yu, Z., Yu, J., Fan, J., & Tao, D. (2017). Multi-modal factorized bilinear pooling with co-attention learning for visual question answering. In Proceedings of the IEEE international conference on computer vision (pp. 1821–1830).
Zurück zum Zitat Yu, Z., Yu, J., Xiang, C., Fan, J., & Tao, D. (2018). Beyond bilinear: Generalized multimodal factorized high-order pooling for visual question answering. IEEE Transactions on Neural Networks and Learning Systems, 99, 1–13. Yu, Z., Yu, J., Xiang, C., Fan, J., & Tao, D. (2018). Beyond bilinear: Generalized multimodal factorized high-order pooling for visual question answering. IEEE Transactions on Neural Networks and Learning Systems, 99, 1–13.
Zurück zum Zitat Zhan, Y., Yu, J., Yu, T., & Tao, D. (2019). On exploring undetermined relationships for visual relationship detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5128–5137). Zhan, Y., Yu, J., Yu, T., & Tao, D. (2019). On exploring undetermined relationships for visual relationship detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5128–5137).
Zurück zum Zitat Zhang, H., Kyaw, Z., Chang, S., & Chua, T. (2017a). Visual translation embedding network for visual relation detection. In Computer vision and pattern recognition (pp. 3107–3115). Zhang, H., Kyaw, Z., Chang, S., & Chua, T. (2017a). Visual translation embedding network for visual relation detection. In Computer vision and pattern recognition (pp. 3107–3115).
Zurück zum Zitat Zhang, H., Kyaw, Z., Yu, J., & Chang, S. (2017b). Ppr-fcn: Weakly supervised visual relation detection via parallel pairwise r-fcn. In International conference on computer vision (pp. 4243–4251). Zhang, H., Kyaw, Z., Yu, J., & Chang, S. (2017b). Ppr-fcn: Weakly supervised visual relation detection via parallel pairwise r-fcn. In International conference on computer vision (pp. 4243–4251).
Zurück zum Zitat Zhang, J., Kalantidis, Y., Rohrbach, M., Paluri, M., Elgammal, A., & Elhoseiny, M. (2019a). Large-scale visual relationship understanding. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 9185–9194.CrossRef Zhang, J., Kalantidis, Y., Rohrbach, M., Paluri, M., Elgammal, A., & Elhoseiny, M. (2019a). Large-scale visual relationship understanding. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 9185–9194.CrossRef
Zurück zum Zitat Zhang, J., Shih, K. J., Elgammal, A., Tao, A., & Catanzaro, B. (2019b). Graphical contrastive losses for scene graph parsing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 11535–11543). Zhang, J., Shih, K. J., Elgammal, A., Tao, A., & Catanzaro, B. (2019b). Graphical contrastive losses for scene graph parsing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 11535–11543).
Zurück zum Zitat Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499–1503.CrossRef Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499–1503.CrossRef
Zurück zum Zitat Zhang, X., & LeCun, Y. (2017). Universum prescription: Regularization using unlabeled data. In AAAI (pp. 2907–2913). Zhang, X., & LeCun, Y. (2017). Universum prescription: Regularization using unlabeled data. In AAAI (pp. 2907–2913).
Zurück zum Zitat Zhou, J. T., Pan, S. J., Mao, Q., & Tsang, I. W. (2012). Multi-view positive and unlabeled learning. In Asian conference on machine learning (pp. 555–570). Zhou, J. T., Pan, S. J., Mao, Q., & Tsang, I. W. (2012). Multi-view positive and unlabeled learning. In Asian conference on machine learning (pp. 555–570).
Zurück zum Zitat Zhu, Y., & Jiang, S. (2018). Deep structured learning for visual relationship detection. In AAAI Conference on artificial intelligence. Zhu, Y., & Jiang, S. (2018). Deep structured learning for visual relationship detection. In AAAI Conference on artificial intelligence.
Zurück zum Zitat Zhuang, B., Liu, L., Shen, C., & Reid, I. (2017). Towards context-aware interaction recognition for visual relationship detection. In Proceedings of the IEEE international conference on computer vision (pp. 589–598). Zhuang, B., Liu, L., Shen, C., & Reid, I. (2017). Towards context-aware interaction recognition for visual relationship detection. In Proceedings of the IEEE international conference on computer vision (pp. 589–598).
Metadaten
Titel
Multi-task Compositional Network for Visual Relationship Detection
verfasst von
Yibing Zhan
Jun Yu
Ting Yu
Dacheng Tao
Publikationsdatum
30.07.2020
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 8-9/2020
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-020-01353-8

Weitere Artikel der Ausgabe 8-9/2020

International Journal of Computer Vision 8-9/2020 Zur Ausgabe

Premium Partner