Skip to main content
Erschienen in: International Journal of Computer Vision 12/2020

24.07.2020

Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts

verfasst von: Shafin Rahman, Salman H. Khan, Fatih Porikli

Erschienen in: International Journal of Computer Vision | Ausgabe 12/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Zero shot learning (ZSL) identifies unseen objects for which no training images are available. Conventional ZSL approaches are restricted to a recognition setting where each test image is categorized into one of several unseen object classes. We posit that this setting is ill-suited for real-world applications where unseen objects appear only as a part of a complete scene, warranting both ‘recognition’ and ‘localization’ of the unseen category. To address this limitation, we introduce a new ‘Zero-Shot Detection’ (ZSD) problem setting, which aims at simultaneously recognizing and locating object instances belonging to novel categories, without any training samples. We introduce an integrated solution to the ZSD problem that jointly models the complex interplay between visual and semantic domain information. Ours is an end-to-end trainable deep network for ZSD that effectively overcomes the noise in the unsupervised semantic descriptions. To this end, we utilize the concept of meta-classes to design an original loss function that achieves synergy between max-margin class separation and semantic domain clustering. In order to set a benchmark for ZSD, we propose an experimental protocol for the large-scale ILSVRC dataset that adheres to practical challenges, e.g., rare classes are more likely to be the unseen ones. Furthermore, we present a baseline approach extended from conventional recognition to the ZSD setting. Our extensive experiments show a significant boost in performance (in terms of mAP and Recall) on the imperative yet difficult ZSD problem on ImageNet detection, MSCOCO and FashionZSD datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Meta-classes are obtained by clustering semantically similar classes.
 
2
Although, we acknowledge that Recall@100 stays an appropriate measure for large-scale datasets that are not fully labeled (such as Visual Genome-see Sect. 5.5).
 
Literatur
Zurück zum Zitat Akata, Z., Malinowski, M., Fritz, M., & Schiele, B. (2016). Multi-cue zero-shot learning with strong supervision. In The IEEE conference on computer vision and pattern recognition (CVPR). Akata, Z., Malinowski, M., Fritz, M., & Schiele, B. (2016). Multi-cue zero-shot learning with strong supervision. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Akata, Z., Reed, S., Walter, D., Lee, H., & Schiele, B. (2015). Evaluation of output embeddings for fine-grained image classification. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 07–12 June-2015 (pp. 2927–2936). https://doi.org/10.1109/CVPR.2015.7298911. Akata, Z., Reed, S., Walter, D., Lee, H., & Schiele, B. (2015). Evaluation of output embeddings for fine-grained image classification. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 07–12 June-2015 (pp. 2927–2936). https://​doi.​org/​10.​1109/​CVPR.​2015.​7298911.
Zurück zum Zitat Al-Halah, Z., Tapaswi, M., & Stiefelhagen, R. (2016). Recovering the missing link: Predicting class-attribute associations for unsupervised zero-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR). Al-Halah, Z., Tapaswi, M., & Stiefelhagen, R. (2016). Recovering the missing link: Predicting class-attribute associations for unsupervised zero-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Bansal, A., Sikka, K., Sharma, G., Chellappa, R., & Divakaran, A. (2018). Zero-shot object detection. In The European conference on computer vision (ECCV). Bansal, A., Sikka, K., Sharma, G., Chellappa, R., & Divakaran, A. (2018). Zero-shot object detection. In The European conference on computer vision (ECCV).
Zurück zum Zitat Changpinyo, S., Chao, W. L., Gong, B., & Sha, F. (2016). Synthesized classifiers for zero-shot learning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, January 2016 (pp. 5327–5336). Changpinyo, S., Chao, W. L., Gong, B., & Sha, F. (2016). Synthesized classifiers for zero-shot learning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, January 2016 (pp. 5327–5336).
Zurück zum Zitat Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object detection via region-based fully convolutional networks. arXiv:1605.06409. Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object detection via region-based fully convolutional networks. arXiv:​1605.​06409.
Zurück zum Zitat Demirel, B., Cinbis, R. G., & Ikizler-Cinbis, N. (2018). Zero-shot object detection by hybrid region embedding. In British machine vision conference (BMVC). Demirel, B., Cinbis, R. G., & Ikizler-Cinbis, N. (2018). Zero-shot object detection by hybrid region embedding. In British machine vision conference (BMVC).
Zurück zum Zitat Demirel, B., Gokberk Cinbis, R., & Ikizler-Cinbis, N. (2017). Attributes2classname: A discriminative model for attribute-based unsupervised zero-shot learning. In The IEEE international conference on computer vision (ICCV). Demirel, B., Gokberk Cinbis, R., & Ikizler-Cinbis, N. (2017). Attributes2classname: A discriminative model for attribute-based unsupervised zero-shot learning. In The IEEE international conference on computer vision (ICCV).
Zurück zum Zitat Deng, J., Ding, N., Jia, Y., Frome, A., Murphy, K., Bengio, S., Li, Y., Neven, H., & Adam, H. (2014). Large-scale object classification using label relation graphs. In ECCV (pp. 48–64). Springer. Deng, J., Ding, N., Jia, Y., Frome, A., Murphy, K., Bengio, S., Li, Y., Neven, H., & Adam, H. (2014). Large-scale object classification using label relation graphs. In ECCV (pp. 48–64). Springer.
Zurück zum Zitat Deutsch, S., Kolouri, S., Kim, K., Owechko, Y., & Soatto, S. (2017). Zero shot learning via multi-scale manifold regularization. In CVPR. Deutsch, S., Kolouri, S., Kim, K., Owechko, Y., & Soatto, S. (2017). Zero shot learning via multi-scale manifold regularization. In CVPR.
Zurück zum Zitat Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. (2009). Describing objects by their attributes. In 2009 IEEE conference on computer vision and pattern recognition, CVPR 2009 (pp. 1778–1785). IEEE. Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. (2009). Describing objects by their attributes. In 2009 IEEE conference on computer vision and pattern recognition, CVPR 2009 (pp. 1778–1785). IEEE.
Zurück zum Zitat Frome, A., Corrado, G. S., Shlens, J., Bengio, S., Dean, J., Ranzato, M. A., et al. (2013). Devise: A deep visual-semantic embedding model. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26, pp. 2121–2129). Red Hook: Curran Associates Inc. Frome, A., Corrado, G. S., Shlens, J., Bengio, S., Dean, J., Ranzato, M. A., et al. (2013). Devise: A deep visual-semantic embedding model. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26, pp. 2121–2129). Red Hook: Curran Associates Inc.
Zurück zum Zitat He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In The IEEE conference on computer vision and pattern recognition (CVPR). He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Hu, R., Xu, H., Rohrbach, M., Feng, J., Saenko, K., & Darrell, T. (2016). Natural language object retrieval. In CVPR (pp. 4555–4564). Hu, R., Xu, H., Rohrbach, M., Feng, J., Saenko, K., & Darrell, T. (2016). Natural language object retrieval. In CVPR (pp. 4555–4564).
Zurück zum Zitat Jayaraman, D., & Grauman, K. (2014). Zero-shot recognition with unreliable attributes. In Advances in neural information processing systems (pp. 3464–3472). Jayaraman, D., & Grauman, K. (2014). Zero-shot recognition with unreliable attributes. In Advances in neural information processing systems (pp. 3464–3472).
Zurück zum Zitat Jetley, S., Sapienza, M., Golodetz, S., & Torr, P. H. (2016). Straight to shapes: Real-time detection of encoded shapes. arXiv:1611.07932. Jetley, S., Sapienza, M., Golodetz, S., & Torr, P. H. (2016). Straight to shapes: Real-time detection of encoded shapes. arXiv:​1611.​07932.
Zurück zum Zitat Kodirov, E., Xiang, T., Fu, Z., & Gong, S. (2015). Unsupervised domain adaptation for zero-shot learning. In The IEEE international conference on computer vision (ICCV). Kodirov, E., Xiang, T., Fu, Z., & Gong, S. (2015). Unsupervised domain adaptation for zero-shot learning. In The IEEE international conference on computer vision (ICCV).
Zurück zum Zitat Kodirov, E., Xiang, T., & Gong, S. (2017). Semantic autoencoder for zero-shot learning. In CVPR. Kodirov, E., Xiang, T., & Gong, S. (2017). Semantic autoencoder for zero-shot learning. In CVPR.
Zurück zum Zitat Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., et al. (2017). Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1), 32–73.MathSciNetCrossRef Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., et al. (2017). Visual genome: Connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 123(1), 32–73.MathSciNetCrossRef
Zurück zum Zitat Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In 2009 IEEE computer society conference on computer vision and pattern recognition workshops, CVPR workshops 2009 (pp. 951–958). https://doi.org/10.1109/CVPRW.2009.5206594. Lampert, C., Nickisch, H., & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In 2009 IEEE computer society conference on computer vision and pattern recognition workshops, CVPR workshops 2009 (pp. 951–958). https://​doi.​org/​10.​1109/​CVPRW.​2009.​5206594.
Zurück zum Zitat Lei Ba, J., Swersky, K., & Fidler, S., et al. (2015). Predicting deep zero-shot convolutional neural networks using textual descriptions. In CVPR (pp. 4247–4255). Lei Ba, J., Swersky, K., & Fidler, S., et al. (2015). Predicting deep zero-shot convolutional neural networks using textual descriptions. In CVPR (pp. 4247–4255).
Zurück zum Zitat Li, X., Liao, S., Lan, W., Du, X., & Yang, G. (2015). Zero-shot image tagging by hierarchical semantic embedding. In RDIR (pp. 879–882). ACM. Li, X., Liao, S., Lan, W., Du, X., & Yang, G. (2015). Zero-shot image tagging by hierarchical semantic embedding. In RDIR (pp. 879–882). ACM.
Zurück zum Zitat Li, Y., Wang, D., Hu, H., Lin, Y., & Zhuang, Y. (2017). Zero-shot recognition using dual visual-semantic mapping paths. In The IEEE conference on computer vision and pattern recognition (CVPR). Li, Y., Wang, D., Hu, H., Lin, Y., & Zhuang, Y. (2017). Zero-shot recognition using dual visual-semantic mapping paths. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Li, Z., Gavves, E., Mensink, T., & Snoek, C. G. (2014). Attributes make sense on segmented objects. In European conference on computer vision (pp. 350–365). Springer. Li, Z., Gavves, E., Mensink, T., & Snoek, C. G. (2014). Attributes make sense on segmented objects. In European conference on computer vision (pp. 350–365). Springer.
Zurück zum Zitat Li, Z., Tao, R., Gavves, E., Snoek, C., & Smeulders, A. (2017). Tracking by natural language specification. In CVPR (pp. 6495–6503). Li, Z., Tao, R., Gavves, E., Snoek, C., & Smeulders, A. (2017). Tracking by natural language specification. In CVPR (pp. 6495–6503).
Zurück zum Zitat Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In ECCV (pp. 740–755). Springer. Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In ECCV (pp. 740–755). Springer.
Zurück zum Zitat Maxime Bucher, S. H., & Jurie, F. (2016). Improving semantic embedding consistency by metric learning for zero-shot classification. In Proceedings of the 14th European conference on computer vision. Maxime Bucher, S. H., & Jurie, F. (2016). Improving semantic embedding consistency by metric learning for zero-shot classification. In Proceedings of the 14th European conference on computer vision.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26, pp. 3111–3119). Red Hook: Curran Associates Inc. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26, pp. 3111–3119). Red Hook: Curran Associates Inc.
Zurück zum Zitat Miller, G. A. (1995). Wordnet: A lexical database for English. Communications of the ACM, 38(11), 39–41.CrossRef Miller, G. A. (1995). Wordnet: A lexical database for English. Communications of the ACM, 38(11), 39–41.CrossRef
Zurück zum Zitat Morgado, P., & Vasconcelos, N. (2017). Semantically consistent regularization for zero-shot recognition. In The IEEE conference on computer vision and pattern recognition (CVPR). Morgado, P., & Vasconcelos, N. (2017). Semantically consistent regularization for zero-shot recognition. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Palatucci, M., Pomerleau, D., Hinton, G. E., & Mitchell, T. M. (2009). Zero-shot learning with semantic output codes. In Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in neural information processing systems (Vol. 22, pp. 1410–1418). Red Hook: Curran Associates Inc. Palatucci, M., Pomerleau, D., Hinton, G. E., & Mitchell, T. M. (2009). Zero-shot learning with semantic output codes. In Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, & A. Culotta (Eds.), Advances in neural information processing systems (Vol. 22, pp. 1410–1418). Red Hook: Curran Associates Inc.
Zurück zum Zitat Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Empirical methods in natural language processing (EMNLP) (pp. 1532–1543). Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Empirical methods in natural language processing (EMNLP) (pp. 1532–1543).
Zurück zum Zitat Rahman, S., Khan, S., & Barnes, N. (2019). Transductive learning for zero-shot object detection. In Proceedings of the IEEE international conference on computer vision (pp. 6082–6091). Rahman, S., Khan, S., & Barnes, N. (2019). Transductive learning for zero-shot object detection. In Proceedings of the IEEE international conference on computer vision (pp. 6082–6091).
Zurück zum Zitat Rahman, S., Khan, S., & Barnes, N. (2020a). Improved visual-semantic alignment for zero-shot object detection. In AAAI (pp. 11,932–11,939). Rahman, S., Khan, S., & Barnes, N. (2020a). Improved visual-semantic alignment for zero-shot object detection. In AAAI (pp. 11,932–11,939).
Zurück zum Zitat Rahman, S., Khan, S., & Porikli, F. (2019). Zero-shot object detection: Learning to simultaneously recognize and localize novel concepts. In C. V. Jawahar, H. Li, G. Mori, & K. Schindler (Eds.), Computer vision—ACCV 2018 (pp. 547–563). Cham: Springer.CrossRef Rahman, S., Khan, S., & Porikli, F. (2019). Zero-shot object detection: Learning to simultaneously recognize and localize novel concepts. In C. V. Jawahar, H. Li, G. Mori, & K. Schindler (Eds.), Computer vision—ACCV 2018 (pp. 547–563). Cham: Springer.CrossRef
Zurück zum Zitat Redmon, J., & Farhadi, A. (2017). Yolo9000: Better, faster, stronger. In The IEEE conference on computer vision and pattern recognition (CVPR). Redmon, J., & Farhadi, A. (2017). Yolo9000: Better, faster, stronger. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Romera-Paredes, B., & Torr, P. (2015). An embarrassingly simple approach to zero-shot learning. In Proceedings of the 32nd international conference on machine learning (pp. 2152–2161). Romera-Paredes, B., & Torr, P. (2015). An embarrassingly simple approach to zero-shot learning. In Proceedings of the 32nd international conference on machine learning (pp. 2152–2161).
Zurück zum Zitat Schonfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., & Akata, Z. (2019). Generalized zero- and few-shot learning via aligned variational autoencoders. In The IEEE conference on computer vision and pattern recognition (CVPR). Schonfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., & Akata, Z. (2019). Generalized zero- and few-shot learning via aligned variational autoencoders. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Shigeto, Y., Suzuki, I., Hara, K., Shimbo, M., & Matsumoto, Y. (2015). Ridge regression, hubness, and zero-shot learning. In Joint European conference on machine learning and knowledge discovery in databases (pp. 135–151). Springer. Shigeto, Y., Suzuki, I., Hara, K., Shimbo, M., & Matsumoto, Y. (2015). Ridge regression, hubness, and zero-shot learning. In Joint European conference on machine learning and knowledge discovery in databases (pp. 135–151). Springer.
Zurück zum Zitat Socher, R., Ganjoo, M., Manning, C. D., & Ng, A. (2013). Zero-shot learning through cross-modal transfer. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26, pp. 935–943). Red Hook: Curran Associates Inc. Socher, R., Ganjoo, M., Manning, C. D., & Ng, A. (2013). Zero-shot learning through cross-modal transfer. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (Vol. 26, pp. 935–943). Red Hook: Curran Associates Inc.
Zurück zum Zitat Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology. Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology.
Zurück zum Zitat Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., & Schiele, B. (2016). Latent embeddings for zero-shot classification. In The IEEE conference on computer vision and pattern recognition (CVPR). Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., & Schiele, B. (2016). Latent embeddings for zero-shot classification. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Xian, Y., Lorenz, T., Schiele, B., & Akata, Z. (2018b). Feature generating networks for zero-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR). Xian, Y., Lorenz, T., Schiele, B., & Akata, Z. (2018b). Feature generating networks for zero-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Xian, Y., Sharma, S., Schiele, B., & Akata, Z. (2019). F-vaegan-d2: A feature generating framework for any-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR). Xian, Y., Sharma, S., Schiele, B., & Akata, Z. (2019). F-vaegan-d2: A feature generating framework for any-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms.
Zurück zum Zitat Xu, X., Shen, F., Yang, Y., Zhang, D., Shen, H. T., & Song, J. (2017). Matrix tri-factorization with manifold regularizations for zero-shot learning. In Proceedings of CVPR. Xu, X., Shen, F., Yang, Y., Zhang, D., Shen, H. T., & Song, J. (2017). Matrix tri-factorization with manifold regularizations for zero-shot learning. In Proceedings of CVPR.
Zurück zum Zitat Ye, M., & Guo, Y. (2017). Zero-shot classification with discriminative semantic representation learning. In The IEEE conference on computer vision and pattern recognition (CVPR). Ye, M., & Guo, Y. (2017). Zero-shot classification with discriminative semantic representation learning. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Yu, F. X., Cao, L., Feris, R. S., Smith, J. R., & Chang, S. F. (2013). Designing category-level attributes for discriminative visual recognition. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 771–778). https://doi.org/10.1109/CVPR.2013.105. Yu, F. X., Cao, L., Feris, R. S., Smith, J. R., & Chang, S. F. (2013). Designing category-level attributes for discriminative visual recognition. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 771–778). https://​doi.​org/​10.​1109/​CVPR.​2013.​105.
Zurück zum Zitat Zhang, L., Xiang, T., & Gong, S. (2017). Learning a deep embedding model for zero-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR). Zhang, L., Xiang, T., & Gong, S. (2017). Learning a deep embedding model for zero-shot learning. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Zhang, Y., Gong, B., & Shah, M. (2016). Fast zero-shot image tagging. In The IEEE conference on computer vision and pattern recognition (CVPR). Zhang, Y., Gong, B., & Shah, M. (2016). Fast zero-shot image tagging. In The IEEE conference on computer vision and pattern recognition (CVPR).
Zurück zum Zitat Zhang, Z., & Saligrama, V. (2015). Zero-shot learning via semantic similarity embedding. In The IEEE international conference on computer vision (ICCV). Zhang, Z., & Saligrama, V. (2015). Zero-shot learning via semantic similarity embedding. In The IEEE international conference on computer vision (ICCV).
Zurück zum Zitat Zhang, Z., & Saligrama, V. (2016). Zero-shot learning via joint latent similarity embedding. In The IEEE conference on computer vision and pattern recognition (CVPR). Zhang, Z., & Saligrama, V. (2016). Zero-shot learning via joint latent similarity embedding. In The IEEE conference on computer vision and pattern recognition (CVPR).
Metadaten
Titel
Zero-Shot Object Detection: Joint Recognition and Localization of Novel Concepts
verfasst von
Shafin Rahman
Salman H. Khan
Fatih Porikli
Publikationsdatum
24.07.2020
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 12/2020
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-020-01355-6

Weitere Artikel der Ausgabe 12/2020

International Journal of Computer Vision 12/2020 Zur Ausgabe

Premium Partner