Skip to main content
Erschienen in: International Journal of Computer Vision 1-2/2014

01.05.2014

The Ignorant Led by the Blind: A Hybrid Human–Machine Vision System for Fine-Grained Categorization

verfasst von: Steve Branson, Grant Van Horn, Catherine Wah, Pietro Perona, Serge Belongie

Erschienen in: International Journal of Computer Vision | Ausgabe 1-2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present a visual recognition system for fine-grained visual categorization. The system is composed of a human and a machine working together and combines the complementary strengths of computer vision algorithms and (non-expert) human users. The human users provide two heterogeneous forms of information object part clicks and answers to multiple choice questions. The machine intelligently selects the most informative question to pose to the user in order to identify the object class as quickly as possible. By leveraging computer vision and analyzing the user responses, the overall amount of human effort required, measured in seconds, is minimized. Our formalism shows how to incorporate many different types of computer vision algorithms into a human-in-the-loop framework, including standard multiclass methods, part-based methods, and localized multiclass and attribute methods. We explore our ideas by building a field guide for bird identification. The experimental results demonstrate the strength of combining ignorant humans with poor-sighted machines the hybrid system achieves quick and accurate bird identification on a dataset containing 200 bird species.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Our user model assumes binary or multinomial attributes; however, one could use continuous attribute values for the computer vision component described in this section
 
2
The integral in Eq. 26 involves a bottom-up traversal of \(T=(V,E)\), at each step convolving a spatial score map with a unary score map (takes time \(O(n \log n)\) time in the number of pixels).
 
3
Maximum likelihood inference involves a bottom-up traversal of \(T\), doing a distance transform operation (Felzenszwalb et al. 2008) for each part in the tree (takes time \(O(n)\) time in the number of pixels).
 
4
in practice, we also computed an average segmentation mask for each part-aspect and used that to weight each extracted patch, see supplementary material
 
Literatur
Zurück zum Zitat Belhumeur, P., Chen, D., Feiner, S., Jacobs, D., Kress, W., Ling, H., Lopez, I., Ramamoorthi, R., Sheorey, S., White, S. & Zhang, L. (2008). Searching the world’s herbaria. In ECCV. Belhumeur, P., Chen, D., Feiner, S., Jacobs, D., Kress, W., Ling, H., Lopez, I., Ramamoorthi, R., Sheorey, S., White, S. & Zhang, L. (2008). Searching the world’s herbaria. In ECCV.
Zurück zum Zitat Berg, T. & Belhumeur, P.N. (2013). Poof: Part-based one-vs-one features for fine-grained categorization, face verification, and attribute estimation. In CVPR. Berg, T. & Belhumeur, P.N. (2013). Poof: Part-based one-vs-one features for fine-grained categorization, face verification, and attribute estimation. In CVPR.
Zurück zum Zitat Biederman, I., Subramaniam, S., Bar, M., Kalocsai, P., & Fiser, J. (1999). Subordinate-level object classification reexamined. Psychological Research, 63(2–3), 131–153.CrossRef Biederman, I., Subramaniam, S., Bar, M., Kalocsai, P., & Fiser, J. (1999). Subordinate-level object classification reexamined. Psychological Research, 63(2–3), 131–153.CrossRef
Zurück zum Zitat Bourdev, L. & Malik, J. (2009). Poselets: Body part detectors trained using 3d annotations. In ICCV. Bourdev, L. & Malik, J. (2009). Poselets: Body part detectors trained using 3d annotations. In ICCV.
Zurück zum Zitat Branson, S., Perona, P. & Belongie, S. (2011). Strong supervision from weak annotation. In ICCV. Branson, S., Perona, P. & Belongie, S. (2011). Strong supervision from weak annotation. In ICCV.
Zurück zum Zitat Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P. & Belongie, S. (2010). Visual recognition with humans in the loop. In ECCV. Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P. & Belongie, S. (2010). Visual recognition with humans in the loop. In ECCV.
Zurück zum Zitat Chai, Y., Lempitsky, V. & Zisserman, A. (2011). Bicos: A bi-level co-segmentation method. In ICCV. Chai, Y., Lempitsky, V. & Zisserman, A. (2011). Bicos: A bi-level co-segmentation method. In ICCV.
Zurück zum Zitat Chai, Y., Lempitsky, V. & Zisserman, A. (2013). Symbiotic segmentation and part localization for fine-grained categorization. In ICCV. Chai, Y., Lempitsky, V. & Zisserman, A. (2013). Symbiotic segmentation and part localization for fine-grained categorization. In ICCV.
Zurück zum Zitat Chai, Y., Rahtu, E., Lempitsky, V., Van Gool, L. & Zisserman, A. (2012). Tricos. In ECCV. Chai, Y., Rahtu, E., Lempitsky, V., Van Gool, L. & Zisserman, A. (2012). Tricos. In ECCV.
Zurück zum Zitat Cox, I.J., Miller, M.L., Minka, T.P., Papathomas, T.V. & Yianilos, P.N. (2000). The bayesian image retrieval system, pichunter: Theory, implementation, and psychophysical experiments. Image processing. Cox, I.J., Miller, M.L., Minka, T.P., Papathomas, T.V. & Yianilos, P.N. (2000). The bayesian image retrieval system, pichunter: Theory, implementation, and psychophysical experiments. Image processing.
Zurück zum Zitat Donahue, J. & Grauman, K. (2011). Annotator rationales for visual recognition. In ICCV. Donahue, J. & Grauman, K. (2011). Annotator rationales for visual recognition. In ICCV.
Zurück zum Zitat Douze, M., Ramisa, A. & Schmid, C. (2011). Combining attributes and fisher vectors for efficient image retrieval. In CVPR. Douze, M., Ramisa, A. & Schmid, C. (2011). Combining attributes and fisher vectors for efficient image retrieval. In CVPR.
Zurück zum Zitat Duan, K., Parikh, D., Crandall, D. & Grauman, K. (2012). Discovering localized attributes for fine-grained recognition. In CVPR. Duan, K., Parikh, D., Crandall, D. & Grauman, K. (2012). Discovering localized attributes for fine-grained recognition. In CVPR.
Zurück zum Zitat Fang, Y. & Geman, D. (2005). Experiments in mental face retrieval. In AVBPA. Fang, Y. & Geman, D. (2005). Experiments in mental face retrieval. In AVBPA.
Zurück zum Zitat Farhadi, A., Endres, I. & Hoiem, D. (2010). Attribute-centric recognition for generalization. In CVPR. Farhadi, A., Endres, I. & Hoiem, D. (2010). Attribute-centric recognition for generalization. In CVPR.
Zurück zum Zitat Farhadi, A., Endres, I., Hoiem, D. & Forsyth, D. (2009). Describing objects by attributes. In CVPR. Farhadi, A., Endres, I., Hoiem, D. & Forsyth, D. (2009). Describing objects by attributes. In CVPR.
Zurück zum Zitat Farrell, R., Oza, O., Zhang, N., Morariu, V., Darrell, T. & Davis, L. (2011). Birdlets. In ICCV. Farrell, R., Oza, O., Zhang, N., Morariu, V., Darrell, T. & Davis, L. (2011). Birdlets. In ICCV.
Zurück zum Zitat Felzenszwalb, P. & Huttenlocher, D. (2002). Efficient matching of pictorial structures. In CVPR. Felzenszwalb, P. & Huttenlocher, D. (2002). Efficient matching of pictorial structures. In CVPR.
Zurück zum Zitat Felzenszwalb, P., McAllester, D. & Ramanan, D. (2008). A discriminatively trained, multiscale, deformable part model. In CVPR. Felzenszwalb, P., McAllester, D. & Ramanan, D. (2008). A discriminatively trained, multiscale, deformable part model. In CVPR.
Zurück zum Zitat Ferecatu, M. & Geman, D. (2007). Interactive search by mental matching. In ICCV . Ferecatu, M. & Geman, D. (2007). Interactive search by mental matching. In ICCV .
Zurück zum Zitat Ferecatu, M. & Geman, D. (2009). A statistical framework for image category search from a mental picture. In PAMI. Ferecatu, M. & Geman, D. (2009). A statistical framework for image category search from a mental picture. In PAMI.
Zurück zum Zitat Gavves, E., Fernando, B., Snoek, C., Smeulders, A. & Tuytelaars, T. (2013). Fine-grained categorization by alignments. In ICCV. Gavves, E., Fernando, B., Snoek, C., Smeulders, A. & Tuytelaars, T. (2013). Fine-grained categorization by alignments. In ICCV.
Zurück zum Zitat Geman, D. & Jedynak, B. (1993). Shape recognition and twenty questions. Belmont: Wadsworth. Geman, D. & Jedynak, B. (1993). Shape recognition and twenty questions. Belmont: Wadsworth.
Zurück zum Zitat Geman, D. & Jedynak, B. (1996). An active testing model for tracking roads in satellite images. In PAMI. Geman, D. & Jedynak, B. (1996). An active testing model for tracking roads in satellite images. In PAMI.
Zurück zum Zitat Jedynak, B., Frazier, P. I., & Sznitman, R. (2012). Twenty questions with noise: Bayes optimal policies for entropy loss. Journal of Applied Probability, 49(1), 114–136.CrossRefMATHMathSciNet Jedynak, B., Frazier, P. I., & Sznitman, R. (2012). Twenty questions with noise: Bayes optimal policies for entropy loss. Journal of Applied Probability, 49(1), 114–136.CrossRefMATHMathSciNet
Zurück zum Zitat Khosla, A., Jayadevaprakash, N., Yao, B. & Li, F.F. (2011). Novel dataset for fgvc: Stanford dogs. San Diego: CVPR Workshop on FGVC. Khosla, A., Jayadevaprakash, N., Yao, B. & Li, F.F. (2011). Novel dataset for fgvc: Stanford dogs. San Diego: CVPR Workshop on FGVC.
Zurück zum Zitat Kumar, N., Belhumeur, P., Biswas, A., Jacobs, D., Kress, W., Lopez, I. & Soares, J. (2012). Leafsnap: A computer vision system for automatic plant species identification. In ECCV. Kumar, N., Belhumeur, P., Biswas, A., Jacobs, D., Kress, W., Lopez, I. & Soares, J. (2012). Leafsnap: A computer vision system for automatic plant species identification. In ECCV.
Zurück zum Zitat Kumar, N., Belhumeur, P. & Nayar, S. (2008). Facetracer: A search engine for large collections of images with faces. In ECCV. Kumar, N., Belhumeur, P. & Nayar, S. (2008). Facetracer: A search engine for large collections of images with faces. In ECCV.
Zurück zum Zitat Kumar, N., Berg, A.C., Belhumeur, P.N. & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In ICCV. Kumar, N., Berg, A.C., Belhumeur, P.N. & Nayar, S.K. (2009). Attribute and simile classifiers for face verification. In ICCV.
Zurück zum Zitat Lampert, C., Nickisch, H. & Harmeling, S. (2009). Learning to detect unseen object classes. In CVPR. Lampert, C., Nickisch, H. & Harmeling, S. (2009). Learning to detect unseen object classes. In CVPR.
Zurück zum Zitat Larios, N., Soran, B., Shapiro, L.G., Martinez-Munoz, G., Lin, J. & Dietterich, T.G. (2010). Haar random forest features and svm spatial matching kernel for stonefly species identification. In ICPR. Larios, N., Soran, B., Shapiro, L.G., Martinez-Munoz, G., Lin, J. & Dietterich, T.G. (2010). Haar random forest features and svm spatial matching kernel for stonefly species identification. In ICPR.
Zurück zum Zitat Lazebnik, S., Schmid, C. & Ponce, J. (2005). A maximum entropy framework for part-based texture and object recognition. In ICCV. Lazebnik, S., Schmid, C. & Ponce, J. (2005). A maximum entropy framework for part-based texture and object recognition. In ICCV.
Zurück zum Zitat Levin, A., Lischinski, D. & Weiss, Y. (2007). A closed-form solution to natural image matting. In PAMI. Levin, A., Lischinski, D. & Weiss, Y. (2007). A closed-form solution to natural image matting. In PAMI.
Zurück zum Zitat Liu, J., Kanazawa, A., Jacobs, D. & Belhumeur, P. (2012). Dog breed classification using part localization. In ECCV. Liu, J., Kanazawa, A., Jacobs, D. & Belhumeur, P. (2012). Dog breed classification using part localization. In ECCV.
Zurück zum Zitat Lu, Y., Hu, C., Zhu, X., Zhang, H. & Yang, Q. (2000). A unified framework for semantics and feature based relevance feedback in image retrieval systems. In ACM Multimedia. Lu, Y., Hu, C., Zhu, X., Zhang, H. & Yang, Q. (2000). A unified framework for semantics and feature based relevance feedback in image retrieval systems. In ACM Multimedia.
Zurück zum Zitat Maji, S. (2012). Discovering a lexicon of parts and attributes. In ECCV Parts and Attributes. Maji, S. (2012). Discovering a lexicon of parts and attributes. In ECCV Parts and Attributes.
Zurück zum Zitat Maji, S. & Shakhnarovich, G. (2012). Part annotations via pairwise correspondence. In Conference on Artificial Intelligence Workshop. Maji, S. & Shakhnarovich, G. (2012). Part annotations via pairwise correspondence. In Conference on Artificial Intelligence Workshop.
Zurück zum Zitat Martınez-Munoz et al. (2009). Dictionary-free categorization of very similar objects. In CVPR. Martınez-Munoz et al. (2009). Dictionary-free categorization of very similar objects. In CVPR.
Zurück zum Zitat Mervis, C. B., & Crisafi, M. A. (1982). Order of acquisition of subordinate-, basic-, and superordinate-level categories. Child Development, 53(1), 256–266. Mervis, C. B., & Crisafi, M. A. (1982). Order of acquisition of subordinate-, basic-, and superordinate-level categories. Child Development, 53(1), 256–266.
Zurück zum Zitat Nilsback, M. & Zisserman, A. (2008). Automated flower classification. In ICVGIP. Nilsback, M. & Zisserman, A. (2008). Automated flower classification. In ICVGIP.
Zurück zum Zitat Nilsback, M.E. & Zisserman, A. (2006). A visual vocabulary for flower classification. In CVPR. Nilsback, M.E. & Zisserman, A. (2006). A visual vocabulary for flower classification. In CVPR.
Zurück zum Zitat Ott, P. & Everingham, M. (2011). Shared parts for deformable part-based models. In CVPR. Ott, P. & Everingham, M. (2011). Shared parts for deformable part-based models. In CVPR.
Zurück zum Zitat Parikh, D. & Grauman, K. (2011). Interactively building a vocabulary of attributes. In CVPR. Parikh, D. & Grauman, K. (2011). Interactively building a vocabulary of attributes. In CVPR.
Zurück zum Zitat Parikh, D. & Grauman, K. (2011). Relative attributes. In ICCV. Parikh, D. & Grauman, K. (2011). Relative attributes. In ICCV.
Zurück zum Zitat Parikh, D. & Grauman, K. (2013). Implied feedback: Learning nuances of user behavior in image search. In ICCV. Parikh, D. & Grauman, K. (2013). Implied feedback: Learning nuances of user behavior in image search. In ICCV.
Zurück zum Zitat Parikh, D. & Zitnick, C.L. (2011a). Finding the weakest link in person detectors. In CVPR . Parikh, D. & Zitnick, C.L. (2011a). Finding the weakest link in person detectors. In CVPR .
Zurück zum Zitat Parikh, D. & Zitnick, C.L. (2011b). Human-debugging of machines. In NIPS Wisdom of Crowds. Parikh, D. & Zitnick, C.L. (2011b). Human-debugging of machines. In NIPS Wisdom of Crowds.
Zurück zum Zitat Parkash, A. & Parikh, D. (2012). Attributes for classifier feedback. In ECCV. Parkash, A. & Parikh, D. (2012). Attributes for classifier feedback. In ECCV.
Zurück zum Zitat Parkhi, O., Vedaldi, A., Zisserman, A. & Jawahar, C. (2012). Cats and dogs. In CVPR. Parkhi, O., Vedaldi, A., Zisserman, A. & Jawahar, C. (2012). Cats and dogs. In CVPR.
Zurück zum Zitat Parkhi, O.M., Vedaldi, A., Jawahar, C. & Zisserman, A. (2011). The truth about cats and dogs. In ICCV. Parkhi, O.M., Vedaldi, A., Jawahar, C. & Zisserman, A. (2011). The truth about cats and dogs. In ICCV.
Zurück zum Zitat Perronnin, F., Sánchez, J. & Mensink, T. (2010). Improving the fisher kernel. In ECCV. Perronnin, F., Sánchez, J. & Mensink, T. (2010). Improving the fisher kernel. In ECCV.
Zurück zum Zitat Platt, J.C. (1999). Probabilistic outputs for svms. In ALMC. Platt, J.C. (1999). Probabilistic outputs for svms. In ALMC.
Zurück zum Zitat Quinlan, J. R. (1993). C4.5: Programs for machine learning. Burlington: Morgan Kaufmann. Quinlan, J. R. (1993). C4.5: Programs for machine learning. Burlington: Morgan Kaufmann.
Zurück zum Zitat Rasiwasia, N., Moreno, P.J. & Vasconcelos, N. (2007). Bridging the gap: Query by semantic example. In Multimedia. Rasiwasia, N., Moreno, P.J. & Vasconcelos, N. (2007). Bridging the gap: Query by semantic example. In Multimedia.
Zurück zum Zitat Rosch, E. (1999). Principles of categorization. In Concepts: Core readings. Rosch, E. (1999). Principles of categorization. In Concepts: Core readings.
Zurück zum Zitat Rosch, E., Mervis, C.B. & Gray, W.D., Johnson, D.M., Boyes-Braem, P. (1976). Basic objects in natural categories. In Cognitive Psychology. Rosch, E., Mervis, C.B. & Gray, W.D., Johnson, D.M., Boyes-Braem, P. (1976). Basic objects in natural categories. In Cognitive Psychology.
Zurück zum Zitat Rother, C., Kolmogorov, V. & Blake, A. (2004). Grabcut: Interactive foreground extraction. In TOG. Rother, C., Kolmogorov, V. & Blake, A. (2004). Grabcut: Interactive foreground extraction. In TOG.
Zurück zum Zitat Settles, B. (2008). Curious machines: Active learning with structured instances. Settles, B. (2008). Curious machines: Active learning with structured instances.
Zurück zum Zitat Stark, M., Krause, J., Pepik, B., Meger, D., Little, J.J., Schiele, B. & Koller, D. (2012). Fine-grained categorization for 3d scene understanding. In BMVC. Stark, M., Krause, J., Pepik, B., Meger, D., Little, J.J., Schiele, B. & Koller, D. (2012). Fine-grained categorization for 3d scene understanding. In BMVC.
Zurück zum Zitat Sznitman, R., Basu, A., Richa, R., Handa, J., Gehlbach, P., Taylor, R.H., Jedynak, B. & Hager, G.D. (2011). Unified detection and tracking in retinal microsurgery. In MICCAI. Sznitman, R., Basu, A., Richa, R., Handa, J., Gehlbach, P., Taylor, R.H., Jedynak, B. & Hager, G.D. (2011). Unified detection and tracking in retinal microsurgery. In MICCAI.
Zurück zum Zitat Sznitman, R. & Jedynak, B. (2010). Active testing for face detection and localization. In PAMI. Sznitman, R. & Jedynak, B. (2010). Active testing for face detection and localization. In PAMI.
Zurück zum Zitat Tsiligkaridis, T., Sadler, B. & Hero, A. (2013). A collaborative 20 questions model for target search with human-machine interaction. In ICASSP. Tsiligkaridis, T., Sadler, B. & Hero, A. (2013). A collaborative 20 questions model for target search with human-machine interaction. In ICASSP.
Zurück zum Zitat Tsochantaridis, I., Joachims, T., Hofmann, T. & Altun, Y. (2006). Large margin methods for structured and interdependent output variables. In JMLR. Tsochantaridis, I., Joachims, T., Hofmann, T. & Altun, Y. (2006). Large margin methods for structured and interdependent output variables. In JMLR.
Zurück zum Zitat Vijayanarasimhan, S. & Grauman, K. (2009). What’s It Going to Cost You? In CVPR. Vijayanarasimhan, S. & Grauman, K. (2009). What’s It Going to Cost You? In CVPR.
Zurück zum Zitat Vijayanarasimhan, S. & Grauman, K. (2011). Large-scale live active learning. In CVPR. Vijayanarasimhan, S. & Grauman, K. (2011). Large-scale live active learning. In CVPR.
Zurück zum Zitat Vondrick, C. & Ramanan, D. (2011). Video Annotation and Tracking with Active Learning. In NIPS. Vondrick, C. & Ramanan, D. (2011). Video Annotation and Tracking with Active Learning. In NIPS.
Zurück zum Zitat Vondrick, C., Ramanan, D. & Patterson, D. (2010). Efficiently scaling up video annotation. In ECCV. Vondrick, C., Ramanan, D. & Patterson, D. (2010). Efficiently scaling up video annotation. In ECCV.
Zurück zum Zitat Wah, C., Branson, S., Perona, P. & Belongie, S. (2011). Multiclass recognition and part localization with humans in the loop. In ICCV. Wah, C., Branson, S., Perona, P. & Belongie, S. (2011). Multiclass recognition and part localization with humans in the loop. In ICCV.
Zurück zum Zitat Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. (2011). The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001, Pasadena: Caltech. Wah, C., Branson, S., Welinder, P., Perona, P. & Belongie, S. (2011). The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001, Pasadena: Caltech.
Zurück zum Zitat Wang, G. & Forsyth, D. (2009). Joint learning of visual attributes, object classes. In ICCV. Wang, G. & Forsyth, D. (2009). Joint learning of visual attributes, object classes. In ICCV.
Zurück zum Zitat Wang, J., Markert, K. & Everingham, M. (2009). Learning models for object recognition from natural language descriptions. In BMVC. Wang, J., Markert, K. & Everingham, M. (2009). Learning models for object recognition from natural language descriptions. In BMVC.
Zurück zum Zitat Wu, W. & Yang, J. (2006). SmartLabel: an object labeling tool. In Multimedia. Wu, W. & Yang, J. (2006). SmartLabel: an object labeling tool. In Multimedia.
Zurück zum Zitat Yang, Y. & Ramanan, D. (2011). Articulated pose estimation using mixtures of parts. In CVPR. Yang, Y. & Ramanan, D. (2011). Articulated pose estimation using mixtures of parts. In CVPR.
Zurück zum Zitat Yao, B., Bradski, G., Fei-Fei, L.: A codebook and annotation-free approach for fgvc. In: CVPR (2012) Yao, B., Bradski, G., Fei-Fei, L.: A codebook and annotation-free approach for fgvc. In: CVPR (2012)
Zurück zum Zitat Yao, B., Khosla, A. & Fei-Fei, L. (2011). Combining randomization and discrimination for fgvc. In CVPR. Yao, B., Khosla, A. & Fei-Fei, L. (2011). Combining randomization and discrimination for fgvc. In CVPR.
Zurück zum Zitat Zhang, N., Farrell, R. & Darrell, T. (2012). Pose pooling kernels for sub-category recognition. In CVPR. Zhang, N., Farrell, R. & Darrell, T. (2012). Pose pooling kernels for sub-category recognition. In CVPR.
Zurück zum Zitat Zhang, N., Farrell, R., Iandola, F. & Darrell, T. (2013). Deformable part descriptors for fine-grained recognition and attribute prediction. In ICCV. Zhang, N., Farrell, R., Iandola, F. & Darrell, T. (2013). Deformable part descriptors for fine-grained recognition and attribute prediction. In ICCV.
Zurück zum Zitat Zhou, X. & Huang, T. (2003). Relevance feedback in image retrieval. In Multimedia. Zhou, X. & Huang, T. (2003). Relevance feedback in image retrieval. In Multimedia.
Metadaten
Titel
The Ignorant Led by the Blind: A Hybrid Human–Machine Vision System for Fine-Grained Categorization
verfasst von
Steve Branson
Grant Van Horn
Catherine Wah
Pietro Perona
Serge Belongie
Publikationsdatum
01.05.2014
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 1-2/2014
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-014-0698-4

Weitere Artikel der Ausgabe 1-2/2014

International Journal of Computer Vision 1-2/2014 Zur Ausgabe

Premium Partner