Top

Published in:

2016 | OriginalPaper | Chapter

Harvesting Training Images for Fine-Grained Object Categories Using Visual Descriptions

Authors : Josiah Wang, Katja Markert, Mark Everingham

Published in: Advances in Information Retrieval

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

We harvest training images for visual object recognition by casting it as an IR task. In contrast to previous work, we concentrate on fine-grained object categories, such as the large number of particular animal subspecies, for which manual annotation is expensive. We use ‘visual descriptions’ from nature guides as a novel augmentation to the well-known use of category names. We use these descriptions in both the query process to find potential category images as well as in image reranking where an image is more highly ranked if web page text surrounding it is similar to the visual descriptions. We show the potential of this method when harvesting images for 10 butterfly categories: when compared to a method that relies on the category name only, using visual descriptions improves precision for many categories.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter VODUM: A Topic Model Unifying Viewpoint, Topic and Opinion Discovery

next chapter Do Your Social Profiles Reveal What Languages You Speak? Language Inference from Social Media Profiles

Previous work [1, 5, 15] has used visual descriptions for object recognition without any training images but not for the discovery of training images itself.

http://trac.webkit.org/wiki/QtWebKit.

Ba, J.L., Swersky, K., Fidler, S., Salakhutdinov, R.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

Berg, T.L., Forsyth, D.A.: Animals on the web. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, vol. 2, pp. 1463–1470 (2006)

Collins, B., Deng, J., Li, K., Fei-Fei, L.: Towards scalable dataset construction: an active learning approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 86–98. Springer, Heidelberg (2008)CrossRef

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, pp. 248–255 (2009)

Elhoseiny, M., Saleh, B., Elgammal, A.: Write a classifier: Zero-shot learning using purely textual descriptions. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition (2013)

Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. In: Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 1816–1823 (2005)

George, M., Ghanem, N., Ismail, M.A.: Learning-based incremental creation of web image databases. In: Proceedings of the 12th IEEE International Conference on Machine Learning and Applications (ICMLA 2013), pp. 424–429 (2013)

Krapac, J., Allan, M., Verbeek, J., Jurie, F.: Improving web-image search results using query-relative classifiers. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, pp. 1094–1101 (2010)

Li, L.J., Wang, G., Fei-Fei, L.: OPTIMOL: Automatic Object Picture collecTion via Incremental MOdel Learning. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, pp. 1–8 (2007)

10.

Nilsback, M.E., Zisserman, A.: Automatedower classification over a large numberof classes. In: Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, pp. 722–729 (2008)

11.

Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRef

12.

Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the Web. IEEE Trans. Pattern Anal. Mach. Intell. 33(4), 754–766 (2011)CrossRef

13.

Singhal, A., Salton, G., Buckley, C.: Length normalization in degraded text collections. In: Proceedings of Fifth Annual Symposium on Document Analysis and Information Retrieval, pp. 149–162 (1996)

14.

Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: Multiple-instance learning for weakly supervised object categorization. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition (2008)

15.

Wang, J., Markert, K., Everingham, M.: Learning models for object recognition from natural language descriptions. In: Proceedings of the British Machine Vision Conference, pp. 2.1-2.11. BMVA Press (2009)

16.

Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD Birds 200. Technical Report CNS-TR-2010-001. California Institute of Technology (2010)

17.

Zhou, N., Fan, J.: Automatic image-text alignment for large-scale web image indexing and retrieval. Pattern Recogn. 48(1), 205–219 (2015)MathSciNetCrossRef

Title: Harvesting Training Images for Fine-Grained Object Categories Using Visual Descriptions
Authors: Josiah Wang
Katja Markert
Mark Everingham
Publisher: Springer International Publishing
Book: Advances in Information Retrieval
Print ISBN: 978-3-319-30670-4

Electronic ISBN: 978-3-319-30671-1

Copyright Year: 2016
DOI: https://doi.org/10.1007/978-3-319-30671-1_40

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"