nach oben

Erschienen in:

2016 | OriginalPaper | Buchkapitel

The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition

verfasst von : Jonathan Krause, Benjamin Sapp, Andrew Howard, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, Li Fei-Fei

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Current approaches for fine-grained recognition do the following: First, recruit experts to annotate a dataset of images, optionally also collecting more structured data in the form of part annotations and bounding boxes. Second, train a model utilizing this data. Toward the goal of solving fine-grained recognition, we introduce an alternative approach, leveraging free, noisy data from the web and simple, generic methods of recognition. This approach has benefits in both performance and scalability. We demonstrate its efficacy on four fine-grained datasets, greatly exceeding existing state of the art without the manual collection of even a single label, and furthermore show first results at scaling to more than 10,000 fine-grained categories. Quantitatively, we achieve top-1 accuracies of \(92.3\,\%\) on CUB-200-2011, \(85.4\,\%\) on Birdsnap, \(93.4\,\%\) on FGVC-Aircraft, and \(80.8\,\%\) on Stanford Dogs without using their annotated training sets. We compare our approach to an active learning approach for expanding fine-grained datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Indoor-Outdoor 3D Reconstruction Alignment

Nächstes Kapitel A Simple Hierarchical Pooling Data Structure for Loop Closure

Nur mit Berechtigung zugänglich

Google image search: http://images.google.com.

To be released.

Angelova, A., Zhu, S., Lin, Y.: Image segmentation for large-scale subcategory flower recognition. In: Workshop on Applications of Computer Vision (WACV), pp. 39–45. IEEE (2013)

Balcan, M.-F., Broder, A., Zhang, T.: Margin based active learning. In: Bshouty, N.H., Gentile, C. (eds.) COLT 2007. LNCS (LNAI), vol. 4539, pp. 35–50. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72927-3_5 CrossRef

Berg, T., Belhumeur, P.N.: Poof: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. In: Computer Vision and Pattern Recognition (CVPR), pp. 955–962. IEEE (2013)

Berg, T., Liu, J., Lee, S.W., Alexander, M.L., Jacobs, D.W., Belhumeur, P.N.: Birdsnap: large-scale fine-grained visual categorization of birds. In: Computer Vision and Pattern Recognition (CVPR), June 2014

Branson, S., Van Horn, G., Perona, P., Belongie, S.: Improved bird species recognition using pose normalized deep convolutional nets. In: British Machine Vision Conference (BMVC) (2014)

Branson, S., Van Horn, G., Wah, C., Perona, P., Belongie, S.: The ignorant led by the blind: a hybrid human-machine vision system for fine-grained categorization. Int. J. Comput. Vision (IJCV), 1–27 (2014)

Chai, Y., Lempitsky, V., Zisserman, A.: Bicos: A bi-level co-segmentation method for image classification. In: International Conference on Computer Vision (ICCV). IEEE (2011)

Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: International Conference on Computer Vision (ICCV), pp. 321–328. IEEE (2013)

Chai, Y., Rahtu, E., Lempitsky, V., Gool, L., Zisserman, A.: TriCoS: a tri-level class-discriminative co-segmentation method for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 794–807. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33718-5_57

10.

Chen, X., Gupta, A.: Webly supervised learning of convolutional networks. In: International Conference on Computer Vision (ICCV). IEEE (2015)

11.

Chen, X., Shrivastava, A., Gupta, A.: Neil: Extracting visual knowledge from web data. In: International Conference on Computer Vision (ICCV), pp. 1409–1416. IEEE (2013)

12.

Collins, B., Deng, J., Li, K., Fei-Fei, L.: Towards scalable dataset construction: an active learning approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 86–98. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88682-2_8 CrossRef

13.

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Computer Vision and Pattern Recognition (CVPR) (2009)

14.

Deng, J., Krause, J., Fei-Fei, L.: Fine-grained crowdsourcing for fine-grained recognition. In: Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2013)

15.

Divvala, S.K., Farhadi, A., Guestrin, C.: Learning everything about anything: webly-supervised visual concept learning. In: Computer Vision and Pattern Recognition (CVPR), pp. 3270–3277. IEEE (2014)

16.

Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: Computer Vision and Pattern Recognition (CVPR), pp. 3474–3481. IEEE

17.

Erkan, A.N.: Semi-supervised learning via generalized maximum entropy. Ph.D. thesis, New York University (2010)

18.

Farrell, R., Oza, O., Zhang, N., Morariu, V.I., Darrell, T., Davis, L.S.: Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance. In: International Conference on Computer Vision (ICCV), pp. 161–168. IEEE (2011)

19.

Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from internet image searches. Proc. IEEE 98(8), 1453–1466 (2010)CrossRef

20.

Gavves, E., Fernando, B., Snoek, C.G., Smeulders, A.W., Tuytelaars, T.: Fine-grained categorization by alignments. In: International Conference on Computer Vision (ICCV), pp. 1713–1720. IEEE

21.

Gavves, E., Fernando, B., Snoek, C.G., Smeulders, A.W., Tuytelaars, T.: Local alignments for fine-grained categorization. Int. J. Comput. Vision (IJCV), 1–22 (2014)

22.

Goering, C., Rodner, E., Freytag, A., Denzler, J.: Nonparametric part transfer for fine-grained recognition. In: Computer Vision and Pattern Recognition (CVPR), pp. 2489–2496. IEEE (2014)

23.

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2016)

24.

Hinchliff, C.E., Smith, S.A., Allman, J.F., Burleigh, J.G., Chaudhary, R., Coghill, L.M., Crandall, K.A., Deng, J., Drew, B.T., Gazis, R., Gude, K., Hibbett, D.S., Katz, L.A., Laughinghouse, H.D., McTavish, E.J., Midford, P.E., Owen, C.L., Ree, R.H., Rees, J.A., Soltis, D.E., Williams, T., Cranston, K.A.: Synthesis of phylogeny and taxonomy into a comprehensive tree of life. Proc. Nat. Acad. Sci. (2015). http://www.pnas.org/content/early/2015/09/16/1423041112.abstract

25.

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (ICML) (2015)

26.

Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Neural Information Processing Systems (NIPS) (2015)

27.

Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: First Workshop on Fine-Grained Visual Categorization, Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, June 2011

28.

Krause, J., Gebru, T., Deng, J., Li, L.J., Fei-Fei, L.: Learning features and parts for fine-grained recognition. In: International Conference on Pattern Recognition (ICPR), Stockholm, Sweden, August 2014

29.

Krause, J., Jin, H., Yang, J., Fei-Fei, L.: Fine-grained recognition without part annotations. In: Conference on Computer Vision and Pattern Recognition (CVPR). IEEE

30.

Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13). IEEE (2013)

31.

Kumar, N., Belhumeur, P.N., Biswas, A., Jacobs, D.W., Kress, W.J., Lopez, I.C., Soares, J.V.: Leafsnap: a computer vision system for automatic plant species identification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) European Conference on Computer Vision (ECCV), vol. 7573, pp. 502–516. Springer, Heidelberg (2012)

32.

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef

33.

Lewis, D.D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: International Conference on Machine Learning (ICML), pp. 148–156 (1994)

34.

Li, L.J., Fei-Fei, L.: Optimol: automatic online picture collection via incremental model learning. Int. J. Comput. Vision (IJCV) 88(2), 147–168 (2010)CrossRef

35.

Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear cnn models for fine-grained visual recognition. In: International Conference on Computer Vision (ICCV). IEEE

36.

Liu, J., Kanazawa, A., Jacobs, D., Belhumeur, P.: Dog breed classification using part localization. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 172–185. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33718-5_13

37.

Maji, S., Kannala, J., Rahtu, E., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. Technical report (2013)

38.

Mnih, V., Hinton, G.E.: Learning to label aerial images from noisy data. In: International Conference on Machine Learning (ICML), pp. 567–574 (2012)

39.

Mozafari, B., Sarkar, P., Franklin, M., Jordan, M., Madden, S.: Scaling up crowd-sourcing to very large datasets: a case for active learning. Proc. VLDB Endowment 8(2), 125–136 (2014)CrossRef

40.

Nilsback, M.E., Zisserman, A.: A visual vocabulary for flower classification. In: Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1447–1454. IEEE (2006)

41.

Pu, J., Jiang, Y.-G., Wang, J., Xue, X.: Which looks like which: exploring inter-class relationships in fine-grained visual categorization. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 425–440. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10578-9_28

42.

Reed, S., Lee, H., Anguelov, D., Szegedy, C., Erhan, D., Rabinovich, A.: Training deep neural networks on noisy labels with bootstrapping (2014). arXiv preprint arXiv:1412.6596

43.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision (IJCV), 1–42, April 2015

44.

Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. Pattern Anal. Mach. Intell. (PAMI) 33(4), 754–766 (2011)CrossRef

45.

Sermanet, P., Frome, A., Real, E.: Attention for fine-grained categorization (2014). arXiv preprint arXiv:1412.7054

46.

Settles, B.: Active learning literature survey. Univ. Wis. Madison 52(55–66), 11 (2010)

47.

Settles, B., Craven, M., Ray, S.: Multiple-instance active learning. In: Advances in Neural Information Processing Systems (NIPS), pp. 1289–1296 (2008)

48.

Shih, K.J., Mallya, A., Singh, S., Hoiem, D.: Part localization using multi-proposal consensus for fine-grained categorization. In: British Machine Vision Conference (BMVC) (2015)

49.

Simon, M., Rodner, E.: Neural activation constellations: unsupervised part model discovery with convolutional networks. In: ICCV (2015)

50.

Simon, M., Rodner, E., Denzler, J.: Part detector discovery in deep convolutional neural networks. In: Asian Conference on Computer Vision (ACCV), vol. 2, pp.162–177 (2014)

51.

Sukhbaatar, S., Fergus, R.: Learning from noisy labels with deep neural networks (2014). arXiv preprint arXiv:1406.2080

52.

Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning (2016). arXiv preprint arXiv:1602.07261

53.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015)

54.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2016)

55.

Thomee, B., Shamma, D.A., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., Li, L.J.: The new data and new challenges in multimedia research (2015). arXiv preprint arXiv:1503.01817

56.

Torralba, A., Efros, A., et al.: Unbiased look at dataset bias. In: Computer Vision and Pattern Recognition (CVPR), pp. 1521–1528. IEEE (2011)

57.

Van Horn, G., Branson, S., Farrell, R., Haber, S., Barry, J., Ipeirotis, P., Perona, P., Belongie, S.: Building a bird recognition app. and large scale dataset with citizen scientists: the fine print in fine-grained dataset collection. In: Computer Vision and Pattern Recognition (CVPR). IEEE (2015)

58.

Vedaldi, A., Mahendran, S., Tsogkas, S., Maji, S., Girshick, B., Kannala, J., Rahtu, E., Kokkinos, I., Blaschko, M.B., Weiss, D., Taskar, B., Simonyan, K., Saphra, N., Mohamed, S.: Understanding objects in detail with fine-grained attributes. In: Computer Vision and Pattern Recognition (CVPR) (2014)

59.

Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 dataset. Technical report CNS-TR-2011-001, California Institute of Technology (2011)

60.

Wah, C., Belongie, S.: Attribute-based detection of unfamiliar classes with humans in the loop. In: Computer Vision and Pattern Recognition (CVPR), pp. 779–786. IEEE (2013)

61.

Wah, C., Branson, S., Perona, P., Belongie, S.: Multiclass recognition and part localization with humans in the loop. In: International Conference on Computer Vision (ICCV), pp. 2524–2531. IEEE (2011)

62.

Wah, C., Horn, G., Branson, S., Maji, S., Perona, P., Belongie, S.: Similarity comparisons for interactive fine-grained categorization. In: Computer Vision and Pattern Recognition (CVPR) (2014)

63.

Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J., Philbin, J., Chen, B., Wu, Y.: Learning fine-grained image similarity with deep ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1386–1393 (2014)

64.

Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD Birds 200. Technical report CNS-TR-2010-001, California Institute of Technology (2010)

65.

Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., Zhang, Z.: The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Computer Vision and Pattern Recognition (CVPR). IEEE

66.

Xiao, T., Xia, T., Yang, Y., Huang, C., Wang, X.: Learning from massive noisy labeled data for image classification. In: Computer Vision and Pattern Recognition (CVPR). IEEE

67.

Xie, S., Yang, T., Wang, X., Lin, Y.: Hyper-class augmented and regularized deep learning for fine-grained image classification. In: Computer Vision and Pattern Recognition (CVPR). IEEE

68.

Xu, Z., Huang, S., Zhang, Y., Tao, D.: Augmenting strong supervision using web data for fine-grained categorization. In: International Conference on Computer Vision (ICCV) (2015)

69.

Yang, L., Luo, P., Loy, C.C., Tang, X.: A large-scale car dataset for fine-grained categorization and verification. In: Computer Vision and Pattern Recognition (CVPR). IEEE

70.

Yang, S., Bo, L., Wang, J., Shapiro, L.G.: Unsupervised template learning for fine-grained object recognition. In: Advances in Neural Information Processing Systems (NIPS), pp. 3122–3130 (2012)

71.

Yao, B., Bradski, G., Fei-Fei, L.: A codebook-free and annotation-free approach for fine-grained image categorization. In: Computer Vision and Pattern Recognition (CVPR), pp. 3466–3473. IEEE (2012)

72.

Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: Computer Vision and Pattern Recognition (CVPR), pp. 1577–1584. IEEE (2011)

73.

Yu, F., Zhang, Y., Song, S., Seff, A., Xiao, J.: Construction of a large-scale image dataset using deep learning with humans in the loop (2015). arXiv preprint arXiv:1506.03365

74.

Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based r-cnns for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10590-1_54

75.

Zhang, N., Farrell, R., Darrell, T.: Pose pooling kernels for sub-category recognition. In: Computer Vision and Pattern Recognition (CVPR), pp. 3665–3672. IEEE (2012)

76.

Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: International Conference on Computer Vision (ICCV), pp. 729–736. IEEE (2013)

77.

Zhang, Y., Wei, X-S., Wu, J., Cai, J., Lu, J., Nguyen, V.A., Do, M.N.: Weakly supervised fine-grained image categorization (2015). arXiv preprint arXiv:1504.04943

Titel: The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition
verfasst von: Jonathan Krause
Benjamin Sapp
Andrew Howard
Howard Zhou
Alexander Toshev
Tom Duerig
James Philbin
Li Fei-Fei
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2016
Print ISBN: 978-3-319-46486-2

Electronic ISBN: 978-3-319-46487-9

Copyright-Jahr: 2016
DOI: https://doi.org/10.1007/978-3-319-46487-9_19

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"