Skip to main content
Erschienen in: International Journal of Computer Vision 1/2016

01.08.2016

Do We Need More Training Data?

verfasst von: Xiangxin Zhu, Carl Vondrick, Charless C. Fowlkes, Deva Ramanan

Erschienen in: International Journal of Computer Vision | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Datasets for training object recognition systems are steadily increasing in size. This paper investigates the question of whether existing detectors will continue to improve as data grows, or saturate in performance due to limited model complexity and the Bayes risk associated with the feature spaces in which they operate. We focus on the popular paradigm of discriminatively trained templates defined on oriented gradient features. We investigate the performance of mixtures of templates as the number of mixture components and the amount of training data grows. Surprisingly, even with proper treatment of regularization and “outliers”, the performance of classic mixture models appears to saturate quickly (\({\sim }10\) templates and \({\sim }100\) positive training examples per template). This is not a limitation of the feature space as compositional mixtures that share template parameters via parts and that can synthesize new templates not encountered during training yield significantly better performance. Based on our analysis, we conjecture that the greatest gains in detection performance will continue to derive from improved representations and learning algorithms that can make efficient use of large datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
1
The dataset can be downloaded from http://​vision.​ics.​uci.​edu/​datasets/​.
 
Literatur
Zurück zum Zitat Beis, J.S., & Lowe, D.G. (1997). Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on IEEE (pp. 1000–1006). Beis, J.S., & Lowe, D.G. (1997). Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on IEEE (pp. 1000–1006).
Zurück zum Zitat Boiman, O., Shechtman, E., & Irani, M. (2008). In defense of nearest-neighbor based image classification. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on IEEE (pp. 1–8). Boiman, O., Shechtman, E., & Irani, M. (2008). In defense of nearest-neighbor based image classification. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on IEEE (pp. 1–8).
Zurück zum Zitat Bosch, A., Zisserman, A., & Muoz, X. (2007). Image classification using random forests and ferns. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on IEEE (pp. 1–8). Bosch, A., Zisserman, A., & Muoz, X. (2007). Image classification using random forests and ferns. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on IEEE (pp. 1–8).
Zurück zum Zitat Bourdev, L., & Malik, J. (2009). Poselets: Body part detectors trained using 3D human pose annotations. In International Conference on Computer Vision. Bourdev, L., & Malik, J. (2009). Poselets: Body part detectors trained using 3D human pose annotations. In International Conference on Computer Vision.
Zurück zum Zitat Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR 2005. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR 2005.
Zurück zum Zitat Deng, J., Berg, A., Li, K., & Fei-Fei, L. (2010). What does classifying more than 10,000 image categories tell us?. In International Conference on Computer Vision. Deng, J., Berg, A., Li, K., & Fei-Fei, L. (2010). What does classifying more than 10,000 image categories tell us?. In International Conference on Computer Vision.
Zurück zum Zitat Divvala, S.K., Efros, A.A., & Hebert, M. (2012). How important are deformable parts in the deformable parts model? In European Conference on Computer Vision (ECCV), Parts and Attributes Workshop. Divvala, S.K., Efros, A.A., & Hebert, M. (2012). How important are deformable parts in the deformable parts model? In European Conference on Computer Vision (ECCV), Parts and Attributes Workshop.
Zurück zum Zitat Everingham, M., Van Gool, L., Williams, C., Winn, J., & Zisserman, A. (2010). The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.CrossRef Everingham, M., Van Gool, L., Williams, C., Winn, J., & Zisserman, A. (2010). The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.CrossRef
Zurück zum Zitat Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE TPAMI, 32(9), 1627–1645.CrossRef Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE TPAMI, 32(9), 1627–1645.CrossRef
Zurück zum Zitat Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-pie. Image and Vision Computing, 28(5), 807–813.CrossRef Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-pie. Image and Vision Computing, 28(5), 807–813.CrossRef
Zurück zum Zitat Halevy, A., Norvig, P., & Pereira, F. (2009). The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2), 8–12.CrossRef Halevy, A., Norvig, P., & Pereira, F. (2009). The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2), 8–12.CrossRef
Zurück zum Zitat Hays, J., & Efros, A. (2007). Scene completion using millions of photographs. ACM Transactions on Graphics (TOG), 26, 4.CrossRef Hays, J., & Efros, A. (2007). Scene completion using millions of photographs. ACM Transactions on Graphics (TOG), 26, 4.CrossRef
Zurück zum Zitat Hays, J., & Efros, A.A. (2008). Im2gps: Estimating geographic information from a single image. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on IEEE (pp. 1–8). Hays, J., & Efros, A.A. (2008). Im2gps: Estimating geographic information from a single image. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on IEEE (pp. 1–8).
Zurück zum Zitat Hoiem, D., Chodpathumwan. Y., & Dai, Q. (2012). Diagnosing error in object detectors. In Computer Vision ECCV 2012 (Vol. 7574, pp. 340–353). Berlin: Springer. Hoiem, D., Chodpathumwan. Y., & Dai, Q. (2012). Diagnosing error in object detectors. In Computer Vision ECCV 2012 (Vol. 7574, pp. 340–353). Berlin: Springer.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1106–1114. Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1106–1114.
Zurück zum Zitat Liu, C., Yuen, J., & Torralba, A. (2011). Nonparametric scene parsing via label transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12), 2368–2382.CrossRef Liu, C., Yuen, J., & Torralba, A. (2011). Nonparametric scene parsing via label transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12), 2368–2382.CrossRef
Zurück zum Zitat Malisiewicz, T., Gupta, A., & Efros, A. (2011). Ensemble of exemplar-svms for object detection and beyond. In IEEE, International Conference on Computer Vision (pp. 89–96). Malisiewicz, T., Gupta, A., & Efros, A. (2011). Ensemble of exemplar-svms for object detection and beyond. In IEEE, International Conference on Computer Vision (pp. 89–96).
Zurück zum Zitat McAllester, D. A. (1999). Some pac-bayesian theorems. Machine Learning, 37(3), 355–363.CrossRefMATH McAllester, D. A. (1999). Some pac-bayesian theorems. Machine Learning, 37(3), 355–363.CrossRefMATH
Zurück zum Zitat Muja, M., & Lowe, D.G. (2009). Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Applications (VISSAPP09) (pp. 331–340). Muja, M., & Lowe, D.G. (2009). Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Applications (VISSAPP09) (pp. 331–340).
Zurück zum Zitat Parikh, D., & Zitnick, C. (2011). Finding the weakest link in person detectors. In Computer Vision and Pattern Recognition IEEE (pp. 1425–1432). Parikh, D., & Zitnick, C. (2011). Finding the weakest link in person detectors. In Computer Vision and Pattern Recognition IEEE (pp. 1425–1432).
Zurück zum Zitat Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers (pp. 61–74), MIT Press. Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers (pp. 61–74), MIT Press.
Zurück zum Zitat Shakhnarovich, G., Darrell, T., & Indyk, P. (2005). Nearest-neighbor methods in learning and vision: Theory and practice. Cambridge: MIT press. Shakhnarovich, G., Darrell, T., & Indyk, P. (2005). Nearest-neighbor methods in learning and vision: Theory and practice. Cambridge: MIT press.
Zurück zum Zitat Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast pose estimation with parameter-sensitive hashing. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on IEEE (pp. 750–757). Shakhnarovich, G., Viola, P., & Darrell, T. (2003). Fast pose estimation with parameter-sensitive hashing. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on IEEE (pp. 750–757).
Zurück zum Zitat Tighe, J., & Lazebnik, S. (2010). Superparsing: Scalable nonparametric image parsing with superpixels. In Computer Vision-ECCV 2010 (pp. 352–365). Springer. Tighe, J., & Lazebnik, S. (2010). Superparsing: Scalable nonparametric image parsing with superpixels. In Computer Vision-ECCV 2010 (pp. 352–365). Springer.
Zurück zum Zitat Torralba, A., & Efros, A. (2011). Unbiased look at dataset bias. In Computer Vision and Pattern Recognition IEEE (pp. 1521–1528). Torralba, A., & Efros, A. (2011). Unbiased look at dataset bias. In Computer Vision and Pattern Recognition IEEE (pp. 1521–1528).
Zurück zum Zitat Torralba, A., Fergus, R., & Freeman, W. T. (2008). 80 Million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1958–1970.CrossRef Torralba, A., Fergus, R., & Freeman, W. T. (2008). 80 Million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1958–1970.CrossRef
Zurück zum Zitat Tuytelaars, T., & Mikolajczyk, K. (2008). Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision, 3(3), 177–280.CrossRef Tuytelaars, T., & Mikolajczyk, K. (2008). Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision, 3(3), 177–280.CrossRef
Zurück zum Zitat Vedaldi, A., Gulshan, V., Varma, M., & Zisserman, A. (2009). Multiple kernels for object detection. In Computer Vision, 2009 IEEE 12th International Conference on IEEE (pp. 606–613). Vedaldi, A., Gulshan, V., Varma, M., & Zisserman, A. (2009). Multiple kernels for object detection. In Computer Vision, 2009 IEEE 12th International Conference on IEEE (pp. 606–613).
Zurück zum Zitat Wu, Y., & Liu, Y. (2007). Robust truncated hinge loss support vector machines. Journal of the American Statistical Association, 102(479), 974–983.MathSciNetCrossRefMATH Wu, Y., & Liu, Y. (2007). Robust truncated hinge loss support vector machines. Journal of the American Statistical Association, 102(479), 974–983.MathSciNetCrossRefMATH
Zurück zum Zitat Zhang, H., Berg, A. C., Maire, M., & Malik, J. (2006). Svm-knn: Discriminative nearest neighbor classification for visual category recognition. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on IEEE (Vol. 2, pp. 2126–2136). Zhang, H., Berg, A. C., Maire, M., & Malik, J. (2006). Svm-knn: Discriminative nearest neighbor classification for visual category recognition. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on IEEE (Vol. 2, pp. 2126–2136).
Zurück zum Zitat Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In Computer Vision and Pattern Recognition. Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In Computer Vision and Pattern Recognition.
Metadaten
Titel
Do We Need More Training Data?
verfasst von
Xiangxin Zhu
Carl Vondrick
Charless C. Fowlkes
Deva Ramanan
Publikationsdatum
01.08.2016
Verlag
Springer US
Erschienen in
International Journal of Computer Vision / Ausgabe 1/2016
Print ISSN: 0920-5691
Elektronische ISSN: 1573-1405
DOI
https://doi.org/10.1007/s11263-015-0812-2

Weitere Artikel der Ausgabe 1/2016

International Journal of Computer Vision 1/2016 Zur Ausgabe