Skip to main content

2018 | OriginalPaper | Buchkapitel

Dataset Augmentation with Synthetic Images Improves Semantic Segmentation

verfasst von : Manik Goyal, Param Rajpura, Hristo Bojinov, Ravi Hegde

Erschienen in: Computer Vision, Pattern Recognition, Image Processing, and Graphics

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Although Deep Convolutional Neural Networks trained with strong pixel-level annotations have significantly pushed the performance in semantic segmentation, annotation efforts required for the creation of training data remains a roadblock for further improvements. We show that augmentation of the weakly annotated training dataset with synthetic images minimizes both the annotation efforts and also the cost of capturing images with sufficient variety. Evaluation on the PASCAL 2012 validation dataset shows an increase in mean IOU from 52.80% to 55.47% by adding just 100 synthetic images per object class. Our approach is thus a promising solution to the problems of annotation and dataset collection.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc. (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc. (2012)
2.
Zurück zum Zitat Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 07–12 June 2015, pp. 1–9. IEEE, June 2015 Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 07–12 June 2015, pp. 1–9. IEEE, June 2015
3.
Zurück zum Zitat Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. arXiv Preprint arXiv:1605.06211, May 2016 Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. arXiv Preprint arXiv:​1605.​06211, May 2016
4.
Zurück zum Zitat Badrinarayanan, V., Handa, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv Preprint arXiv:1505.07293, May 2015 Badrinarayanan, V., Handa, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling. arXiv Preprint arXiv:​1505.​07293, May 2015
5.
Zurück zum Zitat Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. arXiv Preprint arXiv:1606.00915, June 2016 Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. arXiv Preprint arXiv:​1606.​00915, June 2016
6.
Zurück zum Zitat Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly supervised structured output learning for semantic segmentation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 845–852. IEEE, June 2012 Vezhnevets, A., Ferrari, V., Buhmann, J.M.: Weakly supervised structured output learning for semantic segmentation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 845–852. IEEE, June 2012
7.
Zurück zum Zitat Verbeek, J., Triggs, B.: Region classification with Markov field aspect models. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, June 2007 Verbeek, J., Triggs, B.: Region classification with Markov field aspect models. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, June 2007
8.
Zurück zum Zitat Xu, J., Schwing, A.G., Urtasun, R.: Tell me what you see and i will show you where it is. In: CVPR (2014) Xu, J., Schwing, A.G., Urtasun, R.: Tell me what you see and i will show you where it is. In: CVPR (2014)
9.
Zurück zum Zitat Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)CrossRef Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)CrossRef
10.
Zurück zum Zitat Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: 2011 International Conference on Computer Vision, pp. 991–998. IEEE, November 2011 Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: 2011 International Conference on Computer Vision, pp. 991–998. IEEE, November 2011
11.
12.
Zurück zum Zitat Pathak, D., Krähenbühl, P., Darrell, T.: Constrained convolutional neural networks for weakly supervised segmentation. arXiv Preprint arXiv:1506.03648, June 2015 Pathak, D., Krähenbühl, P., Darrell, T.: Constrained convolutional neural networks for weakly supervised segmentation. arXiv Preprint arXiv:​1506.​03648, June 2015
13.
Zurück zum Zitat Pathak, D., Shelhamer, E., Long, J., Darrell, T.: Fully convolutional multi-class multiple instance learning. arXiv Preprint arXiv:1412.7144, December 2014 Pathak, D., Shelhamer, E., Long, J., Darrell, T.: Fully convolutional multi-class multiple instance learning. arXiv Preprint arXiv:​1412.​7144, December 2014
14.
Zurück zum Zitat Rother, C., Kolmogorov, V., Blake, A., Rother, C., Kolmogorov, V., Blake, A.: GrabCut. In: ACM SIGGRAPH 2004 Papers on - SIGGRAPH 2004, vol. 23, p. 309. ACM Press, New York (2004) Rother, C., Kolmogorov, V., Blake, A., Rother, C., Kolmogorov, V., Blake, A.: GrabCut. In: ACM SIGGRAPH 2004 Papers on - SIGGRAPH 2004, vol. 23, p. 309. ACM Press, New York (2004)
15.
Zurück zum Zitat Papandreou, G., Chen, L.C., Murphy, K.P., Yuille, A.L.: Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1742–1750. IEEE, December 2015 Papandreou, G., Chen, L.C., Murphy, K.P., Yuille, A.L.: Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1742–1750. IEEE, December 2015
16.
Zurück zum Zitat Li, W., Duan, L., Xu, D., Tsang, I.W.: Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1134–1148 (2014)CrossRef Li, W., Duan, L., Xu, D., Tsang, I.W.: Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1134–1148 (2014)CrossRef
17.
Zurück zum Zitat Hoffman, J., Rodner, E., Donahue, J., Darrell, T., Saenko, K.: Efficient learning of domain-invariant image representations. arXiv Preprint arXiv:1301.3224, January 2013 Hoffman, J., Rodner, E., Donahue, J., Darrell, T., Saenko, K.: Efficient learning of domain-invariant image representations. arXiv Preprint arXiv:​1301.​3224, January 2013
18.
Zurück zum Zitat Hoffman, J., Guadarrama, S., Tzeng, E., Hu, R., Donahue, J., Girshick, R., Darrell, T., Saenko, K.: LSDA: large scale detection through adaptation. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp. 3536–3544. MIT Press (2014) Hoffman, J., Guadarrama, S., Tzeng, E., Hu, R., Donahue, J., Girshick, R., Darrell, T., Saenko, K.: LSDA: large scale detection through adaptation. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp. 3536–3544. MIT Press (2014)
19.
Zurück zum Zitat Kulis, B., Saenko, K., Darrell, T.: What you saw is not what you get: domain adaptation using asymmetric kernel transforms. In: CVPR 2011, pp. 1785–1792. IEEE, June 2011 Kulis, B., Saenko, K., Darrell, T.: What you saw is not what you get: domain adaptation using asymmetric kernel transforms. In: CVPR 2011, pp. 1785–1792. IEEE, June 2011
20.
Zurück zum Zitat Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, vol. 37, pp. 97–105. JMLR.org (2015) Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, vol. 37, pp. 97–105. JMLR.org (2015)
21.
Zurück zum Zitat Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp. 3320–3328. MIT Press (2014) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Proceedings of the 27th International Conference on Neural Information Processing Systems, pp. 3320–3328. MIT Press (2014)
22.
Zurück zum Zitat Li, Y., Wang, N., Shi, J., Liu, J., Hou, X.: Revisiting batch normalization for practical domain adaptation. In: International Conference on Learning Representations Workshop (2017) Li, Y., Wang, N., Shi, J., Liu, J., Hou, X.: Revisiting batch normalization for practical domain adaptation. In: International Conference on Learning Representations Workshop (2017)
23.
Zurück zum Zitat Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. arXiv Preprint arXiv:1604.06646, April 2016 Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localisation in natural images. arXiv Preprint arXiv:​1604.​06646, April 2016
24.
Zurück zum Zitat Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. arXiv Preprint arXiv:1412.3474, December 2014 Tzeng, E., Hoffman, J., Zhang, N., Saenko, K., Darrell, T.: Deep domain confusion: maximizing for domain invariance. arXiv Preprint arXiv:​1412.​3474, December 2014
25.
Zurück zum Zitat Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: CVPR (2016) Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: CVPR (2016)
26.
Zurück zum Zitat Peng, X., Saenko, K.: Synthetic to real adaptation with generative correlation alignment networks. arXiv Preprint arXiv:1701.05524, January 2017 Peng, X., Saenko, K.: Synthetic to real adaptation with generative correlation alignment networks. arXiv Preprint arXiv:​1701.​05524, January 2017
27.
Zurück zum Zitat Peng, X., Sun, B., Ali, K., Saenko, K.: Learning deep object detectors from 3D models. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1278–1286. IEEE, December 2015 Peng, X., Sun, B., Ali, K., Saenko, K.: Learning deep object detectors from 3D models. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1278–1286. IEEE, December 2015
28.
Zurück zum Zitat Peng, X., Saenko, K.: Combining texture and shape cues for object recognition with minimal supervision. arXiv Preprint arXiv:1609.04356, September 2016 Peng, X., Saenko, K.: Combining texture and shape cues for object recognition with minimal supervision. arXiv Preprint arXiv:​1609.​04356, September 2016
30.
Zurück zum Zitat Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: SUN database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492. IEEE, June 2010 Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: SUN database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492. IEEE, June 2010
31.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE, June 2009 Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE, June 2009
Metadaten
Titel
Dataset Augmentation with Synthetic Images Improves Semantic Segmentation
verfasst von
Manik Goyal
Param Rajpura
Hristo Bojinov
Ravi Hegde
Copyright-Jahr
2018
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-0020-2_31