Skip to main content

2016 | OriginalPaper | Buchkapitel

An R-CNN Based Method to Localize Speech Balloons in Comics

verfasst von : Yongtao Wang, Xicheng Liu, Zhi Tang

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Comic books enjoy great popularity around the world. More and more people choose to read comic books on digital devices, especially on mobile ones. However, the screen size of most mobile devices is not big enough to display an entire comic page directly. As a consequence, without any reflow or adaption to the original books, users often find that the texts on comic pages are hard to recognize when reading comics on mobile devices. Given the positions of speech balloons, it becomes quite easy to do further processing on texts to make them easier to read on mobile devices. Because the texts on a comic page often come along with surrounding speech balloons. Therefore, it is important to devise an effective method to localize speech balloons in comics. However, only a few studies have been done in this direction. In this paper, we propose a Regions with Convolutional Neural Network (R-CNN) based method to localize speech balloons in comics. Experimental results have demonstrated that the proposed method can localize the speech balloons in comics effectively and accurately.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Rigaud, C., Burie, J., Ogier, J., Karatzas, D., Weijer, J.: An active contour model for speech balloon detection in comics. In: International Conference on Document Analysis and Recognition, Washington, DC, pp. 1240–1244 (2013) Rigaud, C., Burie, J., Ogier, J., Karatzas, D., Weijer, J.: An active contour model for speech balloon detection in comics. In: International Conference on Document Analysis and Recognition, Washington, DC, pp. 1240–1244 (2013)
2.
Zurück zum Zitat Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Process. 4(6), 669676 (2011) Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Process. 4(6), 669676 (2011)
3.
Zurück zum Zitat Ho, A.N., Burie, J., Ogier, J.: Panel and speech balloon extraction from comic books. In: International Workshop on Document Analysis Systems, Gold Cost, QLD, pp. 424–428 (2012) Ho, A.N., Burie, J., Ogier, J.: Panel and speech balloon extraction from comic books. In: International Workshop on Document Analysis Systems, Gold Cost, QLD, pp. 424–428 (2012)
4.
Zurück zum Zitat Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition, Columbus, OH, pp. 580–587 (2014) Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition, Columbus, OH, pp. 580–587 (2014)
5.
Zurück zum Zitat Gu, C., Lim, J.J., Arbelaez, P., Malik, J.: Recognition using regions. In: Computer Vision and Pattern Recognition, Miami, FL, pp. 1030–1037 (2009) Gu, C., Lim, J.J., Arbelaez, P., Malik, J.: Recognition using regions. In: Computer Vision and Pattern Recognition, Miami, FL, pp. 1030–1037 (2009)
6.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, South Lake Tahoe, Nevada, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, South Lake Tahoe, Nevada, pp. 1097–1105 (2012)
8.
Zurück zum Zitat Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vision 104(2), 154–171 (2013)CrossRef Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vision 104(2), 154–171 (2013)CrossRef
9.
Zurück zum Zitat Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)CrossRef Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)CrossRef
11.
Zurück zum Zitat Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)CrossRef Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)CrossRef
12.
Zurück zum Zitat Sung, K.-K., Poggio, T.: Example-based learning for view-based human face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 39–51 (1998)CrossRef Sung, K.-K., Poggio, T.: Example-based learning for view-based human face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 39–51 (1998)CrossRef
Metadaten
Titel
An R-CNN Based Method to Localize Speech Balloons in Comics
verfasst von
Yongtao Wang
Xicheng Liu
Zhi Tang
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-27671-7_37

Neuer Inhalt