Skip to main content
Top

2019 | OriginalPaper | Chapter

Multi-task Model for Comic Book Image Analysis

Authors : Nhu-Van Nguyen, Christophe Rigaud, Jean-Christophe Burie

Published in: MultiMedia Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Comic book image analysis methods often propose multiple algorithms or models for multiple tasks like panels and characters detection, balloons segmentation and text recognition, etc. In this work, we aim to reduce the complexity for comic book image analysis by proposing one model which can learn multiple tasks called Comic MTL. In addition to the detection task and segmentation task, we integrate the relation analysis task for balloons and characters into the Comic MTL model. The experiments with our model are carried out on the eBDtheque dataset which contains the annotations for panels, balloons, characters and also the relations balloon-character. We show that the Comic MTL model can detect the association between balloons and their speakers (comic characters) and handle other tasks like panels, characters detection and balloons segmentation with promising results.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abdulnabi, A.H., Wang, G., Lu, J., Jia, K.: Multi-task CNN model for attribute prediction. IEEE Trans. Multimedia 17(11), 1949–1959 (2015)CrossRef Abdulnabi, A.H., Wang, G., Lu, J., Jia, K.: Multi-task CNN model for attribute prediction. IEEE Trans. Multimedia 17(11), 1949–1959 (2015)CrossRef
2.
go back to reference Arai, K., Tolle, H.: Method for automatic e-comic scene frame extraction for reading comic on mobile devices. In: 7th International Conference on Information Technology: New Generations, pp. 370–375. IEEE Computer Society, Washington DC (2010) Arai, K., Tolle, H.: Method for automatic e-comic scene frame extraction for reading comic on mobile devices. In: 7th International Conference on Information Technology: New Generations, pp. 370–375. IEEE Computer Society, Washington DC (2010)
3.
go back to reference Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Process. (IJIP) 4(6), 669–676 (2011) Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Process. (IJIP) 4(6), 669–676 (2011)
4.
go back to reference Augereau, O., Iwata, M., Kise, K.: A survey of comics research in computer science. J. Imaging 4 (2018) Augereau, O., Iwata, M., Kise, K.: A survey of comics research in computer science. J. Imaging 4 (2018)
5.
go back to reference Chu, W.T., Cheng, W.C.: Manga-specific features and latent style model formanga style analysis. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1332–1336, March 2016 Chu, W.T., Cheng, W.C.: Manga-specific features and latent style model formanga style analysis. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1332–1336, March 2016
6.
go back to reference Chu, W.T., Li, W.W.: Manga FaceNet: face detection in manga based on deep neural network. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 412–415. ACM (2017) Chu, W.T., Li, W.W.: Manga FaceNet: face detection in manga based on deep neural network. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 412–415. ACM (2017)
7.
go back to reference Everingham, M., Eslami, S.M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vision 111(1), 98–136 (2015)CrossRef Everingham, M., Eslami, S.M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vision 111(1), 98–136 (2015)CrossRef
9.
go back to reference Guérin, C., et al.: eBDtheque: a representative database of comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1145–1149, August 2013 Guérin, C., et al.: eBDtheque: a representative database of comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1145–1149, August 2013
10.
go back to reference He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. CoRR abs/1703.06870 (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. CoRR abs/1703.06870 (2017)
11.
go back to reference Ho, A.K.N., Burie, J.C., Ogier, J.M.: Panel and speech balloon extraction from comic books. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 424–428, March 2012 Ho, A.K.N., Burie, J.C., Ogier, J.M.: Panel and speech balloon extraction from comic books. In: 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 424–428, March 2012
12.
go back to reference In, Y., Oie, T., Higuchi, M., Kawasaki, S., Koike, A., Murakami, H.: Fast frame decomposition and sorting by contour tracing for mobile phone comic images. Int. J. Syst. Appl. Eng. Dev. 5(2), 216–223 (2011) In, Y., Oie, T., Higuchi, M., Kawasaki, S., Koike, A., Murakami, H.: Fast frame decomposition and sorting by contour tracing for mobile phone comic images. Int. J. Syst. Appl. Eng. Dev. 5(2), 216–223 (2011)
13.
go back to reference Li, L., Wang, Y., Tang, Z., Gao, L.: Automatic comic page segmentation based on polygon detection. Multimedia Tools Appl. 69(1), 171–197 (2014)CrossRef Li, L., Wang, Y., Tang, Z., Gao, L.: Automatic comic page segmentation based on polygon detection. Multimedia Tools Appl. 69(1), 171–197 (2014)CrossRef
14.
go back to reference Liu, X., Li, C., Zhu, H., Wong, T.T., Xu, X.: Text-aware balloon extraction from manga. Vis. Computer 32(4), 501–511 (2016)CrossRef Liu, X., Li, C., Zhu, H., Wong, T.T., Xu, X.: Text-aware balloon extraction from manga. Vis. Computer 32(4), 501–511 (2016)CrossRef
15.
go back to reference Matsui, Y., Ito, K., Aramaki, Y., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using Manga109 dataset. CoRR abs/1510.04389 (2015) Matsui, Y., Ito, K., Aramaki, Y., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using Manga109 dataset. CoRR abs/1510.04389 (2015)
16.
go back to reference Nguyen, N.V., Rigaud, C., Burie, J.: Comic characters detection using deep learning. In: 2nd International Workshop on coMics Analysis, Processing, and Understanding, MANPU 2017, Kyoto, Japan, 9–15 November 2017, pp. 41–46 (2017) Nguyen, N.V., Rigaud, C., Burie, J.: Comic characters detection using deep learning. In: 2nd International Workshop on coMics Analysis, Processing, and Understanding, MANPU 2017, Kyoto, Japan, 9–15 November 2017, pp. 41–46 (2017)
17.
go back to reference Nguyen, N., Rigaud, C., Burie, J.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)CrossRef Nguyen, N., Rigaud, C., Burie, J.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)CrossRef
18.
go back to reference Obispo, S.L., Kuboi, T.: Element detection in Japanese comic book panels (2014) Obispo, S.L., Kuboi, T.: Element detection in Japanese comic book panels (2014)
19.
go back to reference Ogawa, T., Otsubo, A., Narita, R., Matsui, Y., Yamasaki, T., Aizawa, K.: Object detection for comics using manga109 annotations. CoRR abs/1803.08670 (2018) Ogawa, T., Otsubo, A., Narita, R., Matsui, Y., Yamasaki, T., Aizawa, K.: Object detection for comics using manga109 annotations. CoRR abs/1803.08670 (2018)
20.
go back to reference Pang, X., Cao, Y., Lau, R.W., Chan, A.B.: A robust panel extraction method formanga. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014, pp. 1125–1128. ACM, New York (2014) Pang, X., Cao, Y., Lau, R.W., Chan, A.B.: A robust panel extraction method formanga. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014, pp. 1125–1128. ACM, New York (2014)
22.
go back to reference Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28, pp. 91–99. Curran Associates, Inc. (2015) Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28, pp. 91–99. Curran Associates, Inc. (2015)
23.
go back to reference Rigaud, C., et al.: Speech balloon and speaker association for comics and manga understanding. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 351–355, August 2015 Rigaud, C., et al.: Speech balloon and speaker association for comics and manga understanding. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 351–355, August 2015
24.
go back to reference Rigaud, C., Burie, J., Ogier, J.: Segmentation-free speech text recognition for comic books. In: 2nd International Workshop on coMics Analysis, Processing, and Understanding, Kyoto, Japan, 9–15 November, pp. 29–34 (2017) Rigaud, C., Burie, J., Ogier, J.: Segmentation-free speech text recognition for comic books. In: 2nd International Workshop on coMics Analysis, Processing, and Understanding, Kyoto, Japan, 9–15 November, pp. 29–34 (2017)
26.
go back to reference Rigaud, C., Guérin, C., Karatzas, D., Burie, J.C., Ogier, J.M.: Knowledge-driven understanding of images in comic books. Int. J. Doc. Anal. Recogn. (IJDAR) 18(3), 199–221 (2015)CrossRef Rigaud, C., Guérin, C., Karatzas, D., Burie, J.C., Ogier, J.M.: Knowledge-driven understanding of images in comic books. Int. J. Doc. Anal. Recogn. (IJDAR) 18(3), 199–221 (2015)CrossRef
27.
go back to reference Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: An active contour model for speech balloon detection in comics. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1240–1244, August 2013 Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: An active contour model for speech balloon detection in comics. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1240–1244, August 2013
28.
go back to reference Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: Automatic text localisation in scanned comic books. In: Proceedings of the 8th International Conference on Computer Vision Theory and Applications (VISAPP) (2013) Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: Automatic text localisation in scanned comic books. In: Proceedings of the 8th International Conference on Computer Vision Theory and Applications (VISAPP) (2013)
30.
go back to reference Singh, S.P., Markovitch, S. (eds.): Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 4–9 February 2017, San Francisco, California, USA (2017) Singh, S.P., Markovitch, S. (eds.): Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 4–9 February 2017, San Francisco, California, USA (2017)
32.
go back to reference Sun, W., Burie, J.C., Ogier, J.M., Kise, K.: Specific comic character detection using local feature matching. In: 12th International Conference on Document Analysis and Recognition, Washington, DC, USA, pp. 275–279 (2013) Sun, W., Burie, J.C., Ogier, J.M., Kise, K.: Specific comic character detection using local feature matching. In: 12th International Conference on Document Analysis and Recognition, Washington, DC, USA, pp. 275–279 (2013)
33.
go back to reference Yamada, M., Budiarto, R., Endo, M., Miyazaki, S.: Comic image decomposition for reading comics on cellular phones. IEICE Trans. 87–D(6), 1370–1376 (2004) Yamada, M., Budiarto, R., Endo, M., Miyazaki, S.: Comic image decomposition for reading comics on cellular phones. IEICE Trans. 87–D(6), 1370–1376 (2004)
34.
go back to reference Yim, J., Jung, H., Yoo, B., Choi, C., Park, D., Kim, J.: Rotating your face using multi-task deep neural network. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 676–684, June 2015 Yim, J., Jung, H., Yoo, B., Choi, C., Park, D., Kim, J.: Rotating your face using multi-task deep neural network. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 676–684, June 2015
Metadata
Title
Multi-task Model for Comic Book Image Analysis
Authors
Nhu-Van Nguyen
Christophe Rigaud
Jean-Christophe Burie
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-05716-9_57