Skip to main content
Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) 3/2019

17.07.2019 | Special Issue Paper

Comic MTL: optimized multi-task learning for comic book image analysis

verfasst von: Nhu-Van Nguyen, Christophe Rigaud, Jean-Christophe Burie

Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Comic book image analysis methods often propose multiple algorithms or models for multiple tasks like panel and character (body and face) detection, balloon segmentation, text recognition, etc. In this work, we aim to reduce the processing time for comic book image analysis by proposing one model that can learn multiple tasks called Comic MTL instead of using one model per task. In addition to detection and segmentation tasks, we integrate the relation analysis task for balloons and characters into the Comic MTL model. The experiments are carried out on DCM772 and eBDtheque public datasets that contain the annotations for panels, balloons, characters and also the associations between balloon and character. We show that the Comic MTL model can detect the associations between balloons and their speakers (comic characters) and handle other tasks like panel and character detection and also balloons segmentation with promising results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Arai, K., Tolle, H.: Method for automatic e-comic scene frame extraction for reading comic on mobile devices. In: 7th International Conference on Information Technology: New Generations, ITNG, pp. 370–375. IEEE Computer Society, Washington DC, USA (2010) Arai, K., Tolle, H.: Method for automatic e-comic scene frame extraction for reading comic on mobile devices. In: 7th International Conference on Information Technology: New Generations, ITNG, pp. 370–375. IEEE Computer Society, Washington DC, USA (2010)
2.
Zurück zum Zitat Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Process. 4(6), 669–676 (2011) Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Process. 4(6), 669–676 (2011)
3.
Zurück zum Zitat Aramaki, Y., Matsui, Y., Yamasaki, T., Aizawa, K.: Text detection in manga by combining connected-component-based and region-based classifications. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2901–2905 (2016) Aramaki, Y., Matsui, Y., Yamasaki, T., Aizawa, K.: Text detection in manga by combining connected-component-based and region-based classifications. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2901–2905 (2016)
4.
Zurück zum Zitat Augereau, O., Iwata, M., Kise, K.: A survey of comics research in computer science. J. Imaging 4, 87 (2018)CrossRef Augereau, O., Iwata, M., Kise, K.: A survey of comics research in computer science. J. Imaging 4, 87 (2018)CrossRef
6.
Zurück zum Zitat Bingel, J., Sogaard, A.: Identifying beneficial task relations for multi-task learning in deep neural networks. In: EACL (2017) Bingel, J., Sogaard, A.: Identifying beneficial task relations for multi-task learning in deep neural networks. In: EACL (2017)
8.
Zurück zum Zitat Chu, W.T., Cheng, W.C.: Manga-specific features and latent style model for manga style analysis. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1332–1336 (2016) Chu, W.T., Cheng, W.C.: Manga-specific features and latent style model for manga style analysis. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1332–1336 (2016)
9.
Zurück zum Zitat Chu, W.T., Li, W.W.: Manga facenet: Face detection in manga based on deep neural network. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 412–415. ACM (2017) Chu, W.T., Li, W.W.: Manga facenet: Face detection in manga based on deep neural network. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 412–415. ACM (2017)
10.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR09 (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR09 (2009)
11.
Zurück zum Zitat Everingham, M., Eslami, S.M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)CrossRef Everingham, M., Eslami, S.M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)CrossRef
12.
Zurück zum Zitat Fujino, S., Mori, N., Matsumoto, K.: Recognizing the order of four-scene comics by evolutionary deep learning. In: Distributed Computing and Artificial Intelligence, pp. 136–144 (2015) Fujino, S., Mori, N., Matsumoto, K.: Recognizing the order of four-scene comics by evolutionary deep learning. In: Distributed Computing and Artificial Intelligence, pp. 136–144 (2015)
13.
Zurück zum Zitat Guérin, C., Rigaud, C., Mercier, A., Ammar-Boudjelal, F., Bertet, K., Bouju, A., Burie, J.C., Louis, G., Ogier, J.M., Revel, A.: eBDtheque: A representative database of comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1145–1149 (2013) Guérin, C., Rigaud, C., Mercier, A., Ammar-Boudjelal, F., Bertet, K., Bouju, A., Burie, J.C., Louis, G., Ogier, J.M., Revel, A.: eBDtheque: A representative database of comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1145–1149 (2013)
14.
Zurück zum Zitat Hashimoto, K., Xiong, C., Tsuruoka, Y., Socher, R.: A joint many-task model: Growing a neural network for multiple nlp tasks. In: EMNLP (2017) Hashimoto, K., Xiong, C., Tsuruoka, Y., Socher, R.: A joint many-task model: Growing a neural network for multiple nlp tasks. In: EMNLP (2017)
15.
Zurück zum Zitat He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. CoRR abs/1703.06870 (2017) He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. CoRR abs/1703.06870 (2017)
16.
Zurück zum Zitat He, Z., Zhou, Y., Wang, Y., Tang, Z.: Sren: Shape regression network for comic storyboard extraction. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, pp. 4937–4938 (2017) He, Z., Zhou, Y., Wang, Y., Tang, Z.: Sren: Shape regression network for comic storyboard extraction. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, 2017, San Francisco, California, pp. 4937–4938 (2017)
17.
Zurück zum Zitat He, Z., Zhou, Y., Wang, Y., Wang, S., Lu, X., Tang, Z., Cai, L.: An end-to-end quadrilateral regression network for comic panel extraction. In: ACM Multimedia (2018) He, Z., Zhou, Y., Wang, Y., Wang, S., Lu, X., Tang, Z., Cai, L.: An end-to-end quadrilateral regression network for comic panel extraction. In: ACM Multimedia (2018)
18.
Zurück zum Zitat Ho, A.K.N., Burie, J.C., Ogier, J.M.: Panel and Speech Balloon Extraction from Comic Books. 2012 10th IAPR International Workshop on Document Analysis Systems pp. 424–428 (2012) Ho, A.K.N., Burie, J.C., Ogier, J.M.: Panel and Speech Balloon Extraction from Comic Books. 2012 10th IAPR International Workshop on Document Analysis Systems pp. 424–428 (2012)
19.
Zurück zum Zitat Huang, Z., Li, J., Siniscalchi, S.M., Chen, I.F., Wu, J., Lee, C.H.: Rapid adaptation for deep neural networks through multi-task learning. In: INTERSPEECH (2015) Huang, Z., Li, J., Siniscalchi, S.M., Chen, I.F., Wu, J., Lee, C.H.: Rapid adaptation for deep neural networks through multi-task learning. In: INTERSPEECH (2015)
20.
Zurück zum Zitat In, Y., Oie, T., Higuchi, M., Kawasaki, S., Koike, A., Murakami, H.: Fast frame decomposition and sorting by contour tracing for mobile phone comic images. Int. J. Syst. Appl. Eng. Dev. 5(2), 216–223 (2011) In, Y., Oie, T., Higuchi, M., Kawasaki, S., Koike, A., Murakami, H.: Fast frame decomposition and sorting by contour tracing for mobile phone comic images. Int. J. Syst. Appl. Eng. Dev. 5(2), 216–223 (2011)
21.
Zurück zum Zitat Kaiser, L., Gomez, A.N., Shazeer, N., Vaswani, A., Parmar, N., Jones, L., Uszkoreit, J.: One model to learn them all. CoRR abs/1706.05137 (2017) Kaiser, L., Gomez, A.N., Shazeer, N., Vaswani, A., Parmar, N., Jones, L., Uszkoreit, J.: One model to learn them all. CoRR abs/1706.05137 (2017)
22.
Zurück zum Zitat Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 7482–7491 (2018) Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 7482–7491 (2018)
23.
Zurück zum Zitat Khan, F.S., Anwer, R.M., van de Weijer, J., Bagdanov, A.D., Vanrell, M., Lopez, A.M.: Color attributes for object detection. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3306–3313 (2012) Khan, F.S., Anwer, R.M., van de Weijer, J., Bagdanov, A.D., Vanrell, M., Lopez, A.M.: Color attributes for object detection. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3306–3313 (2012)
24.
Zurück zum Zitat Li, L., Wang, Y., Tang, Z., Gao, L.: Automatic comic page segmentation based on polygon detection. Multimed. Tools Appl. 69(1), 171–197 (2014)CrossRef Li, L., Wang, Y., Tang, Z., Gao, L.: Automatic comic page segmentation based on polygon detection. Multimed. Tools Appl. 69(1), 171–197 (2014)CrossRef
25.
Zurück zum Zitat Liu, X., Li, C., Zhu, H., Wong, T.T., Xu, X.: Text-aware balloon extraction from manga. Vis. Comput. 32(4), 501–511 (2016)CrossRef Liu, X., Li, C., Zhu, H., Wong, T.T., Xu, X.: Text-aware balloon extraction from manga. Vis. Comput. 32(4), 501–511 (2016)CrossRef
26.
Zurück zum Zitat Liu, X., Wang, Y., Tang, Z.: A clump splitting based method to localize speech balloons in comics. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 901–905 (2015) Liu, X., Wang, Y., Tang, Z.: A clump splitting based method to localize speech balloons in comics. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 901–905 (2015)
27.
Zurück zum Zitat Matsui, Y., Ito, K., Aramaki, Y., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using Manga109 dataset. CoRR abs/1510.04389 (2015) Matsui, Y., Ito, K., Aramaki, Y., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using Manga109 dataset. CoRR abs/1510.04389 (2015)
28.
Zurück zum Zitat Nguyen, N., Rigaud, C., Burie, J.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)CrossRef Nguyen, N., Rigaud, C., Burie, J.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)CrossRef
29.
Zurück zum Zitat Nguyen, N.V., Rigaud, C., Burie, J.: Comic characters detection using deep learning. In: 2nd International Workshop on coMics Analysis, Processing, and Understanding, MANPU 2017, Kyoto, Japan, November 9–15, 2017, pp. 41–46 (2017) Nguyen, N.V., Rigaud, C., Burie, J.: Comic characters detection using deep learning. In: 2nd International Workshop on coMics Analysis, Processing, and Understanding, MANPU 2017, Kyoto, Japan, November 9–15, 2017, pp. 41–46 (2017)
30.
Zurück zum Zitat Nguyen, N.V., Rigaud, C., Burie, J.C.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)CrossRef Nguyen, N.V., Rigaud, C., Burie, J.C.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)CrossRef
31.
Zurück zum Zitat Obispo, S.L., Kuboi, T.: Element detection in Japanese comic book panels (2014) Obispo, S.L., Kuboi, T.: Element detection in Japanese comic book panels (2014)
32.
Zurück zum Zitat Ogawa, T., Otsubo, A., Narita, R., Matsui, Y., Yamasaki, T., Aizawa, K.: Object detection for comics using manga109 annotations. CoRR abs/1803.08670 (2018). arXiv:1803.08670 Ogawa, T., Otsubo, A., Narita, R., Matsui, Y., Yamasaki, T., Aizawa, K.: Object detection for comics using manga109 annotations. CoRR abs/1803.08670 (2018). arXiv:​1803.​08670
33.
Zurück zum Zitat Pang, X., Cao, Y., Lau, R.W., Chan, A.B.: A robust panel extraction method for manga. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM ’14, pp. 1125–1128. ACM, New York (2014) Pang, X., Cao, Y., Lau, R.W., Chan, A.B.: A robust panel extraction method for manga. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM ’14, pp. 1125–1128. ACM, New York (2014)
34.
Zurück zum Zitat Plank, B., Alonso, H.M.: When is multitask learning effective? semantic sequence prediction under varying data conditions. In: EACL (2017) Plank, B., Alonso, H.M.: When is multitask learning effective? semantic sequence prediction under varying data conditions. In: EACL (2017)
35.
Zurück zum Zitat Ponsard, C., Ramdoyal, R., Dziamski, D.: An OCR-enabled digital comic books viewer. In: Computers Helping People with Special Needs, pp. 471–478. Springer (2012) Ponsard, C., Ramdoyal, R., Dziamski, D.: An OCR-enabled digital comic books viewer. In: Computers Helping People with Special Needs, pp. 471–478. Springer (2012)
36.
Zurück zum Zitat Qin, X., Zhou, Y., He, Z., Wang, Y., Tang, Z.: A faster r-cnn based method for comic characters face detection. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 1074–1080 (2017) Qin, X., Zhou, Y., He, Z., Wang, Y., Tang, Z.: A faster r-cnn based method for comic characters face detection. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 1074–1080 (2017)
37.
Zurück zum Zitat Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 91–99. Curran Associates Inc, Red Hook (2015) Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 91–99. Curran Associates Inc, Red Hook (2015)
38.
Zurück zum Zitat Rigaud, C., Burie, J., Ogier, J.: Segmentation-free speech text recognition for comic books. In: 2nd International Workshop on coMics Analysis, Processing, and Understanding, 2017, Kyoto, Japan, November 9-15, pp. 29–34 (2017) Rigaud, C., Burie, J., Ogier, J.: Segmentation-free speech text recognition for comic books. In: 2nd International Workshop on coMics Analysis, Processing, and Understanding, 2017, Kyoto, Japan, November 9-15, pp. 29–34 (2017)
39.
Zurück zum Zitat Rigaud, C., Burie, J.C., Ogier, J.M.: Text-independent speech balloon segmentation for comics and manga. In: Graphic Recognition. Current Trends and Challenges: 11th International Workshop, GREC 2015, Nancy, France, pp. 133–147. Cham (2017) Rigaud, C., Burie, J.C., Ogier, J.M.: Text-independent speech balloon segmentation for comics and manga. In: Graphic Recognition. Current Trends and Challenges: 11th International Workshop, GREC 2015, Nancy, France, pp. 133–147. Cham (2017)
40.
Zurück zum Zitat Rigaud, C., Guérin, C., Karatzas, D., Burie, J.C., Ogier, J.M.: Knowledge-driven understanding of images in comic books. Int. J. Doc. Anal. Recogn. 18(3), 199–221 (2015)CrossRef Rigaud, C., Guérin, C., Karatzas, D., Burie, J.C., Ogier, J.M.: Knowledge-driven understanding of images in comic books. Int. J. Doc. Anal. Recogn. 18(3), 199–221 (2015)CrossRef
41.
Zurück zum Zitat Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: An active contour model for speech balloon detection in comics. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1240–1244 (2013) Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: An active contour model for speech balloon detection in comics. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1240–1244 (2013)
42.
Zurück zum Zitat Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: Automatic text localisation in scanned comic books. In: Proceedings of the 8th International Conference on Computer Vision Theory and Applications (VISAPP) (2013) Rigaud, C., Karatzas, D., Van de Weijer, J., Burie, J.C., Ogier, J.M.: Automatic text localisation in scanned comic books. In: Proceedings of the 8th International Conference on Computer Vision Theory and Applications (VISAPP) (2013)
43.
Zurück zum Zitat Rigaud, C., Thanh, N.L., Burie, J.., Ogier, J.., Iwata, M., Imazu, E., Kise, K.: Speech balloon and speaker association for comics and manga understanding. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 351–355 (2015) Rigaud, C., Thanh, N.L., Burie, J.., Ogier, J.., Iwata, M., Imazu, E., Kise, K.: Speech balloon and speaker association for comics and manga understanding. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 351–355 (2015)
44.
Zurück zum Zitat Rigaud, C., Tsopze, N., Burie, J.C., Ogier, J.M.: Robust frame and text extraction from comic books. In: Graphics Recognition. New Trends and Challenges, vol. 7423, pp. 129–138. Springer, Berlin (2013) Rigaud, C., Tsopze, N., Burie, J.C., Ogier, J.M.: Robust frame and text extraction from comic books. In: Graphics Recognition. New Trends and Challenges, vol. 7423, pp. 129–138. Springer, Berlin (2013)
45.
Zurück zum Zitat Stommel, M., Merhej, L.I., Müller, M.G.: Segmentation-free detection of comic panels. In: Computer Vision and Graphics, pp. 633–640. Springer (2012) Stommel, M., Merhej, L.I., Müller, M.G.: Segmentation-free detection of comic panels. In: Computer Vision and Graphics, pp. 633–640. Springer (2012)
46.
Zurück zum Zitat Sun, W., Burie, J.C., Ogier, J.M., Kise, K.: Specific comic character detection using local feature matching. In: 12th International Conference on Document Analysis and Recognition, pp. 275–279. Washington, DC (2013) Sun, W., Burie, J.C., Ogier, J.M., Kise, K.: Specific comic character detection using local feature matching. In: 12th International Conference on Document Analysis and Recognition, pp. 275–279. Washington, DC (2013)
47.
Zurück zum Zitat Tanaka, T., Shoji, K., Toyama, F., Miyamichi, J.: Layout analysis of tree-structured scene frames in comic images. In: IJCAI’07, pp. 2885–2890 (2007) Tanaka, T., Shoji, K., Toyama, F., Miyamichi, J.: Layout analysis of tree-structured scene frames in comic images. In: IJCAI’07, pp. 2885–2890 (2007)
48.
Zurück zum Zitat Yamada, M., Budiarto, R., Endo, M., Miyazaki, S.: Comic image decomposition for reading comics on cellular phones. IEICE Trans. 87–D(6), 1370–1376 (2004) Yamada, M., Budiarto, R., Endo, M., Miyazaki, S.: Comic image decomposition for reading comics on cellular phones. IEICE Trans. 87–D(6), 1370–1376 (2004)
49.
Zurück zum Zitat Zamir, A.R., Sax, A., Shen, W.B., Guibas, L.J., Malik, J., Savarese, S.: Taskonomy: Disentangling task transfer learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 3712–3722 (2018) Zamir, A.R., Sax, A., Shen, W.B., Guibas, L.J., Malik, J., Savarese, S.: Taskonomy: Disentangling task transfer learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 3712–3722 (2018)
50.
Zurück zum Zitat Zhao, W., Wang, B., Ye, J., Yang, M., Zhao, Z., Luo, R., Qiao, Y.: A multi-task learning approach for image captioning. In: IJCAI (2018) Zhao, W., Wang, B., Ye, J., Yang, M., Zhao, Z., Luo, R., Qiao, Y.: A multi-task learning approach for image captioning. In: IJCAI (2018)
Metadaten
Titel
Comic MTL: optimized multi-task learning for comic book image analysis
verfasst von
Nhu-Van Nguyen
Christophe Rigaud
Jean-Christophe Burie
Publikationsdatum
17.07.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Document Analysis and Recognition (IJDAR) / Ausgabe 3/2019
Print ISSN: 1433-2833
Elektronische ISSN: 1433-2825
DOI
https://doi.org/10.1007/s10032-019-00330-3

Weitere Artikel der Ausgabe 3/2019

International Journal on Document Analysis and Recognition (IJDAR) 3/2019 Zur Ausgabe