Skip to main content

2017 | OriginalPaper | Buchkapitel

Book Page Identification Using Convolutional Neural Networks Trained by Task-Unrelated Dataset

verfasst von : Leyuan Liu, Yi Zhao, Huabing Zhou, Jingying Chen

Erschienen in: Image and Graphics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a pipeline to make convolutional neural networks (CNNs) trained for another unrelated task available for book page identification. The pipeline has five building blocks: (1) An image segmentation module to separate book page from the background; (2) An image correction module to correct geometry and color distortions; (3) A feature extraction module to extract discriminative image features by a pre-trained CNN; (4) A feature compression module to reduce feature dimensions for speeding up; and (5) A feature matching module to calculate the similarity between a query image and a reference image, and then to find out the most similar reference image. The experimental results on a challenging testing dataset show that the proposed book page identification method achieves a top-5 hit rate of 98.93%.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yokota, J.: From print to digital? Considering the future of picture books for children. In: Bologna: Fifty Years of Children’s Books from Around the World, pp. 443–449 (2013) Yokota, J.: From print to digital? Considering the future of picture books for children. In: Bologna: Fifty Years of Children’s Books from Around the World, pp. 443–449 (2013)
2.
Zurück zum Zitat Chae, S., Yang, Y., Choi, H., Kim, I., Byun, J., Jo, J., Han, T.: Smart advisor: real-time information provider with mobile augmented reality. In: Proceedings of the IEEE International Conference on Consumer Electronics, Las Vegas, USA, pp. 97–98, January 2016 Chae, S., Yang, Y., Choi, H., Kim, I., Byun, J., Jo, J., Han, T.: Smart advisor: real-time information provider with mobile augmented reality. In: Proceedings of the IEEE International Conference on Consumer Electronics, Las Vegas, USA, pp. 97–98, January 2016
3.
Zurück zum Zitat Hsu, M., Chen, C.: Analysis of motivation triggers in interactive digital reading for children. Int. J. Infonomics 6(1), 669–675 (2013)CrossRef Hsu, M., Chen, C.: Analysis of motivation triggers in interactive digital reading for children. Int. J. Infonomics 6(1), 669–675 (2013)CrossRef
4.
Zurück zum Zitat Jeong, H.T., Lee, D.W., Heo, G.S., Lee, C.H.: Live book: a mixed reality book using a projection system. In: Proceedings of the IEEE International Conference on Consumer Electronics, Las Vegas, USA, pp. 680–681, January 2012 Jeong, H.T., Lee, D.W., Heo, G.S., Lee, C.H.: Live book: a mixed reality book using a projection system. In: Proceedings of the IEEE International Conference on Consumer Electronics, Las Vegas, USA, pp. 680–681, January 2012
5.
Zurück zum Zitat Baik, S.: Rethinking QR code: analog portal to digital world. Multimedia Tools Appl. 58(2), 427–434 (2012)CrossRef Baik, S.: Rethinking QR code: analog portal to digital world. Multimedia Tools Appl. 58(2), 427–434 (2012)CrossRef
6.
Zurück zum Zitat Iwata, K., Yamamoto, K., Yasuda, M., Murata, K.: Book cover identification by using four directional features filed for a small-scale library system. In: Proceedings of the International Conference on Document Analysis and Recognition, Seattle, USA, pp. 582–586, September 2001 Iwata, K., Yamamoto, K., Yasuda, M., Murata, K.: Book cover identification by using four directional features filed for a small-scale library system. In: Proceedings of the International Conference on Document Analysis and Recognition, Seattle, USA, pp. 582–586, September 2001
7.
Zurück zum Zitat Tsai, S.S., Chen, D., Singh, J.P., Girod, B.: Rate-efficient, real-time CD cover recognition on a camera-phone. In: Proceedings of the International Conference on Multimedia, Vancouver, Canada, pp. 1023–1024, October 2008 Tsai, S.S., Chen, D., Singh, J.P., Girod, B.: Rate-efficient, real-time CD cover recognition on a camera-phone. In: Proceedings of the International Conference on Multimedia, Vancouver, Canada, pp. 1023–1024, October 2008
9.
Zurück zum Zitat Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)CrossRef Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)CrossRef
10.
Zurück zum Zitat Cheng, M.M., Prisacariu, V.A., Zheng, S., Torr, P.H.S., Rother, C.: DenseCut: densely connected CRFs for real-time GrabCut. Comput. Graph. Forum 34(7), 193–201 (2015)CrossRef Cheng, M.M., Prisacariu, V.A., Zheng, S., Torr, P.H.S., Rother, C.: DenseCut: densely connected CRFs for real-time GrabCut. Comput. Graph. Forum 34(7), 193–201 (2015)CrossRef
11.
Zurück zum Zitat Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, pp. 321–328, December 2013 Chai, Y., Lempitsky, V., Zisserman, A.: Symbiotic segmentation and part localization for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, pp. 321–328, December 2013
12.
Zurück zum Zitat Cheng, M.M., Mitra, N.J., Huang, X., Hu, S.M.: Salientshape: group saliency in image collections. Vis. Comput. 30(4), 443–453 (2014)CrossRef Cheng, M.M., Mitra, N.J., Huang, X., Hu, S.M.: Salientshape: group saliency in image collections. Vis. Comput. 30(4), 443–453 (2014)CrossRef
13.
Zurück zum Zitat Possegger, H., Mauthner, T., Bischof, H.: In defense of color-based model-free tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 2113–2120, June 2015 Possegger, H., Mauthner, T., Bischof, H.: In defense of color-based model-free tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 2113–2120, June 2015
14.
Zurück zum Zitat Van De Weijer, J., Gevers, T., Gijsenij, A.: Edge-based color constancy. IEEE Trans. Image Process. 16(9), 2207–2214 (2007)MathSciNetCrossRef Van De Weijer, J., Gevers, T., Gijsenij, A.: Edge-based color constancy. IEEE Trans. Image Process. 16(9), 2207–2214 (2007)MathSciNetCrossRef
15.
Zurück zum Zitat Liu, L., Sang, N., Yang, S., Huang, R.: Real-time skin color detection under rapidly changing illumination conditions. IEEE Trans. Consum. Electron. 57(3), 1295–1302 (2011)CrossRef Liu, L., Sang, N., Yang, S., Huang, R.: Real-time skin color detection under rapidly changing illumination conditions. IEEE Trans. Consum. Electron. 57(3), 1295–1302 (2011)CrossRef
16.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, USA, pp. 1097–1105, December 2012 Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, USA, pp. 1097–1105, December 2012
17.
Zurück zum Zitat Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the British Machine Vision Conference, Nottingham, UK, pp. 1–12, September 2014 Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the British Machine Vision Conference, Nottingham, UK, pp. 1–12, September 2014
18.
Zurück zum Zitat Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 1717–1724, June 2014 Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 1717–1724, June 2014
19.
Zurück zum Zitat Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., Cun, Y.L.: Overfeat: integrated recognition, localization and detection using convolutional networks. In: Proceedings of the International Conference on Learning Representations, Banff, Canada, pp. 1–16, April 2014 Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., Cun, Y.L.: Overfeat: integrated recognition, localization and detection using convolutional networks. In: Proceedings of the International Conference on Learning Representations, Banff, Canada, pp. 1–16, April 2014
21.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Miami, USA, pp. 248–255, June 2009 Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Miami, USA, pp. 248–255, June 2009
22.
Zurück zum Zitat Lin, K., Yang, H.F., Hsiao, J.H., Chen, C.S.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, USA, pp. 27–35, June 2015 Lin, K., Yang, H.F., Hsiao, J.H., Chen, C.S.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, USA, pp. 27–35, June 2015
23.
Zurück zum Zitat Liu, H., Wang, R., Shan, S., Chen, X.L.: Deep supervised hashing for fast Image retrieval. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2064–2072, June 2016 Liu, H., Wang, R., Shan, S., Chen, X.L.: Deep supervised hashing for fast Image retrieval. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2064–2072, June 2016
24.
Zurück zum Zitat Lai, H., Pan, Y., Liu, Y., Yan, S.: Simultaneous feature learning and hash coding with deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 3270–3278, June 2015 Lai, H., Pan, Y., Liu, Y., Yan, S.: Simultaneous feature learning and hash coding with deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 3270–3278, June 2015
Metadaten
Titel
Book Page Identification Using Convolutional Neural Networks Trained by Task-Unrelated Dataset
verfasst von
Leyuan Liu
Yi Zhao
Huabing Zhou
Jingying Chen
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-71607-7_57