Skip to main content
Erschienen in: International Journal on Digital Libraries 2/2019

08.03.2018

Content-based video retrieval in historical collections of the German Broadcasting Archive

verfasst von: Markus Mühling, Manja Meister, Nikolaus Korfhage, Jörg Wehling, Angelika Hörth, Ralph Ewerth, Bernd Freisleben

Erschienen in: International Journal on Digital Libraries | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The German Broadcasting Archive maintains the cultural heritage of radio and television broadcasts of the former German Democratic Republic (GDR). The uniqueness and importance of the video material fosters a large scientific interest in the video content. In this paper, we present a system for automatic video content analysis and retrieval to facilitate search in historical collections of GDR television recordings. It relies on a distributed, service-oriented architecture and includes video analysis algorithms for shot boundary detection, concept classification, person recognition, text recognition and similarity search. The combination of different search modalities allows users to obtain answers for a wide range of queries, leading to satisfactory results in short time. The performance of the system is evaluated using 2500 h of GDR television recordings.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ahonen, T., Hadid, A., Pietikainen, M.: Face recognition with local binary patterns. In: Proceedings of the IEEE European Conference on Computer Vision. pp. 469–481 (2004) Ahonen, T., Hadid, A., Pietikainen, M.: Face recognition with local binary patterns. In: Proceedings of the IEEE European Conference on Computer Vision. pp. 469–481 (2004)
2.
Zurück zum Zitat Albertson, D., Ju, B.: Design criteria for video digital libraries: categories of important features emerging from users’ responses. Online Inf. Rev. 39(2), 214–228 (2015)CrossRef Albertson, D., Ju, B.: Design criteria for video digital libraries: categories of important features emerging from users’ responses. Online Inf. Rev. 39(2), 214–228 (2015)CrossRef
3.
Zurück zum Zitat Belhumeur, P.N., Kriegman, D.J.: Eigenfaces versus fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)CrossRef Belhumeur, P.N., Kriegman, D.J.: Eigenfaces versus fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)CrossRef
4.
Zurück zum Zitat Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., Shafait, F.: High-performance OCR for printed English and Fraktur using LSTM networks. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 683–687 (2013) Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., Shafait, F.: High-performance OCR for printed English and Fraktur using LSTM networks. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 683–687 (2013)
5.
Zurück zum Zitat Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the British Machine Vision Conference, pp. 1–11 (2014) Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the British Machine Vision Conference, pp. 1–11 (2014)
6.
Zurück zum Zitat Christel, M., Kanade, T., Mauldin, M., Reddy, R., Sirbu, M., Stevens, S., Wactlar, H.: Informedia digital video library. Commun. ACM 38(4), 57–58 (1995)CrossRef Christel, M., Kanade, T., Mauldin, M., Reddy, R., Sirbu, M., Stevens, S., Wactlar, H.: Informedia digital video library. Commun. ACM 38(4), 57–58 (1995)CrossRef
7.
Zurück zum Zitat Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48:1–48:9 (2009) Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48:1–48:9 (2009)
8.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’09), pp. 2–9 (2009) Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR ’09), pp. 2–9 (2009)
9.
Zurück zum Zitat Ewerth, R., Ballafkir, K., Mühling, M., Seiler, D., Freisleben, B.: Long-term incremental web-supervised learning of visual concepts via random savannas. IEEE Trans. Multimed. 14(4), 1008–1020 (2012)CrossRef Ewerth, R., Ballafkir, K., Mühling, M., Seiler, D., Freisleben, B.: Long-term incremental web-supervised learning of visual concepts via random savannas. IEEE Trans. Multimed. 14(4), 1008–1020 (2012)CrossRef
10.
Zurück zum Zitat Ewerth, R., Freisleben, B.: Video cut detection without thresholds. In: Proceedings of the 11th International Workshop on Signals, Systems and Image Processing (IWSSIP ’04), pp. 227–230. Poznan, Poland (2004) Ewerth, R., Freisleben, B.: Video cut detection without thresholds. In: Proceedings of the 11th International Workshop on Signals, Systems and Image Processing (IWSSIP ’04), pp. 227–230. Poznan, Poland (2004)
11.
Zurück zum Zitat Ewerth, R., Freisleben, B.: Unsupervised detection of gradual video shot changes with motion-based false alarm removal. In: Proceedings of the 11th Conference on Advanced Concepts for Intelligent Vision Systems, pp. 253–264 (2009) Ewerth, R., Freisleben, B.: Unsupervised detection of gradual video shot changes with motion-based false alarm removal. In: Proceedings of the 11th Conference on Advanced Concepts for Intelligent Vision Systems, pp. 253–264 (2009)
12.
Zurück zum Zitat Ewerth, R., Mühling, M., Freisleben, B.: Self-supervised learning of face appearances in TV casts and movies. Int. J. Semant. Comput. 1(2), 185–204 (2007)CrossRef Ewerth, R., Mühling, M., Freisleben, B.: Self-supervised learning of face appearances in TV casts and movies. Int. J. Semant. Comput. 1(2), 185–204 (2007)CrossRef
13.
Zurück zum Zitat Ewerth, R., Mühling, M., Freisleben, B.: Robust video content analysis via transductive learning. ACM Trans. Intell. Syst. Technol. (TIST) 3(3), 1–26 (2011) Ewerth, R., Mühling, M., Freisleben, B.: Robust video content analysis via transductive learning. ACM Trans. Intell. Syst. Technol. (TIST) 3(3), 1–26 (2011)
14.
Zurück zum Zitat Ewerth, R., Schwalb, M., Tessmann, P., Freisleben, B.: Segmenting Moving Objects in MPEG Videos in the Presence of Camera Motion. In: Image Analysis and Processing, 2007. ICIAP 2007. 14th International Conference on IEEE, pp. 819–824 (2007) Ewerth, R., Schwalb, M., Tessmann, P., Freisleben, B.: Segmenting Moving Objects in MPEG Videos in the Presence of Camera Motion. In: Image Analysis and Processing, 2007. ICIAP 2007. 14th International Conference on IEEE, pp. 819–824 (2007)
15.
Zurück zum Zitat Gllavata, J., Ewerth, R.: Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of 17th International Conference on Pattern Recognition (ICPR ’04), pp. 425–428. IEEE (2004) Gllavata, J., Ewerth, R.: Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of 17th International Conference on Pattern Recognition (ICPR ’04), pp. 425–428. IEEE (2004)
16.
Zurück zum Zitat Gong, Y., Jia, Y., Leung, T., Toshev, A., Ioffe, S.: Deep Convolutional Ranking for Multilabel Image Annotation. arXiv preprint arXiv:1312.4894 (2013) Gong, Y., Jia, Y., Leung, T., Toshev, A., Ioffe, S.: Deep Convolutional Ranking for Multilabel Image Annotation. arXiv preprint arXiv:​1312.​4894 (2013)
17.
Zurück zum Zitat Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), pp. 6645–6649 (2013) Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), pp. 6645–6649 (2013)
18.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
19.
Zurück zum Zitat Hentschel, C., Blümel, I., Sack, H.: Automatic annotation of scientific video material based on visual concept detection. In: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies, p. 16 (2013) Hentschel, C., Blümel, I., Sack, H.: Automatic annotation of scientific video material based on visual concept detection. In: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies, p. 16 (2013)
20.
Zurück zum Zitat Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678 (2014) Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678 (2014)
21.
Zurück zum Zitat Krizhevsky, A., Hinton, G.: Using very deep Autoencoders for content-based image retrieval. In: Proceedings of the European Symposium on Artificial Neural Networks, pp. 1–7 (2011) Krizhevsky, A., Hinton, G.: Using very deep Autoencoders for content-based image retrieval. In: Proceedings of the European Symposium on Artificial Neural Networks, pp. 1–7 (2011)
22.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1–9 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1–9 (2012)
23.
Zurück zum Zitat Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Doklady. 10, 707–710 (1966)MathSciNet Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Doklady. 10, 707–710 (1966)MathSciNet
24.
Zurück zum Zitat Lin, K., Yang, H., Hsiao, J., Chen, C.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 27–35 (2015) Lin, K., Yang, H., Hsiao, J., Chen, C.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 27–35 (2015)
25.
Zurück zum Zitat Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: Proceedings of the 13th IEEE International Conference on Computer Vision, pp. 2486–2493 (2011) Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: Proceedings of the 13th IEEE International Conference on Computer Vision, pp. 2486–2493 (2011)
26.
Zurück zum Zitat Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999) Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)
27.
Zurück zum Zitat Marchionini, G., Geisler, G.: The open video digital library. D-Lib. Mag. 8(12), 1082–9873 (2002) Marchionini, G., Geisler, G.: The open video digital library. D-Lib. Mag. 8(12), 1082–9873 (2002)
28.
Zurück zum Zitat Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)CrossRef Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)CrossRef
29.
Zurück zum Zitat Mühling, M., Markus, M., Ewerth, R., Freisleben, B.: Improving cross-domain concept detection via object-based features. In: Proceedings of the International Conference on Computer Analysis of Images and Patterns (CAIP ’15) (2015) Mühling, M., Markus, M., Ewerth, R., Freisleben, B.: Improving cross-domain concept detection via object-based features. In: Proceedings of the International Conference on Computer Analysis of Images and Patterns (CAIP ’15) (2015)
30.
Zurück zum Zitat Mühling, M., Ewerth, R., Freisleben, B.: On the spatial extents of SIFT descriptors for visual concept detection. In: Proceedings of the 8th International Conference on Computer Vision Systems, pp. 71–80. Springer (2011) Mühling, M., Ewerth, R., Freisleben, B.: On the spatial extents of SIFT descriptors for visual concept detection. In: Proceedings of the 8th International Conference on Computer Vision Systems, pp. 71–80. Springer (2011)
31.
Zurück zum Zitat Mühling, M., Ewerth, R., Shi, B., Freisleben, B.: Multi-class object detection with hough forests using local histograms of visual words. In: Proceedings of 14th International Conference on Computer Analysis of Images and Patterns, pp. 386–393. Springer (2011) Mühling, M., Ewerth, R., Shi, B., Freisleben, B.: Multi-class object detection with hough forests using local histograms of visual words. In: Proceedings of 14th International Conference on Computer Analysis of Images and Patterns, pp. 386–393. Springer (2011)
32.
Zurück zum Zitat Mühling, M., Ewerth, R., Zhou, J., Freisleben, B.: Multimodal video concept detection via bag of auditory words and multiple kernel learning. In: Proceedings of the 18th International Conference on Advances in Multimedia Modeling, pp. 40–50. Springer (2012) Mühling, M., Ewerth, R., Zhou, J., Freisleben, B.: Multimodal video concept detection via bag of auditory words and multiple kernel learning. In: Proceedings of the 18th International Conference on Advances in Multimedia Modeling, pp. 40–50. Springer (2012)
33.
Zurück zum Zitat Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2012) Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2012)
34.
Zurück zum Zitat Sack, H., Plank, M.: AV-Portal: The German National Library of Science and Technology’s Semantic Video Portal. ERCIM News 96 (2014) Sack, H., Plank, M.: AV-Portal: The German National Library of Science and Technology’s Semantic Video Portal. ERCIM News 96 (2014)
35.
Zurück zum Zitat Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969–978 (2009)CrossRef Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969–978 (2009)CrossRef
36.
Zurück zum Zitat Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015) Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
37.
Zurück zum Zitat Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern. Anal. Mach. Intell. 22(12), 1349–1380 (2000)CrossRef Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern. Anal. Mach. Intell. 22(12), 1349–1380 (2000)CrossRef
38.
Zurück zum Zitat Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015) Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
39.
Zurück zum Zitat Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2014) Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2014)
40.
Zurück zum Zitat Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001) Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)
41.
Zurück zum Zitat Wan, J., Wang, D., Hoi, S.C.H., Wu, P.: Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the ACM International Conference on Multimedia, pp. 157–166 (2014) Wan, J., Wang, D., Hoi, S.C.H., Wu, P.: Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the ACM International Conference on Multimedia, pp. 157–166 (2014)
42.
Zurück zum Zitat Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 311–321 (1993) Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 311–321 (1993)
43.
Zurück zum Zitat Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. Adv. Neural Inf. Process. Syst. 27, 487–495 (2014) Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. Adv. Neural Inf. Process. Syst. 27, 487–495 (2014)
Metadaten
Titel
Content-based video retrieval in historical collections of the German Broadcasting Archive
verfasst von
Markus Mühling
Manja Meister
Nikolaus Korfhage
Jörg Wehling
Angelika Hörth
Ralph Ewerth
Bernd Freisleben
Publikationsdatum
08.03.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Digital Libraries / Ausgabe 2/2019
Print ISSN: 1432-5012
Elektronische ISSN: 1432-1300
DOI
https://doi.org/10.1007/s00799-018-0236-z

Weitere Artikel der Ausgabe 2/2019

International Journal on Digital Libraries 2/2019 Zur Ausgabe