Skip to main content
Erschienen in: Multimedia Systems 5/2011

01.10.2011 | Interactive Multimedia Computing

Personalized video similarity measure

verfasst von: Jialie Shen, Zhiyong Cheng

Erschienen in: Multimedia Systems | Ausgabe 5/2011

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As an effective technique to manage and explore large scale of video collections, personalized video search has received great attentions in recent years. One of the key problems in the related technique development is how to design and evaluate the similarity measures. Most of the existing approaches simply adopt traditional Euclidean distance or its variants. Consequently, they generally suffer from two main disadvantages: (1) low effectiveness—retrieval accuracy is poor. One of main reasons is that very little research has been carried out on designing an effective fusion scheme for integrating multimodal information (e.g., text, audio and visual) from video sequences and (2) poor scalability—development process of the video similarity metrics is largely disconnected from that of the relevant database access methods (indexing structures). This article reports a new distance metric called personalized video distance to effectively fuse information about individual preference and multimodal properties into a compact signature. Moreover, a novel hashing-based indexing structure has been designed to facilitate fast retrieval process and better scalability. A set of comprehensive empirical studies have been carried out based on two large video test collections and carefully designed queries with different complexities. We observe significant improvements over the existing techniques on various aspects.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Variance must be larger than 80%.
 
Literatur
1.
Zurück zum Zitat Special issue on keeping, refinding, and sharing personal information. ACM Trans. Inf. Syst. (2008) Special issue on keeping, refinding, and sharing personal information. ACM Trans. Inf. Syst. (2008)
2.
Zurück zum Zitat Aggarwal, C.C.: On the effects of dimensionality reduction on high dimensional similarity search. In: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (POSD) (2001) Aggarwal, C.C.: On the effects of dimensionality reduction on high dimensional similarity search. In: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (POSD) (2001)
3.
Zurück zum Zitat Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional spaces. In: Proceedings of the 8th International Conference on Database Theory (ICDT) (2001) Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional spaces. In: Proceedings of the 8th International Conference on Database Theory (ICDT) (2001)
4.
Zurück zum Zitat Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Proc. of ACM FOCS (2006) Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Proc. of ACM FOCS (2006)
5.
Zurück zum Zitat Berchtold, S., Keim, D.A., Kriegel, H.: The x-tree : An index structure for high-dimensional data. In: Proceedings of 22th International Conference on Very Large Data Bases (VLDB’96) pp. 28–39 (1996) Berchtold, S., Keim, D.A., Kriegel, H.: The x-tree : An index structure for high-dimensional data. In: Proceedings of 22th International Conference on Very Large Data Bases (VLDB’96) pp. 28–39 (1996)
6.
Zurück zum Zitat Blei, D., Jordan, M.: Modeling annotated data. In: Proc. of ACM SIGIR (2003) Blei, D., Jordan, M.: Modeling annotated data. In: Proc. of ACM SIGIR (2003)
7.
Zurück zum Zitat Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), (2001) Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), (2001)
8.
Zurück zum Zitat Chang, H.S., Sull, S., Lee, S.U.: Efficient video indexing scheme for content-based retrieval. IEEE Trans. Circuits Syst. Video Technol. 9(8), 1269–1279 (1999)CrossRef Chang, H.S., Sull, S., Lee, S.U.: Efficient video indexing scheme for content-based retrieval. IEEE Trans. Circuits Syst. Video Technol. 9(8), 1269–1279 (1999)CrossRef
9.
Zurück zum Zitat Chen, L., Chua, T.-S.: A match and tiling approach to content-based video retrieval. In: Proceeding of ICME (2001) Chen, L., Chua, T.-S.: A match and tiling approach to content-based video retrieval. In: Proceeding of ICME (2001)
10.
Zurück zum Zitat Cherubini, M., de Oliveira, R., Oliver, N.: Understanding near-duplicate videos: a user-centric approach. In: ACM Multimedia (2009) Cherubini, M., de Oliveira, R., Oliver, N.: Understanding near-duplicate videos: a user-centric approach. In: ACM Multimedia (2009)
11.
Zurück zum Zitat Cheung, S., Zakhor, A.: Efficient video similarity measurement with video signature. IEEE Trans. Circuits Syst. Video Technol. 13(1), (2003) Cheung, S., Zakhor, A.: Efficient video similarity measurement with video signature. IEEE Trans. Circuits Syst. Video Technol. 13(1), (2003)
12.
Zurück zum Zitat Chiu, C.-Y., Li, C.-H., Wang, H.-A., Chen, C.-S., Chien, L.-F.: A time warping based approach for video copy detection. In: Proceeding of ICPR (2006) Chiu, C.-Y., Li, C.-H., Wang, H.-A., Chen, C.-S., Chien, L.-F.: A time warping based approach for video copy detection. In: Proceeding of ICPR (2006)
13.
Zurück zum Zitat O’Toole, C., Smeaton, A., Murphy, N., Marlow, S.: Evaluation of shot boundary detection on a large video test suite. In: Proc. of Challenges in Image Retrieval (1999) O’Toole, C., Smeaton, A., Murphy, N., Marlow, S.: Evaluation of shot boundary detection on a large video test suite. In: Proc. of Challenges in Image Retrieval (1999)
14.
Zurück zum Zitat Dadason, K., Lejsek, H., Ásmundsson, F., Jónsson, B., Amsaleg, L.: Videntifier: identifying pirated videos in real-time. In: Proceedings of ACM the 15th International Conference on Multimedia, pp. 471–472 (2007) Dadason, K., Lejsek, H., Ásmundsson, F., Jónsson, B., Amsaleg, L.: Videntifier: identifying pirated videos in real-time. In: Proceedings of ACM the 15th International Conference on Multimedia, pp. 471–472 (2007)
15.
Zurück zum Zitat Divakaran, A., Radhakrishnan, R., Peker, K.A.: Motion activity-based extraction of key-frames from video shots. In: Proceeding of the IEEE International Conference on Image Processing (2002) Divakaran, A., Radhakrishnan, R., Peker, K.A.: Motion activity-based extraction of key-frames from video shots. In: Proceeding of the IEEE International Conference on Image Processing (2002)
16.
Zurück zum Zitat Fahlman, S.: An empirical study of learning speed for back-propagation networks. Technical report, Technical Report CMU-CS 88-162, Carnegie-Mellon University (1988) Fahlman, S.: An empirical study of learning speed for back-propagation networks. Technical report, Technical Report CMU-CS 88-162, Carnegie-Mellon University (1988)
17.
Zurück zum Zitat Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. In: Proc. of the International Conference on Computer Vision and Pattern Recognition (CVPR) (2004) Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. In: Proc. of the International Conference on Computer Vision and Pattern Recognition (CVPR) (2004)
18.
Zurück zum Zitat Ferman, A.M., Tekalp, A.M.: Two-stage hierarchical video summary extraction to match low-level user browsing preferences. IEEE Trans. Multimed. 5(2), 244–256 (2003)CrossRef Ferman, A.M., Tekalp, A.M.: Two-stage hierarchical video summary extraction to match low-level user browsing preferences. IEEE Trans. Multimed. 5(2), 244–256 (2003)CrossRef
19.
Zurück zum Zitat Gibbon D. (2005) Introduction to video search engines (tutorial). In: Proc. of WWW Gibbon D. (2005) Introduction to video search engines (tutorial). In: Proc. of WWW
20.
Zurück zum Zitat Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice Hall (2002) Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice Hall (2002)
21.
Zurück zum Zitat Haghani, P., Michel, S., Cudré-Mauroux, P., Aberer, K.: Lsh at large—distributed knn search in high dimensions. In: WebDB (2008) Haghani, P., Michel, S., Cudré-Mauroux, P., Aberer, K.: Lsh at large—distributed knn search in high dimensions. In: WebDB (2008)
22.
Zurück zum Zitat Haykin, S.: Neural Networks: A Comprehensive Foundation. Macmillan Publishing (1994) Haykin, S.: Neural Networks: A Comprehensive Foundation. Macmillan Publishing (1994)
23.
Zurück zum Zitat Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? In: Proceedings of 26th International Conference on Very Large Data Bases (VLDB) (2000) Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? In: Proceedings of 26th International Conference on Very Large Data Bases (VLDB) (2000)
24.
Zurück zum Zitat Hoad, T., Zobel, J.: Detection of video sequences using compact signatures. ACM Trans. Inf. Syst. 24(1) (2006) Hoad, T., Zobel, J.: Detection of video sequences using compact signatures. ACM Trans. Inf. Syst. 24(1) (2006)
25.
Zurück zum Zitat Jagadish, H.V., Ooi, B.C., Tan, K.-L., Yu, C., Zhang, R.: idistance: An adaptive b+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. 30(2), 364–397 (2005)CrossRef Jagadish, H.V., Ooi, B.C., Tan, K.-L., Yu, C., Zhang, R.: idistance: An adaptive b+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. 30(2), 364–397 (2005)CrossRef
26.
Zurück zum Zitat Li, Y., Zhang, T., Tretter, D.: An overview of video abstraction techniques. Technical report, HP Laboratory, (2001) Li, Y., Zhang, T., Tretter, D.: An overview of video abstraction techniques. Technical report, HP Laboratory, (2001)
27.
Zurück zum Zitat Lin, K.-I., Jagadish, H.V., Faloutsos, C.: The tv-tree: An index structure for high-dimensional data. VLDB J. 3(4), 517–542 (1994)CrossRef Lin, K.-I., Jagadish, H.V., Faloutsos, C.: The tv-tree: An index structure for high-dimensional data. VLDB J. 3(4), 517–542 (1994)CrossRef
28.
Zurück zum Zitat Logan, B.: Mel frequency cepstral coefficients for music modeling. In: Proc. of the ISMIR (2000) Logan, B.: Mel frequency cepstral coefficients for music modeling. In: Proc. of the ISMIR (2000)
29.
Zurück zum Zitat Lu, L., Liu, D., Zhang, H.: Automatic mood detection and tracking of music audio signals. IEEE Trans. Acoust., Speech, Signal (2006) Lu, L., Liu, D., Zhang, H.: Automatic mood detection and tracking of music audio signals. IEEE Trans. Acoust., Speech, Signal (2006)
30.
Zurück zum Zitat Luo, H., Fan, J.: Building concept ontology for medical video annotation. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, pp. 57–60 (2006) Luo, H., Fan, J.: Building concept ontology for medical video annotation. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, pp. 57–60 (2006)
31.
Zurück zum Zitat Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008) Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
32.
Zurück zum Zitat OConnor, B.C.: Selecting key frames of moving image documents: A digital environment for analysis and navigation. Microcomput. Inf. Manag. 8(2), (1991) OConnor, B.C.: Selecting key frames of moving image documents: A digital environment for analysis and navigation. Microcomput. Inf. Manag. 8(2), (1991)
33.
Zurück zum Zitat Puzicha, J., Buhmann, J., Rubner, Y., Tomasi, C.: Empirical evaluation of dissimilarity measures for color and texture. In: Proc. of the International Conference on Computer Vision (ICCV) (1999) Puzicha, J., Buhmann, J., Rubner, Y., Tomasi, C.: Empirical evaluation of dissimilarity measures for color and texture. In: Proc. of the International Conference on Computer Vision (ICCV) (1999)
34.
Zurück zum Zitat Sakurai, Y., Yoshikawa, M., Uemura, S., Kojima, H.: The a-tree: An index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB ’00), pp. 516–526 (2000) Sakurai, Y., Yoshikawa, M., Uemura, S., Kojima, H.: The a-tree: An index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB ’00), pp. 516–526 (2000)
35.
Zurück zum Zitat Santini, S., Jain, R.: Similarity measures. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), (1999) Santini, S., Jain, R.: Similarity measures. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), (1999)
36.
Zurück zum Zitat Shen, J., Tao, D., Li, X.: Modality mixture projections for semantic video event detection. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1587–1596 (2008)CrossRef Shen, J., Tao, D., Li, X.: Modality mixture projections for semantic video event detection. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1587–1596 (2008)CrossRef
37.
Zurück zum Zitat Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. ACM Trans. Database Syst. 35(3), (2010) Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. ACM Trans. Database Syst. 35(3), (2010)
38.
Zurück zum Zitat Truong, B.T., Venkatesh, S.: Video abstraction: A systematic review and classification. ACM Transactions on Multimedia Computing, Communications and Applications 3(1), (2007) Truong, B.T., Venkatesh, S.: Video abstraction: A systematic review and classification. ACM Transactions on Multimedia Computing, Communications and Applications 3(1), (2007)
39.
Zurück zum Zitat Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. on Speech and Audio Processing (2002) Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. on Speech and Audio Processing (2002)
40.
Zurück zum Zitat Wang, M., Hua, X.-S., Hong, R., Tang, J., Qi, G.-J., Song, Y.: Unified video annotation via multi-graph learning. IEEE Trans. Circuits Syst. Video Technol. 19(5), (2009) Wang, M., Hua, X.-S., Hong, R., Tang, J., Qi, G.-J., Song, Y.: Unified video annotation via multi-graph learning. IEEE Trans. Circuits Syst. Video Technol. 19(5), (2009)
41.
Zurück zum Zitat Wang, M., Hua, X.-S., Tang, J., Hong, R.: Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Trans. Multimed. 11(3), (2009) Wang, M., Hua, X.-S., Tang, J., Hong, R.: Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Trans. Multimed. 11(3), (2009)
42.
Zurück zum Zitat Zhang, B., Shen, J., Xiang, Q., Wang, Y.: Compositemap: A novel framework for music similarity measure. In: Proc. of ACM SIGIR (2009) Zhang, B., Shen, J., Xiang, Q., Wang, Y.: Compositemap: A novel framework for music similarity measure. In: Proc. of ACM SIGIR (2009)
43.
Zurück zum Zitat Zhang, H., Tan, S.Y., Smoliar, S.W., Gong, Y.: Automatic parsing and indexing of news video. Multimed. Syst. 2(6), 256–266 (1995)CrossRef Zhang, H., Tan, S.Y., Smoliar, S.W., Gong, Y.: Automatic parsing and indexing of news video. Multimed. Syst. 2(6), 256–266 (1995)CrossRef
44.
Zurück zum Zitat Zhu, X., Fan, J., Elmagarmid, A.K., Wu, X.: Hierarchical video content description and summarization using unified semantic and visual similarity. Multimed. Syst. 9(1), (2003) Zhu, X., Fan, J., Elmagarmid, A.K., Wu, X.: Hierarchical video content description and summarization using unified semantic and visual similarity. Multimed. Syst. 9(1), (2003)
45.
Zurück zum Zitat Zhu, X., Wu, X., Fan, J., Elmagarmid, A.K., Aref, W.G.: Exploring video content structure for hierarchical summarization. Multimed. Syst. 10(2), 98–115 (2004)CrossRef Zhu, X., Wu, X., Fan, J., Elmagarmid, A.K., Aref, W.G.: Exploring video content structure for hierarchical summarization. Multimed. Syst. 10(2), 98–115 (2004)CrossRef
Metadaten
Titel
Personalized video similarity measure
verfasst von
Jialie Shen
Zhiyong Cheng
Publikationsdatum
01.10.2011
Verlag
Springer-Verlag
Erschienen in
Multimedia Systems / Ausgabe 5/2011
Print ISSN: 0942-4962
Elektronische ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-010-0223-8

Weitere Artikel der Ausgabe 5/2011

Multimedia Systems 5/2011 Zur Ausgabe

Neuer Inhalt