Skip to main content
Erschienen in: Multimedia Systems 2/2015

01.03.2015 | Special Issue Paper

Multi-order visual phrase for scalable partial-duplicate visual search

verfasst von: Shiliang Zhang, Qi Tian, Qingming Huang, Wen Gao, Yong Rui

Erschienen in: Multimedia Systems | Ausgabe 2/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Visual phrase considers multiple visual words and captures extra spatial clues among them. Thus, visual phrase shows better discriminative power than single visual word in image retrieval and matching. Not withstanding their success, existing visual phrases still show obvious shortcomings: (1) limited flexibility, i.e., visual phrases are considered for matching only if they contain the same number of visual words; (2) large quantization error and low repeatability, i.e., quantization errors in visual words are aggregated in visual word combinations and visual phrases, making them harder to be matched than single visual words. To avoid these issues, we propose multi-order visual phrase (MVP) which contains two complementary clues: center visual word quantized from the local descriptor of each image keypoint and the visual and spatial clues of multiple nearby keypoints. Two MVPs are flexibly matched by first matching their center visual words, then estimating a match confidence by checking the spatial and visual consistency of their neighbor keypoints. Therefore, center visual word matching equals to traditional visual word matching, but the neighbor spatial and visual clues checking significantly boosts the discriminative power. MVP does not scarify the repeatability of single visual word and is more robust to quantization error than existing visual phrases. We test our approach in three image retrieval tasks on UKbench, Oxford5K, and 1 million distractor images collected from Flickr. Comparisons with recent retrieval approaches and existing visual phrase features clearly demonstrate the competitive accuracy and significantly better efficiency of MVP.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bao, B., Zhu, G., Shen, J., Yan, S.: Robust image analysis with sparse representation on quantized visual features. IEEE Trans. Image Process. 22(3), 860–871 (2013)CrossRefMathSciNet Bao, B., Zhu, G., Shen, J., Yan, S.: Robust image analysis with sparse representation on quantized visual features. IEEE Trans. Image Process. 22(3), 860–871 (2013)CrossRefMathSciNet
2.
Zurück zum Zitat Bay, H., Tuytelaars, T., Gool, L.V.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)CrossRef Bay, H., Tuytelaars, T., Gool, L.V.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)CrossRef
3.
Zurück zum Zitat Brown, M., Lowe, D.: Unsupervised 3D object recognition and reconstruction in unordered datasets. In: IEEE International Conference on 3-D Digital Imaging and Modeling, pp. 56-63. Ottawa, Ontario, Canada (2005) Brown, M., Lowe, D.: Unsupervised 3D object recognition and reconstruction in unordered datasets. In: IEEE International Conference on 3-D Digital Imaging and Modeling, pp. 56-63. Ottawa, Ontario, Canada (2005)
4.
Zurück zum Zitat Brown, M., Loww, D.G.: Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 74(1), 59–73 (2007)CrossRef Brown, M., Loww, D.G.: Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 74(1), 59–73 (2007)CrossRef
5.
Zurück zum Zitat Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–391 (1981)CrossRefMathSciNet Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–391 (1981)CrossRefMathSciNet
6.
Zurück zum Zitat Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision. Marseille, France, pp. 304–317 (2008) Jégou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision. Marseille, France, pp. 304–317 (2008)
7.
Zurück zum Zitat Jégou, H., Douze, M., Schmid, C.: Improving bag-of-feature for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)CrossRef Jégou, H., Douze, M., Schmid, C.: Improving bag-of-feature for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)CrossRef
8.
Zurück zum Zitat Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptor into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (2010) Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptor into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)
9.
Zurück zum Zitat Juan, L., Gwun, O.: A comparison of SIFT, PCA-SIFT and SURF. Int J Image Processing 3(4), 143–152 (2009) Juan, L., Gwun, O.: A comparison of SIFT, PCA-SIFT and SURF. Int J Image Processing 3(4), 143–152 (2009)
10.
Zurück zum Zitat Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. Comput. Vis. Pattern Recognit. 2, II-506 (2004) Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. Comput. Vis. Pattern Recognit. 2, II-506 (2004)
11.
Zurück zum Zitat Ke, Y., Sukthankar, R., Huston, L.: Efficient near-duplicated detection and sub-image retrieval. In: ACM Multimedia. New York City, pp. 10–16 (2004) Ke, Y., Sukthankar, R., Huston, L.: Efficient near-duplicated detection and sub-image retrieval. In: ACM Multimedia. New York City, pp. 10–16 (2004)
12.
Zurück zum Zitat Levin, A., Zomet, A., Peleg, S., Weiss. Y.: Seamless image stitching in the gradient domain. In: European Conference on Computer Vision, pp. 377–389. Berlin, Heidelberg (2004) Levin, A., Zomet, A., Peleg, S., Weiss. Y.: Seamless image stitching in the gradient domain. In: European Conference on Computer Vision, pp. 377–389. Berlin, Heidelberg (2004)
13.
Zurück zum Zitat Liu, D., Hua, G., Viola, P., Chen, T.: Integrated feature selection and higher-order spatial feature extraction for object categorization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008) Liu, D., Hua, G., Viola, P., Chen, T.: Integrated feature selection and higher-order spatial feature extraction for object categorization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
14.
Zurück zum Zitat Lowe, D.G.: Distinctive image features from scale invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef Lowe, D.G.: Distinctive image features from scale invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef
15.
Zurück zum Zitat Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, pp. 384–391. Cardiff, UK (2002) Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, pp. 384–391. Cardiff, UK (2002)
16.
Zurück zum Zitat Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)CrossRef Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)CrossRef
17.
Zurück zum Zitat Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition, New York City, NY, pp. 17–22 (2006) Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition, New York City, NY, pp. 17–22 (2006)
18.
Zurück zum Zitat Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, pp. 17–22 (2007) Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, pp. 17–22 (2007)
19.
Zurück zum Zitat Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: IEEE Conference on Computer Vision and Pattern Recognition (2008) Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
20.
Zurück zum Zitat Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an effcient alternative to SIFTor SURF. In: ICCV, pp. 2564–2571. Barcelona, Spain (2011) Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an effcient alternative to SIFTor SURF. In: ICCV, pp. 2564–2571. Barcelona, Spain (2011)
21.
Zurück zum Zitat Savarese, S., Winn, J., Criminisi, A.: Discriminative object class models of appearance and shape by correlatons. IEEE Conf. Comput. Visi. Pattern Recognit. 2, 2033–2040 (2006) Savarese, S., Winn, J., Criminisi, A.: Discriminative object class models of appearance and shape by correlatons. IEEE Conf. Comput. Visi. Pattern Recognit. 2, 2033–2040 (2006)
22.
Zurück zum Zitat Shen, X., Lin, Z., Brandt, J., Avidan, S., Wu, Y.: Object retrieval and localization with spatially-constrained similarity measure and k-NN reranking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3013–3020. Providence, Rhode Island, USA (2012) Shen, X., Lin, Z., Brandt, J., Avidan, S., Wu, Y.: Object retrieval and localization with spatially-constrained similarity measure and k-NN reranking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3013–3020. Providence, Rhode Island, USA (2012)
23.
Zurück zum Zitat Shum, H.Y., Szeliski, R.: Systems and experiment paper: construction of panoramic image mosaics with global and local alignment. Int. J. Comput. Vis. 36(2), 101–130 (2000)CrossRef Shum, H.Y., Szeliski, R.: Systems and experiment paper: construction of panoramic image mosaics with global and local alignment. Int. J. Comput. Vis. 36(2), 101–130 (2000)CrossRef
24.
Zurück zum Zitat Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision. Nice, France (2003) Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: International Conference on Computer Vision. Nice, France (2003)
25.
Zurück zum Zitat Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)CrossRef Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)CrossRef
26.
Zurück zum Zitat Wang, B., Li, Z., Li, M., Ma, W.Y.: Large-scale duplicate detection for web image search. In: IEEE International Conference on Multimedia and Expo, pp. 353–356. Toronto, Ontario, Canada (2006) Wang, B., Li, Z., Li, M., Ma, W.Y.: Large-scale duplicate detection for web image search. In: IEEE International Conference on Multimedia and Expo, pp. 353–356. Toronto, Ontario, Canada (2006)
27.
Zurück zum Zitat Wang, M., Li, G., Lu, Z., Gao, Y., Chua, T.-S.: When amazon meets google: product visualization by exploring multiple web sources. ACM Trans. Internet Technol 12(4), 12 (2013)CrossRef Wang, M., Li, G., Lu, Z., Gao, Y., Chua, T.-S.: When amazon meets google: product visualization by exploring multiple web sources. ACM Trans. Internet Technol 12(4), 12 (2013)CrossRef
28.
Zurück zum Zitat Wang, M., Li, H., Tao, D., Lu, K., Wu, X.: Multimodal graph-based reranking for web image search. IEEE Trans. Image Process. 21(11), 4649–4661 (2012)CrossRefMathSciNet Wang, M., Li, H., Tao, D., Lu, K., Wu, X.: Multimodal graph-based reranking for web image search. IEEE Trans. Image Process. 21(11), 4649–4661 (2012)CrossRefMathSciNet
29.
Zurück zum Zitat Wang, M., Yang, K., Hua, X., Zhang, H.: Towards a relevant and diverse search of social images. IEEE Trans. Multimed. 12(8), 829–842 (2010)CrossRef Wang, M., Yang, K., Hua, X., Zhang, H.: Towards a relevant and diverse search of social images. IEEE Trans. Multimed. 12(8), 829–842 (2010)CrossRef
30.
Zurück zum Zitat Wang, X., Yang, M., Cour, T., Zhu, S., Yu, K., Han, T.X.: Contextual weighting for vocabulary tree based image retrieval. In: Internationall Conference on Computer Vision, pp. 6–13. Barcelona, Spain (2011) Wang, X., Yang, M., Cour, T., Zhu, S., Yu, K., Han, T.X.: Contextual weighting for vocabulary tree based image retrieval. In: Internationall Conference on Computer Vision, pp. 6–13. Barcelona, Spain (2011)
31.
Zurück zum Zitat Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling feature for large scale partial-duplicated web image search. In: IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL (2009) Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling feature for large scale partial-duplicated web image search. In: IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL (2009)
32.
Zurück zum Zitat Yang, J., Yu, K., Gong, Y., Huang, T. : Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1794–1801. Miami, Florida, USA (2009) Yang, J., Yu, K., Gong, Y., Huang, T. : Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1794–1801. Miami, Florida, USA (2009)
33.
Zurück zum Zitat Zhang, S., Huang, Q., Hua, G., Jiang, S., Gao, W., Tian, Q.: Building contextual visual vocabulary for large-scale image applications. In: ACM Multimedia. Florence, Italy (2010) Zhang, S., Huang, Q., Hua, G., Jiang, S., Gao, W., Tian, Q.: Building contextual visual vocabulary for large-scale image applications. In: ACM Multimedia. Florence, Italy (2010)
34.
Zurück zum Zitat Zhang, S., Huang, Q., Lu, Y., Gao, W., Tian, Q. : Building pair-wise visual word tree for efficient image re-ranking. In: ICASSP, pp. 794–797. Dallas, Texas, USA (2010) Zhang, S., Huang, Q., Lu, Y., Gao, W., Tian, Q. : Building pair-wise visual word tree for efficient image re-ranking. In: ICASSP, pp. 794–797. Dallas, Texas, USA (2010)
35.
Zurück zum Zitat Zhang, S., Tian, Q., Hua, G., Huang, Q., Li, S.: Descriptive visual words and visual phrases for image applications. In: ACM Multimedia. Beijing, China (2009) Zhang, S., Tian, Q., Hua, G., Huang, Q., Li, S.: Descriptive visual words and visual phrases for image applications. In: ACM Multimedia. Beijing, China (2009)
36.
Zurück zum Zitat Zhang, S., Tian, Q., Lu, K., Huang, Q., Gao, W.: Edge-SIFT: discriminative binary descriptor for scalable partial-duplicate mobile search. IEEE Trans. Image Process. 22(7), 2889–2902 (2013)CrossRef Zhang, S., Tian, Q., Lu, K., Huang, Q., Gao, W.: Edge-SIFT: discriminative binary descriptor for scalable partial-duplicate mobile search. IEEE Trans. Image Process. 22(7), 2889–2902 (2013)CrossRef
37.
Zurück zum Zitat Zhang, S., Yang, M., Wang, X., Lin, Y., Tian, Q.: Semantic-aware co-indexing for image retrieval. In: IEEE International Conference on Computer Vision, Sydney, Australia (2013) Zhang, S., Yang, M., Wang, X., Lin, Y., Tian, Q.: Semantic-aware co-indexing for image retrieval. In: IEEE International Conference on Computer Vision, Sydney, Australia (2013)
38.
Zurück zum Zitat Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA (2011) Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA (2011)
39.
Zurück zum Zitat Zheng, Y.-T., Zhao, M., Neo, S.-Y., Chua, T.-S., Tian, Q.: Visual synset: towards a higher-level visual representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. Anchorage, Alaska, USA (2008) Zheng, Y.-T., Zhao, M., Neo, S.-Y., Chua, T.-S., Tian, Q.: Visual synset: towards a higher-level visual representation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. Anchorage, Alaska, USA (2008)
40.
Zurück zum Zitat Zhou, W., Li, H., Lu, Y., Tian, Q.: Large scale image search with geometric coding. In: ACM Multimedia. Arizona, USA (2011) Zhou, W., Li, H., Lu, Y., Tian, Q.: Large scale image search with geometric coding. In: ACM Multimedia. Arizona, USA (2011)
Metadaten
Titel
Multi-order visual phrase for scalable partial-duplicate visual search
verfasst von
Shiliang Zhang
Qi Tian
Qingming Huang
Wen Gao
Yong Rui
Publikationsdatum
01.03.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
Multimedia Systems / Ausgabe 2/2015
Print ISSN: 0942-4962
Elektronische ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-014-0369-x

Weitere Artikel der Ausgabe 2/2015

Multimedia Systems 2/2015 Zur Ausgabe

Neuer Inhalt