Skip to main content
Top
Published in: Multimedia Systems 1/2017

03-01-2015 | Special Issue Paper

Multiple level visual semantic fusion method for image re-ranking

Authors: Shuhan Qi, Fanglin Wang, Xuan Wang, Yue Guan, Jia Wei, Jian Guan

Published in: Multimedia Systems | Issue 1/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Mid-level semantic attributes have obtained some success in image retrieval and re-ranking. However, due to the semantic gap between the low-level feature and intermediate semantic concept, information loss is considerable in the process of converting the low-level feature to semantic concept. To tackle this problem, we tried to bridge the semantic gap by looking for the complementary of different mid-level features. In this paper, a framework is proposed to improve image re-ranking by fusing multiple mid-level features together. The framework contains three mid-level features (DCNN-ImageNet attributes, Fisher vector, sparse coding spatial pyramid matching) and a semi-supervised multigraph-based model that combines these features together. In addition, our framework can be easily extended to utilize arbitrary number of features for image re-ranking. The experiments are conducted on the a-Pascal dataset, and our approach that fuses different features together is able to boost performance of image re-ranking efficiently.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Boureau, Y.L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2559–2566. IEEE, New York (2010) Boureau, Y.L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2559–2566. IEEE, New York (2010)
2.
go back to reference Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision. European Conference on Computer Vision (ECCV), Vol. 1, pp. 1–2 (2004) Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision. European Conference on Computer Vision (ECCV), Vol. 1, pp. 1–2 (2004)
3.
go back to reference Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1, pp. 886–893. IEEE, New York (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1, pp. 886–893. IEEE, New York (2005)
4.
go back to reference Douze, M., Ramisa, A., Schmid, C.: Combining attributes and fisher vectors for efficient image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 745–752. IEEE, New York (2011) Douze, M., Ramisa, A., Schmid, C.: Combining attributes and fisher vectors for efficient image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 745–752. IEEE, New York (2011)
5.
go back to reference Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778–1785. IEEE, New York (2009) Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778–1785. IEEE, New York (2009)
6.
go back to reference Farquhar, J., Szedmak, S., Meng, H., Shawe-Taylor, J.: Improving” bag-of-keypoints” image categorisation: generative models and pdf-kernels. In: Technical Report, University of Southampton (2005) Farquhar, J., Szedmak, S., Meng, H., Shawe-Taylor, J.: Improving” bag-of-keypoints” image categorisation: generative models and pdf-kernels. In: Technical Report, University of Southampton (2005)
7.
go back to reference Gao, Y., Ji, R., Liu, W., Dai, Q., Hua, G.: Weakly supervised dictionary learning with attributes. In: IEEE Transactions on Image Processing (2014) Gao, Y., Ji, R., Liu, W., Dai, Q., Hua, G.: Weakly supervised dictionary learning with attributes. In: IEEE Transactions on Image Processing (2014)
8.
go back to reference Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-D object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)MathSciNetCrossRef Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-D object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)MathSciNetCrossRef
9.
go back to reference Gao, Y., Wang, M., Zha, Z.J., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)MathSciNetCrossRef Gao, Y., Wang, M., Zha, Z.J., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)MathSciNetCrossRef
10.
go back to reference van Gemert, J.C., Geusebroek, J.M., Veenman, C.J., Smeulders, A.W.: Kernel codebooks for scene categorization. In: European Conference on Computer Vision (ECCV), pp. 696–709. Springer, Berlin, Heidelberg (2008) van Gemert, J.C., Geusebroek, J.M., Veenman, C.J., Smeulders, A.W.: Kernel codebooks for scene categorization. In: European Conference on Computer Vision (ECCV), pp. 696–709. Springer, Berlin, Heidelberg (2008)
11.
go back to reference Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311. IEEE, New York (2010) Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311. IEEE, New York (2010)
12.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Conference on Neural Information Processing Systems (NIPS), Vol. 1, p. 4 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Conference on Neural Information Processing Systems (NIPS), Vol. 1, p. 4 (2012)
13.
go back to reference Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2, pp. 2169–2178. IEEE, New York (2006) Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2, pp. 2169–2178. IEEE, New York (2006)
14.
go back to reference Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef
15.
go back to reference Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE, New York (2007) Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE, New York (2007)
16.
go back to reference Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: European Conference on Computer Vision (ECCV), pp. 143–156. Springer, Berlin, Heidelberg (2010) Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: European Conference on Computer Vision (ECCV), pp. 143–156. Springer, Berlin, Heidelberg (2010)
17.
go back to reference Philbin, J., Isard, M., Sivic, J., Zisserman, A.: Descriptor learning for efficient retrieval. In: European Conference on Computer Vision, pp. 677–691. Springer, Berlin, Heidelberg (2010) Philbin, J., Isard, M., Sivic, J., Zisserman, A.: Descriptor learning for efficient retrieval. In: European Conference on Computer Vision, pp. 677–691. Springer, Berlin, Heidelberg (2010)
18.
go back to reference Scheirer, W.J., Kumar, N., Belhumeur, P.N., Boult, T.E.: Multi-attribute spaces: calibration for attribute fusion and similarity search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2933–2940. IEEE, New York (2012) Scheirer, W.J., Kumar, N., Belhumeur, P.N., Boult, T.E.: Multi-attribute spaces: calibration for attribute fusion and similarity search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2933–2940. IEEE, New York (2012)
19.
go back to reference Siddiquie, B., Feris, R.S., Davis, L.S.: Image ranking and retrieval based on multi-attribute queries. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 801–808. IEEE, New York (2011) Siddiquie, B., Feris, R.S., Davis, L.S.: Image ranking and retrieval based on multi-attribute queries. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 801–808. IEEE, New York (2011)
20.
go back to reference Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, pp. 1470–1477. IEEE, New York (2003) Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, pp. 1470–1477. IEEE, New York (2003)
21.
go back to reference Vaquero, D.A., Feris, R.S., Tran, D., Brown, L., Hampapur, A., Turk, M.: Attribute-based people search in surveillance environments. In: Workshop on Applications of Computer Vision (WACV), pp. 1–8. IEEE, New York (2009) Vaquero, D.A., Feris, R.S., Tran, D., Brown, L., Hampapur, A., Turk, M.: Attribute-based people search in surveillance environments. In: Workshop on Applications of Computer Vision (WACV), pp. 1–8. IEEE, New York (2009)
22.
go back to reference Wang, F., Qi, S., Gao, G., Zhao, S., Wang, X.: Logo information recognition in large-scale social media data. In: Multimedia Systems, pp. 1–11 (2014) Wang, F., Qi, S., Gao, G., Zhao, S., Wang, X.: Logo information recognition in large-scale social media data. In: Multimedia Systems, pp. 1–11 (2014)
23.
go back to reference Wang, M., Hua, X.S., Hong, R., Tang, J., Qi, G.J., Song, Y.: Unified video annotation via multigraph learning. IEEE Trans. Circuits Syst. Video Technol. 19(5), 733–746 (2009)CrossRef Wang, M., Hua, X.S., Hong, R., Tang, J., Qi, G.J., Song, Y.: Unified video annotation via multigraph learning. IEEE Trans. Circuits Syst. Video Technol. 19(5), 733–746 (2009)CrossRef
24.
go back to reference Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: European Conference on Computer Vision (ECCV), pp. 155–168. Springer, New York (2010) Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: European Conference on Computer Vision (ECCV), pp. 155–168. Springer, New York (2010)
25.
go back to reference Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801. IEEE, New York (2009) Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801. IEEE, New York (2009)
26.
go back to reference Yang, J., Yu, K., Huang, T.: Efficient highly over-complete sparse coding using a mixture model. In: European Conference on Computer Vision (ECCV), pp. 113–126. Springer, Berlin, Heidelberg (2010) Yang, J., Yu, K., Huang, T.: Efficient highly over-complete sparse coding using a mixture model. In: European Conference on Computer Vision (ECCV), pp. 113–126. Springer, Berlin, Heidelberg (2010)
27.
go back to reference Yu, F.X., Ji, R., Tsai, M.H., Ye, G., Chang, S.F.: Weak attributes for large-scale image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2949–2956. IEEE, New York (2012) Yu, F.X., Ji, R., Tsai, M.H., Ye, G., Chang, S.F.: Weak attributes for large-scale image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2949–2956. IEEE, New York (2012)
28.
go back to reference Zhang, C., Wang, S., Liang, C., Liu, J., Huang, Q., Li, H., Tian, Q.: Beyond bag of words: image representation in sub-semantic space. In: Proceedings of the 21st ACM International Conference On Multimedia, pp. 497–500. ACM, New York (2013) Zhang, C., Wang, S., Liang, C., Liu, J., Huang, Q., Li, H., Tian, Q.: Beyond bag of words: image representation in sub-semantic space. In: Proceedings of the 21st ACM International Conference On Multimedia, pp. 497–500. ACM, New York (2013)
29.
go back to reference Zhang, H., Zha, Z.J., Yang, Y., Yan, S., Gao, Y., Chua, T.S.: Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 33–42. ACM, New York (2013) Zhang, H., Zha, Z.J., Yang, Y., Yan, S., Gao, Y., Chua, T.S.: Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 33–42. ACM, New York (2013)
30.
go back to reference Zhang, L., Gao, Y., Hong, C., Feng, Y., Zhu, J., Cai, D.: Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. IEEE Trans. Cybern. 44(8), 1408–1419 (2013)CrossRef Zhang, L., Gao, Y., Hong, C., Feng, Y., Zhu, J., Cai, D.: Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. IEEE Trans. Cybern. 44(8), 1408–1419 (2013)CrossRef
31.
go back to reference Zhang, L., Gao, Y., Ji, R., Dai, Q., Li, X.: Actively learning human gaze shifting paths for photo cropping. IEEE Trans. Image Process. 21(5), 2235–2245 (2014)MathSciNetCrossRef Zhang, L., Gao, Y., Ji, R., Dai, Q., Li, X.: Actively learning human gaze shifting paths for photo cropping. IEEE Trans. Image Process. 21(5), 2235–2245 (2014)MathSciNetCrossRef
32.
go back to reference Zhang, L., Gao, Y., Lu, K., Shen, J., Ji, R.: Representative discovery of structure cues for weakly-supervised image segmentation. IEEE Trans. Multimed. 16(2), 470–479 (2014)CrossRef Zhang, L., Gao, Y., Lu, K., Shen, J., Ji, R.: Representative discovery of structure cues for weakly-supervised image segmentation. IEEE Trans. Multimed. 16(2), 470–479 (2014)CrossRef
33.
go back to reference Zhang, L., Gao, Y., Xia, Y., Dai, Q., Li, X.: A fine-grained image categorization system by cellet-encoded spatial pyramid modeling. In: IEEE Transactions on Industrial Electronics (2014) Zhang, L., Gao, Y., Xia, Y., Dai, Q., Li, X.: A fine-grained image categorization system by cellet-encoded spatial pyramid modeling. In: IEEE Transactions on Industrial Electronics (2014)
34.
go back to reference Zhang, L., Gao, Y., Xia, Y., Li, X.: Spatial-aware object-level saliency prediction by learning graphlet hierarchies. In: IEEE Transactions on Image Processing (2014) Zhang, L., Gao, Y., Xia, Y., Li, X.: Spatial-aware object-level saliency prediction by learning graphlet hierarchies. In: IEEE Transactions on Image Processing (2014)
35.
go back to reference Zhang, L., Gao, Y., Zimmermann, R., Tian, Q., Li, X.: Fusion of multi-channel local and global structural cues for photo aesthetics evaluation. IEEE Trans. Image Process. 23(3), 1419–1429 (2014)MathSciNetCrossRef Zhang, L., Gao, Y., Zimmermann, R., Tian, Q., Li, X.: Fusion of multi-channel local and global structural cues for photo aesthetics evaluation. IEEE Trans. Image Process. 23(3), 1419–1429 (2014)MathSciNetCrossRef
36.
go back to reference Zhang, L., Han, Y., Yang, Y., Song, M., Yan, S., Tian, Q.: Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans. Image Process. 22(12), 5071–5084 (2013)MathSciNetCrossRef Zhang, L., Han, Y., Yang, Y., Song, M., Yan, S., Tian, Q.: Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans. Image Process. 22(12), 5071–5084 (2013)MathSciNetCrossRef
37.
go back to reference Zhang, L., Song, M., Liu, X., Sun, L., Chen, C., Bu, J.: Recognizing architecture styles by hierarchical sparse coding of blocklets. Inf. Sci. 254, 141–154 (2014)CrossRef Zhang, L., Song, M., Liu, X., Sun, L., Chen, C., Bu, J.: Recognizing architecture styles by hierarchical sparse coding of blocklets. Inf. Sci. 254, 141–154 (2014)CrossRef
38.
go back to reference Zhang, L., Song, M., Zhao, Q., Liu, X., Bu, J., Chen, C.: Probabilistic graphlet transfer for photo cropping. IEEE Trans. Image Process. 22(2), 802–815 (2013)MathSciNetCrossRef Zhang, L., Song, M., Zhao, Q., Liu, X., Bu, J., Chen, C.: Probabilistic graphlet transfer for photo cropping. IEEE Trans. Image Process. 22(2), 802–815 (2013)MathSciNetCrossRef
39.
go back to reference Zhang, L., Yang, Y., Gao, Y., Yu, Y., Wang, C., Li, X.: A probabilistic associative model for segmenting weakly-supervised images. IEEE Trans. Image Process. 23(9), 4150–4159 (2014)MathSciNetCrossRef Zhang, L., Yang, Y., Gao, Y., Yu, Y., Wang, C., Li, X.: A probabilistic associative model for segmenting weakly-supervised images. IEEE Trans. Image Process. 23(9), 4150–4159 (2014)MathSciNetCrossRef
40.
go back to reference Zhang, S., Yang, M., Wang, X., Lin, Y., Tian, Q.: Semantic-aware co-indexing for image retrieval. In: Proceedings of IEEE International Conference on Computer Vision (2013) Zhang, S., Yang, M., Wang, X., Lin, Y., Tian, Q.: Semantic-aware co-indexing for image retrieval. In: Proceedings of IEEE International Conference on Computer Vision (2013)
41.
go back to reference Zhao, S., Gao, Y., Jiang, X., Yao, H., Chua, T.S., Sun, X.: Exploring principles-of-art features for image emotion recognition. In: ACM International Conference on Multimedia (2014) Zhao, S., Gao, Y., Jiang, X., Yao, H., Chua, T.S., Sun, X.: Exploring principles-of-art features for image emotion recognition. In: ACM International Conference on Multimedia (2014)
42.
go back to reference Zhao, S., Yao, H., Yang, Y., Zhang, Y.: Affective image retrieval via multi-graph learning. In: ACM International Conference on Multimedia (2014) Zhao, S., Yao, H., Yang, Y., Zhang, Y.: Affective image retrieval via multi-graph learning. In: ACM International Conference on Multimedia (2014)
Metadata
Title
Multiple level visual semantic fusion method for image re-ranking
Authors
Shuhan Qi
Fanglin Wang
Xuan Wang
Yue Guan
Jia Wei
Jian Guan
Publication date
03-01-2015
Publisher
Springer Berlin Heidelberg
Published in
Multimedia Systems / Issue 1/2017
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-014-0448-z

Other articles of this Issue 1/2017

Multimedia Systems 1/2017 Go to the issue