Top

Multimedia Systems

Published in:

03-01-2015 | Special Issue Paper

Multiple level visual semantic fusion method for image re-ranking

Authors: Shuhan Qi, Fanglin Wang, Xuan Wang, Yue Guan, Jia Wei, Jian Guan

Published in: Multimedia Systems | Issue 1/2017

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Mid-level semantic attributes have obtained some success in image retrieval and re-ranking. However, due to the semantic gap between the low-level feature and intermediate semantic concept, information loss is considerable in the process of converting the low-level feature to semantic concept. To tackle this problem, we tried to bridge the semantic gap by looking for the complementary of different mid-level features. In this paper, a framework is proposed to improve image re-ranking by fusing multiple mid-level features together. The framework contains three mid-level features (DCNN-ImageNet attributes, Fisher vector, sparse coding spatial pyramid matching) and a semi-supervised multigraph-based model that combines these features together. In addition, our framework can be easily extended to utilize arbitrary number of features for image re-ranking. The experiments are conducted on the a-Pascal dataset, and our approach that fuses different features together is able to boost performance of image re-ranking efficiently.

previous article Effective optimizations of cluster-based nearest neighbor search in high-dimensional space

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Boureau, Y.L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2559–2566. IEEE, New York (2010)

Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision. European Conference on Computer Vision (ECCV), Vol. 1, pp. 1–2 (2004)

Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1, pp. 886–893. IEEE, New York (2005)

Douze, M., Ramisa, A., Schmid, C.: Combining attributes and fisher vectors for efficient image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 745–752. IEEE, New York (2011)

Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778–1785. IEEE, New York (2009)

Farquhar, J., Szedmak, S., Meng, H., Shawe-Taylor, J.: Improving” bag-of-keypoints” image categorisation: generative models and pdf-kernels. In: Technical Report, University of Southampton (2005)

Gao, Y., Ji, R., Liu, W., Dai, Q., Hua, G.: Weakly supervised dictionary learning with attributes. In: IEEE Transactions on Image Processing (2014)

Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-D object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)MathSciNetCrossRef

Gao, Y., Wang, M., Zha, Z.J., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)MathSciNetCrossRef

10.

van Gemert, J.C., Geusebroek, J.M., Veenman, C.J., Smeulders, A.W.: Kernel codebooks for scene categorization. In: European Conference on Computer Vision (ECCV), pp. 696–709. Springer, Berlin, Heidelberg (2008)

11.

Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311. IEEE, New York (2010)

12.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Conference on Neural Information Processing Systems (NIPS), Vol. 1, p. 4 (2012)

13.

Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2, pp. 2169–2178. IEEE, New York (2006)

14.

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRef

15.

Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE, New York (2007)

16.

Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: European Conference on Computer Vision (ECCV), pp. 143–156. Springer, Berlin, Heidelberg (2010)

17.

Philbin, J., Isard, M., Sivic, J., Zisserman, A.: Descriptor learning for efficient retrieval. In: European Conference on Computer Vision, pp. 677–691. Springer, Berlin, Heidelberg (2010)

18.

Scheirer, W.J., Kumar, N., Belhumeur, P.N., Boult, T.E.: Multi-attribute spaces: calibration for attribute fusion and similarity search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2933–2940. IEEE, New York (2012)

19.

Siddiquie, B., Feris, R.S., Davis, L.S.: Image ranking and retrieval based on multi-attribute queries. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 801–808. IEEE, New York (2011)

20.

Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, pp. 1470–1477. IEEE, New York (2003)

21.

Vaquero, D.A., Feris, R.S., Tran, D., Brown, L., Hampapur, A., Turk, M.: Attribute-based people search in surveillance environments. In: Workshop on Applications of Computer Vision (WACV), pp. 1–8. IEEE, New York (2009)

22.

Wang, F., Qi, S., Gao, G., Zhao, S., Wang, X.: Logo information recognition in large-scale social media data. In: Multimedia Systems, pp. 1–11 (2014)

23.

Wang, M., Hua, X.S., Hong, R., Tang, J., Qi, G.J., Song, Y.: Unified video annotation via multigraph learning. IEEE Trans. Circuits Syst. Video Technol. 19(5), 733–746 (2009)CrossRef

24.

Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: European Conference on Computer Vision (ECCV), pp. 155–168. Springer, New York (2010)

25.

Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801. IEEE, New York (2009)

26.

Yang, J., Yu, K., Huang, T.: Efficient highly over-complete sparse coding using a mixture model. In: European Conference on Computer Vision (ECCV), pp. 113–126. Springer, Berlin, Heidelberg (2010)

27.

Yu, F.X., Ji, R., Tsai, M.H., Ye, G., Chang, S.F.: Weak attributes for large-scale image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2949–2956. IEEE, New York (2012)

28.

Zhang, C., Wang, S., Liang, C., Liu, J., Huang, Q., Li, H., Tian, Q.: Beyond bag of words: image representation in sub-semantic space. In: Proceedings of the 21st ACM International Conference On Multimedia, pp. 497–500. ACM, New York (2013)

29.

Zhang, H., Zha, Z.J., Yang, Y., Yan, S., Gao, Y., Chua, T.S.: Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 33–42. ACM, New York (2013)

30.

Zhang, L., Gao, Y., Hong, C., Feng, Y., Zhu, J., Cai, D.: Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. IEEE Trans. Cybern. 44(8), 1408–1419 (2013)CrossRef

31.

Zhang, L., Gao, Y., Ji, R., Dai, Q., Li, X.: Actively learning human gaze shifting paths for photo cropping. IEEE Trans. Image Process. 21(5), 2235–2245 (2014)MathSciNetCrossRef

32.

Zhang, L., Gao, Y., Lu, K., Shen, J., Ji, R.: Representative discovery of structure cues for weakly-supervised image segmentation. IEEE Trans. Multimed. 16(2), 470–479 (2014)CrossRef

33.

Zhang, L., Gao, Y., Xia, Y., Dai, Q., Li, X.: A fine-grained image categorization system by cellet-encoded spatial pyramid modeling. In: IEEE Transactions on Industrial Electronics (2014)

34.

Zhang, L., Gao, Y., Xia, Y., Li, X.: Spatial-aware object-level saliency prediction by learning graphlet hierarchies. In: IEEE Transactions on Image Processing (2014)

35.

Zhang, L., Gao, Y., Zimmermann, R., Tian, Q., Li, X.: Fusion of multi-channel local and global structural cues for photo aesthetics evaluation. IEEE Trans. Image Process. 23(3), 1419–1429 (2014)MathSciNetCrossRef

36.

Zhang, L., Han, Y., Yang, Y., Song, M., Yan, S., Tian, Q.: Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans. Image Process. 22(12), 5071–5084 (2013)MathSciNetCrossRef

37.

Zhang, L., Song, M., Liu, X., Sun, L., Chen, C., Bu, J.: Recognizing architecture styles by hierarchical sparse coding of blocklets. Inf. Sci. 254, 141–154 (2014)CrossRef

38.

Zhang, L., Song, M., Zhao, Q., Liu, X., Bu, J., Chen, C.: Probabilistic graphlet transfer for photo cropping. IEEE Trans. Image Process. 22(2), 802–815 (2013)MathSciNetCrossRef

39.

Zhang, L., Yang, Y., Gao, Y., Yu, Y., Wang, C., Li, X.: A probabilistic associative model for segmenting weakly-supervised images. IEEE Trans. Image Process. 23(9), 4150–4159 (2014)MathSciNetCrossRef

40.

Zhang, S., Yang, M., Wang, X., Lin, Y., Tian, Q.: Semantic-aware co-indexing for image retrieval. In: Proceedings of IEEE International Conference on Computer Vision (2013)

41.

Zhao, S., Gao, Y., Jiang, X., Yao, H., Chua, T.S., Sun, X.: Exploring principles-of-art features for image emotion recognition. In: ACM International Conference on Multimedia (2014)

42.

Zhao, S., Yao, H., Yang, Y., Zhang, Y.: Affective image retrieval via multi-graph learning. In: ACM International Conference on Multimedia (2014)

Title: Multiple level visual semantic fusion method for image re-ranking
Authors: Shuhan Qi
Fanglin Wang
Xuan Wang
Yue Guan
Jia Wei
Jian Guan
Publication date: 03-01-2015
Publisher: Springer Berlin Heidelberg
Published in: Multimedia Systems / Issue 1/2017
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI: https://doi.org/10.1007/s00530-014-0448-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 1/2017

Learning for classification of traffic-related object on RGB-D data

A discriminative graph inferring framework towards weakly supervised image parsing

Special issue on “visual semantic analysis with weak supervision”

Image feature detection algorithm based on the spread of Hessian source

Graph-based clustering and ranking for diversified image search

Semi-supervised tensor learning for image classification