Abstract
Where previous reviews on content-based image retrieval emphasize what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems (i.e., image tag assignment, refinement, and tag-based image retrieval) is presented. While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, that is, estimating the relevance of a specific tag with respect to the visual content of a given image and its social context. By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this article introduces a two-dimensional taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations. For a head-to-head comparison with the state of the art, a new experimental protocol is presented, with training sets containing 10,000, 100,000, and 1 million images, and an evaluation on three test sets, contributed by various research groups. Eleven representative works are implemented and evaluated. Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future.
- Morgan Ames and Mor Naaman. 2007. Why we tag: Motivations for annotation in mobile and online media. In Proc. of ACM CHI. 971--980. Google ScholarDigital Library
- Stuart Andrews, Ioannis Tsochantaridis, and Thomas Hofmann. 2003. Support vector machines for multiple-instance learning. In Proc. of NIPS. 561--568.Google Scholar
- Pradeep K. Atrey, M. Anwar Hossain, Abdulmotaleb El Saddik, and Mohan S. Kankanhalli. 2010. Multimodal fusion for multimedia analysis: A survey. Multimedia Systems 16, 6 (2010), 345--379. Google ScholarDigital Library
- Lamberto Ballan, Marco Bertini, Tiberio Uricchio, and Alberto Del Bimbo. 2015. Data-driven approaches for social image and video tagging. Multimedia Tools and Applications 74, 4 (2015), 1443--1468. Google ScholarDigital Library
- Lamberto Ballan, Tiberio Uricchio, Lorenzo Seidenari, and Alberto Del Bimbo. 2014. A cross-media model for automatic image annotation. In Proc. of ACM ICMR. 73--80. Google ScholarDigital Library
- Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. O’Reilly Media. Google ScholarDigital Library
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (2003), 993--1022. Google ScholarDigital Library
- Damian Borth, Rongrong Ji, Tao Chen, Thomas Breuel, and Shih-Fu Chang. 2013. Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proc. of ACM MM. 223--232. Google ScholarDigital Library
- Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis? Journal of the ACM 58, 3 (2011), 11.Google ScholarDigital Library
- Lin Chen, Dong Xu, Ivor W. Tsang, and Jiebo Luo. 2012. Tag-based image retrieval improved by augmented features and group-based refinement. IEEE Transactions on Multimedia 14, 4 (2012), 1057--1067. Google ScholarDigital Library
- Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from national university of singapore. In Proc. of ACM CIVR. 48:1--48:9. Google ScholarDigital Library
- Rudi L. Cilibrasi and Paul M. B. Vitanyi. 2007. The Google similarity distance. IEEE Transactions on Knowledge and Data Engineering 19, 3 (2007), 370--383. Google ScholarDigital Library
- Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z. Wang. 2008. Image retrieval: Ideas, influences, and trends of the new age. Computing Surveys 40, 2 (2008), 5:1--5:60. Google ScholarDigital Library
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proc. of CVPR. 248--255.Google ScholarCross Ref
- Jesse Dodge, Amit Goyal, Xufeng Han, Alyssa Mensch, Margaret Mitchell, Karl Stratos, Kota Yamaguchi, Yejin Choi, Hal Daumé, III, Alexander C. Berg, and Tamara L. Berg. 2012. Detecting visual text. In Proc. of NAACL. 762--772. Google ScholarDigital Library
- Kun Duan, David J. Crandall, and Dhruv Batra. 2014. Multimodal learning in loosely-organized web images. In Proc. of CVPR. 2465--2472. Google ScholarDigital Library
- Lixin Duan, Wen Li, Ivor Wai-Hung Tsang, and Dong Xu. 2011. Improving web image search by bag-based reranking. IEEE Transactions on Image Processing 20, 11 (2011), 3280--3290. Google ScholarDigital Library
- Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. 2015. The PASCAL visual object classes challenge: A retrospective. International Journal of Computer Vision 111, 1 (2015), 98--136. Google ScholarDigital Library
- Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9 (2008), 1871--1874. Google ScholarDigital Library
- Songhe Feng, Congyan Lang, and Bing Li. 2012. Towards relevance and saliency ranking of image tags. In Proc. of ACM MM. 917--920. Google ScholarDigital Library
- Zheyun Feng, Songhe Feng, Rong Jin, and Anil K. Jain. 2014. Image tag completion by noisy matrix recovery. In Proc. of ECCV. 424--438.Google Scholar
- Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. 2003. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research 4 (2003), 933--969. Google ScholarDigital Library
- Yue Gao, Meng Wang, Zheng-Jun Zha, Jialie Shen, Xuelong Li, and Xindong Wu. 2013. Visual-textual joint relevance learning for tag-based social image search. IEEE Transactions on Image Processing 22, 1 (2013), 363--376. Google ScholarDigital Library
- Alexandru Lucian Ginsca, Adrian Popescu, Bogdan Ionescu, Anil Armagan, and Ioannis Kanellos. 2014. Toward an estimation of user tagging credibility for social image retrieval. In Proc. of ACM MM. 1021--1024. Google ScholarDigital Library
- Scott A. Golder and Bernardo A. Huberman. 2006. Usage patterns of collaborative tagging systems. Journal of Information Science 32, 2 (2006), 198--208. Google ScholarDigital Library
- Gene H. Golub and Charles F. Van Loan. 2012. Matrix Computations. Johns Hopkins University Press.Google Scholar
- Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. 2009. TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation. In Proc. of ICCV. 309--316.Google ScholarCross Ref
- Manish Gupta, Rui Li, Zhijun Yin, and Jiawei Han. 2010. Survey on social tagging techniques. SIGKDD Explorations Newsletter 12, 1 (2010), 58--72. Google ScholarDigital Library
- Xian-Sheng Hua, Linjun Yang, Jingdong Wang, Jing Wang, Ming Ye, Kuansan Wang, Yong Rui, and Jin Li. 2013. Clickage: Towards bridging semantic and intent gaps via mining click logs of search engines. In Proc. of ACM MM. 243--252. Google ScholarDigital Library
- Mark J. Huiskes, Bart Thomee, and Michael S. Lew. 2010. New trends and ideas in visual concept detection: The MIR Flickr retrieval evaluation initiative. In Proc. of ACM MIR. 527--536. Google ScholarDigital Library
- Fouzia Jabeen, Shah Khusro, Amna Majid, and Azhar Rauf. 2016. Semantics discovery in social tagging systems: A review. Multimedia Tools and Applications 75, 1 (2016), 573--605. Google ScholarDigital Library
- Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Intelligent Systems and Technology 20, 4 (2002), 422--446. Google ScholarDigital Library
- Herve Jegou, Matthijs Douze, and Cordelia Schmid. 2011. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 1 (2011), 117--128. Google ScholarDigital Library
- Yu-Gang Jiang, Chong-Wah Ngo, and Shih-Fu Chang. 2009. Semantic context transfer across heterogeneous sources for domain adaptive video search. In Proc. of ACM MM. 155--164. Google ScholarDigital Library
- Yohan Jin, Latifur Khan, Lei Wang, and Mamoun Awad. 2005. Image annotations by combining multiple evidence & wordNet. In Proc. of ACM MM. 706--715. Google ScholarDigital Library
- Thorsten Joachims. 1999. Transductive inference for text classification using support vector machines. In Proc. of ICML. 200--209. Google ScholarDigital Library
- Justin Johnson, Lamberto Ballan, and Li Fei-Fei. 2015. Love thy neighbors: Image annotation by exploiting image metadata. In Proc. of ICCV. Google ScholarDigital Library
- Mahdi M. Kalayeh, Haroon Idrees, and Mubarak Shah. 2014. NMF-KNN: Image annotation using weighted multi-view non-negative matrix factorization. In Proc. of CVPR. 184--191. Google ScholarDigital Library
- Lyndon S. Kennedy, Shih-Fu Chang, and Igor V. Kozintsev. 2006. To search or to label?: Predicting the performance of search-based automatic image classifiers. In Proc. of ACM MIR. 249--258. Google ScholarDigital Library
- Lyndon S. Kennedy, Malcolm Slaney, and Kilian Weinberger. 2009. Reliable tags using image similarity: Mining specificity and expertise from large-scale multimedia databases. In Proc. of ACM MM Workshop on Web-Scale Multimedia Corpus. 17--24. Google ScholarDigital Library
- Gunhee Kim and Eric P. Xing. 2013. Time-sensitive web image ranking and retrieval via dynamic multi-task regression. In Proc. of ACM WSDM. 163--172. Google ScholarDigital Library
- Yin-Hsi Kuo, Wen-Huang Cheng, Hsuan-Tien Lin, and Winston H. Hsu. 2012. Unsupervised semantic feature discovery for image object retrieval and tag refinement. IEEE Transactions on Multimedia 14, 4 (2012), 1079--1090. Google ScholarDigital Library
- Tian Lan and Greg Mori. 2013. A max-margin riffled independence model for image tag ranking. In Proc. of CVPR. 3103--3110. Google ScholarDigital Library
- Sihyoung Lee, Wesley De Neve, and Yong Man Ro. 2013. Visually weighted neighbor voting for image tag relevance learning. Multimedia Tools and Applications 72, 2 (2013), 1363--1386. Google ScholarDigital Library
- Mingling Li. 2007. Texture moment for content-based image retrieval. In Proc. of ICME. 508--511.Google ScholarCross Ref
- Wen Li, Lixin Duan, Dong Xu, and Ivor Wai-Hung Tsang. 2011a. Text-based image retrieval using progressive multi-instance learning. In Proc. of ICCV. 2049--2055. Google ScholarDigital Library
- Xirong Li. 2016. Tag relevance fusion for social image retrieval. Multimedia Systems. In press (2016). DOI:http://dx.doi.org/10.1007/s00530-014-0430-9Google Scholar
- Xirong Li, Efstratios Gavves, Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2011b. Personalizing automated image annotation using cross-entropy. In Proc. of ACM MM. 233--242. Google ScholarDigital Library
- Xirong Li and Cees G. M. Snoek. 2013. Classifying tag relevance with relevant positive and negative examples. In Proc. of ACM MM. 485--488. Google ScholarDigital Library
- Xirong Li, Cees G. M. Snoek, and Marcel Worring. 2009a. Annotating images by harnessing worldwide user-tagged photos. In Proc. of ICASSP. 3717--3720. Google ScholarDigital Library
- Xirong Li, Cees G. M. Snoek, and Marcel Worring. 2009b. Learning social tag relevance by neighbor voting. IEEE Transactions on Multimedia 11, 7 (2009), 1310--1322. Google ScholarDigital Library
- Xirong Li, Cees G. M. Snoek, and Marcel Worring. 2010. Unsupervised multi-feature tag relevance learning for social image retrieval. In Proc. of ACM CIVR. 10--17. Google ScholarDigital Library
- Xirong Li, Cees G. M. Snoek, Marcel Worring, Dennis Koelma, and Arnold W. M. Smeulders. 2013. Bootstrapping visual categorization with relevant negatives. IEEE Transactions on Multimedia 15, 4 (2013), 933--945. Google ScholarDigital Library
- Xirong Li, Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2012. Harvesting social images for bi-concept search. IEEE Transactions on Multimedia 14, 4 (2012), 1091--1104. Google ScholarDigital Library
- Zechao Li, Jing Liu, and Hanqing Lu. 2013. Nonlinear matrix factorization with unified embedding for social tag relevance learning. Neurocomputing 105 (2013), 38--44. Google ScholarDigital Library
- Zechao Li, Jing Liu, Xiaobin Zhu, Tinglin Liu, and Hanqing Lu. 2010. Image annotation using multi-correlation probabilistic matrix factorization. In Proc. of ACM MM. 1187--119. Google ScholarDigital Library
- Hsuan-Tien Lin, Chih-Jen Lin, and Ruby C. Weng. 2007. A note on Platt’s probabilistic outputs for support vector machines. Machine Learning 68, 3 (2007), 267--276. Google ScholarDigital Library
- Zijia Lin, Guiguang Ding, Mingqing Hu, Jianmin Wang, and Xiaojun Ye. 2013. Image tag completion via image-specific and tag-specific linear sparse reconstructions. In Proc. of CVPR. 1618--1625. Google ScholarDigital Library
- Dong Liu, Xian-Sheng Hua, Meng Wang, and Hong-Jiang Zhang. 2010. Image retagging. In Proc. of ACM MM. 491--500. Google ScholarDigital Library
- Dong Liu, Xian-Sheng Hua, Linjun Yang, Meng Wang, and Hong-Jiang Zhang. 2009. Tag ranking. In Proc. of WWW. 351--360. Google ScholarDigital Library
- Dong Liu, Xian-Sheng Hua, and Hong-Jiang Zhang. 2011. Content-based tag processing for internet social images. Multimedia Tools and Applications 51, 2 (2011), 723--738. Google ScholarDigital Library
- Dong Liu, Shuicheng Yan, Xian-Sheng Hua, and Hong-Jiang Zhang. 2011b. Image retagging using collaborative tag propagation. IEEE Transactions on Multimedia 13, 4 (2011), 702--712. Google ScholarDigital Library
- Jing Liu, Zechao Li, Jinhui Tang, Yu Jiang, and Hanqing Lu. 2014. Personalized geo-specific tag recommendation for photos on social websites. IEEE Transactions on Multimedia 16, 3 (2014), 588--600. Google ScholarDigital Library
- Jing Liu, Yifan Zhang, Zechao Li, and Hanqing Lu. 2013. Correlation consistency constrained probabilistic matrix factorization for social tag refinement. Neurocomputing 119, 7 (2013), 3--9. Google ScholarDigital Library
- Yang Liu, Fei Wu, Yin Zhang, Jian Shao, and Yueting Zhuang. 2011a. Tag clustering and refinement on semantic unity graph. In Proc. of ICDM. 417--426. Google ScholarDigital Library
- Hao Ma, Jianke Zhu, Michael Rung-Tsong Lyu, and Irwin King. 2010. Bridging the semantic gap between image contents and tags. IEEE Transactions on Multimedia 12, 5 (2010), 462--473. Google ScholarDigital Library
- Subhransu Maji, Alexander C. Berg, and Jitendra Malik. 2008. Classification using intersection kernel support vector machines is efficient. In Proc. of CVPR. 1--8.Google ScholarCross Ref
- Ameesh Makadia, Vladimir Pavlovic, and Sanjiv Kumar. 2010. Baselines for image annotation. International Journal of Computer Vision 90, 1 (2010), 88--105. Google ScholarDigital Library
- Julian McAuley and Jure Leskovec. 2012. Image labeling on a network: Using social-network metadata for image classification. In Proc. of ECCV. 828--841. Google ScholarDigital Library
- Philip McParlane, Stewart Whiting, and Joemon Jose. 2013b. Improving automatic image tagging using temporal tag co-occurrence. In Proc. of MMM. 251--262.Google ScholarCross Ref
- Philip J. McParlane, Yashar Moshfeghi, and Joemon M. Jose. 2013a. On contextual photo tag recommendation. In Proc. of ACM SIGIR. 965--968. Google ScholarDigital Library
- Tao Mei, Yong Rui, Shipeng Li, and Qi Tian. 2014. Multimedia search reranking: A literature survey. Computing Surveys 46, 3 (2014), 38. Google ScholarDigital Library
- Ryszard S. Michalski. 1993. A theory and methodology of inductive learning. In Readings in Knowledge Acquisition and Learning. Morgan Kaufmann Publishers, 323--348. Google ScholarDigital Library
- Liqiang Nie, Shuicheng Yan, Meng Wang, Richang Hong, and Tat-Seng Chua. 2012. Harvesting visual concepts for image search with complex queries. In Proc. of ACM MM. 59--68. Google ScholarDigital Library
- Zhenxing Niu, Gang Hua, Xinbo Gao, and Qi Tian. 2014. Semi-supervised relational topic model for weakly annotated image recognition in social media. In Proc. of CVPR. 4233--4240. Google ScholarDigital Library
- Oded Nov and Chen Ye. 2010. Why do people tag?: Motivations for photo tagging. Communications of the ACM 53, 7 (2010), 128--131. Google ScholarDigital Library
- Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Nikhil Rasiwasia, Gert R. G. Lanckriet, Roger Levy, and Nuno Vasconcelos. 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 3 (2014), 521--535. Google ScholarDigital Library
- Guo-Jun Qi, Charu Aggarwal, Qi Tian, Heng Ji, and Thomas Huang. 2012. Exploring context and content links in social media: A latent space method. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 5 (2012), 850--862. Google ScholarDigital Library
- Xueming Qian, Xian-Sheng Hua, Yuan Yan Tang, and Tao Mei. 2014. Social image tagging with diverse semantics. IEEE Transactions on Cybernetics 44, 12 (2014), 2493--2508.Google ScholarCross Ref
- Zhiming Qian, Ping Zhong, and Runsheng Wang. 2015. Tag refinement for user-contributed images via graph learning and nonnegative tensor factorization. IEEE Signal Processing Letters 22, 9 (2015), 1302--1305.Google ScholarCross Ref
- Fabian Richter, Stefan Romberg, Eva Hörster, and Rainer Lienhart. 2012. Leveraging community metadata for multimodal image ranking. Multimedia Tools and Applications 56, 1 (2012), 35--62. Google ScholarDigital Library
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252. Google ScholarDigital Library
- Jitao Sang, Changsheng Xu, and Jing Liu. 2012a. User-aware image tag refinement via ternary semantic analysis. IEEE Transactions on Multimedia 14, 3 (2012), 883--895. Google ScholarDigital Library
- Jitao Sang, Changsheng Xu, and Dongyuan Lu. 2012b. Learn to personalized image search from the photo sharing websites. IEEE Transactions on Multimedia 14, 4 (2012), 963--974. Google ScholarDigital Library
- Neela Sawant, Ritendra Datta, Jia Li, and James Z. Wang. 2010. Quest for relevant tags using local interaction networks and visual content. In Proc. of ACM MIR. 231--240. Google ScholarDigital Library
- Neela Sawant, Jia Li, and James Z. Wang. 2011. Automatic image semantic interpretation using social action and tagging data. Multimedia Tools and Applications 51, 1 (2011), 213--246. Google ScholarDigital Library
- Shilad Sen, Shyong K. Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, and John Riedl. 2006. Tagging, communities, vocabulary, evolution. In Proc. of CSCW. 181--190. Google ScholarDigital Library
- Börkur Sigurbjörnsson and Roelof Van Zwol. 2008. Flickr tag recommendation based on collective knowledge. In Proc. of WWW. 327--336. Google ScholarDigital Library
- Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proc. of ICLR.Google Scholar
- Arnold W. M. Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta, and Ramesh Jain. 2000. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 12 (2000), 1349--1380. Google ScholarDigital Library
- Nitish Srivastava and Ruslan R. Salakhutdinov. 2014. Multimodal learning with deep Boltzmann machines. Journal of Machine Learning Research 15, 1 (2014), 2949--2980. Google ScholarDigital Library
- Aixin Sun, Sourav S. Bhowmick, Nam Nguyen, Khanh Tran, and Ge Bai. 2011. Tag-based social image retrieval: An empirical evaluation. Journal of the American Society for Information Science and Technology 62, 12 (2011), 2364--2381. Google ScholarDigital Library
- Jinhui Tang, Richang Hong, Shuicheng Yan, Tat-Seng Chua, Guo-Jun Qi, and Ramesh Jain. 2011. Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images. ACM Transactions on Intelligent Systems and Technology 2, 2 (2011), 14:1--14:15. Google ScholarDigital Library
- Jinhui Tang, Shuicheng Yan, Richang Hong, Guo-Jun Qi, and Tat-Seng Chua. 2009. Inferring semantic concepts from community-contributed images and noisy tags. In Proc. of ACM MM. 223--232. Google ScholarDigital Library
- Ba Quan Truong, Aixin Sun, and Sourav S. Bhowmick. 2012. Content is still king: The effect of neighbor voting schemes on tag relevance for social image retrieval. In Proc. of ACM ICMR. 9:1--9:8. Google ScholarDigital Library
- Ledyard R. Tucker. 1966. Some mathematical notes on three-mode factor analysis. Psychometrika 31, 3 (1966), 279--311.Google ScholarCross Ref
- Tiberio Uricchio, Lamberto Ballan, Marco Bertini, and Alberto Del Bimbo. 2013. An evaluation of nearest-neighbor methods for tag refinement. In Proc. of ICME. 1--6.Google ScholarCross Ref
- Koen E. A. Van De Sande, Theo Gevers, and Cees G. M. Snoek. 2010. Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 9 (2010), 1582--1596. Google ScholarDigital Library
- Jakob Verbeek, Matthieu Guillaumin, Thomas Mensink, and Cordelia Schmid. 2010. Image annotation with TagProp on the MIRFLICKR set. In Proc. of ACM MIR. 537--546. Google ScholarDigital Library
- Daan T. J. Vreeswijk, Cees G. M. Snoek, Koen E. A. van de Sande, and Arnold W. M. Smeulders. 2012. All vehicles are cars: Subclass preferences in container concepts. In Proc. of ACM ICMR. 8:1--8:7. Google ScholarDigital Library
- Changhu Wang, Feng Jing, Lei Zhang, and Hong-Jiang Zhang. 2006. Image annotation refinement using random walk with restarts. In Proc. of ACM MM. 647--650. Google ScholarDigital Library
- Gang Wang, Derek Hoiem, and David Forsyth. 2009. Building text features for object image classification. In Proc. of CVPR. 1367--1374.Google ScholarCross Ref
- Jingdong Wang, Jiazhen Zhou, Hao Xu, Tao Mei, Xian-Sheng Hua, and Shipeng Li. 2014. Image tag refinement by regularized latent Dirichlet allocation. Computer Vision and Image Understanding 124 (2014), 61--70.Google ScholarCross Ref
- Meng Wang, Bingbing Ni, Xian-Sheng Hua, and Tat-Seng Chua. 2012. Assistive tagging: A survey of multimedia tagging with human-computer joint exploration. Computing Surveys 44, 4 (2012), 25:1--25:24. Google ScholarDigital Library
- Meng Wang, Kuiyuan Yang, Xian-Sheng Hua, and Hong-Jiang Zhang. 2010. Towards a relevant and diverse search of social images. IEEE Transactions on Multimedia 12, 8 (2010), 829--842. Google ScholarDigital Library
- Lei Wu, Xian-Sheng Hua, Nenghai Yu, Wei-Ying Ma, and Shipeng Li. 2008. Flickr distance. In Proc. of ACM MM. 31--40. Google ScholarDigital Library
- Lei Wu, Rong Jin, and Anubhav K. Jain. 2013. Tag completion for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 3 (2013), 716--727. Google ScholarDigital Library
- Lei Wu, Linjun Yang, Nenghai Yu, and Xian-Sheng Hua. 2009. Learning to tag. In Proc. of WWW. 361--370. Google ScholarDigital Library
- Pengcheng Wu, Steven Chu-Hong Hoi, Peilin Zhao, and Ying He. 2011. Mining social images with distance metric learning for automated image tagging. In Proc. of ACM WSDM. 97--206. Google ScholarDigital Library
- Zhibiao Wu and Martha Palmer. 1994. Verbs semantics and lexical selection. In Proc. of ACL. 133--138. Google ScholarDigital Library
- Hao Xu, Jingdong Wang, Xian-Sheng Hua, and Shipeng Li. 2009. Tag refinement by regularized LDA. In Proc. of ACM MM. 573--576. Google ScholarDigital Library
- Xing Xu, Akira Shimada, and Rin-ichiro Taniguchi. 2014. Tag completion with defective tag assignments via image-tag re-weighting. In Proc. of ICME. 1--6.Google ScholarCross Ref
- Kuiyuan Yang, Xian-Sheng Hua, Meng Wang, and Hong-Jiang Zhang. 2011. Tag tagging: Towards more descriptive keywords of image content. IEEE Transactions on Multimedia 13, 4 (2011), 662--673. Google ScholarDigital Library
- Yang Yang, Yue Gao, Hanwang Zhang, Jie Shao, and Tat-Seng Chua. 2014. Image tagging with social assistance. In Proc. of ACM ICMR. 81--88. Google ScholarDigital Library
- Bolei Zhou, Vignesh Jagadeesh, and Robinson Piramuthu. 2015. ConceptLearner: Discovering visual concepts from weakly labeled image collections. In Proc. of CVPR.Google ScholarCross Ref
- Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with hypergraphs: Clustering, classification, and embedding. In Proc. of NIPS. 1601--1608.Google Scholar
- Guangyu Zhu, Shuicheng Yan, and Yi Ma. 2010. Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proc. of ACM MM. 461--470. Google ScholarDigital Library
- Shiai Zhu, Chong-Wah Ngo, and Yu-Gang Jiang. 2012. Sampling and ontologically pooling web images for visual concept learning. IEEE Transactions on Multimedia 14, 4 (2012), 1068--1078. Google ScholarDigital Library
- Xiaofei Zhu, Wolfgang Nejdl, and Mihai Georgescu. 2014. An adaptive teleportation random walk model for learning social tag relevance. In Proc. of ACM SIGIR. 223--232. Google ScholarDigital Library
- Jinfeng Zhuang and Steven C. H. Hoi. 2011. A two-view learning approach for image tag ranking. In Proc. of ACM WSDM. 625--634. Google ScholarDigital Library
- Amel Znaidia, Hervé Le Borgne, and Céline Hudelot. 2013. Tag completion based on belief theory and neighbor voting. In Proc. of ACM ICMR. 49--56. Google ScholarDigital Library
Index Terms
- Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement, and Retrieval
Recommendations
Image Tag Assignment, Refinement and Retrieval
MM '15: Proceedings of the 23rd ACM international conference on MultimediaThis tutorial focuses on challenges and solutions for content-based image annotation and retrieval in the context of online image sharing and tagging. We present a unified review on three closely linked problems, i.e., tag assignment, tag refinement, ...
Enriching and localizing semantic tags in internet videos
MM '11: Proceedings of the 19th ACM international conference on MultimediaTagging of multimedia content is becoming more and more widespread as web 2.0 sites, like Flickr and Facebook for images, YouTube and Vimeo for videos, have popularized tagging functionalities among their users. These user-generated tags are used to ...
Content-Irrelevant Tag Cleansing via Bi-Layer Clustering and Peer Cooperation
User-provided tags for social images have facilitated many fields, such as social image organization, summarization and retrieval. Since the users utilize their own knowledge and personalized language to describe the visual content of social images, ...
Comments