Abstract
The plethora of social actions and annotations (tags, comments, ratings) from online media sharing Websites and collaborative games have induced a paradigm shift in the research on image semantic interpretation. Social inputs with their added context represent a strong substitute for expert annotations. Novel algorithms have been designed to fuse visual features with noisy social labels and behavioral signals. In this survey, we review nearly 200 representative papers to identify the current trends, challenges as well as opportunities presented by social inputs for research on image semantics. Our study builds on an interdisciplinary confluence of insights from image processing, data mining, human computer interaction, and sociology to describe the folksonomic features of users, annotations and images. Applications are categorized into four types: concept semantics, person identification, location semantics and event semantics. The survey concludes with a summary of principle research directions for the present and the future.
Similar content being viewed by others
Notes
More than 200,000 players of the ESP game (later renamed Google ImageLabeler) contributed over 50 million image labels as a number of players spent more than 40 hours a week playing the game. Peekaboom recorded more than 500,000 human-hours of play [183].
References
Adomavicius G, Tuzhilin A (2005) Towards the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Aesthetics. Definition from Oxford dictionaries. http://www.oxforddictionaries.com/definition/aesthetics
Agichtein E, Brill E, Dumais S (2006) Improving web search ranking by incorporating user behavior information. In: Proc res dev inf ret, pp 19–26
Ahern S, Naaman M, Nair R, Yang J (2007) World explorer: visualizing aggregate data from unstructured text in geo-referenced collections. In: Proc jt conf digit libr, pp 1–10
Allan J (ed) (2002) Topic detection and tracking: event-based information organization. Kluwer, Boston
Allan M, Verbeek J (2009) Ranking user-annotated images for multiple query terms. In: Proc Br mach vis conf
Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proc hum factors comput syst, pp 971–980
Andrienko N (2003) Exploratory spatio-temporal visualization: an analytical review. J Vis Lang Comput 14(6):503–541
Anguelov D, Lee K, Gokturk S, Sumengen B (2007) Contextual identity recognition in personal photo albums. In: Proc comput vis pattern recognit, pp 1–7
Bailloeul T, Zhu C, Xu Y (2008) Automatic image tagging as a random walk with priors on the canonical correlation subspace. In: Proc ACM multimed inf ret, pp 75–82
Barnard K, Duygulu P, Forsyth D, Freitas ND, Blei DKJ, Hofmann T, Poggio T, Shawe-Taylor J (2003) Matching words and pictures. J Mach Learn Res 3:1107–1135
Barnard K, Fan Q, Swaminathan R, Hoogs A, Collins R, Rondot P, Kaufhold J (2008) Evaluation of localized semantics: data, methodology, and experiments. Int J Comput Vis 77(1–3):199–217
Berg T, Forsyth D (2007) Automatic ranking in iconic images. Tech rep, University of California, Berkeley
Bian J, Liu Y, Agichtein E, Zha H (2008) A few bad votes too many?: towards robust ranking in social media. In: Proc workshop on advers inf ret on the Web, pp 53–60
Bian J, Liu Y, Zhou D, Agichtein E, Zha H (2009) Learn to recognize reliable users and content in social media with coupled mutual reinforcement. In: Proc World Wide Web, pp 51–60
Böttcher M, Höppner F, Spiliopoulou M (2008) On exploiting the power of time in data mining. SIGKDD Explor Newsl 10(2):3–11
Boutell M, Luo J (2004) Photo classification by integrating image content and camera metadata. In: Proc pattern recognit, pp 901–904
Brinker K, Hüllermeier E (2007) Case-based multilabel ranking. In: Proc int jt conf artifical intell, pp 702–707
Budanitsky A, Hirst G (2006) Evaluating wordnet-based measures of lexical semantic relatedness. Proc Res Comput Linguist 32(1):13–47
Budiu R, Pirolli P, Hong L (2009) Remembrance of things tagged: how tagging effort affects tag production and human memory. In: Proc hum factors comput syst, pp 615–624
Cai D, He X, Li Z, Ma WY, Wen JR (2004) Hierarchical clustering of www image search results using visual, textual and link information. In: Proc ACM multimed, pp 952–959
Cao L, Luo J, Huang TS (2008) Annotating photo collections by label propagation according to multiple similarity cues. In: Proc ACM multimed, pp 121–130
Cao L, Luo J, Kautz H, Huang T (2008) Annotating collections of photos using hierarchical event and scene models. In: Proc comput vis pattern recognit
Cattuto C, Benz D, Hotho A, Stumme G (2008) Semantic analysis of tag similarity measures in collaborative tagging systems. In: Proc workshop ontol learn & popul, pp 39–43
Cattuto C, Schmitz C, Baldassarri A, Servedio V, Loreto V, Hotho A, Grahl M, Stumme G (2007) Network properties of folksonomies. AI Commun 20(4):245–262
Chandola V, Banerjee A, Kumar V (2007) Outlier detection: a survey. Technical report, University of Minnesota
Chandramouli K, Izquierdo E (2010) Semantic structuring and retrieval of event chapters in social photo collections. In: Proc ACM multimed inf ret, pp 507–516
Chatzilari E, Nikolopoulos S, Kompatsiaris I, Giannakidou E, Vakali A (2009) Leveraging social media for training object detectors. In: Proc digit signal proc, pp 1–8
Chen HM, Chang MH, Chang PC, Tien MC, Hsu WH, Wu JL (2008) Sheepdog: group and tag recommendation for Flickr photos by automatic search-based learning. In: Proc ACM multimed, pp 737–740
Chen L, Roy A (2009) Event detection from Flickr data through wavelet-based spatial analysis. In: Proc ACM inf knowl manag, pp 523–532
Chen WC, Battestini A, Gelfand N, Setlur V (2009) Visual summaries of popular landmarks from community photo collections. In: Proc ACM multimed, pp 789–792
Chi E, Mytkowicz T (2008) Understanding the efficiency of social tagging systems using information theory. In: Proc ACM hypertext and hypermedia, pp 81–88
Choi JY, Yang S, Ro YM, Plataniotis K (2008) Face annotation for personal photos using context-assisted face recognition. In: Proc ACM multimed inf ret, pp 44–51
Chua TS, Tang J, Hong R, Li H, Luo Z, Yan-Tao Z (2009) Nus-wide: A real-world web image database from national university of singapore. In: Proc ACM image and video ret, pp 1–9
Cilibrasi R, Vitanyi P (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383
Cristani M, Perina A, Castellani U, Murino V (2008) Content visualization and management of geo-located image databases. In: Ext abstr on hum factors comput syst, pp 2823–2828
Cristani M, Perina A, Castellani U, Murino V (2008) Geo-located image analysis using latent representations. In: Proc comput vis pattern recognit
Cutting DR, Karger DR, Pedersen JO, Tukey JW (1992) Scatter/gather: a cluster-based approach to browsing large document collections. In: Proc res and dev inf ret, pp 318–329
Datta R, Joshi D, Li J, Wang JZ (2006) Studying aesthetics in photographic images using a computational approach. Lect Notes Comput Sci: Proc Eur Conf Comput Vis 3953(3):288–301
Datta R, Joshi D, Li J, Wang JZ (2007) Tagging over time: real-world image annotation by lightweight meta-learning. In: Proc ACM multimed, pp 393–402
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60
Datta R, Wang JZ (2010) Acquine: Aesthetic quality inference engine—real-time automatic rating of photo aesthetics. In: Proc ACM multimed inf ret, demo, pp 421–424
Davis M, Smith M, Canny J, Good N, King S, Janakiraman R (2005) Towards context-aware face recognition. In: Proc ACM multimed, pp 483–486
Davis M, Smith M, Stentiford F, Bamidele A, Canny J, Good N, King S, Janakiraman R (2006) Using context and similarity for face and location identification. In: Proc symp on electron imaging sci and tech
Dean J, Ghemawat S (2004) Mapreduce: simplified data processing on large clusters. In: Proc symp on oper syst design & implement, p 10
Deconstructing Flickr interestingness. http://wes2.wordpress.com/2006/05/12/deconstructing-flickrs-interestingness
Deng J, Li K, Do M, Su H, Fei-Fei L (2009) Construction and analysis of a large scale image ontology. Vision Sciences Society (VSS)
Dong W, Fu WT (2010) Cultural difference in image tagging. In: Proc hum factors comput syst, pp 981–984
Donmez P, Carbonell J, Schneider J (2009) Efficiently learning the accuracy of labeling sources for selective sampling. In: Proc ACM knowl discov data mining, pp 259–268
DPChallenge—a digital photography contest. http://www.dpchallenge.com
Dubinko M, Kumar R, Magnani J, Novak J, Raghavan P, Tomkins A (2006) Visizing tags over time. In: Proc World Wide Web, pp 193–202
Duda R, Hart P, Stork D (2000) Pattern classification, 2nd edn. Wiley-Interscience, New York
Ester M, Kriegel H, Sander J (1999) Knowledge discovery in spatial databases. In: Proc Ger conf artif intell, pp 61–74
Exif and related resources. http://www.exif.org
Feifei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Feifei L, Fergus R, Perona P (2007) Learn generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Fellbaum C (ed) (1998) WordNet: an electronic lexical database (language, speech, and communication). MIT Press, Cambridge
4,000,000,000. http://blog.flickr.net/en/2009/10/12/4000000000/
Flickr interestingness. http://www.flickr.com/explore/interesting
Fu WT (2008) The microstructures of social tagging: a rational model. In: Proc ACM comput support co-op work, pp 229–238
Furnas G, Landauer T, Gomez L, Dumais S (1987) The vocabulary problem in human-system communication. Commun ACM 30(11):964–971
Furnas G, Landauer T, Gomez L, Dumais S (1984) Statistical semantics: analysis of the potential performance of keyword information systems. In: Proc SIGCHI conf hum factors comput syst, pp 187–242
Gaber M, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. ACM SIGMOD Rec 34(2):18–26
Garg N, Weber I (2008) Personalized, interactive tag recommendation for Flickr. In: Proc ACM recomm sys, pp 67–74
Geonames. http://www.geonames.org
Giannakidou E, Kompatsiaris I, Vakali A (2008) Semsoc: semantic social and content-based clustering in multimedia collaborative tagging systems. In: Proc IEEE int conf semant comput, pp 128–135
Girgensohn A, Adcock J, Wilcox L (2004) Leveraging face recognition technology to find and organize photos. In: Proc ACM SIGMM int workshop multimed inf ret, pp 99–106
Golder S (2008) Measuring social networks with digital photograph collections. In: Proc ACM hypertext and hypermedia, pp 43–48
Golder S, Huberman B (2006) Usage patterns of collaborative tagging systems. J Inf Sci 32(2):198–208
Gonçalves D, Jesus R, Correia N (2008) A gesture based game for image tagging. In: Ext abstr on hum factors comput syst, pp 2685–2690
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. Tech rep 7694, California Institute of Technology
Games with a purpose. http://www.gwap.com/gwap/about
Halpin H, Robu V, Shepherd H (2007) The complex dynamics of collaborative tagging. In: Proc World Wide Web, pp 211–220
Hamilton JD (1994) Time series analysis, 1st edn. Princeton University Press, Princeton
Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1-2):177–196
Hotho A, Jäschke R, Schmitz C, Stumme G (2006) Inf retrieval in folksonomies: search and ranking. In: Proc Eur semant web conf, vol 4011, pp 411–426
Hotho A, Jäschke R, Schmitz C, Stumme G (2006) Folkrank: a ranking algorithm for folksonomies. In: Proc conf Fachgruppe inf ret, pp 111–114
Huiskes M, Lew M (2008) The mir Flickr retrieval evaluation. In: Proc ACM multimed inf ret, pp 39–43
Ihler A, Hutchins J, Smyth P (2007) Learning to detect events with markov-modulated poisson processes. ACM Trans Knowl Discov Data 1(3):13
Ivanov I, Vajda P, Goldmann L, Lee J, Ebrahimi T (2010) Object-based tag propagation for semi-automatic annotation of images. In: Proc ACM multimed inf ret, pp 497–506
Jaffe A, Naaman M, Tassa T, Davis M (2006) Generating summaries and visualization for large collections of geo-referenced photographs. In: Proc ACM workshop on multimed inf ret, pp 89–98
Jäschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2007) Tag recommendations in folksonomies. In: Proc Eur conf princ. and pract of knowl discov databases, pp 506–514
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proc res dev inf ret, pp 119–126
Jiang J, Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: Proc res comput linguist, pp 19–33
Jin X, Luo J, Yu J, Wang G, Joshi D, Han J (2010) iRIN: image retrieval in image-rich information networks. In: Proc World wide web, pp 1261–1264
Jin Y, Khan L, Wang L, Awad M (2005) Image annotations by combining multiple evidence & wordnet. In: Proc ACM multimed, pp 706–715
Jones W (1986) The memory extender personal filing system. In: Proc hum factors comput syst, pp 298–305
Joshi D, Luo J (2008) Inferring generic activities and events from image content and bags of geo-tags. In: Proc content-based image and video ret, pp 37–46
Ke Y, Tang X, Jing F (2006) The design of high-level features for photo quality assessment. In: Proc comput vis pattern recognit, pp 419–426
Kennedy L, Naaman M, Ahern S, Nair R, Rattenbury T (2007) How Flickr helps us make sense of the world: context and content in community-contributed media collections. In: Proc ACM multimed, pp 631–640
Kennedy L, Slaney M, Weinberger K (2009) Reliable tags using image similarity: mining specificity and expertise from large-scale multimedia databases. In: Proc workshop web-scale multimed corpus, pp 17–24
Kennedy L, Chang S, Kozintsev I (2006) To search or to label?: predicting the performance of search-based automatic image classifiers. In: Proc ACM workshop on multimed inf ret, pp 249–258
Kleinberg J (1999) Authoritative sources in a hyperlinked environment. J ACM 46(5):604–632
Kleinberg J (2003) Bursty and hierarchical structure in streams. In: Proc knowl discov and data mining, vol 7(4), pp 373–397
Koperski K, Adhikary J, Han J (1996) Spatial data mining: progress and challenges—survey paper. In: Proc workshop res issues data mining knowl discov, pp 1–10
Koutrika G, Effendi F, Gyöngyi Z, Heymann P, Garcia-Molina H (2007) Combating spam in tagging systems. In: Proc workshop advers inf ret web, pp 57–64
Krestel R, Chen L (2008) Using co-occurence of tags and resources to identify spammers. In: Proc conf pract knowl discov databases
Kucuktunc O, Sevil S, Tosun A, Zitouni H, Duygulu P, Can F (2008) Tag suggestr: automatic photo tag expansion using visual information for photo sharing websites. In: Proc semant digit media technol: semant multimed
Kustanowitz J, Shneiderman B (2005) Motivating annotation for personal digital photo libraries: lowering barriers while raising incentives. Tech rep, University of Maryland
LabelMe: the open annotation tool. http://labelme.csail.mit.edu
Lambiotte R, Ausloos M (2006) Collaborative tagging as a tripartite network. In: Proc comput sci, pp 1114–1117
Landauer T, Foltz P, Laham D (1998) An introduction to latent semantic analysis. In: Discourse proc, vol 25, pp 259–284
Lee L (1999) Measures of distributional similarity. In: Proc comput linguist, pp 25–32
Leskovec J, Lang K, Dasgupta A, Mahoney M (2008) Statistical properties of community structure in large social and information networks. In: Proc World Wide Web, pp 695–704
Levi K, Weiss Y (2004) Learning object detection from a small number of examples: the importance of good features. In: Proc comput vis pattern recognit, vol 2, pp 53–60
Li J, Wang JZ (2003) Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans Pattern Anal Mach Intell 25(9):1075–1088
Li J, Wang JZ (2008) Real-time computerized annotation of pictures. IEEE Trans Pattern Anal Mach Intell 30(6):985–1002
Li Q, Lu S (2008) Collaborative tagging applications and approaches. IEEE Multimed 15(3):14–21
Li X, Chen L, Zhang L, Lin F, Ma WY (2006) Image annotation by large-scale content-based image retrieval. In: Proc ACM multimed, pp 607–610
Li X, Snoek CG, Worring M (2008) Learn tag relevance by neighbor voting for social image retrieval. In: Proc ACM multimed inf ret, pp 180–187
Lindstaedt S, Mörzinger R, Sorschag R, Pammer V, Thallinger G (2009) Automatic image annotation using visual content and folksonomies. Multimed Tools Appl 42(1):97–113
Lindstaedt S, Pammer V, Mörzinger R, Kern R, Mülner H, Wagner C (2008) Recommending tags for pictures based on text, visual content and user context. In: Proc Internet & Web appl & serv, pp 506–511
Liu D, Hua XS, Yang L, Wang M, Zhang HJ (2009) Tag ranking. In: Proc World Wide Web, pp 351–360
Liu D, Wang M, Hua XS, Zhang HJ (2009) Smart batch tagging of photo albums. In: Proc ACM multimed, pp 809–812
Liu D, Wang M, Yang L, Hua XS, Zhang H (2009) Tag quality improvement for social images. In: Proc IEEE multimed and expo, pp 350–353
Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recogn 40(1):262–282
Lu Y, Zhang L, Tian Q, Ma WY (2008) What are the high-level concepts with small semantic gaps? In: Proc comput vis pattern recognit
Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Manning C, Raghavan P, Schutze H (2008) Index compression. In: Introduction to information retrieval. Cambridge University Press, Cambridge, pp 85–108
Marlow C, Naaman M, Boyd D, Davis M (2006) Ht06, tagging paper, taxonomy, Flickr, academic article, to read. In: Proc conf hypertext hypermedia, pp 31–40
Marques O, Lux M (2008) An exploratory study on joint analysis of visual classification in narrow domains and the discriminative power of tags. In: Proc ACM workshop multimed semant, pp 40–47
Amazon mechanical turk. http://www.mturk.com/mturk
Miller G (1983) Informavores. In: The study of information: interdiscipiinary messages. Wiley-Interscience, New York, pp 111–113
Miller H, Han J (2001) Geographic data mining and knowledge discovery. Taylor & Francis, New York
Moxley E, Kleban J, Manjunath BS (2008) Spirittagger: a geo-aware tag suggestion tool mined from Flickr. In: Proc ACM multimed inf ret, pp 24–30
Naaman M, Nair R (2008) Zonetag’s collaborative tag suggestions: what is this person doing in my phone?. IEEE Multimed 15(3):34–40
Naaman M, Paepcke A, Garcia-Molina H (2003) From where to what: metadata sharing for digital photographs with geographic coordinates. In: On the move to meaningful internet syst: CoopIS, DOA, and ODBASE, pp 196–217
Naaman M, Yeh R, Garcia-Molina H, Paepcke A (2005) Leveraging context to resolve identity in photo albums. In: Proc jt conf digit libr, pp 178–187
Navigli R (2009) Word sense disambiguation: a survey. ACM Comput Surv 41(2):1–69
Negoescu RA, Gatica-Perez D (2008) Analyzing Flickr groups. In: Proc content-based image & video ret, pp 417–426
Ng R, Han J (1994) Efficient and effective clustering methods for spatial data mining. In: Proc very large databases, pp 144–155
Nisbett RE, Peng K, Choi I, Norenzayan A (2001) Culture and systems of thought: holistic versus analytic cognition. Psychol Rev 108(2):291–310
Noll M, Au Yeung C, Gibbins N, Meinel C, Shadbolt N (2009) Telling experts from spammers: expertise ranking in folksonomies. In: Proc res dev inf ret, pp 612–619
Nov O, Naaman M, Ye C (2008) What drives content tagging: the case of photos on Flickr. In: Proc hum factors comput syst, pp 1097–1100
Nov O, Ye C (2010) Why do people tag? Motivations for photo tagging. Commun ACM 53(7):128–131
Nowak S, Rüger S (2010) How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In: Proc ACM multimed inf ret, pp 557–566
Obrador P, Anguera X, deOliveira R, Oliver N (2009) The role of tags and image aesthetics in social image search. In: Proc SIGMM workshop social media, pp 65–72
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Ret 2(1–2):1–135
Pedersen T, Patwardhan S, Michelizzi J (2004) Wordnet: similarity—measuring the relatedness of concepts. In: Demo papers North Am chap assoc for comput linguist—hum lang technol, pp 38–41
Photography community, including forums, reviews, and galleries from photonet. http://photo.net
Pirolli P (2007) Information foraging theory: adaptive interaction with information. Oxford University Press, London
Pirolli P (2009) An elementary social information foraging model. In: Proc hum factors comput syst, pp 605–614
Ramakrishnan G, Chitrapura KP, Krishnapuram R, Bhattacharyya P (2005) A model for handling approximate, noisy or incomplete labeling in text classification. In: Proc mach learn, pp 681–688
Rattenbury T, Good N, Naaman M (2007) Towards automatic extraction of event and place semantics from Flickr tags. In: Proc res dev inf ret, pp 103–110
Rebbapragada U, Brodley C (2007) Class noise mitigation through instance weighting. In: Proc Eur conf mach learn, pp 708–715
Robertson S, Vojnovic M, Weber I (2009) Rethinking the ESP game. In: Proc ext abstr hum factors comput syst, pp 3937–3942
Roddick J, Spiliopoulou M (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 14(4):750–767
Roddick J, Spiliopoulou M, Lister D, Ceglar A (2008) Higher order mining. SIGKDD Explor Newsl 10(1):5–17
Rui Y, Huang TS, Ortega M, Mehrotra S (1998) Relevance feedback: a power tool for interactive content-based image retrieval. In: Proc circuits syst video tech, vol 8(5), pp 644–655
Russell B, Torralba A, Murphy K, Freeman W (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77(1):157–173
Salton G, Buckley C (1987) Term weighting approaches in automatic text retrieval. Tech rep, Cornell University
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
Santini S, Gupta A, Jain R (2001) Emergent semantics through interaction in image databases. IEEE Trans Knowl Data Eng 13(3):337–351
Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proc World Wide Web, pp 285–295
Sawant N, Datta R, Li J, Wang JZ. (2010) Quest for relevant tags using local interaction networks and visual content. In: Proc ACM multimed inf ret, pp 231–240
Sen S, Harper FM, LaPitz A, Riedl J (2007) The quest for quality tags. In: Proc Int ACM support group work, pp 361–370
Sen S, Lam S, Rashid A, Cosley D, Frankowski D, Osterhouse J, Harper F, Riedl J (2006) Tagging, communities, vocabulary, evolution. In: Proc comput support co-op work, pp 181–190
Sen S, Vig J, Riedl J (2009) Learning to recognize valuable tags. In: Proc intell user interfaces, pp 87–96
Seneviratne L, Izquierdo E (2010) An interactive framework for image annotation through gaming. In: Proc ACM multimed inf ret, pp 517–526
Serdyukov P, Murdock V, van Zwol R (2009) Placing Flickr photos on a map. In: Proc res dev inf ret, pp 484–491
Shepitsen A, Gemmell J, Mobasher B, Burke R (2008) Personalized recommendation in social tagging systems using hierarchical clustering. In: Proc ACM recomm sys, pp 259–266
Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proc World Wide Web, pp 327–336
Sinha P, Jain R (2008) Classification and annotation of digital photos using optical context data. In: Proc content-based image & video ret, pp 309–318
Sivic J, Zitnick C, Szeliski R (2006) Finding people in repeated shots of the same scene. In: Proc Br mach vis conf, pp 909–918
Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Smith M, Kollock P (1999) Communities in cyberspace, Chapters 5 and 6. Routledge, Evanston, pp 107–165
Snavely N, Seitz S, Szeliski R (2006) Photo tourism: exploring photo collections in 3d. ACM Trans Graph 25(3). doi:10.1145/1141911.1141964
Solli M, Lenz R (2010) Emotion related structures in large image databases. In: Proc ACM image and video ret, pp 398–405
Song Y, Leung T (2006) Context-aided human recognition: clustering. In: Proc Eur conf comput vis, pp 382–395
Sorokin A, Forsyth D (2008) Utility data annotation with amazon mechanical turk. In: Comput vis pattern recognit workshops
Suchanek F, Vojnovic M, Gunawardena D (2008) Social tags: meaning and suggestions. In: Proc ACM inf knowl manag, pp 223–232
Suh B, Bederson B (2007) Semi-automatic photo annotation strategies using event based clustering and clothing based person recognition. Interact Comput 19(4):524–544
Sun A, Bhowmick S (2009) Image tag clarity: in search of visual-representative tags for social images. In: Proc SIGMM workshop social media, pp 19–26
Surowiecki J (2004) The wisdom of crowds: why the many are smarter than the few and how collective wisdom shapes business, economies, societies and nations. Doubleday, New York
Sylvan E (2010) Predicting influence in an online community of creators. In: Proc hum factors comput syst, pp 1913–1916
Tang J, Yan S, Hong R, Qi GJ, Chua TS (2009) Inferring semantic concepts from community-contributed images and noisy tags. In: Proc ACM multimed, pp 223–232
Truran M, Goulding J, Ashman H (2005) Co-active intelligence for image retrieval. In: Proc ACM multimed, pp 547–550
Tuytelaars T, Mikolajczyk K (2008) Local invariant feature detectors: a survey. Found Trends Comput Graph Vis 3(3):177–280
Verbeek J, Guillaumin M, Mensink T, Schmid C (2010) Image annotation with tagprop on the MIRFlickr set. In: Proc ACM multimed inf ret, pp 537–546
Vlachos M, Meek C, Vagena Z, Gunopulos D (2004) Identifying similarities, periodicities and bursts for online search queries. In: Proc ACM SIGMOD int conf manag data, pp 131–142
von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proc hum factors comput syst, pp 319–326
von Ahn L, Dabbish L (2008) Designing games with a purpose. Commun ACM 51(8):58–67
von Ahn L, Liu R, Blum M (2006) Peekaboom: a game for locating objects in images. In: Proc hum factors comput syst, pp 55–64
Wang C, Jing F, Zhang L, Zhang HJ (2006) Image annotation refinement using random walk with restarts. In: Proc ACM multimed, pp 647–650
Wang C, Jing F, Zhang L, Zhang HJ (2007) Content-based image annotation refinement. In: Proc comput vis pattern recognit, pp 1–8
Wang M, Yang K, Hua XS, Zhang HJ (2009) Visual tag dictionary: interpreting tags with visual words. In: Proc workshop web-scale multimed corpus, pp 1–8
Wang S, Jing F, He J, Du Q, Zhang L (2007) Igroup: presenting web image search results in semantic clusters. In: Proc hum factors comput syst, pp 587–596
Wang X, Zhang L, Jing F, Ma WY (2006) Annosearch: image auto-annotation by search. In: Proc comput vis pattern recognit, pp 1483–1490
Weinberger K, Slaney M, Van Zwol R (2008) Resolving tag ambiguity. In: Proc ACM multimed, pp 111–120
Wordnet. http://wordnet.princeton.edu
Wu L, Hua XS, Yu N, Ma WY, Li S (2008) Flickr distance. In: Proc ACM multimed, pp 31–40
Wu L, Yang L, Yu N, Hua XS (2009) Learning to tag. In: Proc World Wide Web, pp 361–370
Xue GR, Zeng HJ, Chen Z, Yu Y, Ma WY, Xi W, Fan W (2004) Optimizing web search using web click-through data. In: Proc ACM inf knowl manag, pp 118–126
Yacoob Y, Davis L (2006) Detection and analysis of hair. IEEE Trans Pattern Anal Mach Intell 28(7):1164–1169
Yahoo! Flickr. http://www.flickr.com
Yan R, Natsev A, Campbell M (2009) Hybrid tagging and browsing approaches for efficient manual image annotation. IEEE Multimed 16(2):26–41
Yang Q, Chen X, Wang G (2008) Web 2.0 dictionary. In: Proc content-based image and video ret, pp 591–600
Yang Q, Jian B, Chen X (2010) Tag dictionary and its applications. In: Proc ACM multimed inf ret, pp 397–400
Yang Y, Pierce T, Carbonell J (1998) A study of retrospective and on-line event detection. In: Proc int ACM SIGIR conf res and dev in inf ret, pp 28–36
Yao B, Yang X, Zhu SC (2007) Introduction to a large-scale general purpose ground truth database: methodology, annotation tool and benchmarks. In: Proc energy minimization methods comput vis pattern recognit, pp 169–183
Yuan J, Luo J, Kautz H, Wu Y (2008) Mining gps traces and visual words for event classification. In: Proc ACM multimed inf ret, pp 2–9
Zeng HJ, He QC, Chen Z, Ma WY, Ma J (2004) Learning to cluster web search results. In: Proc res dev inf ret, pp 210–217
Zhang L, Hu Y, Li M, Ma W, Zhang H (2004) Efficient propagation for face annotation in family albums. In: Proc ACM multimed, pp 716–723
Zhang S, Farooq U, Carroll JM (2009) Enhancing information scent: identifying and recommending quality tags. In: Proc ACM support group work, pp 1–10
Zhao M, Liu S (2006) Automatic person annotation of family photo album. In: Proc image & video ret, pp 163–172
Zhao W, Chellappa R, Phillips PJ, Rosenfeld A (2003) Face recognition: a literature survey. ACM Comput Surv 35(4):399–458
Zunjarwad A, Sundaram H, Xie L (2007) Contextual wisdom: social relations and correlations for multimedia event annotation. In: Proc ACM multimed, pp 615–624
Author information
Authors and Affiliations
Corresponding author
Additional information
The material was based upon work supported in part by the National Science Foundation under Grant Nos. IIS-0949891 and IIS-0347148, and by The Pennsylvania State University.
Rights and permissions
About this article
Cite this article
Sawant, N., Li, J. & Wang, J.Z. Automatic image semantic interpretation using social action and tagging data. Multimed Tools Appl 51, 213–246 (2011). https://doi.org/10.1007/s11042-010-0650-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0650-8