Skip to main content
Erschienen in: Journal of Intelligent Information Systems 2/2014

01.10.2014

Image understanding and the web: a state-of-the-art review

verfasst von: Fariza Fauzi, Mohammed Belkhatir

Erschienen in: Journal of Intelligent Information Systems | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The contextual information of Web images is investigated to address the issue of characterizing their content with semantic descriptors and therefore bridge the semantic gap, i.e. the gap between their automated low-level representation in terms of colors, textures, shapes…and their semantic interpretation. Such characterization allows for understanding the image content and is crucial in important Web-based tasks such as image indexing and retrieval. Although we are highly motivated by the availability of rich knowledge on the Web and the relative success achieved by commercial search engines in automatically characterizing the image content using contextual information in Web pages, we are aware that the unpredictable quality of the contextual information is a major limiting factor. Among the reasons explaining the difficulty to leverage on the image contextual information, some problems are related to the characterization and extraction of this information. Indeed, the first issue is the lack of large-scale studies to highlight what is considered the relevant contextual information of an image, where it is located in a Web page and whether it is consistent across Web pages of different types, content layouts and domains. Also, the matter related to the extraction of this contextual information is topical as state-of-the-art automated extraction tools are unable to handle the heterogeneous Web. As far as the processing of the contextual information is concerned, problems linked to the syntactic and semantic characterizations of the textual components are important to address in order to tackle the semantic gap. Furthermore, questions pertaining to the organization of these textual components into coherent structures that are usable in image indexing and retrieval frameworks shall arise. To address these issues, we lay down the anatomy of a generic context-based Web image understanding framework and propose its stage-based decomposition, covering topical issues from information indexing and retrieval, image description models, natural language processing, webpage segmentation and automated information extraction. For each of the identified stages, we review state-of-the-art solutions in the literature categorized and analyzed under the light of the techniques used.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Ait-Mokhtar, S., Chanod, J.-P., Roux, C. (2002). Robustness beyond shallowness: incremental deep parsing. Natural Language Engineering, 8(3), 121–144. Ait-Mokhtar, S., Chanod, J.-P., Roux, C. (2002). Robustness beyond shallowness: incremental deep parsing. Natural Language Engineering, 8(3), 121–144.
Zurück zum Zitat Alcic, S., & Conrad, S. (2011). A clustering-based approach to web image context extraction. In Proceedings of the third international conferences on advances in multimedia (pp. 74–79). Alcic, S., & Conrad, S. (2011). A clustering-based approach to web image context extraction. In Proceedings of the third international conferences on advances in multimedia (pp. 74–79).
Zurück zum Zitat Anick, P.G. (1991). Integrating “Natural Language” and Boolean query: an application of computational linguistics to full-text information retrieval. In Proceedings of the AAAI-91 workshop on natural language text retrieval. Anick, P.G. (1991). Integrating “Natural Language” and Boolean query: an application of computational linguistics to full-text information retrieval. In Proceedings of the AAAI-91 workshop on natural language text retrieval.
Zurück zum Zitat Arasu, A., & Garcia-Molina, H. (2003). Extracting structured data from web pages. In Proceedings of the 2003 ACM SIGMOD international conference on management of data (pp. 337–348). Arasu, A., & Garcia-Molina, H. (2003). Extracting structured data from web pages. In Proceedings of the 2003 ACM SIGMOD international conference on management of data (pp. 337–348).
Zurück zum Zitat Armitage, L.H., & Enser, P.G.B. (1997). Analysis of user need in image archives. Journal of Information Science, 23(4), 287–299.CrossRef Armitage, L.H., & Enser, P.G.B. (1997). Analysis of user need in image archives. Journal of Information Science, 23(4), 287–299.CrossRef
Zurück zum Zitat Aslandogan, Y.A., & Yu, C.T. (2000). Evaluating strategies and systems for content based indexing of person images on the web. In Proceedings of the ACM international conference on multimedia (pp. 313–321). Aslandogan, Y.A., & Yu, C.T. (2000). Evaluating strategies and systems for content based indexing of person images on the web. In Proceedings of the ACM international conference on multimedia (pp. 313–321).
Zurück zum Zitat Aslandogan, Y.A., Thier, C., Yu, C., Zou, J., Rishe, N. (1997). Using semantic contents and WordNet in image retrieval. In Proceedings of SIGIR. Aslandogan, Y.A., Thier, C., Yu, C., Zou, J., Rishe, N. (1997). Using semantic contents and WordNet in image retrieval. In Proceedings of SIGIR.
Zurück zum Zitat Blaschko, M.B., & Lampert, C.H. (2008). Correlational spectral clustering. In IEEE conference on computer vision and pattern recognition 2008. CVPR 2008 (pp. 1–8). Blaschko, M.B., & Lampert, C.H. (2008). Correlational spectral clustering. In IEEE conference on computer vision and pattern recognition 2008. CVPR 2008 (pp. 1–8).
Zurück zum Zitat Cai, D., Yu, S., Wen, J.-R. (2003). VIPS?: a vision-based page segmentation algorithm. Cai, D., Yu, S., Wen, J.-R. (2003). VIPS?: a vision-based page segmentation algorithm.
Zurück zum Zitat Cai, D., He, X., Li, Z., Ma, W.Y., Wen, J.R. (2004). Hierarchical clustering of WWW image search results using visual, textual and link information. In Proceedings of the 12th annual ACM international conference on multimedia (pp. 952–959). Cai, D., He, X., Li, Z., Ma, W.Y., Wen, J.R. (2004). Hierarchical clustering of WWW image search results using visual, textual and link information. In Proceedings of the 12th annual ACM international conference on multimedia (pp. 952–959).
Zurück zum Zitat Chakrabarti, D., Kumar, R., Punera, K. (2008). A graph-theoretic approach to web page segmentation. In Proceedings of the 17th international conference on World Wide Web (pp. 377–386). Chakrabarti, D., Kumar, R., Punera, K. (2008). A graph-theoretic approach to web page segmentation. In Proceedings of the 17th international conference on World Wide Web (pp. 377–386).
Zurück zum Zitat Chang, C.-H., Kayed, M., Girgis, M. R., Shaalan, K.F. (2006). A survey of web information extraction systems. IEEE Transactions on Knowledge and Data Engineering, 18(10), 1411–1428.CrossRef Chang, C.-H., Kayed, M., Girgis, M. R., Shaalan, K.F. (2006). A survey of web information extraction systems. IEEE Transactions on Knowledge and Data Engineering, 18(10), 1411–1428.CrossRef
Zurück zum Zitat Chen, Z., Wenyin, L., Hu, C., Li, M., Zhang, H.J. (2001). Ifind: a web image search engine. In Proceedings of ACM SIGIR (p. 450). Chen, Z., Wenyin, L., Hu, C., Li, M., Zhang, H.J. (2001). Ifind: a web image search engine. In Proceedings of ACM SIGIR (p. 450).
Zurück zum Zitat Chen, Y., Ma, W. Y., Zhang, H.J. (2003). Detecting web page structure for adaptive viewing on small form factor devices. In Proceedings of the 12th international conference on World Wide Web (pp. 20–24). Chen, Y., Ma, W. Y., Zhang, H.J. (2003). Detecting web page structure for adaptive viewing on small form factor devices. In Proceedings of the 12th international conference on World Wide Web (pp. 20–24).
Zurück zum Zitat Choi, Y., & Rasmussen, E.M. (2003). Searching for images: the analysis of users’ queries for image retrieval in American History. Journal of the American Society for Information Science and Technology, 54(6), 498–511.CrossRef Choi, Y., & Rasmussen, E.M. (2003). Searching for images: the analysis of users’ queries for image retrieval in American History. Journal of the American Society for Information Science and Technology, 54(6), 498–511.CrossRef
Zurück zum Zitat Chowdhury, G. G. (2003). Natural language processing. Annual Review of Information Science and Technology, 37(1), 51–89.CrossRef Chowdhury, G. G. (2003). Natural language processing. Annual Review of Information Science and Technology, 37(1), 51–89.CrossRef
Zurück zum Zitat Chung, E., & Yoon, J. (2010). An exploratory analysis on unsuccessful image searches. Proceedings of the American Society for Information Science and Technology, 47(1), 1–2.CrossRef Chung, E., & Yoon, J. (2010). An exploratory analysis on unsuccessful image searches. Proceedings of the American Society for Information Science and Technology, 47(1), 1–2.CrossRef
Zurück zum Zitat Coelho, T.A.S., Calado, P.P., Souza, L.V., Ribeiro-Neto, B. (2004). Using multiple evidence ranking. IEEE Transactions on Knowledge and Data Engineering, 16(4), 408–417.CrossRef Coelho, T.A.S., Calado, P.P., Souza, L.V., Ribeiro-Neto, B. (2004). Using multiple evidence ranking. IEEE Transactions on Knowledge and Data Engineering, 16(4), 408–417.CrossRef
Zurück zum Zitat Crescenzi, V., & Mecca, G. (1998). Grammars have exceptions. Information Systems, 23(8), 539–565.CrossRef Crescenzi, V., & Mecca, G. (1998). Grammars have exceptions. Information Systems, 23(8), 539–565.CrossRef
Zurück zum Zitat Crescenzi, V., Mecca, G., Merialdo, P. (2001). Roadrunner: towards automatic data extraction from large web sites. In Proceedings of the 27th VLDB conference (pp. 109–118). Crescenzi, V., Mecca, G., Merialdo, P. (2001). Roadrunner: towards automatic data extraction from large web sites. In Proceedings of the 27th VLDB conference (pp. 109–118).
Zurück zum Zitat Datta, R., Joshi, D., Li, J., Wang, J.Z. (2008). Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys, 40(2), 1–60.CrossRef Datta, R., Joshi, D., Li, J., Wang, J.Z. (2008). Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys, 40(2), 1–60.CrossRef
Zurück zum Zitat De Marneffe, M.C., Maccartney, B., Manning, C.D. (2006). Generating typed dependency parses from phrase structure parses. Proceedings of LREC, 6, 449–454. De Marneffe, M.C., Maccartney, B., Manning, C.D. (2006). Generating typed dependency parses from phrase structure parses. Proceedings of LREC, 6, 449–454.
Zurück zum Zitat Deschacht, K., & Moens, M.F. (2007). Text analysis for automatic image annotation. Annual Meeting-Association for Computational Linguistics, 45, 1000. Deschacht, K., & Moens, M.F. (2007). Text analysis for automatic image annotation. Annual Meeting-Association for Computational Linguistics, 45, 1000.
Zurück zum Zitat Deschacht, K., & Moens, M.F. (2008). Finding the best picture: cross-media retrieval of content. In Proceedings of the European conference on advances in information retrieval (pp. 539–546). Deschacht, K., & Moens, M.F. (2008). Finding the best picture: cross-media retrieval of content. In Proceedings of the European conference on advances in information retrieval (pp. 539–546).
Zurück zum Zitat Dumais, S.T. (1991). Improving the retrieval of information from external sources. Behavior Research Methods, Instruments & Computers, 23(2), 229–236.CrossRef Dumais, S.T. (1991). Improving the retrieval of information from external sources. Behavior Research Methods, Instruments & Computers, 23(2), 229–236.CrossRef
Zurück zum Zitat Eakins, J. (2002). Towards intelligent image retrieval. Pattern Recognition, 35(1), 3–14.CrossRefMATH Eakins, J. (2002). Towards intelligent image retrieval. Pattern Recognition, 35(1), 3–14.CrossRefMATH
Zurück zum Zitat Fauzi, F., & Belkhatir, M. (2010). A user study to investigate semantically relevant contextual information of WWW images. International Journal of Human Computer Studies, 68(5), 270–287.CrossRef Fauzi, F., & Belkhatir, M. (2010). A user study to investigate semantically relevant contextual information of WWW images. International Journal of Human Computer Studies, 68(5), 270–287.CrossRef
Zurück zum Zitat Feng, Y., & Lapata, M. (2008). Automatic image annotation using auxiliary text information. In Proceedings of the 46th annual meeting of the association for computational linguistics: human language technologies (pp. 272–280). Feng, Y., & Lapata, M. (2008). Automatic image annotation using auxiliary text information. In Proceedings of the 46th annual meeting of the association for computational linguistics: human language technologies (pp. 272–280).
Zurück zum Zitat Feng, Y., & Lapata, M. (2010). Topic models for image annotation and text illustration. In Proceedings of the 2010 annual conference of the North American chapter of the association for computational linguistics (pp. 831–839). Feng, Y., & Lapata, M. (2010). Topic models for image annotation and text illustration. In Proceedings of the 2010 annual conference of the North American chapter of the association for computational linguistics (pp. 831–839).
Zurück zum Zitat Feng, H., Shi, R., Chua, T.-S. (2004). A bootstrapping framework for annotating and retrieving WWW images. In Proceedings of the 12th annual ACM international conference on multimedia (p. 960). Feng, H., Shi, R., Chua, T.-S. (2004). A bootstrapping framework for annotating and retrieving WWW images. In Proceedings of the 12th annual ACM international conference on multimedia (p. 960).
Zurück zum Zitat Frankel, C., Swain, M. J., Athitsos, V. (1996). Webseer?: an image search engine for the World Wide Web. World Wide Web Internet and Web Information Systems, 1–24. Frankel, C., Swain, M. J., Athitsos, V. (1996). Webseer?: an image search engine for the World Wide Web. World Wide Web Internet and Web Information Systems, 1–24.
Zurück zum Zitat Gao, B., Liu, T. Y., Qin, T., Zheng, X., Cheng, Q.S., Ma, W.Y. (2005). Web image clustering by consistent utilization of visual features and surrounding texts. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 112–121). Gao, B., Liu, T. Y., Qin, T., Zheng, X., Cheng, Q.S., Ma, W.Y. (2005). Web image clustering by consistent utilization of visual features and surrounding texts. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 112–121).
Zurück zum Zitat Ghoshal, A., Ircing, P., Khudanpur, S. (2005). Hidden Markov models for automatic annotation and contentbased retrieval of images and video. In Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’05). (pp. 544–551) New York, NY: ACM. USA,. doi: 10.1145/1076034.1076127. Ghoshal, A., Ircing, P., Khudanpur, S. (2005). Hidden Markov models for automatic annotation and contentbased retrieval of images and video. In Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’05). (pp. 544–551) New York, NY: ACM. USA,. doi: 10.​1145/​1076034.​1076127.
Zurück zum Zitat Gong, Z.U., Hou, L.H., Cheang, C.W. (2006). Web image indexing by using associated texts. Knowledge and Information Systems, 10(2), 243–264.CrossRef Gong, Z.U., Hou, L.H., Cheang, C.W. (2006). Web image indexing by using associated texts. Knowledge and Information Systems, 10(2), 243–264.CrossRef
Zurück zum Zitat Goodrum, A., & Spink, A. (2001). Image searching on the excite web search engine. Information Processing & Management, 37(2), 295–311.CrossRefMATH Goodrum, A., & Spink, A. (2001). Image searching on the excite web search engine. Information Processing & Management, 37(2), 295–311.CrossRefMATH
Zurück zum Zitat Hammer, J., Garcia-Molina, H., Cho, J., Aranha, R., Crespo, A. (1997). Extracting semistructured information from the web. World Wide Web Internet and Web Information Systems, 1–8. Hammer, J., Garcia-Molina, H., Cho, J., Aranha, R., Crespo, A. (1997). Extracting semistructured information from the web. World Wide Web Internet and Web Information Systems, 1–8.
Zurück zum Zitat He, X., Cai, D., Wen, J.R., Ma, W.Y., Zhang, H.J. (2007). Clustering and searching WWW images using link and page layout analysis. ACM Transactions on Multimedia Computing, Communications, and Applications, 3(2), 10.CrossRef He, X., Cai, D., Wen, J.R., Ma, W.Y., Zhang, H.J. (2007). Clustering and searching WWW images using link and page layout analysis. ACM Transactions on Multimedia Computing, Communications, and Applications, 3(2), 10.CrossRef
Zurück zum Zitat Hearst, M.A. (2006). Clustering versus faceted categories for information exploration. Communications of the ACM, 49(4), 59–61.CrossRef Hearst, M.A. (2006). Clustering versus faceted categories for information exploration. Communications of the ACM, 49(4), 59–61.CrossRef
Zurück zum Zitat Hollink, L., Schreiber, A. T., Wielinga, B. J., Worring, M. (2004). Classification of user image descriptions. International Journal of Human-Computer Studies, 61(5), 601–626.CrossRef Hollink, L., Schreiber, A. T., Wielinga, B. J., Worring, M. (2004). Classification of user image descriptions. International Journal of Human-Computer Studies, 61(5), 601–626.CrossRef
Zurück zum Zitat Hong, J.L., Siew, E.-G., Egerton, S. (2010). Information extraction for search engines using fast heuristic techniques. Data & Knowledge Engineering, 69(2), 169–196.CrossRef Hong, J.L., Siew, E.-G., Egerton, S. (2010). Information extraction for search engines using fast heuristic techniques. Data & Knowledge Engineering, 69(2), 169–196.CrossRef
Zurück zum Zitat Hua, Z., Wang, X. J., Liu, Q., Lu, H. (2005). Semantic knowledge extraction and annotation for web images. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 467–470). Hua, Z., Wang, X. J., Liu, Q., Lu, H. (2005). Semantic knowledge extraction and annotation for web images. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 467–470).
Zurück zum Zitat Hughes, A., Wilkens, T., Wildemuth, B., Marchionini, G. (2003). Text or pictures? An eyetracking study of how people view digital video surrogates. In Proceedings of the international conference on image and video retrieval (pp. 271–280). Hughes, A., Wilkens, T., Wildemuth, B., Marchionini, G. (2003). Text or pictures? An eyetracking study of how people view digital video surrogates. In Proceedings of the international conference on image and video retrieval (pp. 271–280).
Zurück zum Zitat Inoue, M. (2004). On the need for annotation-based image retrieval. In Proceedings of the workshop on information retrieval in context (Irix), Sheffield, UK (pp. 44–46). Inoue, M. (2004). On the need for annotation-based image retrieval. In Proceedings of the workshop on information retrieval in context (Irix), Sheffield, UK (pp. 44–46).
Zurück zum Zitat Inoue, M. (2009). Image retrieval: research and use in the information explosion. Progress in Informatics, 6, 3.CrossRef Inoue, M. (2009). Image retrieval: research and use in the information explosion. Progress in Informatics, 6, 3.CrossRef
Zurück zum Zitat Jaimes, A., & Chang, S.F. (2000). A conceptual framework for indexing visual information at multiple levels. IS&T/SPIE Internet Imaging, 3964, 2–15.CrossRef Jaimes, A., & Chang, S.F. (2000). A conceptual framework for indexing visual information at multiple levels. IS&T/SPIE Internet Imaging, 3964, 2–15.CrossRef
Zurück zum Zitat Jin, Y., Khan, L., Wang, L., Awad, M. (2005). Image annotations by combining multiple evidence & wordnet. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 706–715). Jin, Y., Khan, L., Wang, L., Awad, M. (2005). Image annotations by combining multiple evidence & wordnet. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 706–715).
Zurück zum Zitat Jones, K.S. (1973). Index term weighting. Information Storage and Retrieval, 9, 619–633.CrossRef Jones, K.S. (1973). Index term weighting. Information Storage and Retrieval, 9, 619–633.CrossRef
Zurück zum Zitat Jörgensen, C., & Jörgensen, P. (2005). Image querying by image professionals. Journal of the American Society for Information Science and Technology, 56(12), 1346–1359.CrossRef Jörgensen, C., & Jörgensen, P. (2005). Image querying by image professionals. Journal of the American Society for Information Science and Technology, 56(12), 1346–1359.CrossRef
Zurück zum Zitat Joshi, P. M., & Liu, S. (2009). Web document text and images extraction using DOM analysis and natural language processing. In Proceedings of the 9th ACM symposium on document engineering (p. 218). Joshi, P. M., & Liu, S. (2009). Web document text and images extraction using DOM analysis and natural language processing. In Proceedings of the 9th ACM symposium on document engineering (p. 218).
Zurück zum Zitat Kang, J., Yang, J., Choi, J. (2010). Repetition-based web page segmentation by detecting tag patterns for small-screen devices. IEEE Transactions on Consumer Electronics, 56(2), 980–986.CrossRef Kang, J., Yang, J., Choi, J. (2010). Repetition-based web page segmentation by detecting tag patterns for small-screen devices. IEEE Transactions on Consumer Electronics, 56(2), 980–986.CrossRef
Zurück zum Zitat Kao, H.-Y., Ho, J.-M., Chen, M.-S. (2005). WISDOM?: Web intra-page informative structure mining based on document object model. IEEE Transactions on Knowledge and Data Engineering, 17(5), 614–627.CrossRef Kao, H.-Y., Ho, J.-M., Chen, M.-S. (2005). WISDOM?: Web intra-page informative structure mining based on document object model. IEEE Transactions on Knowledge and Data Engineering, 17(5), 614–627.CrossRef
Zurück zum Zitat Katz, G., & Giesbrecht, E. (2006). Automatic identification of non-compositional multi-word expressions using latent semantic analysis. In Proceedings of the workshop on multiword expressions: identifying and exploiting underlying properties (pp. 12–19). Katz, G., & Giesbrecht, E. (2006). Automatic identification of non-compositional multi-word expressions using latent semantic analysis. In Proceedings of the workshop on multiword expressions: identifying and exploiting underlying properties (pp. 12–19).
Zurück zum Zitat Kennedy, L.S., & Naaman, M. (2008). Generating diverse and representative image search results for landmarks. In Proceedings of the 17th international conference on World Wide Web (pp. 297–306). Kennedy, L.S., & Naaman, M. (2008). Generating diverse and representative image search results for landmarks. In Proceedings of the 17th international conference on World Wide Web (pp. 297–306).
Zurück zum Zitat Kherfi, M.L., Ziou, D., Bernardi, A. (2004). Image retrieval from the World Wide Web: issues, techniques, and systems. ACM Computing Surveys (CSUR), 36(1), 35–67.CrossRef Kherfi, M.L., Ziou, D., Bernardi, A. (2004). Image retrieval from the World Wide Web: issues, techniques, and systems. ACM Computing Surveys (CSUR), 36(1), 35–67.CrossRef
Zurück zum Zitat Kohlschutter, C., & Nejdl, W. (2008). A densitometric approach to web page segmentation. In Proceeding of the 17th ACM conference on information and knowledge management. Kohlschutter, C., & Nejdl, W. (2008). A densitometric approach to web page segmentation. In Proceeding of the 17th ACM conference on information and knowledge management.
Zurück zum Zitat La Cascia, M., Sethi, S., Sclaroff, S. (1998). Combining textual and visual cues for content-based image retrieval on the World Wide Web. In Proceedings of IEEE workshop on content-based access of image and video libraries, 1998 (pp. 24–28). La Cascia, M., Sethi, S., Sclaroff, S. (1998). Combining textual and visual cues for content-based image retrieval on the World Wide Web. In Proceedings of IEEE workshop on content-based access of image and video libraries, 1998 (pp. 24–28).
Zurück zum Zitat Larson, M., Kofler, C., Hanjalic, A. (2011). Reading between the tags to predict real-world size-class for visually depicted objects in images. In Proceedings of ACM multimedia. Larson, M., Kofler, C., Hanjalic, A. (2011). Reading between the tags to predict real-world size-class for visually depicted objects in images. In Proceedings of ACM multimedia.
Zurück zum Zitat Leong, C.W., & Mihalcea, R. (2009). Explorations in automatic image annotation using textual features. In Proceedings of the third linguistic annotation workshop on - ACL-IJCNLP ’09 (pp. 56–59). Leong, C.W., & Mihalcea, R. (2009). Explorations in automatic image annotation using textual features. In Proceedings of the third linguistic annotation workshop on - ACL-IJCNLP ’09 (pp. 56–59).
Zurück zum Zitat Leong, C.W., Mihalcea, R., Hassan, S. (2010). Text mining for automatic image tagging. In Proceedings of the 23rd international conference on computational linguistics (pp. 647–655). Leong, C.W., Mihalcea, R., Hassan, S. (2010). Text mining for automatic image tagging. In Proceedings of the 23rd international conference on computational linguistics (pp. 647–655).
Zurück zum Zitat Lew, M. S. (2000). Next-generation web searches for visual content. IEEE Computer, 33(11), 46–53.CrossRef Lew, M. S. (2000). Next-generation web searches for visual content. IEEE Computer, 33(11), 46–53.CrossRef
Zurück zum Zitat Li, J., Liu, T., Wang, W., Gao, W. (2006). A broadcast model for web image annotation. In Proceedings of the 7th pacific rim conference on multimedia. Li, J., Liu, T., Wang, W., Gao, W. (2006). A broadcast model for web image annotation. In Proceedings of the 7th pacific rim conference on multimedia.
Zurück zum Zitat Lin, D. (1999). Automatic identification of non-compositional phrases. In Proceedings of the 37th annual meeting of the association for computational linguistics on computational linguistics (pp. 317–324). Lin, D. (1999). Automatic identification of non-compositional phrases. In Proceedings of the 37th annual meeting of the association for computational linguistics on computational linguistics (pp. 317–324).
Zurück zum Zitat Liu, Y., Zhang, D., Lu, G., Ma, W.-Y. (2007). A survey of content-based image retrieval with high-level semantics. Pattern Recognition, 40(1), 262–282.CrossRefMATH Liu, Y., Zhang, D., Lu, G., Ma, W.-Y. (2007). A survey of content-based image retrieval with high-level semantics. Pattern Recognition, 40(1), 262–282.CrossRefMATH
Zurück zum Zitat Liu, J., Li, M., Liu, Q., Lu, H., Ma, S. (2009). Image annotation via graph learning. Pattern Recognition, 42(2), 218–228.CrossRefMATH Liu, J., Li, M., Liu, Q., Lu, H., Ma, S. (2009). Image annotation via graph learning. Pattern Recognition, 42(2), 218–228.CrossRefMATH
Zurück zum Zitat Liu, W., Meng, X., Meng, W. (2010). Vide: a vision-based approach for deep web data extraction. IEEE Transactions on Knowledge and Data Engineering, 22(3), 447–460.CrossRef Liu, W., Meng, X., Meng, W. (2010). Vide: a vision-based approach for deep web data extraction. IEEE Transactions on Knowledge and Data Engineering, 22(3), 447–460.CrossRef
Zurück zum Zitat Lu, Y., Hu, C., Zhu, X., Zhang, H.J., Yang, Q. (2000). A unified framework for semantics and feature based relevance feedback in image retrieval systems. In Proceedings of the 8th annual ACM international conference on multimedia (pp. 31–37). Lu, Y., Hu, C., Zhu, X., Zhang, H.J., Yang, Q. (2000). A unified framework for semantics and feature based relevance feedback in image retrieval systems. In Proceedings of the 8th annual ACM international conference on multimedia (pp. 31–37).
Zurück zum Zitat Luo, J., Yu, J., Joshi, D., Hao, W. (2008). Event recognition: viewing the world with a third eye. In Proceedings of the 16th ACM international conference on multimedia (pp. 1071–1080). Luo, J., Yu, J., Joshi, D., Hao, W. (2008). Event recognition: viewing the world with a third eye. In Proceedings of the 16th ACM international conference on multimedia (pp. 1071–1080).
Zurück zum Zitat Meghini, C., Sebastiani, F., Straccia, U. (2001). A model for multimedia information retrieval. Journal of the ACM (JACM), 48(5). Meghini, C., Sebastiani, F., Straccia, U. (2001). A model for multimedia information retrieval. Journal of the ACM (JACM), 48(5).
Zurück zum Zitat Mukherjea, S., & Hirata, K. (1999). Amore: a World Wide Web image retrieval engine. World Wide Web, 2, 115–132.CrossRef Mukherjea, S., & Hirata, K. (1999). Amore: a World Wide Web image retrieval engine. World Wide Web, 2, 115–132.CrossRef
Zurück zum Zitat Olivares, X., Ciaramita, M., Van Zwol, R. (2008). Boosting image retrieval through aggregating search results based on visual annotations. In Proceedings of ACM Multimedia. Olivares, X., Ciaramita, M., Van Zwol, R. (2008). Boosting image retrieval through aggregating search results based on visual annotations. In Proceedings of ACM Multimedia.
Zurück zum Zitat Ortega-Binderberger, M., Mehrotra, S., Chakrabarti, K., Porkaew, K. (2000). Webmars: a multimedia search engine for full document retrieval and cross media browsing. In Proceedings of the sixth international workshop on advances in multimedia information systems (pp. 72–81). Ortega-Binderberger, M., Mehrotra, S., Chakrabarti, K., Porkaew, K. (2000). Webmars: a multimedia search engine for full document retrieval and cross media browsing. In Proceedings of the sixth international workshop on advances in multimedia information systems (pp. 72–81).
Zurück zum Zitat Panofsky, E. (1962). Studies in iconology. New York: Harper & Row. Panofsky, E. (1962). Studies in iconology. New York: Harper & Row.
Zurück zum Zitat Pedersen, T., & Kolhatkar, V. (2009). Wordnet:: Senserelate:: Allwords: a broad coverage word sense tagger that maximizes semantic relatedness. In Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the association for computational linguistics, companion volume: demonstration session (pp. 17–20). Pedersen, T., & Kolhatkar, V. (2009). Wordnet:: Senserelate:: Allwords: a broad coverage word sense tagger that maximizes semantic relatedness. In Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the association for computational linguistics, companion volume: demonstration session (pp. 17–20).
Zurück zum Zitat Pu, H.-T. (2008). An analysis of failed queries for web image retrieval. Journal of Information Science, 34(3), 275–289.CrossRef Pu, H.-T. (2008). An analysis of failed queries for web image retrieval. Journal of Information Science, 34(3), 275–289.CrossRef
Zurück zum Zitat Quack, T., Leibe, B., Van Gool, L. (2008). World-scale mining of objects and events from community photo collections. In Proceedings of the international conference on content-based image and video retrieval (pp. 47–56). Quack, T., Leibe, B., Van Gool, L. (2008). World-scale mining of objects and events from community photo collections. In Proceedings of the international conference on content-based image and video retrieval (pp. 47–56).
Zurück zum Zitat Rege, M., Dong, M., Hua, J. (2008). Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering. In Proceedings of the 17th international conference on World Wide Web (p. 317). Rege, M., Dong, M., Hua, J. (2008). Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering. In Proceedings of the 17th international conference on World Wide Web (p. 317).
Zurück zum Zitat Rorissa, A. (2008). User-generated descriptions of individual images versus labels of groups of images: a comparison using basic level theory. Information Processing & Management, 44(5), 1741–1753.CrossRef Rorissa, A. (2008). User-generated descriptions of individual images versus labels of groups of images: a comparison using basic level theory. Information Processing & Management, 44(5), 1741–1753.CrossRef
Zurück zum Zitat Rorissa, A. (2010). A comparative study of Flickr tags and index terms in a general image collection. Journal of the American Society for Information Science and Technology, 61(11), 2230–2242.CrossRef Rorissa, A. (2010). A comparative study of Flickr tags and index terms in a general image collection. Journal of the American Society for Information Science and Technology, 61(11), 2230–2242.CrossRef
Zurück zum Zitat Sahuguet, A., & Azavant, F. (2001). Building intelligent web applications using lightweight wrappers. Data & Knowledge Engineering, 36(3), 283–316.CrossRefMATH Sahuguet, A., & Azavant, F. (2001). Building intelligent web applications using lightweight wrappers. Data & Knowledge Engineering, 36(3), 283–316.CrossRefMATH
Zurück zum Zitat Sclaroff, S., Cascia, M.L., Sethi, S. (1999). Unifying textual and visual cues for content-based image retrieval on the World Wide Web. Computer Vision and Image, 75, 86–98.CrossRef Sclaroff, S., Cascia, M.L., Sethi, S. (1999). Unifying textual and visual cues for content-based image retrieval on the World Wide Web. Computer Vision and Image, 75, 86–98.CrossRef
Zurück zum Zitat Shatford, S. (1986). Analyzing the subject of a picture: a theoretical approach. Cataloging & Classification Quarterly, 6(3), 39–62.CrossRef Shatford, S. (1986). Analyzing the subject of a picture: a theoretical approach. Cataloging & Classification Quarterly, 6(3), 39–62.CrossRef
Zurück zum Zitat Shen, H.T., Ooi, B.C., Tan, K.-L. (2000). Giving meanings to WWW images. In Proceedings of the 8th annual ACM international conference on multimedia (pp. 39–s47). Shen, H.T., Ooi, B.C., Tan, K.-L. (2000). Giving meanings to WWW images. In Proceedings of the 8th annual ACM international conference on multimedia (pp. 39–s47).
Zurück zum Zitat Simon, I., Snavely, N., Seitz, S.M. (2007). Scene summarization for online image collections. In Proceedings of IEEE 11th international conference on computer vision (pp. 1–8). Simon, I., Snavely, N., Seitz, S.M. (2007). Scene summarization for online image collections. In Proceedings of IEEE 11th international conference on computer vision (pp. 1–8).
Zurück zum Zitat Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1349–1380.CrossRef Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1349–1380.CrossRef
Zurück zum Zitat Smith, J.R., & Chang, S.F. (1997). An image and video search engine for the world-wide web. In Symposium on electronic imaging: science and technology-storage & retrieval for image and video databases V. Smith, J.R., & Chang, S.F. (1997). An image and video search engine for the world-wide web. In Symposium on electronic imaging: science and technology-storage & retrieval for image and video databases V.
Zurück zum Zitat Spengler, A., & Gallinari, P. (2009). Learning to extract content from news web pages. In Proceedings of the 2009 international conference on advanced information networking and applications workshop (pp. 709–714). Spengler, A., & Gallinari, P. (2009). Learning to extract content from news web pages. In Proceedings of the 2009 international conference on advanced information networking and applications workshop (pp. 709–714).
Zurück zum Zitat Tang, J., Yan, S., Hong, R., Qi, G. (2009). Inferring semantic concepts from community-contributed images and noisy tags. In Proceedings of the 17th ACM international conference on multimedia (p. 223). Tang, J., Yan, S., Hong, R., Qi, G. (2009). Inferring semantic concepts from community-contributed images and noisy tags. In Proceedings of the 17th ACM international conference on multimedia (p. 223).
Zurück zum Zitat Tian, G., Guan, G., Wang, Z., Feng, D. (2012). What is Happening: Annotating Images with Verbs. In: Proceedings of the 20th ACM International Conference on Multimedia-MULTIMEDIA 2012 (pp. 1077–1080). Tian, G., Guan, G., Wang, Z., Feng, D. (2012). What is Happening: Annotating Images with Verbs. In: Proceedings of the 20th ACM International Conference on Multimedia-MULTIMEDIA 2012 (pp. 1077–1080).
Zurück zum Zitat Toyama, K., Logan, R., Roseway, A. (2003). Geographic location tags on digital images. In Proceedings of the 11th ACM international conference on multimedia-MULTIMEDIA 2003 (pp. 156–166). Toyama, K., Logan, R., Roseway, A. (2003). Geographic location tags on digital images. In Proceedings of the 11th ACM international conference on multimedia-MULTIMEDIA 2003 (pp. 156–166).
Zurück zum Zitat Tryfou, G., & Tsapatsoulis, N. (2012). Extraction of Web Image Information: Semantic or Visual Cues? In Proceedings of the 8th Artificial Intelligence Applications and Innovations Conference-AIAI 2012, (pp. 368–373). Tryfou, G., & Tsapatsoulis, N. (2012). Extraction of Web Image Information: Semantic or Visual Cues? In Proceedings of the 8th Artificial Intelligence Applications and Innovations Conference-AIAI 2012, (pp. 368–373).
Zurück zum Zitat Wang, J., & Lochovsky, F.H. (2003). Data extraction and label assignment for web databases. In Proceedings of the 12th international conference on World Wide Web (pp. 187–196). Wang, J., & Lochovsky, F.H. (2003). Data extraction and label assignment for web databases. In Proceedings of the 12th international conference on World Wide Web (pp. 187–196).
Zurück zum Zitat Wang X.-J., Ma W.-Y., Xue G.-R., Li X. (2004). Multi-model similarity propagation and its application for web image retrieval. In Proceedings of the 12th annual ACM international conference on multimedia. New York, doi: 10.1145/1027527.1027746. Wang X.-J., Ma W.-Y., Xue G.-R., Li X. (2004). Multi-model similarity propagation and its application for web image retrieval. In Proceedings of the 12th annual ACM international conference on multimedia. New York, doi: 10.​1145/​1027527.​1027746.
Zurück zum Zitat Wang, X. J., Ma, W. Y., Zhang, L., Li, X. (2005). Iteratively clustering web images based on link and attribute reinforcements. In Proceedings of the ACM international conference on multimedia (pp. 122–131). Wang, X. J., Ma, W. Y., Zhang, L., Li, X. (2005). Iteratively clustering web images based on link and attribute reinforcements. In Proceedings of the ACM international conference on multimedia (pp. 122–131).
Zurück zum Zitat Wang, C., Zhang, L., Zhang, H.-J. (2008). Learning to reduce the semantic gap in web image retrieval and annotation. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval - SIGIR 2008 (p. 355). Wang, C., Zhang, L., Zhang, H.-J. (2008). Learning to reduce the semantic gap in web image retrieval and annotation. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval - SIGIR 2008 (p. 355).
Zurück zum Zitat Westerveld, T. (2000). Image retrieval: content versus context. In Content-based multimedia information access, RIAO. Westerveld, T. (2000). Image retrieval: content versus context. In Content-based multimedia information access, RIAO.
Zurück zum Zitat Yang, K. et al. (2011). Tag tagging: towards more descriptive keywords of image content. IEEE Transactions on Multimedia, 13(4), 662–673.CrossRef Yang, K. et al. (2011). Tag tagging: towards more descriptive keywords of image content. IEEE Transactions on Multimedia, 13(4), 662–673.CrossRef
Zurück zum Zitat Yee, K. P., Swearingen, K., Li, K., Hearst, M. (2003). Faceted metadata for image search and browsing. In Proceedings of SIGCHI (pp. 401–408). Yee, K. P., Swearingen, K., Li, K., Hearst, M. (2003). Faceted metadata for image search and browsing. In Proceedings of SIGCHI (pp. 401–408).
Zurück zum Zitat Zhai, Y., & Liu, B. (2005). Web data extraction based on partial tree alignment. In Proceedings of the 14th international conference on World Wide Web (pp. 76–85). Zhai, Y., & Liu, B. (2005). Web data extraction based on partial tree alignment. In Proceedings of the 14th international conference on World Wide Web (pp. 76–85).
Zurück zum Zitat Zheng, Y.T., Zhao, M., Song, Y., Adam, H., Buddemeier, U., Bissacco, A., Brucher, F., et al. (2009). Tour the world: building a web-scale landmark recognition engine. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1085–1092). Zheng, Y.T., Zhao, M., Song, Y., Adam, H., Buddemeier, U., Bissacco, A., Brucher, F., et al. (2009). Tour the world: building a web-scale landmark recognition engine. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1085–1092).
Metadaten
Titel
Image understanding and the web: a state-of-the-art review
verfasst von
Fariza Fauzi
Mohammed Belkhatir
Publikationsdatum
01.10.2014
Verlag
Springer US
Erschienen in
Journal of Intelligent Information Systems / Ausgabe 2/2014
Print ISSN: 0925-9902
Elektronische ISSN: 1573-7675
DOI
https://doi.org/10.1007/s10844-014-0323-6

Weitere Artikel der Ausgabe 2/2014

Journal of Intelligent Information Systems 2/2014 Zur Ausgabe