skip to main content
10.1145/1242572.1242675acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
Article

Measuring semantic similarity between words using web search engines

Published:08 May 2007Publication History
First page image

References

  1. {1} A. Bagga and B. Baldwin. Entity-based cross document coreferencing using the vector space model. In Proc. of 36th COLING-ACL, pages 79-85, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {2} Z. Bar-Yossef and M. Gurevich. Random sampling from a search engine's index. In Proceedings of 15th International World Wide Web Conference, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {3} R. Bekkerman and A. McCallum. Disambiguating web appearances of people in a social network. In Proceedings of the World Wide Web Conference (WWW), pages 463-470, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. {4} D. Bollegala, Y. Matsuo, and M. Ishizuka. Disambiguating personal names on the web using automatically extracted key phrases. In Proc. of the 17th European Conference on Artificial Intelligence, pages 553-557, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. {5} C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using smart: Trec 3. In Proc. of 3rd Text REtreival Conference, pages 69-80, 1994.Google ScholarGoogle Scholar
  6. {6} H. Chen, M. Lin, and Y. Wei. Novel association measures using web search with double checking. In Proc. of the COLING/ACL 2006, pages 1009-1016, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. {7} P. Cimano, S. Handschuh, and S. Staab. Towards the self-annotating web. In Proc. of 13th WWW, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. {8} J. Curran. Ensemble menthods for automatic thesaurus extraction. In Proc. of EMNLP, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {9} D. R. Cutting, J. O. Pedersen, D. Karger, and J. W. Tukey. Scatter/gather: A cluster-based approach to browsing large document collections. In Proceedings SIGIR '92, pages 318-329, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. {10} M. Fleischman and E. Hovy. Multi-document person name resolution. In Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics (ACL), Reference Resolution Workshop, 2004.Google ScholarGoogle Scholar
  11. {11} H. Han, H. Zha, and C. L. Giles. Name disambiguation in author citations using a k-way spectral clustering method. In Proceedings of the International Conference on Digital Libraries, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {12} M. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proc. of 14th COLING, pages 539-545, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. {13} J. Jiang and D. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proc. of the International Conference on Research in Computational Linguistics ROCLING X, 1998.Google ScholarGoogle Scholar
  14. {14} F. Keller and M. Lapata. Using the web to obtain frequencies for unseen bigrams. Computational Linguistics, 29(3):459-484, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. {15} M. Lapata and F. Keller. Web-based models of natural language processing. ACM Transactions on Speech and Language Processing, 2(1):1-31, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. {16} D. Lin. Automatic retreival and clustering of similar words. In Proc. of the 17th COLING, pages 768-774, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. {17} D. Lin. An information-theoretic definition of similarity. In Proc. of the 15th ICML, pages 296-304, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. {18} C. D. Manning and H. Schäutze. Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, Massachusetts, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. {19} Y. Matsuo, J. Mori, M. Hamasaki, K. Ishida, T. Nishimura, H. Takeda, K. Hasida, and M. Ishizuka. Polyphonet: An advanced social network extraction system. In Proc. of 15th International World Wide Web Conference, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. {20} Y. Matsuo, T. Sakaki, K. Uchiyama, and M. Ishizuka. Graph-based word clustering using web search engine. In Proc. of EMNLP 2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. {21} D. McCarthy, R. Koeling, J. Weeds, and J. Carroll. Finding predominant word senses in untagged text. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), pages 279-286, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. {22} D. Medin, R. Goldstone, and D. Gentner. Respects for similarity. Psychological Review, 6(1):1-28, 1991.Google ScholarGoogle Scholar
  23. {23} P. Mika. Ontologies are us: A unified model of social networks and semantics. In Proc. of ISWC2005, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. {24} G. Miller and W. Charles. Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1):1-28, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  25. {25} M. Mitra, A. Singhal, and C. Buckley. Improving automatic query expansion. In Proc. of 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 206-214, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. {26} J. Mori, Y. Matsuo, and M. Ishizuka. Extracting keyphrases to represent relations in social networks from web. In Proc. of 20th IJCAI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. {27} M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge. In Proc. of AAAI-2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. {28} X.-H. Phan, L.-M. Nguyen, and S. Horiguchi. Personal name resolution crossover documents by a semantics-based approach. IEICE Transactions on Information and Systems, E89-D:825-836, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. {29} J. Platt. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. Advances in Large Margin Classifiers, pages 61-74, 2000.Google ScholarGoogle Scholar
  30. {30} R. Rada, H. Mili, E. Bichnell, and M. Blettner. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics, 9(1):17-30, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  31. {31} P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. In Proc. of 14th International Joint Conference on Aritificial Intelligence, 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. {32} P. Resnik. Semantic similarity in a taxonomy: An information based measure and its application to problems of ambiguity in natural language. Journal of Aritificial Intelligence Research, 11:95-130, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. {33} P. Resnik and N. A. Smith. The web as a parallel corpus. Computational Linguistics, 29(3):349-380, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. {34} R. Rosenfield. A maximum entropy approach to adaptive statistical modelling. Computer Speech and Language, 10:187-228, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  35. {35} H. Rubenstein and J. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8:627-633, 1965. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. {36} M. Sahami and T. Heilman. A web-based kernel function for measuring the similarity of short text snippets. In Proc. of 15th International World Wide Web Conference, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. {37} H. Schutze. Automatic word sense discrimination. Computational Linguistics, 24(1):97-123, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. {38} P. D. Turney. Minning the web for synonyms: Pmi-ir versus lsa on toefl. In Proc. of ECML-2001, pages 491-502, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. {39} A. Tversky. Features of similarity. Psychological Review, 84(4):327-352, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  40. {40} B. Vlez, R. Wiess, M. Sheldon, and D. Gifford. Fast and effective query refinement. In Proc. of 20th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 6-15, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. {41} D. M. Y. Li, Zuhair A. Bandar. An approch for measuring semantic similarity between words using multiple information sources. IEEE Transactions on Knowledge and Data Engineering, 15(4):871-882, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Measuring semantic similarity between words using web search engines

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WWW '07: Proceedings of the 16th international conference on World Wide Web
          May 2007
          1382 pages
          ISBN:9781595936547
          DOI:10.1145/1242572

          Copyright © 2007 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 8 May 2007

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate1,899of8,196submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader