ABSTRACT
Discovering users' specific and implicit geographic intention in web search can greatly help satisfy users' information needs. We build a geo intent analysis system that uses minimal supervision to learn a model from large amounts of web-search logs for this discovery. We build a city language model, which is a probabilistic representation of the language surrounding the mention of a city in web queries. We use several features derived from these language models to: (1) identify users' implicit geo intent and pinpoint the city corresponding to this intent, (2) determine whether the geo-intent is localized around the users' current geographic location, (3) predict cities for queries that have a mention of an entity that is located in a specific place. Experimental results demonstrate the effectiveness of using features derived from the city language model. We find that (1) the system has over 90% precision and more than 74% accuracy for the task of detecting users' implicit city level geo intent (2) the system achieves more than 96% accuracy in determining whether implicit geo queries are local geo queries, neighbor region geo queries or none-of-these (3) the city language model can effectively retrieve cities in location-specific queries with high precision (88%) and recall (74%); human evaluation shows that the language model predicts city labels for location-specific queries with high accuracy (84.5%).
- GeoCLEF workshop -- Evaluation of cross--language geographic information retrieval systems. www.uni--hildesheim.de/geoclef.Google Scholar
- L. Andrade and M. J. Silva. Relevance ranking for geographic ir. In ACM GIR, 2006.Google Scholar
- D. Bohning. Multinomial Logistic Regression Algorithm. Annals of the Inst. of Statistical Math., 44:197--200, November 1992.Google ScholarCross Ref
- J. Broglio, J. P. Callan, and W. B. Croft. An overview of the INQUERY system as used for the TIPSTER project. Technical report, Amherst, MA, USA, 1993. Google ScholarDigital Library
- C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm, 2001.Google Scholar
- S. F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. In Proceedings of ACL, pages 310--318, 1996. Google ScholarDigital Library
- K. W. Church and P. Hanks. Word association norms, mutual information, and lexicography. In Proceedings of ACL, pages 76--83, 1989. Google ScholarDigital Library
- J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189--1232, 2001.Google ScholarCross Ref
- R. Jones, W. V. Zhang, B. Rey, P. Jhala, and E. Stipp. Geographic intention and modification in web search. International Journal of Geographical Information Science (IJGIS), March 2008. Google ScholarDigital Library
- M. Pasca. Weakly-supervised discovery of named entities using web search queries. In CIKM, pages 683--690, 2007. Google ScholarDigital Library
- J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In ACM SIGIR, pages 275--281, 1998. Google ScholarDigital Library
- R. Purves and C. Jones, editors. ACM GIR. ACM, 2007.Google Scholar
- J. Qian. Local Search Using Address Completion. US Patent Application 20080065694, March 2008.Google Scholar
- H. Raghavan, J. Allan, and A. McCallum. An exploration of entity models, collective classification and relation description. In ACM LinkKDD, pages 1--10, 2004.Google Scholar
- S. Riise, D. Patel, and E. Stipp. Geographical Location Extraction. US Patent Application 20050108213, 2003.Google Scholar
- M. Sanderson and J. Kohler. Analyzing geographic queries. In ACM GIR, Sheffield, UK, 2004.Google Scholar
- S. Tong and D. Koller. Support vector machine active learning with applications to text classification. In Proceedings of ICML, pages 999--1006, 2000. Google ScholarDigital Library
- L. Wang, C. Wang, X. Xie, J. Forman, Y. Lu, W.-Y. Ma, and Y. Li. Detecting dominant locations from search queries. In ACM SIGIR, pages 424--431, 2005. Google ScholarDigital Library
- M. J. Welch and J. Cho. Automatically identifying localizable queries. In ACM SIGIR, pages 507--514, 2008. Google ScholarDigital Library
- B. Yu and G. Cai. A query-aware document ranking method for geographic information retrieval. In ACM GIR, pages 49--54, 2007. Google ScholarDigital Library
- C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad-hoc Information Retrieval. In ACM SIGIR, pages 334--342, 2001. Google ScholarDigital Library
- Z. Zhuang, C. Brunk, and C. L. Giles. Modeling and visualizing geo-sensitive queries based on user clicks. In ACM LocWeb, pages 73--76, 2008. Google ScholarDigital Library
Index Terms
- Discovering users' specific geo intention in web search
Recommendations
Geographic intention and modification in web search
Web searchers signal their geographic intent by using place-names in search queries. They also indicate their flexibility about geographic specificity by reformulating their queries. By examining this data we can learn to understand web searcher ...
Discovering the representative of a search engine
CIKM '01: Proceedings of the tenth international conference on Information and knowledge managementGiven a large number of search engines on the Internet, it is difficult for a person to determine which search engines could serve his/her information needs. A common solution is to construct a metasearch engine on top of the search engines. Upon ...
Using Local Popularity of Web Resources for Geo-Ranking of Search Engine Results
Search engines retrieve and rank Web pages which are not only relevant to a query but also important or popular for the users. This popularity has been studied by analysis of the links between Web resources. Link-based page ranking models such as ...
Comments