ABSTRACT
Understanding, representing, and reasoning about Points Of Interest (POI) types such as Auto Repair, Body Shop, Gas Stations, or Planetarium, is a key aspect of geographic information retrieval, recommender systems, geographic knowledge graphs, as well as studying urban spaces in general, e.g., for extracting functional or vague cognitive regions from user-generated content. One prerequisite to these tasks is the ability to capture the similarity and relatedness between POI types. Intuitively, a spatial search that returns body shops or even gas stations in the absence of auto repair places is still likely to satisfy some user needs while returning planetariums will not. Place hierarchies are frequently used for query expansion, but most of the existing hierarchies are relatively shallow and structured from a single perspective, thereby putting POI types that may be closely related regarding some characteristics far apart from another. This leads to the question of how to learn POI type representations from data. Models such as Word2Vec that produces word embeddings from linguistic contexts are a novel and promising approach as they come with an intuitive notion of similarity. However, the structure of geographic space, e.g., the interactions between POI types, differs substantially from linguistics. In this work, we present a novel method to augment the spatial contexts of POI types using a distance-binned, information-theoretic approach to generate embeddings. We demonstrate that our work outperforms Word2Vec and other models using three different evaluation tasks and strongly correlates with human assessments of POI type similarity. We published the resulting embeddings for 570 place types as well as a collection of human similarity assessments online for others to use.
- Benjamin Adams and Krzysztof Janowicz. 2015. Thematic signatures for cleansing and enriching place-related linked data. International Journal of Geographical Information Science 29, 4 (2015), 556--579. Google ScholarDigital Library
- Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin. 2003. A neural probabilistic language model. Journal of machine learning research 3, Feb (2003), 1137--1155. Google Scholar
- Anne Cocos and Chris Callison-Burch. 2017. The Language of Place: Semantic Value from Geospatial Context. EACL 2017 (2017), 99.Google ScholarCross Ref
- Shanshan Feng, Gao Cong, Bo An, and Yeow Meng Chee. 2017. POI2Vec: Geographical Latent Representation for Predicting Future Visitors. (2017).Google Scholar
- John R Firth. 1957. A synopsis of linguistic theory, 1930--1955. (1957).Google Scholar
- Nelson Goodman. 1972. Problems and projects. (1972).Google Scholar
- Sébastien Harispe, Sylvie Ranwez, Stefan Janaqi, and Jacky Montmain. 2015. Semantic similarity from natural language and ontology analysis. Synthesis Lectures on Human Language Technologies 8, 1 (2015), 1--254.Google ScholarDigital Library
- Stevan Harnad. 2005. To cognize is to categorize: Cognition is categorization. Handbook of categorization in cognitive science (2005), 20--45.Google Scholar
- Krzysztof Janowicz. 2012. Observation-driven geo-ontology engineering. Transactions in GIS 16, 3 (2012), 351--374.Google ScholarCross Ref
- Krzysztof Janowicz, Martin Raubal, and Werner Kuhn. 2011. The semantics of similarity in geographic information retrieval. Journal of Spatial Information Science 2011, 2 (2011), 29--57.Google Scholar
- Jay J Jiang and David W Conrath. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint cmp-lg/9709008 (1997).Google Scholar
- Junchul Kim, Maria Vasardani, and Stephan Winter. 2017. Similarity matching for integrating spatial information extracted from place descriptions. International Journal of Geographical Information Science 31, 1 (2017), 56--80.Google ScholarDigital Library
- Claudia Leacock and Martin Chodorow. 1998. Combining local context and WordNet similarity for word sense identification. WordNet: An electronic lexical database 49, 2 (1998), 265--283.Google Scholar
- Wentian Li. 1992. Random texts exhibit Zipf's-law-like word frequency distribution. IEEE Transactions on information theory 38, 6 (1992), 1842--1845. Google ScholarDigital Library
- Dekang Lin et al. 1998. An information-theoretic definition of similarity.. In Icml, Vol. 98. 296--304. Google ScholarDigital Library
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov (2008), 2579--2605.Google Scholar
- Grant McKenzie and Krzysztof Janowicz. 2015. Where is also about time: A location-distortion model to improve reverse geocoding using behavior-driven temporal semantic signatures. Computers, Environment and Urban Systems 54 (2015), 1--13.Google ScholarCross Ref
- Grant McKenzie, Krzysztof Janowicz, Song Gao, and Li Gong. 2015. How where is when? On the regional variability and resolution of geosocial temporal signatures for points of interest. Computers, Environment and Urban Systems 54 (2015), 336-- 346.Google ScholarCross Ref
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013).Google Scholar
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119. Google ScholarDigital Library
- Andriy Mnih and Koray Kavukcuoglu. 2013. Learning word embeddings efficiently with noise-contrastive estimation. In Advances in neural information processing systems. 2265--2273. Google ScholarDigital Library
- Christoph Mülligann, Krzysztof Janowicz, Mao Ye, and Wang-Chien Lee. 2011. Analyzing the spatial-semantic interaction of points of interest in volunteered geographic information. In International Conference on Spatial Information Theory. Springer, 350--370. Google ScholarDigital Library
- Gianluca Quercini and Hanan Samet. 2014. Uncovering the spatial relatedness in Wikipedia. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM, 153--162. Google ScholarDigital Library
- David Sánchez, Montserrat Batet, and David Isern. 2011. Ontology-based information content computation. Knowledge-Based Systems 24, 2 (2011), 297--303. Google ScholarDigital Library
- Nuno Seco, Tony Veale, and Jer Hayes. 2004. An intrinsic information content metric for semantic similarity in WordNet. In Proceedings of the 16th European conference on artificial intelligence. IOS Press, 1089--1090. Google ScholarDigital Library
- Yi-Fu Tuan. 1977. Space and place: The perspective of experience. Uni. of Minnesota.Google Scholar
- Zhibiao Wu and Martha Palmer. 1994. Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 133--138. Google ScholarDigital Library
- Yao Yao, Xia Li, Xiaoping Liu, Penghua Liu, Zhaotang Liang, Jinbao Zhang, and Ke Mai. 2017. Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model. International Journal of Geographical Information Science 31, 4 (2017), 825--848. Google ScholarDigital Library
- Mao Ye, Krzysztof Janowicz, Christoph Mülligann, and Wang-Chien Lee. 2011. What you are is when you are: the temporal dimension of feature types in location-based social networks. In Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM, 102--111. Google ScholarDigital Library
- Chao Zhang, Keyang Zhang, Quan Yuan, Haoruo Peng, Yu Zheng, Tim Hanratty, Shaowen Wang, and Jiawei Han. 2017. Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 361--370. Google ScholarDigital Library
- Yating Zhang, Adam Jatowt, and Katsumi Tanaka. 2017. Is Tofu the Cheese of Asia?: Searching for Corresponding Objects across Geographical Areas. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 1033--1042. Google ScholarDigital Library
- Shenglin Zhao, Tong Zhao, Irwin King, and Michael R Lyu. 2017. Geo-Teaser: Geo-Temporal Sequential Embedding Rank for Point-of-interest Recommendation. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 153--162. Google ScholarDigital Library
- Rui Zhu, Yingjie Hu, Krzysztof Janowicz, and Grant McKenzie. 2016. Spatial signatures for geographic feature types: Examining gazetteer ontologies using spatial statistics. Transactions in GIS 20, 3 (2016), 333--355.Google ScholarCross Ref
Index Terms
- From ITDL to Place2Vec: Reasoning About Place Type Similarity and Relatedness by Learning Embeddings From Augmented Spatial Contexts
Recommendations
From PIace2Vec to Multi-Scale Built-Environment Representation: A General-Purpose Distributional Embedding for Urban Data Analysis
LocalRec'20: Proceedings of the 4th ACM SIGSPATIAL Workshop on Location-Based Recommendations, Geosocial Networks, and GeoadvertisingBuilt environments like cities, roads, communities are rich sources of urban data. Many downstream applications require comprehensive analysis like geographic information retrieval, recommender systems, geographic knowledge graphs, and in general, ...
POI types characterization based on geographic feature embeddings
SAC '23: Proceedings of the 38th ACM/SIGAPP Symposium on Applied ComputingRepresenting Points of Interest (POI) types, such as restaurants and shopping malls, is crucial to develop computational mechanisms that may assist in tasks such as urban planning and POI recommendation. The POI co-occurrences in different spatial ...
An evaluative baseline for geo-semantic relatedness and similarity
In geographic information science and semantics, the computation of semantic similarity is widely recognised as key to supporting a vast number of tasks in information integration and retrieval. By contrast, the role of geo-semantic relatedness has been ...
Comments