Abstract
One of the potential problems of volunteered geographic information (VGI) is ensuring its quality. Innocent mistakes and intentional falsehoods can reduce not only the quality of the information, but also people’s confidence in VGI as a legitimate source of data. We present a case study in VGI that addresses the quality problem by aggregating input from many different people. Specifically, we present a technique to maintain a comprehensive list of points of interest (POI) for digital maps. This is traditionally difficult, because new POI are created, because some POI are known only locally, and because some POI have multiple names. We address this problem by exploiting map annotations contributed by regular, online map users. Our institution’s mapping Web site allows users to create arbitrary collections of geographically anchored pushpins that are annotated with text. Our data mining solution finds geometric clusters of these pushpins and examines the pushpins’ text and other features for likely POI names. For instance, if a given text phrase is mentioned frequently in a cluster, but infrequently elsewhere, this increases our confidence that this phrase names a POI. We tested the quality of our results by asking 100 local residents whether or not the POI we found were correct, and our user study told us we were generally successful. We also show how we can use the same user-annotated pushpins to assess the popularity of existing POI, which is a guide for which ones to display on a map.
Similar content being viewed by others
References
Ahern, S., M. Naaman, et al. (2007). World Explorer: Visualizing Aggregate Data from Unstructured Text in Geo-Referenced Collections. In Seventh ACM/IEEE-CS Joint Conference on Digital Libraries, (JCDL 07). Vancouver, Canada.
Anders, K.-H. (2001). Data mining for automated GIS data collection photogrammetric week 01 (pp. 263–272). Germany: Heidelberg.
Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York, Chichester, Brisbane, Toronto and Singapore: Wiley.
Goodchild, M. F. (2007). Citizens as voluntary sensors: Spatial data infrastructure in the World of Web 2.0. International Journal of Spatial Data Infrastructures Research, 2, 24–32.
Li, D., Di, K., et al. (2000). Land use classification of remote sensing image with GIS data based on spatial data mining techniques. International Archives of Photogrammetry and Remote Sensing, 33(B3), 238–245.
Miller, H. J., Han, J. (Eds.). (2001). Geographic data mining and knowledge discovery. London and New York: Taylor & Francis.
Rousseeuw, P. J., & Croux, C. (1993). Alternatives to the median absolute deviation. Journal of the American Statistical Association, 88(424), 1273–1283.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mummidi, L.N., Krumm, J. Discovering points of interest from users’ map annotations. GeoJournal 72, 215–227 (2008). https://doi.org/10.1007/s10708-008-9181-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10708-008-9181-5