skip to main content
10.1145/2675133.2675235acmconferencesArticle/Chapter ViewAbstractPublication PagescscwConference Proceedingsconference-collections
research-article

There's No Such Thing as the Perfect Map: Quantifying Bias in Spatial Crowd-sourcing Datasets

Published:28 February 2015Publication History

ABSTRACT

Crowd-sourcing has become a popular form of computer mediated collaborative work and OpenStreetMap represents one of the most successful crowd-sourcing systems, where the goal of building and maintaining an accurate global map of the world is being accomplished by means of contributions made by over 1.2M citizens. However, within this apparently large crowd, a tiny group of highly active users is responsible for the mapping of almost all the content. One may thus wonder to what extent the information being mapped is biased towards the interests and agenda of this group of users. In this paper, we present a method to quantitatively measure content bias in crowd-sourced geographic information. We then apply the method to quantify content bias across a three-year period of OpenStreetMap mapping in 40 countries. We find almost no content bias in terms of what is being mapped, but significant geographic bias; furthermore, we find that bias in terms of meticulousness varies with culture.

References

  1. Arsanjani, J., Barron, C., Bakillah, M., and Helbich, M. Assessing the Quality of OpenStreetMap Contributors together with their Contributions. In Proc. of AGILE (2013).Google ScholarGoogle Scholar
  2. Boyd, D., and Crawford, K. Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon. Information, Communication & Society 15, 5 (2012), 662--679.Google ScholarGoogle ScholarCross RefCross Ref
  3. Brabham, D. Crowdsourcing. MIT Press, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bryant, S., Forte, A., and Bruckman, A. Becoming Wikipedian: Transformation of Participation in a Collaborative Online Encyclopedia. In Proc. of GROUP, ACM (2005), 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Callahan, E., and Herring, S. Cultural Bias in Wikipedia Content on Famous Persons. Journal of the American Society for Information Science and Technology 62, 10 (2011), 1899--1915. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cha, M., Kwak, H., Rodriguez, P., Ahn, Y., and Moon, S. I Tube, You Tube, Everybody Tubes: Analyzing the World's Largest User Generated Content Video System. In Proc. of IMC, ACM (2007), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chango, S., Kumar, V., Gilbert, E., and Terveen, L. Specialization, Homophily, and Gender in a Social Curation Site: Findings from Pinterest. In Proc. of CSCW, ACM (2014), 674--686. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cohen, N. Define Gender Gap? Look Up Wikipedias Contributor List. The New York Times (January 2011).Google ScholarGoogle Scholar
  9. Dai, W., Jin, G. Z., Lee, J., and Luca, M. Optimal Aggregation of Consumer Ratings: An Application to Yelp.com. Tech. rep., National Bureau of Economic Research, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  10. Fana, H., Zipfa, A., Fub, Q., and Neisa, P. Quality assessment for building footprints data on OpenStreetMap. International Journal of Geographical Information Science (IJGIS) 28, 4 (2014), 700--719. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Gallagher, S., and Savage, T. Cross-cultural analysis in online community research: A literature review. Computers in Human Behavior (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Garcia-Gavilanes, R., Quercia, D., and Jaimes, A. Cultural Dimensions in Twitter: Time, Individualism, and Power. In Proc. of ICWSM (2013).Google ScholarGoogle Scholar
  13. Girres, J., and Touya, G. Quality assessment of the French OpenStreetMap dataset. Transactions in GIS 14, 4 (2010), 435--459.Google ScholarGoogle ScholarCross RefCross Ref
  14. Goodchild, M. Citizens as Sensors: the World of Volunteered Geography. GeoJournal 69, 4 (2007), 211--221.Google ScholarGoogle ScholarCross RefCross Ref
  15. Haklay, M. How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and Planning B: Planning and Design 37, 4 (2010), 682--703.Google ScholarGoogle ScholarCross RefCross Ref
  16. Haklay, M., Basiouka, S., Antoniou, V., and Ather, A. How Many Volunteers Does it Take to Map an Area Well? The Validity of Linus Law to Volunteered Geographic Information. The Cartographic Journal 47, 4 (2010), 315--322.Google ScholarGoogle ScholarCross RefCross Ref
  17. Halfaker, A., Geiger, R., Morgan, J., and Riedl, J. The Rise and Decline of an Open Collaboration System: How Wikipedia's reaction to sudden popularity is causing its decline. American Behavioral Scientist 57, 5 (2013), 664--688.Google ScholarGoogle ScholarCross RefCross Ref
  18. Halfaker, A., Kittur, A., Kraut, R., , and Riedl, J. A Jury of your Peers: Quality, Experience and Ownership in Wikipedia. In Proc. of WikiSym, ACM (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hecht, B., and Stephens, M. A Tale of Cities: Urban Biases in Volunteered Geographic Information. In Proc. of ICWSM 2014 (2014).Google ScholarGoogle Scholar
  20. Hofstede, G. Culture's Consequences: Comparing Values, Behaviors, Institutions and Organizations across Nations. SAGE Publications, 2001.Google ScholarGoogle Scholar
  21. Howe, J. The Rise of Crowdsourcing. Wired (2006).Google ScholarGoogle Scholar
  22. Hristova, D., Quattrone, G., Mashhadi, A., and Capra, L. The Life of the Party: Impact of Social Mapping in OpenStreetMap. In Proc. of ICWSM (2013).Google ScholarGoogle Scholar
  23. Hu, M., Lim, E., Sun, A., Lauw, H., and Vuong, B. Measuring Article Quality in Wikipedia: Models and Evaluation. In Proc. of CIKM, ACM (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ishida, K. Geographical Bias on Social Media and Geo-local Contents System with Mobile Devices. In Proc. of HICSS (2012), 1790--1796. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kittur, A., Chi, E., Pendleton, B., Suh, B., and Mytkowicz, T. Power of the Few vs. Wisdom of the Crowd: Wikipedia and the Rise of the Bourgeoisie. In Proc. of WWW (2007).Google ScholarGoogle Scholar
  26. L., J. R., Irani, Silberman, M., Zaldivar, A., and Tomlinson, B. Who are the Crowdworkers' Shifting Demographics in Mechanical Turk. In Proc. of CHI (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Lam, S. T. K., Uduwage, A., Dong, Z., Sen, S., Musicant, D. R., Terveen, L., and Riedl, J. WP:Clubhouse- An Exploration of Wikipedias Gender Imbalance. In Proc. of WikiSym, ACM (2011), 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ludwig, I., Voss, A., and Krause-Traudes, M. A Comparison of the Street Networks of Navteq and OSM in Germany. Advancing Geoinformation Science for a Changing World 1, 2 (2011), 65--84.Google ScholarGoogle Scholar
  29. Maceachren, A. M., Robinson, A., Gardner, S., Murray, R., Gahegan, M., and Hetzler, E. Visualizing Geospatial Information Uncertainty: What We Know and What We Need to Know. Information Science 32 (2005), 160.Google ScholarGoogle Scholar
  30. Mashhadi, A., Quattrone, G., Capra, L., and Mooney, P. On the Accuracy of Urban Crowd-Sourcing for Maintaining Large-Scale Geospatial Databases. In Proc. of WikiSym, ACM (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Neis, P., Zielstra, D., and Zipf, A. Comparison of Volunteered Geographic Information Data Contributions and Community Development for Selected World Regions. Future Internet 5, 2 (2013), 282--300.Google ScholarGoogle ScholarCross RefCross Ref
  32. Neis, P., and Zipf, A. Analyzing the Contributor Activity of a Volunteered Geographic Information ProjectThe Case of OpenStreetMap. ISPRS International Journal of Geo-Information 1, 2 (2012), 146--165.Google ScholarGoogle ScholarCross RefCross Ref
  33. Panciera, K., Halfaker, A., and Terveen, L. Wikipedians Are Born, Not Made: a Study of Power Editors on Wikipedia. In Proc. of GROUP, ACM (2009), 51--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Pfeil, U., Zaphiris, P., and Ang, C. S. Cultural Differences in Collaborative Authoring of Wikipedia. Journal of Computer-Mediated Communication 12, 1 (2006), 88--113.Google ScholarGoogle ScholarCross RefCross Ref
  35. Priedhorsky, R., Lam, S., Panciera, K., Terveen, L., and Riedl, J. Creating, Destroying, and Restoring Value in Wikipedia. In Proc. of GROUP, ACM (2007), 259--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Priedhorsky, R., Masli, M., and Terveen, L. Eliciting and Focusing Geographic Volunteer Work. In Proc. of CSCW, ACM (2010), 61--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Quattrone, G., Mashhadi, A., and Capra, L. Mind the Map: The Impact of Culture and Economic Affluence on Crowd-Mapping Behaviours. In Proc. of CSCW, ACM (2014), 934--944. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Reinecke, K., Nguyen, M. K., Bernstein, A., Näf, M., and Gajos, K. Doodle Around the World: Online Scheduling Behavior Reflects Cultural Differences in Time Perception and Group Decision-Making. In Proc. of CSCW, ACM (2013), 45--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Rost, M., Barkhuus, L., Cramer, H., and Brown, B. Representation and Communication: Challenges in Interpreting Large Social Media Datasets. In Proc. of CSCW, ACM (2013), 357--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Shannon, C. A Mathematical Theory of Communication. The Bell System Technical Journal 27 (1948), 379--423 and 623--656.Google ScholarGoogle ScholarCross RefCross Ref
  41. Singhal, A. Modern Information Retrieval: A Brief Overview. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 24, 4 (2001), 35--43.Google ScholarGoogle Scholar
  42. Stephens, M. Gender and the GeoWeb: Divisions in the Production of User-Generated Cartographic Information. GeoJournal (2013), 1--16.Google ScholarGoogle Scholar
  43. Vasconcelos, M., Ricci, S., Almeida, J., Benevenuto, F., and Almeida, V. Tips, Dones and ToDos: Uncovering User Profiles in FourSquare. In Proc. of WSDM, ACM (2012), 653--662. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Zielstra, D., and Zipf, A. A Comparative Study of Proprietary Geodata and Volunteered Geographic Information for Germany. In Proc. of AGILE (2010).Google ScholarGoogle Scholar

Index Terms

  1. There's No Such Thing as the Perfect Map: Quantifying Bias in Spatial Crowd-sourcing Datasets

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CSCW '15: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing
          February 2015
          1956 pages
          ISBN:9781450329224
          DOI:10.1145/2675133

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 28 February 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          CSCW '15 Paper Acceptance Rate161of575submissions,28%Overall Acceptance Rate2,235of8,521submissions,26%

          Upcoming Conference

          CSCW '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader