skip to main content
10.1145/2817946.2817968acmconferencesArticle/Chapter ViewAbstractPublication PagescosnConference Proceedingsconference-collections
research-article

"I don't have a photograph, but you can have my footprints.": Revealing the Demographics of Location Data

Published:02 November 2015Publication History

ABSTRACT

Location data are routinely available to a plethora of mobile apps and third party web services. The resulting datasets are increasingly available to advertisers for targeting and also requested by governmental agencies for law enforcement purposes. While the re-identification risk of such data has been widely reported, the discriminative power of mobility has received much less attention. In this study we fill this void with an open and reproducible method. We explore how the growing number of geotagged footprints left behind by social network users in photosharing services can give rise to inferring demographic information from mobility patterns. Chiefly among those, we provide the first detailed analysis of ethnic mobility patterns in two metropolitan areas. This analysis allows us to examine questions pertaining to spatial segregation and the extent to which ethnicity can be inferred using only location data. Our results reveal that even a few location records at a coarse grain can be sufficient for simple algorithms to draw an accurate inference. Our method generalizes to other features, such as gender, offering for the first time a general approach to evaluate discriminative risks associated with location-enabled personalization.

References

  1. Y. Altshuler, N. Aharony, M. Fire, Y. Elovici, and A. Pentland. Incremental learning with accuracy prediction of social and individual properties from mobile-phone data. In SocialCom/PASSAT, pages 969--974. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. E. Badger. This is how women feel about walking alone at night in their own neighborhoods. http://www.washingtonpost.com/blogs/wonkblog/wp/2014-/05/28/this-is-how-women-feel-about-walking-alone-at-night-in-their-own-neighborhoods/, May 2014.Google ScholarGoogle Scholar
  3. R. Becker, R. Cáceres, K. Hanson, S. Isaacman, J. M. Loh, M. Martonosi, J. Rowland, S. Urbanek, A. Varshavsky, and C. Volinsky. Human mobility characterization from cellular network data. Communications of the ACM, 56(1), Jan. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Brea, J. Burroni, M. Minnoni, and C. Sarraute. Harnessing Mobile Phone Social Network Topology to Infer Users Demographic Attributes. In SNAKDD'14: Proceedings of the 8th Workshop on Social Network Mining and Analysis. ACM Request Permissions, Aug. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Chang, I. Rosenn, L. Backstrom, and C. Marlow. epluribus: Ethnicity on social networks, 2010.Google ScholarGoogle Scholar
  6. Z. Cheng, J. Caverlee, K. Lee, and D. Sui. Exploring millions of footprints in location sharing services, 2011.Google ScholarGoogle Scholar
  7. E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user movement in location-based social networks. In KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM Request Permissions, Aug. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Cranshaw, E. Toch, J. Hong, A. Kittur, and N. Sadeh. Bridging the gap between physical location and online social networks. In Proceedings of the 12th ACM International Conference on Ubiquitous Computing, UbiComp '10, pages 119--128, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y.-A. de Montjoye et al. Unique in the crowd: The privacy bounds of human mobility. Sci. Rep., 3, 2013.Google ScholarGoogle Scholar
  10. Y.-A. de Montjoye, J. Quoidbach, F. Robic, and A. S. Pentland. Predicting personality using novel mobile phone-based metrics. In Proceedings of the 6th International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction, SBP'13, pages 48--55, Berlin, Heidelberg, 2013. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Deng and M. Ji. Deriving Rules for Trip Purpose Identification from GPS Travel Survey Data and Land Use Data: A Machine Learning Approach, chapter 72, pages 768--777. 2010.Google ScholarGoogle Scholar
  12. M. Duggan and J. Brenner. The demographics of social media users - 2012. Pew Research Center, 2013.Google ScholarGoogle Scholar
  13. T. File. Computer and internet use in the united states. http://www.census.gov/prod/2013pubs/p20--569.pdf, May 2013.Google ScholarGoogle Scholar
  14. M. González, C. Hidalgo, and A.-L. Barabasi. Understanding individual human mobility patterns. Nature, 2008.Google ScholarGoogle Scholar
  15. M. Grossglauser and D. Tse. Mobility increases the capacity of ad hoc wireless networks. Networking, IEEE/ACM Transactions on, 10(4):477--486, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Guha, M. Jain, and V. N. Padmanabhan. Koi: a location-privacy platform for smartphone apps. In NSDI'12: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, Apr. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. Hu, L. Manikonda, and S. Kambhampati. What we instagram: A first analysis of instagram photo content and user types, 2014.Google ScholarGoogle Scholar
  18. J. Iceland, D. Weinberg, and L. Hughes. The residential segregation of detailed Hispanic and Asian groups in the United States: 1980--2010. Demographic Research, 3:593--624, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  19. S. Isaacman, R. Becker, R. Cáceres, S. Kobourov, M. Martonosi, J. Rowland, and A. Varshavsky. Identifying important places in people's lives from cellular network data. Pervasive Computing, pages 133--151, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Isaacman, R. Becker, R. Cáceres, S. Kobourov, M. Martonosi, J. Rowland, and A. Varshavsky. Ranges of human mobility in Los Angeles and New York. In Pervasive Computing and Communications Workshops (PERCOM Workshops), 2011 IEEE International Conference on, pages 88--93, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  21. S. Isaacman, R. Becker, R. Cáceres, S. Kobourov, J. Rowland, and A. Varshavsky. A tale of two cities. In HotMobile '10: Proceedings of the Eleventh Workshop on Mobile Computing Systems & Applications. ACM Request Permissions, Feb. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kelton. 4th annual springhill suites annual travel survey. http://news.marriott.com/springhill-suites-annual-travel-survey.html, April 2013.Google ScholarGoogle Scholar
  23. K. Krippendorff. Content analysis: An introduction to its methodology. SAGE, Beverly Hills, CA, USA, 1980.Google ScholarGoogle Scholar
  24. M.-P. Kwan. Gender, the home-work link, and space-time patterns of nonemployment activities. Economic Geography, 75(4):pp --370, 1999.Google ScholarGoogle Scholar
  25. N. Lathia, D. Quercia, and J. Crowcroft. The hidden image of the city: Sensing community well-being from urban mobility. In J. Kay, P. Lukowicz, H. Tokuda, P. Olivier, and A. Krüger, editors, Pervasive, volume 7319 of Lecture Notes in Computer Science, pages 91--98. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Lewis, J. Kaufman, and N. Christakis. The taste for privacy: An analysis of college student privacy settings in an online social network. J. Computer-Mediated Communication, 14(1):79--100, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  27. L. Liao, D. Fox, and H. Kautz. Extracting places and activities from GPS traces using hierarchical conditional random fields. Int. J. Rob. Res., 26(1):119--134, Jan. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Lindamood, R. Heatherly, M. Kantarcioglu, and B. Thuraisingham. Inferring private information using social network data. In Proceedings of the 18th International Conference on World Wide Web, WWW '09, pages 1145--1146, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. F. Liu, D. Janssens, G. Wets, and M. Cools. Annotating mobile phone location data with activity purposes using machine learning algorithms. Expert Syst. Appl., 40(8):3299--3311, June 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Madden. Privacy management on social media sites. Pew Research Center, 2012.Google ScholarGoogle Scholar
  31. M. Madden, A. Lenhart, S. Cortesi, U. Grasser, M. Duggan, A. Smith, and M. Beaton. Teens, social media, and privacy. Pew Research Center, 2013.Google ScholarGoogle Scholar
  32. C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. D. S. Massey and N. A. Denton. The dimensions of residential segregation. Social Forces, 67(2):281--315, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  34. S. McDonough and D. L. Brunsma. Navigating the color complex: How multiracial individuals narrate the elements of appearance and dynamics of color in twenty-first-century america. In R. E. Hall, editor, The Melanin Millennium. Springer, Dordrecht, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  35. A. Mislove, S. Lehmann, Y.-Y. Ahn, J.-P. Onnela, and J. N. Rosenquist. Understanding the Demographics of Twitter Users. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM'11). Barcelona, Spain, July 2011.Google ScholarGoogle Scholar
  36. A. Noulas, S. Scellato, C. Mascolo, and M. Pontil. An empirical study of geographic user activity patterns in foursquare, 2011.Google ScholarGoogle Scholar
  37. G. Paolacci, J. Chandler, and P. G. Ipeirotis. Running experiments on amazon mechanical turk. Judgment and Decision Making, 5(5):411--419, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  38. F. Pedregosa et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825--2830, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Pennacchiotti and A.-M. Popescu. A machine learning approach to twitter user classification, 2011.Google ScholarGoogle Scholar
  40. D. Rao, D. Yarowsky, A. Shreevats, and M. Gupta. Classifying latent user attributes in twitter. In Proceedings of the 2Nd International Workshop on Search and Mining User-generated Contents, SMUC '10, pages 37--44, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. F. Reardon. A Conceptual Framework for Measuring Segregation and its Association with Population Outcomes, chapter 7, pages 169--192. John Wiley Sons, San Francisco, CA, USA, 2006.Google ScholarGoogle Scholar
  42. J. T. Roscoe and J. A. Byars. An Investigation of the Restraints with Respect to Sample Size Commonly Imposed on the Use of the Chi-Square Statistic. Journal of the American Statistical Association, 66(336):755--759, Dec. 1971.Google ScholarGoogle ScholarCross RefCross Ref
  43. C. Sarraute, P. Blanc, and J. Burroni. A study of age and gender seen through mobile phone usage patterns in Mexico. In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on, pages 836--843, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. C. Song, Z. Qu, N. Blumm, and A.-L. Barabási. Limits of predictability in human mobility. Science, 327(5968):1018--1021, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  45. Statista. Social networking time per user in the united states in july 2012, by ethnicity (in hours and minutes). http://www.statista.com/statistics/248158/social-networking-time-per-us-user-by-ethnicity/, 2012.Google ScholarGoogle Scholar
  46. United States Census Bureau. 2010 census. http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml, 2010.Google ScholarGoogle Scholar
  47. United States v. Jones. 2012. 132 S. Ct. 945, 955 (Sotomayor, J., concurring) (quoting People v. Weaver, 12 N.Y.3d 433, 441--42 (2009)).Google ScholarGoogle Scholar
  48. M. J. White. Segregation and diversity measures in population distribution. Population Index, 52(2):198--221, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  49. H. Zang and J. Bolot. Anonymization of location data does not work: a large-scale measurement study. In MobiCom '11: Proceedings of the 17th annual international conference on Mobile computing and networking. ACM Request Permissions, Sept. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Y. Zhong, N. J. Yuan, W. Zhong, F. Zhang, and X. Xie. You are where you go: Inferring demographic attributes from location check-ins. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, WSDM '15, pages 295--304, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. "I don't have a photograph, but you can have my footprints.": Revealing the Demographics of Location Data

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        COSN '15: Proceedings of the 2015 ACM on Conference on Online Social Networks
        November 2015
        280 pages
        ISBN:9781450339513
        DOI:10.1145/2817946

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 November 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        COSN '15 Paper Acceptance Rate22of82submissions,27%Overall Acceptance Rate69of307submissions,22%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader