skip to main content
10.1145/2623330.2623681acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Modeling human location data with mixtures of kernel densities

Published:24 August 2014Publication History

ABSTRACT

Location-based data is increasingly prevalent with the rapid increase and adoption of mobile devices. In this paper we address the problem of learning spatial density models, focusing specifically on individual-level data. Modeling and predicting a spatial distribution for an individual is a challenging problem given both (a) the typical sparsity of data at the individual level and (b) the heterogeneity of spatial mobility patterns across individuals. We investigate the application of kernel density estimation (KDE) to this problem using a mixture model approach that can interpolate between an individual's data and broader patterns in the population as a whole. The mixture-KDE approach is evaluated on two large geolocation/check-in data sets, from Twitter and Gowalla, with comparisons to non-KDE baselines, using both log-likelihood and detection of simulated identity theft as evaluation metrics. Our experimental results indicate that the mixture-KDE method provides a useful and accurate methodology for capturing and predicting individual-level spatial patterns in the presence of noisy and sparse data.

Skip Supplemental Material Section

Supplemental Material

p35-sidebyside.mp4

mp4

291.4 MB

References

  1. Twitter streaming api. https://dev.twitter.com/docs/using-search.Google ScholarGoogle Scholar
  2. J. Bithell. An application of density estimation to geographical epidemiology. Statistics in Medicine, 9(6):691--701, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  3. L. Breiman, W. Meisel, and E. Purcell. Variable kernel estimates of multivariate densities. Technometrics, 19(2):135--144, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  4. D. Brockmann, L. Hufnagel, and T. Geisel. The scaling laws of human travel. Nature, 439(7075):462--465, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  5. J. Chang and E. Sun. Location 3: How users share and respond to location-based data on social networking sites. In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pages 74--80, 2011.Google ScholarGoogle Scholar
  6. C. Cheng, H. Yang, I. King, and M. R. Lyu. Fused matrix factorization with geographical and social influence in location-based social networks. In Proceedings of the 26th AAAI, pages 17--23, 2012.Google ScholarGoogle Scholar
  7. E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1082--1090, ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Federal Trade Commission Identity theft survey report, 2006. URL http://www.ftc.gov/reports/federal-trade-commission-2006-identity-theft-survey-report-prepared-commission-synovateGoogle ScholarGoogle Scholar
  9. J. Cranshaw, R. Schwartz, J. I. Hong, and N. M. Sadeh. The livehoods project: Utilizing social media to understand the dynamics of a city. In Proceedings of the Sixth ICWSM, pages 58--65, 2012.Google ScholarGoogle Scholar
  10. J. Cranshaw and T. Yano. Seeing a home away from the home: Distilling proto-neighborhoods from incidental data with latent topic modeling. In CSSWC Workshop at NIPS, 2010.Google ScholarGoogle Scholar
  11. N. Donthu and R. T. Rust. Estimating geographic customer densities using kernel density estimation. Marketing Science, 8(2):191--203, 1989.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. N. Eagle and A. S. Pentland. Eigenbehaviors: Identifying structure in routine. Behavioral Ecology and Sociobiology, 63(7):1057--1066, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. D. Escobar and M. West. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430):577--588, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Fieberg. Kernel density estimators of home range: smoothing and the autocorrelation red herring. Ecology, 88(4):1059--1066, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  15. V. Frias-Martinez, V. Soto, H. Hohwald, and E. Frias-Martinez. Characterizing urban landscapes using geolocated tweets. In Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Confernece on Social Computing (SocialCom), pages 239--248. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. C. Gonzalez, C. A. Hidalgo, and A.-L. Barabasi. Understanding individual human mobility patterns. Nature, 453(7196):779--782, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  17. A. G. Gray and A. W. Moore. Nonparametric density estimation: Toward computational tractability. In Proceeding of the 2003 SIAM International Conference of Data Mining, pages 203--211, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  18. S. Hasan, X. Zhan, and S. V. Ukkusuri. Understanding urban human activity and mobility patterns using large-scale location-based data from online social media. In Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing, ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. Joseph, C. H. Tan, and K. M. Carley. Beyond local, categories and friends: clustering foursquare users with latent topics. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing, pages 919--926. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Lee, S. Wakamiya, and K. Sumiya. Urban area characterization based on crowd behavioral lifelogs over twitter. Personal and Ubiquitous Computing, 17(4):605--620, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Z. Li, B. Ding, J. Han, R. Kays, and P. Nye. Mining periodic behaviors for moving objects. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1099--1108. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. P. Murphy. Machine Learning: A Probabilistic Perspective. MIT Press, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Sadilek, H. A. Kautz, and V. Silenzio. Modeling spread of disease from social interactions. In Proceedings of the Sixth AAAI International Conference on Weblogs and Social Media (ICWSM), pages 322--329, 2012.Google ScholarGoogle Scholar
  24. S. Scellato, M. Musolesi, C. Mascolo, V. Latora, and A. T. Campbell. Nextplace: a spatio-temporal prediction framework for pervasive systems. In Pervasive Computing, pages 152--169. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. W. Silverman. Density Estimation for Statistics and Data Analysis. CRC press, 1986.Google ScholarGoogle Scholar
  26. P. Smyth and D. Wolpert. Linearly combining density estimators via stacking. Machine Learning, 36(1--2):59--83, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. L. Song, D. Kotz, R. Jain, and X. He. Evaluating next-cell predictors with extensive wi-fi mobility data. Mobile Computing, IEEE Transactions on, 5(12):1633--1649, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. J. Vaughan-Nichols. Will mobile computing's future be location, location, location? Computer, 42(2):14--17, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J.-D. Zhang and C.-Y. Chow. igslr: personalized geo-social location recommendation: a kernel density estimation approach. In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 324--333. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modeling human location data with mixtures of kernel densities

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
          August 2014
          2028 pages
          ISBN:9781450329569
          DOI:10.1145/2623330

          Copyright © 2014 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 August 2014

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          KDD '14 Paper Acceptance Rate151of1,036submissions,15%Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader