skip to main content
10.1145/2488388.2488428acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Exploiting innocuous activity for correlating users across sites

Published:13 May 2013Publication History

ABSTRACT

We study how potential attackers can identify accounts on different social network sites that all belong to the same user, exploiting only innocuous activity that inherently comes with posted content. We examine three specific features on Yelp, Flickr, and Twitter: the geo-location attached to a user's posts, the timestamp of posts, and the user's writing style as captured by language models. We show that among these three features the location of posts is the most powerful feature to identify accounts that belong to the same user in different sites. When we combine all three features, the accuracy of identifying Twitter accounts that belong to a set of Flickr users is comparable to that of existing attacks that exploit usernames. Our attack can identify 37% more accounts than using usernames when we instead correlate Yelp and Twitter. Our results have significant privacy implications as they present a novel class of attacks that exploit users' tendency to assume that, if they maintain different personas with different names, the accounts cannot be linked together; whereas we show that the posts themselves can provide enough information to correlate the accounts.

References

  1. Social Intelligence Corp., http://www.socialintel.com/.Google ScholarGoogle Scholar
  2. R. Schmid, "Salesforce service cloud -- featuring activision," September 2012, http://www.youtube.com/watch?v=eT6iHEdnKQ4&feature=relmfu.Google ScholarGoogle Scholar
  3. A. Narayanan and V. Shmatikov, "Robust de-anonymization of large sparse datasets," in Proceedings of the 2008 IEEE Symposium on Security and Privacy (S&P), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Perito, C. Castelluccia, M. Ali Kâafar, and P. Manils, "How unique and traceable are usernames?" in Proceedings of the 11th Privacy Enhancing Technologies Symposium (PETS), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. "Yahoo! placemaker," http://developer.yahoo.com/geo/placemaker/.Google ScholarGoogle Scholar
  6. "geonames.org," http://geonames.org.Google ScholarGoogle Scholar
  7. D. J. Crandall, L. Backstrom, D. Huttenlocher, and J. Kleinberg, "Mapping the world's photos," in Proceedings of the 18th International Conference on World Wide Web (WWW), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Kinsella, V. Murdock, and N. O'Hare, "I'm eating a sandwich in Glasgow": modeling locations with tweets," in Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents (SMUC), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Z. Cheng, J. Caverlee, and K. Lee, "You are where you tweet: a content-based approach to geo-locating twitter users," in Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Nanavati, N. Taylor, W. Aiello, and A. Warfield, "Herbert west: deanonymizer," in Proceedings of the 6th USENIX Conference on Hot topics in Security (HotSec), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. "Bing Maps API," http://www.microsoft.com/maps/developers/web.aspx.Google ScholarGoogle Scholar
  12. K. S. Jones, "A statistical interpretation of term specificity and its application in retrieval," Journal of Documentation, vol. 28, pp. 11--21, 1972.Google ScholarGoogle ScholarCross RefCross Ref
  13. B. Picart, "Improved Phone Posterior Estimation Through K-NN and MLP-Based Similarity," Idiap Research Institute, Tech. Rep., 2009.Google ScholarGoogle Scholar
  14. S.-h. Cha, "Comprehensive survey on distance / similarity measures between probability density functions," International Journal of Mathematical Models and Methods in Applied Sciences, vol. 1, no. 4, pp. 300--307, 2007.Google ScholarGoogle Scholar
  15. V. Keselj, F. Peng, N. Cercone, and C. Thomas, "N-gram-based author profiles for authorship attribution," in Pacific Association for Computational Linguistics, 2003.Google ScholarGoogle Scholar
  16. A. Stolcke, "Srilm - an extensible language modeling toolkit," in Proceedings of Int'l Conference on Spoken Language Processing, 2002.Google ScholarGoogle Scholar
  17. M. Tranmer and M. Elliot, "Binary logistic regression," Cathie Marsh for Census and Survey Research, Paper 2008--20.Google ScholarGoogle Scholar
  18. F. J. Provost, T. Fawcett, and R. Kohavi, "The case against accuracy estimation for comparing induction algorithms," in Proceedings of the Fifteenth International Conference on Machine Learning (ICML), 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. W. W. Cohen, P. Ravikumar, and S. E. Fienberg, "A comparison of string distance metrics for name-matching tasks," in Proceedings of IJCAI-03 Workshop on Information Integration, 2003.Google ScholarGoogle Scholar
  20. G. Friedland, G. Maier, R. Sommer, and N. Weaver, "Sherlock Holmes' evil twin: on the impact of global inference for online privacy," in Proceedings of the 2011 Workshop on New Security Paradigms Workshop (NSPW), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Irani, S. Webb, K. Li, and C. Pu, "Large online social footprints--an emerging threat," in Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 03 (CSE), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Balduzzi, C. Platzer, T. Holz, E. Kirda, D. Balzarotti, and C. Kruegel, "Abusing social networks for automated user profiling," in Proceedings of 13th International Symposium on Recent Advances in Intrusion Detection (RAID), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Iofciu, P. Fankhauser, F. Abel, and K. Bischoff, "Identifying users across social tagging systems," in Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM), 2011.Google ScholarGoogle Scholar
  24. G. Wondracek, T. Holz, E. Kirda, and C. Kruegel, "A practical attack to de-anonymize social network users," in Proceedings of the 31st IEEE Symposium on Security and Privacy (S&P), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Zang and J. Bolot, "Anonymization of location data does not work: a large-scale measurement study," in Proceedings of the 17th annual International Conference on Mobile Computing and Networking (MobiCom), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. Hecht, L. Hong, B. Suh, and E. H. Chi, "Tweets from justin bieber's heart: the dynamics of the location field in user profiles," in Proceedings of the 2011 Annual Conference on Human Factors in Computing Systems (CHI), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. Chaabane, G. Acs, and M. A. Kaafar, "You are what you like! information leakage through users' interests," in Proceedings of the 19th Annual Network & Distributed System Security Symposium (NDSS), 2012.Google ScholarGoogle Scholar
  28. E. Zheleva and L. Getoor, "To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles," in Proceedings of the 18th International Conference on World Wide Web (WWW), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Gayo Avello, "All liaisons are dangerous when all your friends are known to us," in Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia (HT), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Narayanan, H. Paskov, N. Z. Gong, J. Bethencourt, E. Stefanov, E. C. R. Shin, and D. Song, "On the feasibility of internet-scale author identification," in Proceedings of the 33st IEEE Symposium on Security and Privacy (S&P), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. A. Mishari and G. Tsudik, "Exploring linkability of user reviews," in Proceedings of the 17th European Symposium on Research in Computer Security (ESORICS), 2012.Google ScholarGoogle Scholar
  32. L. Sweeney, "Weaving technology and policy together to maintain confidentiality," Journal of Law, Medicine, and Ethics, vol. 25, no. 2-3, pp. 98--110, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  33. A. Narayanan and V. Shmatikov, "De-anonymizing social networks," in Proceedings of the 2009 30th IEEE Symposium on Security and Privacy (S&P), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Srivatsa and M. Hicks, "Deanonymizing mobility traces: Using social network as a side-channel," in Proceedings of the ACM Conference on Computer and Communications Security (CCS), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Bishop, J. Cummins, S. Peisert, A. Singh, B. Bhumiratana, D. Agarwal, D. Frincke, and M. Hogarth, "Relationships and data sanitization: A study in scarlet," in Proceedings of the 2010 Workshop on New Security Paradigms (NSPW), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G. Friedland and R. Sommer, "Cybercasing the Joint: On the Privacy Implications of Geo-Tagging," in Proceedings of the 5th USENIX Conference on Hot Topics in Security (HotSec), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting innocuous activity for correlating users across sites

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader