research-article

General Graph Data De-Anonymization: From Mobility Traces to Social Networks

Authors:
Shouling Ji

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Weiqing Li

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

,
Mudhakar Srivatsa

IBM T. J. Watson Research Center, NY, USA

IBM T. J. Watson Research Center, NY, USA
View Profile

,
Jing Selena He

Kennesaw State University, GA, USA

Kennesaw State University, GA, USA
View Profile

,
Raheem Beyah

Georgia Institute of Technology, Atlanta, GA, USA

Georgia Institute of Technology, Atlanta, GA, USA
View Profile

ACM Transactions on Information and System Security Volume 18 Issue 4Article No.: 12pp 1–29https://doi.org/10.1145/2894760

Published:21 April 2016Publication History

ACM Transactions on Information and System Security

Abstract

When people utilize social applications and services, their privacy suffers a potential serious threat. In this article, we present a novel, robust, and effective de-anonymization attack to mobility trace data and social data. First, we design a Unified Similarity (US) measurement, which takes account of local and global structural characteristics of data, information obtained from auxiliary data, and knowledge inherited from ongoing de-anonymization results. By analyzing the measurement on real datasets, we find that some data can potentially be de-anonymized accurately and the other can be de-anonymized in a coarse granularity. Utilizing this property, we present a US-based De-Anonymization (DA) framework, which iteratively de-anonymizes data with accuracy guarantee. Then, to de-anonymize large-scale data without knowledge of the overlap size between the anonymized data and the auxiliary data, we generalize DA to an Adaptive De-Anonymization (ADA) framework. By smartly working on two core matching subgraphs, ADA achieves high de-anonymization accuracy and reduces computational overhead. Finally, we examine the presented de-anonymization attack on three well-known mobility traces: St Andrews, Infocom06, and Smallblue, and three social datasets: ArnetMiner, Google+, and Facebook. The experimental results demonstrate that the presented de-anonymization framework is very effective and robust to noise.

The source code and employed datasets are now publicly available at SecGraph [2015].

References

L. Alvisi, A. Clement, A. Epasto, S. Lattanzi, and A. Panconesi. 2013. SoK: The evolution of Sybil defense via social networks. In S&P. Google ScholarDigital Library
L. Backstrom, C. Dwork, and J. Kleinberg. 2007. Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In WWW. Google ScholarDigital Library
S. C. Basak, V. R. Magnuson, G. J. Niemi, and R. R. Regal. 1988. Determining structural similarity of chemicals using graph-theoretic indices. Discrete Appl. Math. 19 (1988), 17--44. Google ScholarDigital Library
A. Bavelas. 1950. Communication patterns in task-oriented groups. J. Acoust. Soc. Am 22 (1950), 725--730.Google ScholarCross Ref
G. Bigwood, D. Rehunathan, M. Bateman, T. Henderson, and S. Bhatti. 2011. CRAWDAD dataset st_andrews/sassy. Retrieved from http:/crawdad.cs.dartmouth.edu/st_andrews/sassy.Google Scholar
P. Boldi and S. Vigna. 2014. Axioms for centrality. Internet Math. 10, 3--4 (2014), 222--262.Google ScholarCross Ref
U. Brandes and J. Lerner. 2004. Structural similarity in graphs: A relaxation approach for role assignment. LNCS 3341 (2004), 184--195. Google ScholarDigital Library
A. Campan and T. M. Truta. 2008. A clustering approach for data and structural anonymity in social networks. In PinKDD.Google Scholar
Centrality. 2015. Centrality - Wikipedia, the free encyclopedia. Retrieved from https://en.wikipedia.org/wiki/Centrality.Google Scholar
CRAWDAD. 2014. Retrieved from http://crawdad.cs.dartmouth.edu/.Google Scholar
M. Egele, C. Kruegel, E. Kirda, and G. Vigna. 2011. PiOS: Detecting privacy leaks in iOS applications. NDSS (2011).Google Scholar
L. C. Freeeman. 1978. Centrality in social networks: Conceptual clarification. Social Netw. 1 (1978), 215--239.Google ScholarCross Ref
M. Garg. 2009. Axiomatic foundations of centrality in networks. Social Sci. Res. Netw. (2009), 1--37.Google Scholar
N. Z. Gong, A. Talwalkar, L. Mackey, L. Huang, E. C. R. Shin, E. Stefanov, E. Shi, and D. Song. 2012a. Jointly predicting links and inferring attributes using a social-attribute network (SAN). In SNA-KDD.Google Scholar
N. Z. Gong, W. Xu, L. Huang, P. Mittal, E. Stefanov, V. Sekar, and D. Song. 2012b. Evolution of social-attribute networks: Measurements, modeling, and implications using Google+. In IMC. Google ScholarDigital Library
R. Hanneman and M. Riddle. 2005. Introduction to Social Network Methods: Table of Contents. Retrieved from http://faculty.ucr.edu/&sim;hanneman/nettext/.Google Scholar
M. Hay, G. Miklau, D. Jensen, D. Towsley, and P. Weis. 2008. Resisting structural re-identification in anonymized social networks. In VLDB. Google ScholarDigital Library
P. Hornyack, S. Han, J. Jung, S. Schechter, and D. Wetherall. 2011. These aren’t the droids you’re looking for: Retrofitting android to protect data from imperious applications. In CCS. Google ScholarDigital Library
S. Ji, W. Li, P. Mittal, X. Hu, and R. Beyah. 2014. Structural data de-anonymization: Quantification, practice, and implications. In CCS. Google ScholarDigital Library
S. Ji, W. Li, P. Mittal, X. Hu, and R. Beyah. 2015a. On your social network de-anonymizablity: Quantification and large scale evaluation with seed knowledge. In NDSS.Google Scholar
S. Ji, W. Li, P. Mittal, X. Hu, and R. Beyah. 2015b. SecGraph: A uniform and open-source evaluation system for graph data anonymization and de-anonymization. In USENIX Security. Google ScholarDigital Library
J. Kruskal and M. Wish. 1978. Multidimensional Scaling. Sage Publications.Google Scholar
K. Liu and E. Terzi. 2008. Towards identity anonymization on graphs. SIGMOD (2008). Google ScholarDigital Library
J. McAuley and J. Leskovec. 2012. Learning to discover social circles in ego networks. In NIPS.Google Scholar
A. Narayanan and V. Shmatikov. 2008. Robust de-anonymization of large sparse datasets (de-anonymizing the Netflix prize dataset). In S&P. Google ScholarDigital Library
A. Narayanan and V. Shmatikov. 2009. De-anonymizing social networks. In S&P. Google ScholarDigital Library
M. E. J. Newman. 2010. Networks: An Introduction. Oxford University Press. Google ScholarCross Ref
S. Nilizadeh, A. Kapadia, and Y.-Y. Ahn. 2014. Community-enhanced de-anonymization of online social networks. In CCS. Google ScholarDigital Library
T. Opsahl. 2010. Closeness centrality in networks with disconnected components | Tore Opsahl. Retrieved from http://toreopsahl.com/2010/03/20/closeness-centrality-in-networks-with-disconnected-components/.Google Scholar
T. Opsahl, F. Agneessens, and J. Skvoretz. 2010. Node centrality in weighted networks: Generalizing degree and shortest paths. Social Netw. 32 (2010), 245--251.Google ScholarCross Ref
Y. Rochat. 2009. Closeness centrality extended to unconnected graphs: The harmonic centrality index. In ASNA. 1--14.Google Scholar
J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau. 2009. CRAWDAD dataset cambridge/haggle. Retrieved from http://crawdad.cs.dartmouth.edu/cambridge/haggle.Google Scholar
SecGraph. 2015. Retrieved from http://www.ece.gatech.edu/cap/secgraph/.Google Scholar
Similarity. 2015. Similarity (network science) - Wikipedia, the free encyclopedia. Retrieved from https://en.wikipedia.org/wiki/Similarity\_(network_science).Google Scholar
K. Singh, S. Bhola, and W. Lee. 2009. xBook: Redesigning privacy control in social networking platforms. In USENIX. Google ScholarDigital Library
Smallblue. 2009. SmallBlue Research Projects. Retrieved from http://domino.research.ibm.com/comm/research_projects.nsf/pages/smallblue.index.html.Google Scholar
SNAP. 2014. Stanford Large Network Dataset Collection. Retrieved from http://snap.stanford.edu/data/.Google Scholar
M. Srivatsa and M. Hicks. 2012. Deanonymizing mobility traces: Using social networks as a side-channel. In CCS. Google ScholarDigital Library
J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. 2008. ArnetMiner: Extraction and mining of academic social networks. In KDD. Google ScholarDigital Library
B. Viswanath, A. N. Mislove, M. Cha, and K. P. Gummadi. 2009. On the evolution of user interaction in facebook. In WOSN. Google ScholarDigital Library
H. Yu, P. B. Gibbons, M. Kaminsky, and F. Xiao. 2008a. SybilLimit: A near-optimal social network defense against Sybil attacks. In S&P. Google ScholarDigital Library
H. Yu, M. Kaminsky, P. B. Gibbons, and A. D. Flaxman. 2008b. SybilGuard: Defending against Sybil attacks via social networks. IEEE/ACM Trans. Netw. 16, 3 (2008), 576--589. Google ScholarDigital Library
H. Yu, C. Shi, M. Kaminsky, P. B. Gibbons, and F. Xiao. 2009. DSybil: Optimal Sybil-resistance for recommendation systems. In S&P. Google ScholarDigital Library
E. Zheleva and L. Getoor. 2007. Preserving the privacy of sensitive relationships in graph data. In PinKDD. Google ScholarDigital Library
Y. Zhou, H. Cheng, and J. X. Yu. 2009. Graph clustering based on structural/attribute similarities. In VLDB. Google ScholarDigital Library

Index Terms

General Graph Data De-Anonymization: From Mobility Traces to Social Networks
1. Security and privacy
  1. Database and storage security
    1. Data anonymization and sanitization
  2. Security services
    1. Pseudonymity, anonymity and untraceability

Recommendations

An Automated Social Graph De-anonymization Technique
WPES '14: Proceedings of the 13th Workshop on Privacy in the Electronic Society

We present a generic and automated approach to re-identifying nodes in anonymized social networks which enables novel anonymization techniques to be quickly evaluated. It uses machine learning (decision forests) to matching pairs of nodes in disparate ...
Read More
Structural Data De-anonymization: Quantification, Practice, and Implications
CCS '14: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security

In this paper, we study the quantification, practice, and implications of structural data (e.g., social data, mobility traces) De-Anonymization (DA). First, we address several open problems in structural data DA by quantifying perfect and (1-ε)-perfect ...
Read More
POSTER: Evaluating Privacy Metrics for Graph Anonymization and De-anonymization
ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications Security

Many modern communication systems generate graph data, for example social networks and email networks. Such graph data can be used for recommender systems and data mining. However, because graph data contains sensitive information about individuals, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Information and System Security Volume 18, Issue 4
May 2016
88 pages
ISSN:1094-9224
EISSN:1557-7406
DOI:10.1145/2928292
Editor:
David Basin
ETH Zurich, Switzerland
Issue’s Table of Contents
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 April 2016
- Accepted: 1 February 2016
- Revised: 1 November 2015
- Received: 1 January 2015
Published in tissec Volume 18, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Graph de-anonymization
mobility traces
social networks
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 1,151
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

General Graph Data De-Anonymization: From Mobility Traces to Social Networks

ACM Transactions on Information and System Security

Abstract

References

Cited By

Index Terms

Recommendations

An Automated Social Graph De-anonymization Technique

Structural Data De-anonymization: Quantification, Practice, and Implications

POSTER: Evaluating Privacy Metrics for Graph Anonymization and De-anonymization