research-article

DPLink: User Identity Linkage via Deep Neural Network From Heterogeneous Mobility Data

Authors:
Jie Feng

Tsinghua University, China

Tsinghua University, China
View Profile

,
Mingyang Zhang

Tsinghua University, China

Tsinghua University, China
View Profile

,
Huandong Wang

Tsinghua University, China

Tsinghua University, China
View Profile

,
Zeyu Yang

Tsinghua University, China

Tsinghua University, China
View Profile

,
Chao Zhang

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA
View Profile

,
Yong Li

Tsinghua University, China

Tsinghua University, China
View Profile

,
Depeng Jin

Tsinghua University, China

Tsinghua University, China
View Profile

Authors Info & Claims

WWW '19: The World Wide Web ConferenceMay 2019Pages 459–469https://doi.org/10.1145/3308558.3313424

Published:13 May 2019Publication History

WWW '19: The World Wide Web Conference

Pages 459–469

ABSTRACT

Online services are playing critical roles in almost all aspects of users' life. Users usually have multiple online identities (IDs) in different online services. In order to fuse the separated user data in multiple services for better business intelligence, it is critical for service providers to link online IDs belonging to the same user. On the other hand, the popularity of mobile networks and GPS-equipped smart devices have provided a generic way to link IDs, i.e., utilizing the mobility traces of IDs. However, linking IDs based on their mobility traces has been a challenging problem due to the highly heterogeneous, incomplete and noisy mobility data across services.

In this paper, we propose DPLink, an end-to-end deep learning based framework, to complete the user identity linkage task for heterogeneous mobility data collected from different services with different properties. DPLink is made up by a feature extractor including a location encoder and a trajectory encoder to extract representative features from trajectory and a comparator to compare and decide whether to link two trajectories as the same user. Particularly, we propose a pre-training strategy with a simple task to train the DPLink model to overcome the training difficulties introduced by the highly heterogeneous nature of different source mobility data. Besides, we introduce a multi-modal embedding network and a co-attention mechanism in DPLink to deal with the low-quality problem of mobility data. By conducting extensive experiments on two real-life ground-truth mobility datasets with eight baselines, we demonstrate that DPLink outperforms the state-of-the-art solutions by more than 15% in terms of hit-precision. Moreover, it is expandable to add external geographical context data and works stably with heterogeneous noisy mobility traces. Our code is publicly available1.

References

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473(2014).Google Scholar
Wei Cao, Zhengwei Wu, Dong Wang, Jian Li, and Haishan Wu. 2016. Automatic user identification method across heterogeneous mobility data sources. In 2016 IEEE 32nd International Conference on Data Engineering (ICDE). IEEE, 978-989.Google ScholarCross Ref
Alket Cecaj, Marco Mamei, and Franco Zambonelli. 2016. Re-identification and information fusion between anonymized CDR and social network data. Journal of Ambient Intelligence and Humanized Computing 7, 1 (2016), 83-96.Google ScholarCross Ref
Wei Chen, Hongzhi Yin, Weiqing Wang, Lei Zhao, and Xiaofang Zhou. 2018. Effective and Efficient User Account Linkage Across Location Based Social Networks. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 1085-1096.Google Scholar
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555(2014).Google Scholar
Jie Feng, Yong Li, Chao Zhang, Funing Sun, Fanchao Meng, Ang Guo, and Depeng Jin. 2018. Deepmove: Predicting human mobility with attentional recurrent networks. In Proceedings of the 2018 World Wide Web Conference on World Wide Web (WWW). International World Wide Web Conferences Steering Committee, 1459-1468. Google ScholarDigital Library
Qiang Gao, Fan Zhou, Kunpeng Zhang, Goce Trajcevski, Xucheng Luo, and Fengli Zhang. 2017. Identifying human mobility via trajectory embeddings. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI). Google ScholarDigital Library
Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, and Renata Teixeira. 2013. Exploiting innocuous activity for correlating users across sites. In Proceedings of the 22nd international conference on World Wide Web (WWW). Google ScholarDigital Library
Oana Goga, Patrick Loiseau, Robin Sommer, Renata Teixeira, and Krishna P Gummadi. 2015. On the reliability of profile matching across large online social networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1799-1808. Google ScholarDigital Library
Marta C Gonzalez, Cesar A Hidalgo, and Albert-Laszlo Barabasi. 2008. Understanding individual human mobility patterns. Nature 453, 7196 (2008), 779-782.Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural computation 9 8(1997), 1735-80.Google Scholar
Shouling Ji, Weiqing Li, Neil Zhenqiang Gong, Prateek Mittal, and Raheem A Beyah. 2015. On Your Social Network De-anonymizablity: Quantification and Large Scale Evaluation with Seed Knowledge.. In Proceedings of the Network and Distributed System Security Symposium (NDSS).Google ScholarCross Ref
Shouling Ji, Weiqing Li, Mudhakar Srivatsa, and Raheem Beyah. 2014. Structural data de-anonymization: Quantification, practice, and implications. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1040-1053. Google ScholarDigital Library
Ethan Katz-Bassett, John P John, Arvind Krishnamurthy, David Wetherall, Thomas Anderson, and Yatin Chawathe. 2006. Towards IP geolocation using delay and topology measurements. In Proceedings of the ACM SIGCOMM conference on Internet Measurement (IMC). Google ScholarDigital Library
Nitish Korula and Silvio Lattanzi. 2014. An efficient reconciliation algorithm for social networks. Proceedings of the VLDB Endowment 7, 5 (2014), 377-388. Google ScholarDigital Library
Shamanth Kumar, Reza Zafarani, and Huan Liu. 2011. Understanding User Migration Patterns in Social Media. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Google ScholarDigital Library
Xiucheng Li, Kaiqi Zhao, Gao Cong, Christian S Jensen, and Wei Wei. 2018. Deep Representation Learning for Trajectory Similarity Computation. (2018).Google Scholar
Ziqian Lin, Jie Feng, Ziyang Lu, Yong Li, and Depeng Jin. 2019. DeepSTN+: Context-aware Spatial-Temporal Neural Network for Crowd Flow Prediction in Metropolis. In AAAI.Google Scholar
Qiang Liu, Shu Wu, Liang Wang, and Tieniu Tan. 2016. Predicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Google ScholarDigital Library
Chris YT Ma, David KY Yau, Nung Kwan Yip, and Nageswara SV Rao. 2013. Privacy vulnerability of published anonymous mobility traces. IEEE/ACM Transactions on Networking (TON)(2013). Google ScholarDigital Library
Nehal Magdy, Mahmoud A. Sakr, Tamer Mostafa, and Khaled El-Bahnasy. 2016. Review on trajectory similarity measures. In IEEE Seventh International Conference on Intelligent Computing and Information Systems.Google Scholar
Farid M Naini, Jayakrishnan Unnikrishnan, Patrick Thiran, and Martin Vetterli. 2016. Where you are is who you are: User identification by matching statistics. IEEE Transactions on Information Forensics and Security (TIFS) (2016).Google Scholar
Arvind Narayanan and Vitaly Shmatikov. 2008. Robust de-anonymization of large sparse datasets. In Proceedings of the IEEE Symposium on Security and Privacy (SP). Google ScholarDigital Library
Christopher Riederer, Yunsung Kim, Augustin Chaintreau, Nitish Korula, and Silvio Lattanzi. 2016. Linking users across domains with location data: Theory and validation. In Proceedings of the 25th International Conference on World Wide Web (WWW). 707-719. Google ScholarDigital Library
Luca Rossi and Mirco Musolesi. 2014. It's the way you check-in: identifying users in location-based social networks. In Proceedings of the second ACM Conference on Online Social Networks (COSN). Google ScholarDigital Library
Reza Shokri, George Theodorakopoulos, Jean-Yves Le Boudec, and Jean-Pierre Hubaux. 2011. Quantifying location privacy. In Proceedings of the IEEE Symposium on Security and Privacy (SP). Google ScholarDigital Library
Chaoming Song, Zehui Qu, Nicholas Blumm, and Albert-László Barabási. 2010. Limits of predictability in human mobility. Science 327, 5968 (2010), 1018-1021.Google Scholar
Karen Sparck Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation 28, 1 (1972), 11-21.Google ScholarCross Ref
Mudhakar Srivatsa and Mike Hicks. 2012. Deanonymizing mobility traces: Using social network as a side-channel. In Proceedings of the 2012 ACM conference on Computer and communications security. ACM, 628-637. Google ScholarDigital Library
Zhen Tu, Kai Zhao, Fengli Xu, Yong Li, Li Su, and Depeng Jin. 2018. Protecting Trajectory from Semantic Attack Considering k-Anonymity, l-diversity and t-closeness. IEEE Transactions on Network and Service Management (2018).Google Scholar
Gang Wang, Sarita Yardi Schoenebeck, Haitao Zheng, and Ben Y. Zhao. 2016. ”Will Check-in for Badges”: Understanding Bias and Misbehavior on Location-Based Social Networks. In Proceedings of the International Conference on Web and Social Media (ICWSM).Google Scholar
Huandong Wang, Chen Gao, Yong Li, Gang Wang, Depeng Jin, and Jingbo Sun. 2018. De-anonymization of Mobility Trajectories: Dissecting the Gaps between Theory and Practice. In Proceedings of the Network and Distributed System Security Symposium (NDSS).Google ScholarCross Ref
Huandong Wang, Chen Gao, Yong Li, Zhi-Li Zhang, and Depeng Jin. 2017. From Fingerprint to Footprint: Revealing Physical World Privacy Leakage by Cyberspace Cookie Logs. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM). 1209-1218. Google ScholarDigital Library
Huandong Wang, Yong Li, Gang Wang, and Depeng Jin. 2018. You Are How You Move: Linking Multiple User Identities From Massive Mobility Traces. In Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, 189-197.Google ScholarCross Ref
Fengli Xu, Zhen Tu, Yong Li, Pengyu Zhang, Xiaoming Fu, and Depeng Jin. 2017. Trajectory Recovery From Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data. In Proceedings of the 26th International Conference on World Wide Web (WWW. 1241-1250. Google ScholarDigital Library
Fengli Xu, Guozhen Zhang, Zhilong Chen, Jiaxin Huang, Yong Li, Diyi Yang, Ben Y Zhao, and Fanchao Meng. 2018. Understanding Motivations behind Inaccurate Check-ins. Proceedings of the ACM on Human-Computer Interaction (CSCW) (2018). Google ScholarDigital Library
Ming Yan, Jitao Sang, Tao Mei, and Changsheng Xu. 2013. Friend transfer: cold-start friend recommendation with cross-platform transfer learning of social knowledge. In Proceedings of the International Conference on Multimedia and Expo (ICME).Google Scholar
Chunfeng Yang, Huan Yan, Donghan Yu, Yong Li, and Dah Ming Chiu. 2017. Multi-site User Behavior Modeling and Its Application in Video Recommendation. In Proceedings of the International ACM Conference on Research and Development in Information Retrieval (SIGIR). Google ScholarDigital Library
Di Yao, Chao Zhang, Zhihua Zhu, Jianhui Huang, and Jingping Bi. 2017. Trajectory clustering via deep representation learning. In International Joint Conference on Neural Networks (IJCNN).Google ScholarCross Ref
Reza Zafarani and Huan Liu. 2014. Finding Friends on a New Site Using Minimum Information. In Proceedings of the SIAM International Conference on Data Mining (SDM).Google ScholarCross Ref
Jiawei Zhang, Xiangnan Kong, and Philip S. Yu. 2014. Transferring heterogeneous links across location-based social networks. In WSDM. Google ScholarDigital Library
Zefang Zong, Jie Feng, Kechun Liu, Hongzhi Shi, and Yong Li. 2019. DeepDPM: Dynamic Population Mapping via Deep Neural Network. In AAAI.Google Scholar

Recommendations

Learning to Simulate Human Mobility
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Realistic simulation of a massive amount of human mobility data is of great use in epidemic spreading modeling and related health policy-making. Existing solutions for mobility simulation can be classified into two categories: model-based methods and ...
Read More
User Identity Linkage via Co-Attentive Neural Network From Heterogeneous Mobility Data
Online services are playing critical roles in almost all aspects of users’ life. Users usually have multiple online identities (IDs) in different online services. In order to fuse the separated user data in multiple services for better business ...
Read More
Variational Cross-Network Embedding for Anonymized User Identity Linkage
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

User identity linkage (UIL) task aims to infer the identical users between different social networks/platforms. Existing models leverage the labeled inter-linkages or high-quality user attributes to make predictions. Nevertheless, it is often difficult ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Editors:
Ling Liu
Georgia Tech, USA
,
Ryen White
Microsoft Research, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
mobility trajectory
user identity linkage
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 60
  Total Citations
  View Citations
- 1,054
  Total Downloads
- Downloads (Last 12 months)100
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

DPLink: User Identity Linkage via Deep Neural Network From Heterogeneous Mobility Data

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Learning to Simulate Human Mobility

User Identity Linkage via Co-Attentive Neural Network From Heterogeneous Mobility Data

Variational Cross-Network Embedding for Anonymized User Identity Linkage

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

DPLink: User Identity Linkage via Deep Neural Network From Heterogeneous Mobility Data

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Learning to Simulate Human Mobility

User Identity Linkage via Co-Attentive Neural Network From Heterogeneous Mobility Data

Variational Cross-Network Embedding for Anonymized User Identity Linkage

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media