research-article

New perspectives and methods in link prediction

Authors:
Ryan N. Lichtenwalter

The University of Notre Dame, South Bend, IN, USA

The University of Notre Dame, South Bend, IN, USA
View Profile

,
Jake T. Lussier

The University of Notre Dame, South Bend, IN, USA

The University of Notre Dame, South Bend, IN, USA
View Profile

,
Nitesh V. Chawla

The University of Notre Dame, South Bend, IN, USA

The University of Notre Dame, South Bend, IN, USA
View Profile

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2010Pages 243–252https://doi.org/10.1145/1835804.1835837

Published:25 July 2010Publication History

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 243–252

ABSTRACT

This paper examines important factors for link prediction in networks and provides a general, high-performance framework for the prediction task. Link prediction in sparse networks presents a significant challenge due to the inherent disproportion of links that can form to links that do form. Previous research has typically approached this as an unsupervised problem. While this is not the first work to explore supervised learning, many factors significant in influencing and guiding classification remain unexplored. In this paper, we consider these factors by first motivating the use of a supervised framework through a careful investigation of issues such as network observational period, generality of existing methods, variance reduction, topological causes and degrees of imbalance, and sampling approaches. We also present an effective flow-based predicting algorithm, offer formal bounds on imbalance in sparse network link prediction, and employ an evaluation method appropriate for the observed imbalance. Our careful consideration of the above issues ultimately leads to a completely general framework that outperforms unsupervised link prediction methods by more than 30% AUC.

Supplemental Material

kdd2010_lichtenwalter_npml_01.mov

mov

72 MB

Download

References

L. Adamic and E. Adar. Friends and neighbors on the web. Social Networks, 25:211--230, 2001.Google ScholarCross Ref
M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki. Link prediction using supervised learning. In Workshop on Link Discovery: Issues, Approaches and Apps., 2005.Google Scholar
A.-L. Barab´asi, H. Jeong, Z. N´eda, E. Ravasz, A. Schubert, and T. Vicsek. Evolution of the social network of scientific collaboration. Physica A, 311(3-4):590--614, 2002.Google ScholarCross Ref
L. Breiman. Bagging predictors. Machine Learning, 24(2):123--140, 1996. Google ScholarCross Ref
L. Breiman. Random forests. Machine Learning, 45(1):5--32, 2001. Google ScholarDigital Library
N. V. Chawla, K. W. Bowyer, L. O. Hall, and P. W. Kegelmeyer. Smote: Synthetic minority over-sampling technique. Journal of A.I. Research, 16:341--378, 2002. Google ScholarDigital Library
D. A. Cieslak and N. V. Chawla. Learning decision trees for unbalanced data. In Proc. of the ECML. Springer, 2008.Google Scholar
L. Katz. A new status index derived from sociometric analysis. Psychometrika, 18(1):39--43, 1953.Google ScholarCross Ref
H. Kautz, B. Selman, and M. Shah. Referral web: combining social networks and collaborative filtering. Communications of the ACM, 40(3):63, 1997. Google ScholarDigital Library
V. E. Krebs. Mapping networks of terrorist cells. Connections, 24(3):43--52, 2002.Google Scholar
D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7):1019--1031, 2007. Google ScholarCross Ref
M. E. J. Newman. Clustering and preferential attachment in growing networks. Physical Review Letters E, 64, 2001.Google Scholar
F. Provost and T. Fawcett. Robust classification for imprecise environments. Machine Learning, 42(3):203--231, 2001. Google ScholarDigital Library
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. Google ScholarDigital Library
M. J. Rattigan and D. Jensen. The case for anomalous link discovery. SIGKDD Explorations Newsletter, 7(2):41--47, 2005. Google ScholarDigital Library
M. P. Stumpf, C. Wiuf, and R. M. May. Subnets of scale-free networks are not scale-free: sampling properties of networks. Proc. of the Nat Acad. of Sci., 102(12):4221--4224, 2005.Google ScholarCross Ref
C. Wang, V. Satuluri, and S. Parthasarathy. Local probabilistic models for link prediction. In Proc. of the 2007 7th IEEE ICDM, pages 322--331, Washington, D.C., USA, 2007. IEEE Computer Society. Google ScholarDigital Library
I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco, California, USA, second edition, 2005.. Google ScholarDigital Library

Index Terms

New perspectives and methods in link prediction
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Evaluating link prediction methods

Link prediction is a popular research area with important applications in a variety of disciplines, including biology, social science, security, and medicine. The fundamental requirement of link prediction is the accurate and effective prediction of new ...
Read More
Noise-Enhanced Unsupervised Link Prediction
Advances in Knowledge Discovery and Data Mining
Abstract
Link prediction has attracted attention from multiple research areas. Although several – mostly unsupervised – link prediction methods have been proposed, improving them is still under study. In several fields of science, noise is used as an ...
Read More
Link Prediction: Fair and Effective Evaluation
ASONAM '12: Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)

Link prediction is a popular area for publication. Papers appear in virtually every conference on data mining or network science with new methods. We argue that the practical performance potential of these methods is generally unknown because of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
class imbalance
link prediction
machine learning
networks
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 470
  Total Citations
  View Citations
- 3,879
  Total Downloads
- Downloads (Last 12 months)238
- Downloads (Last 6 weeks)32
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

New perspectives and methods in link prediction

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Evaluating link prediction methods

Noise-Enhanced Unsupervised Link Prediction

Link Prediction: Fair and Effective Evaluation