research-article

Refining Social Graph Connectivity via Shortcut Edge Addition

Author:
Manos Papagelis

University of Toronto, Toronto ON, Canada

University of Toronto, Toronto ON, Canada
View Profile

ACM Transactions on Knowledge Discovery from Data Volume 10 Issue 2Article No.: 12pp 1–35https://doi.org/10.1145/2757281

Published:12 October 2015Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Small changes on the structure of a graph can have a dramatic effect on its connectivity. While in the traditional graph theory, the focus is on well-defined properties of graph connectivity, such as biconnectivity, in the context of a social graph, connectivity is typically manifested by its ability to carry on social processes. In this paper, we consider the problem of adding a small set of nonexisting edges (shortcuts) in a social graph with the main objective of minimizing its characteristic path length. This property determines the average distance between pairs of vertices and essentially controls how broadly information can propagate through a network. We formally define the problem of interest, characterize its hardness and propose a novel method, path screening, which quickly identifies important shortcuts to guide the augmentation of the graph. We devise a sampling-based variant of our method that can scale up the computation in larger graphs. The claims of our methods are formally validated. Through experiments on real and synthetic data, we demonstrate that our methods are a multitude of times faster than standard approaches, their accuracy outperforms sensible baselines and they can ease the spread of information in a network, for a varying range of conditions.

References

Noga Alon. 1986. Eigenvalues and expanders. Combinatorica 6, 2, 83--96. Google ScholarDigital Library
B. Awerbuch. 1985. A new distributed depth-first-search algorithm. Inform. Process. Lett. 20, 3, 147--150.Google ScholarCross Ref
Baruch Awerbuch and Robert G. Gallager. 1985. Distributed BFS algorithms. In FOCS, 250--256. Google ScholarDigital Library
Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: predicting and recommending links in social networks. In WSDM, 635--644. Google ScholarDigital Library
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439, 509--512.Google Scholar
D. Bu, Y. Zhao, L. Cai, H. Xue, X. Zhu, H. Lu, J. Zhang, S. Sun, L. Ling, N. Zhang, G. Li, and R. Chen. 2003. Topological structure analysis of the protein-protein interaction network in budding yeast. Nucl. Acids Res. 31, 9, 2443--2450.Google ScholarCross Ref
K. Mani Chandy and Jayadev Misra. 1982. Distributed computation on graphs: shortest path algorithms. ACM Commun. 25, 11, 833--837. Google ScholarDigital Library
Vineet Chaoji, Sayan Ranu, Rajeev Rastogi, and Rushi Bhatt. 2012. Recommendations to boost content spread in social networks. In WWW. 529--538. Google ScholarDigital Library
Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: simplified data processing on large clusters. In OSDI. 107--113. Google ScholarDigital Library
Erik D. Demaine and Morteza Zadimoghaddam. 2010. Minimizing the diameter of a network using shortcut edges. In SWAT. 420--431. Google ScholarDigital Library
E.W. Dijkstra. 1959. A note on two problems in connexion with graphs. Numerische mathematik 1, 1, 269--271. Google ScholarDigital Library
D. Easley and J. Kleinberg. 2010. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, Cambridge, U.K. Google ScholarDigital Library
Kapali P. Eswaran and Robert Endre Tarjan. 1976. Augmentation problems. SIAM J. Comput. 5, 4, 653--665.Google ScholarDigital Library
Uriel Feige. 1998. A threshold of ln n for approximating set cover. J. ACM 45, 4, 634--652. Google ScholarDigital Library
Yuval Filmus and Justin Ward. 2012. A tight combinatorial algorithm for submodular maximization subject to a matroid constraint. In FOCS. 659--668. Google ScholarDigital Library
Greg N. Frederickson and Joseph JáJá. 1981. Approximation algorithms for several graph augmentation problems. SIAM J. Comput. 10, 2, 270--283.Google ScholarDigital Library
Michael L. Fredman and Robert Endre Tarjan. 1987. Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM 34, 3, 596--615. Google ScholarDigital Library
M. Girvan and M. E. J. Newman. 2002. Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99, 12, 7821--7826.Google ScholarCross Ref
Andrew V. Goldberg and Chris Harrelson. 2005. Computing the shortest path: A search meets graph theory. In SODA. 156--165. Google ScholarDigital Library
Mark Jerrum and Alistair Sinclair. 1989. Approximating the permanent. SIAM J. 18, 6, 1149--1178. Google ScholarDigital Library
Donald B. Johnson. 1977. Efficient algorithms for shortest paths in sparse networks. J. ACM 24, 1, 1--13. Google ScholarDigital Library
Srikanth Kandula and Ratul Mahajan. 2009. Sampling biases in network path measurements and what to do about it. In IMC. 156--169. Google ScholarDigital Library
Michael Kapralov, Ian Post, and Jan Vondrak. 2013. Online submodular welfare maximization: Greedy is optimal. In SODA. 1216--1225. Google ScholarDigital Library
George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 1, 359--392. Google ScholarDigital Library
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In KDD. 137--147. Google ScholarDigital Library
Samir Khuller and Ramakrishna Thurimella. 1993. Approximation algorithms for graph augmentation. J. Algorithms 14, 2, 214--225. Google ScholarDigital Library
Jamie King. 2003. Conductance and rapidly mixing Markov chains. Technical report, University of Waterloo, 2003.Google Scholar
Jon Kleinberg. 2007. Cascading behavior in networks: Algorithmic and economic issues. In Algorithmic Game Theory, Noam Nisan, Tim Roughgarden, Éva Tardos, and Vijay Vazirani (Eds.). Cambridge University Press, Cambridge, U.K.Google Scholar
J. Kleinberg, A. Slivkins, and T. Wexler. 2004. Triangulation and embedding using small sets of beacons. In FOCS. 444--453. Google ScholarDigital Library
Nikolaos Laoutaris, Laura J. Poplawski, Rajmohan Rajaraman, Ravi Sundaram, and Shang-Hua Teng. 2008. Bounded budget connection (BBC) games or how to make friends and influence people, on a budget. In PODC. 165--174. Google ScholarDigital Library
Konstantina Lazaridou, Konstantinos Semertzidis, Evaggelia Pitoura, and Panayiotis Tsaparas. 2015. Identifying converging pairs of nodes on a budget. EDBT. 313--324.Google Scholar
Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 631--636. Google ScholarDigital Library
Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance. 2007. Cost-effective outbreak detection in networks. In KDD. 420--429. Google ScholarDigital Library
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. (June 2014). Retrieved from http://snap.stanford.edu/data.Google Scholar
Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2008. Statistical properties of community structure in large social and information networks. In WWW. 695--704. Google ScholarDigital Library
David Liben-Nowell and Jon Kleinberg. 2003. The link prediction problem for social networks. In CIKM. 1019--1031. Google ScholarDigital Library
László Lovász. 1993. Random walks on graphs: a survey. Combinatorics, Paul erdos is eighty 2, 1 (1993), 1--46.Google Scholar
David Lusseau, Karsten Schneider, Oliver J. Boisseau, Patti Haase, Elisabeth Slooten, and Steve M Dawson. 2003. The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Beh. Ecol. Sociobiol. 54, 4, 91--125.Google ScholarCross Ref
Karl Menger. 1927. Zur allgemeinen kurventheorie. Fundamenta Mathematicae 10, 1, 96--115.Google ScholarCross Ref
Adam Meyerson and Brian Tagiku. 2009. Minimizing average shortest path distances via shortcut edge addition. In APPROX-RANDOM. Google ScholarDigital Library
Abedelaziz Mohaisen, Aaram Yun, and Yongdae Kim. 2010. Measuring the mixing time of social graphs. In Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement. 383--389. Google ScholarDigital Library
Stephen Morris. 2000. Contagion. Rev. Economic Stud. 67, 1, 57--78.Google ScholarCross Ref
G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions-I. Math. Program. 14, 1, 265--294.Google ScholarDigital Library
M. E. J. Newman. 2005. A measure of betweenness centrality based on random walks. Soc. Netw. 27, 1, 39--54.Google ScholarCross Ref
M. E. J. Newman. 2006. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 3, 036104.Google ScholarCross Ref
Zeev Nutov. 2005. Approximating connectivity augmentation problems. In SODA. 176--185. Google ScholarDigital Library
Christos H. Papadimitriou and Kenneth Steiglitz. 1998. Combinatorial Optimization: Algorithms and Complexity. Courier Corporation, Mineola, New York.Google ScholarDigital Library
Manos Papagelis, Francesco Bonchi, and Aristides Gionis. 2011. Suggesting ghost edges for a smaller world. In CIKM. 2305--2308. Google ScholarDigital Library
Manos Papagelis, Gautam Das, and Nick Koudas. 2013. Sampling online social networks. IEEE Trans. Knowl. Data Eng. 25, 3, 662--676. Google ScholarDigital Library
N. Parotisidis, Evaggelia Pitoura, and Panayiotis Tsaparas. 2015. Selecting shortcuts for a smaller world. In SIAM International Conference on Data Mining (SDM). 28--36.Google ScholarCross Ref
Alexandrin Popescul, Rin Popescul, and Lyle H. Ungar. 2003. Statistical relational learning for link prediction. In IJCAI03 Workshop on Learning Statistical Models from Relational Data. 109--115.Google Scholar
Miao Qiao, Hong Cheng, Lijun Chang, and Jeffrey Xu Yu. 2012. Approximate shortest distance computing: A query-dependent local landmark scheme. In ICDE. 55--68. Google ScholarDigital Library
Matthew Richardson and Pedro Domingos. 2002. Mining knowledge-sharing sites for viral marketing. In KDD. 61--70. Google ScholarDigital Library
K. Shvachko, H. Kuang, S. Radia, and R. Chansler. 2010. The Hadoop distributed file system. In MSST. 1--10. Google ScholarDigital Library
Alistair Sinclair. 1992. Improved Bounds for Mixing Rates of Markov Chains and Multicommodity Flow. Springer, New York.Google ScholarDigital Library
Yuan Tian, Qi He, Qiankun Zhao, Xingjie Liu, and Wang-chien Lee. 2010. Boosting social network connectivity with link revival. In CIKM. 589--598. Google ScholarDigital Library
Hanghang Tong, B. Aditya Prakash, Tina Eliassi-Rad, Michalis Faloutsos, and Christos Faloutsos. 2012. Gelling, and melting, large graphs by edge manipulation. In CIKM. 245--254. Google ScholarDigital Library
Jan Vondrak. 2008. Optimal approximation for the submodular welfare problem in the value oracle model. In STOC. 67--74. Google ScholarDigital Library
Zhuojie Zhou, Nan Zhang, Zhiguo Gong, and Gautam Das. 2013. Faster random walks by rewiring online social networks on-the-fly. In ICDE. 769--780. Google ScholarDigital Library

Index Terms

Refining Social Graph Connectivity via Shortcut Edge Addition
1. Applied computing
  1. Law, social and behavioral sciences
2. Information systems
  1. Information storage systems
    1. Storage architectures
      1. Storage network architectures
  2. Information systems applications
    1. Data mining

Recommendations

Extra edge connectivity and isoperimetric edge connectivity

An edge set S of a connected graph G is a k-extra edge cut, if G-S is no longer connected, and each component of G-S has at least k vertices. The cardinality of a minimum k-extra edge cut, denoted by @l"k(G), is the k-extra edge connectivity of G. The ...
Read More
Graph connectivity and its augmentation: applications of MA orderings

This paper surveys how the maximum adjacency (MA) ordering of the vertices in a graph can be used to solve various graph problems. We first explain that the minimum cut problem can be solved efficiently by utilizing the MA ordering. The idea is then ...
Read More
Constrained Edge-Splitting Problems

Splitting off two edges su,sv in a graph G means deleting su,sv and adding a new edge uv. Let G=(V+s,E) be k-edge-connected in V ($k\geq 2$) and let d(s) be even. Lovász proved that the edges incident to s can be split off in pairs in a such a way that ...
Read More

Reviews

Reviewer: Salvatore F. Pileggi

Small changes to the structure of graphs imply a huge impact on their connectivity. While in a purely theoretical approach, a topological change affects just formal properties, graph data structures associated with well-defined semantics (for example, a social graph) can significantly alter the reality they are representing (such as social processes in a social graph). In this paper, the characteristic path length (average distance between pairs of vertices) is minimized by applying a graph augmentation technique: a small set of nonexisting edges is added to the original structure in order to define shortcuts. Even though contextual properties (for example, influence of people, relevance of content) play an important role in the propagation of the information in practice, shorter paths are a guarantee of improved performance. This work is definitely a solid contribution, including a clear introduction to the problem, an appreciated section on motivating applications, and a validation/evaluation of the model with synthetic data. As a very minor remark, I would have liked a deeper discussion on the effects of augmentation techniques in social graphs, also including negative ones. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 10, Issue 2
October 2015
291 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/2835206
Editor:
Philip S. Yu
University of Illinois at Chicago, USA
Issue’s Table of Contents
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 October 2015
- Accepted: 1 April 2015
- Revised: 1 February 2015
- Received: 1 May 2014
Published in tkdd Volume 10, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Graph augmentation
conductance
mixing time
network engineering
propagation
shortcuts
social networks
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 494
  Total Downloads
- Downloads (Last 12 months)23
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Refining Social Graph Connectivity via Shortcut Edge Addition

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Extra edge connectivity and isoperimetric edge connectivity

Graph connectivity and its augmentation: applications of MA orderings

Constrained Edge-Splitting Problems

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Refining Social Graph Connectivity via Shortcut Edge Addition

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Extra edge connectivity and isoperimetric edge connectivity

Graph connectivity and its augmentation: applications of MA orderings

Constrained Edge-Splitting Problems

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media