research-article

On the bias of traceroute sampling: Or, power-law degree distributions in regular graphs

Authors:
Dimitris Achlioptas

University of California, Santa Cruz, CA

University of California, Santa Cruz, CA
View Profile

,
Aaron Clauset

University of New Mexico, Albuquerque, and the Santa Fe Institute, New Mexico

University of New Mexico, Albuquerque, and the Santa Fe Institute, New Mexico
View Profile

,
David Kempe

University of Southern California, Los Angeles, CA

University of Southern California, Los Angeles, CA
View Profile

,
Cristopher Moore

University of New Mexico, Albuquerque, and the Santa Fe Institute, New Mexico

University of New Mexico, Albuquerque, and the Santa Fe Institute, New Mexico
View Profile

Authors Info & Claims

Journal of the ACM Volume 56 Issue 4Article No.: 21pp 1–28https://doi.org/10.1145/1538902.1538905

Published:02 July 2009Publication History

Journal of the ACM

Abstract

Understanding the graph structure of the Internet is a crucial step for building accurate network models and designing efficient algorithms for Internet applications. Yet, obtaining this graph structure can be a surprisingly difficult task, as edges cannot be explicitly queried. For instance, empirical studies of the network of Internet Protocol (IP) addresses typically rely on indirect methods like traceroute to build what are approximately single-source, all-destinations, shortest-path trees. These trees only sample a fraction of the network's edges, and a paper by Lakhina et al. [2003] found empirically that the resulting sample is intrinsically biased. Further, in simulations, they observed that the degree distribution under traceroute sampling exhibits a power law even when the underlying degree distribution is Poisson.

In this article, we study the bias of traceroute sampling mathematically and, for a very general class of underlying degree distributions, explicitly calculate the distribution that will be observed. As example applications of our machinery, we prove that traceroute sampling finds power-law degree distributions in both δ-regular and Poisson-distributed random graphs. Thus, our work puts the observations of Lakhina et al. on a rigorous footing, and extends them to nearly arbitrary degree distributions.

References

Alderson, D., Li, L., Willinger, W., and Doyle, J. 2005. Understanding Internet topology: Principles, models, and validation. IEEE/ACM Trans. Netw. 13, 1205--1218. Google ScholarDigital Library
Amini, L., Shaikh, A., and Schulzrinne, H. 2002. Issues with inferring Internet topological attributes. In SPIE IT-Com.Google Scholar
Barford, P., Bestavros, A., Byers, J., and Crovella, M. 2001. On the marginal utility of network topology measurements. In SIGCOMM Internet Measurement Workshop. Google ScholarDigital Library
Blondel, V., Guillaume, J.-L., Hendrickx, J., and Jungers, R. 2007. Distance distribution in random graphs and application to network exploration. Phys. Rev. E 76, 066101.Google ScholarCross Ref
Bollobás, B. 2001. Random Graphs, 2nd ed. Cambridge University Press, Cambridge.Google Scholar
Bollobás, B., and Chung, F. 1988. The diameter of a cycle plus a random matching. SIAM J. Disc. Math. 1, 328--333. Google ScholarDigital Library
Chen, Q., Chang, H., Govindan, R., Jamin, S., Shenker, S., and Willinger, W. 2002. The origin of power laws in Internet topologies revisited. In Proceedings of the 21st ACM SIGCOMM Conference. ACM, New York.Google Scholar
Clauset, A., and Moore, C. 2005. Accuracy and scaling phenomena in Internet mapping. Phys. Rev. Lett. 94, 018701.Google ScholarCross Ref
Cohen, R., Gonen, M., and Wool, A. 2007. Bounding the bias of tree-like sampling in ip topologies. In Proceedings of the European Conference on Complex Systems (ECCS).Google Scholar
Dall'Asta, L. 2007. Dynamic exploration of networks: From general principles to the traceroute process. Europ. Phys. J. B 60, 123--133.Google ScholarCross Ref
Dall'Asta, L., Alvarez-Hamelin, I., Barrat, A., Vázquez, A., and Vespignani, A. 2006. Exploring networks with traceroute-like probes: Theory and simulations. Theoret. Comput. Sci. 355, 6--24. Google ScholarDigital Library
Erdős, P., and Rényi, A. 1960. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 5, 17--61.Google Scholar
Fabrikant, A., Koutsoupias, E., and Papadimitriou, C. 2002. Heuristically optimized trade-offs: A new paradigm for power laws in the internet. In Proceedings of the 29th International Colloguium on Automata, Languages and Programming. 110--122. Google ScholarDigital Library
Faloutsos, M., Faloutsos, P., and Faloutsos, C. 1999. On power-law relationships of the Internet topology. In Proceedings of the 18th ACM SIGCOMM Conference. ACM, New York. 251--262. Google ScholarDigital Library
Flaxman, A., and Vera, J. 2007. Bias reduction in traceroute sampling: Towards a more accurate map of the internet. In Proceedings of the 5th Workshop on Algorithms and Models for the Web-Graph (WAW2007). Google ScholarDigital Library
Govindan, R., and Tangmunarunkit, H. 2000. Heuristics for Internet map discovery. In Proceedings of the 19th ACM SIGCOMM Conference. ACM, New York.Google Scholar
Guillaume, J.-L., Latapy, M., and Magoni, D. 2006. Relevance of massively distributed explorations of the internet topology: Qualitative results. Comput. Netw. 50, 3197--3224. Google ScholarDigital Library
Hoeffding, W. 1963. Probability inequalities for sums of bounded random variables. J. ASA 58, 13--30.Google Scholar
Kim, J. H. 2006. Poisson cloning model for random graphs. Preprint.Google Scholar
Kleinberg, J., Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1999. The web as a graph: Measurements, models and methods. In Proceedings of the International Conference on Combinatorics and Computing. Google ScholarDigital Library
Lakhina, A., Byers, J., Crovella, M., and Xie, P. 2003. Sampling biases in IP topology measurements. In Proceedings of the 22nd IEEE INFOCOM Conference. IEEE Computer Society Press, Los Alamitos.Google Scholar
Leguay, J., Latapy, M., Friedman, T., and Salamatian, K. 2007. Describing and simulating Internet routes. Comput. Netw. 51, 2067--2087. Google ScholarDigital Library
McDiarmid, C. 1998. Concentration. In Probabilistic Methods for Algorithmic Discrete Mathematics, M. Habib, C. McDiarmid, J. Ramirez-Alfonsin, and B. Reed, Eds. Springer-Verlag, Berlin, Germany, 195--247.Google Scholar
Mihail, M., Papadimitriou, C., and Saberi, A. 2003. On certain connectivity properties of the Internet topology. In Proceedings of the 35th ACM Symposium on Theory of Computing. ACM, New York.Google Scholar
Molloy, M., and Reed, B. 1995. A critical point for random graphs with a given degree sequence. Rand. Struct. Algor. 6, 161--180. Google ScholarDigital Library
Molloy, M., and Reed, B. 1998. The size of the largest component of a random graph on a fixed degree sequence. Combin. Probab. Comput. 7, 295--306. Google ScholarDigital Library
Motwani, R., and Raghavan, P. 1990. Randomized Algorithms. Cambridge University Press. Cambridge. Google ScholarDigital Library
Pansiot, J.-J., and Grad, D. 1998. On routers and multicast trees in the Internet. ACM SIGCOMM Commun. Rev. 28, 41--50. Google ScholarDigital Library
Petermann, T., and De Los Rios, P. 2004. Exploration of scale-free networks: Do we measure the real exponents&quest; Euro. Phys. J. B 38, 201--204.Google ScholarCross Ref
Seaborn, J. 1991. Hypergeometric Functions and Their Applications. Springer-Verlag, Berlin, Germany. Google ScholarDigital Library
Shavitt, Y., and Shir, E. 2005. Dimes: Let the Internet measure itself. ACM SIGCOMM Comput. Commun. Rev. 35, 71--74. Google ScholarDigital Library
Spanier, J., and Oldham, K. 1987. The exponential integral ei(x) and related functions. In An Atlas of Functions. Hemisphere, Chapter 37, 351--360.Google Scholar
Spring, N., Mahajan, R., Wetherall, D., and Anderson, T. 2004. Measuring ISP topologies with rocketfuel. IEEE/ACM Trans. Netw. 12, 2--16. Google ScholarDigital Library
Tangmunarunkit, H., Govindan, R., Shenker, S., and Estrin, D. 2001. The impact of policy on Internet paths. In Proceedings of the 20th IEEE INFOCOM Conference. IEEE Computer Society Press, Los Alamitos.Google Scholar
Viger, F., Augustin, B., Cuvellier, X., Magnien, C., Latapy, M., Friedman, T., and Teixeira, R. 2006. Detection, understanding, and prevention of traceroute measurement artifacts. In Proceedings of the 6th Internet Measurement Conference (IMC'06).Google Scholar
Wilf, H. 1994. Generatingfunctionology. Academic Press, Orlando, FL. Google ScholarDigital Library
Wormald, N. 1999. Models of random regular graphs. In London Math. Soc. Lecture Note Series, J. Lamb and D. Preece, Eds. Vol. 276. Cambridge University Press, Cambridge. 239--298.Google Scholar

Index Terms

On the bias of traceroute sampling: Or, power-law degree distributions in regular graphs
1. Mathematics of computing
  1. Probability and statistics
    1. Stochastic processes
2. Networks

Recommendations

The Study of Sampling Bias in Traceroute Topology Measurement
IMCCC '11: Proceedings of the 2011 First International Conference on Instrumentation, Measurement, Computer, Communication and Control

The current study of Internet topology measurement is mainly based on trace route which brings the problem of sampling bias. This paper makes a model of the sampling process, and analyzes the bias of node degree in sampling probe. We find that the probe ...
Read More
On the bias of traceroute sampling: or, power-law degree distributions in regular graphs
STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing

Understanding the structure of the Internet graph is a crucial step for building accurate network models and designing efficient algorithms for Internet applications. Yet, obtaining its graph structure is a surprisingly difficult task, as edges cannot ...
Read More
Measuring the Impacts of Sampling Bias on Internet AS-Level Topology Inference
CMC '09: Proceedings of the 2009 WRI International Conference on Communications and Mobile Computing - Volume 03

In this paper, we carried out a systematic qualitative study of the impacts of sampling bias over Internet AS-level topology inference. Based on the relatively complete Chinese Internet AS graph derived from our large-scale measurement effort, we ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Journal of the ACM Volume 56, Issue 4
June 2009
195 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/1538902
Issue’s Table of Contents

Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 July 2009
- Accepted: 1 February 2008
- Revised: 1 January 2008
- Received: 1 July 2007
Published in jacm Volume 56, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Internet topology
sampling bias
traceroute
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 57
  Total Citations
  View Citations
- 816
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On the bias of traceroute sampling: Or, power-law degree distributions in regular graphs

Journal of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

The Study of Sampling Bias in Traceroute Topology Measurement

On the bias of traceroute sampling: or, power-law degree distributions in regular graphs

Measuring the Impacts of Sampling Bias on Internet AS-Level Topology Inference

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

On the bias of traceroute sampling: Or, power-law degree distributions in regular graphs

Journal of the ACM

Abstract

References

Cited By

Index Terms

Recommendations

The Study of Sampling Bias in Traceroute Topology Measurement

On the bias of traceroute sampling: or, power-law degree distributions in regular graphs

Measuring the Impacts of Sampling Bias on Internet AS-Level Topology Inference

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media