Skip to main content
Erschienen in: Annals of Telecommunications 1-2/2011

01.02.2011

A new statistical approach to estimate global file populations from local observations in the eDonkey P2P file sharing system

verfasst von: Patrick Brown, Sanja Petrovic

Erschienen in: Annals of Telecommunications | Ausgabe 1-2/2011

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we propose a new statistical approach, also known in biology under the name capture–recapture methods in order to estimate global population statistics from local observations. Evaluating population sizes in P2P systems has received much attention lately as these may be useful to set system parameters, to derive other system statistics, or to predict system performance. As these systems are very large, encompassing several millions of users and since they are highly distributed estimating population sizes is a challenging task. More precisely, we are interested in estimating the number of file replicas in the system, i.e., the size of the population of users possessing given files. To this end, we propose a capture–recapture method which is both computationally efficient and accurate. The method proposed allows deriving global population statistics from local and time-limited observations. We apply the method on a measurement data set of several days on a residential network. We compare the results obtained from direct counting procedures with those derived with the proposed methodology.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Assumption \(({\cal H}1)\) is crucial in traditional population estimation as the samples are taken in successive time periods. In the current context, we will apply analog methods on samples taken on identical time periods. If the population varies during the observation and assumption \(({\cal H}2)\) is valid, we estimate the total number of peers having belonged to the population any time during the observation instead of the average population. Both values are close if the average time in system is long compared to the observation period.
 
Literatur
1.
Zurück zum Zitat Anagnostopoulos I, Stavropoulos P, Kouzas G, Anagnostopoulos C, Vergados DD (2006) Estimating the evolution of categorized web page populations. In: ICWE ’06: workshop proceedings of the sixth international conference on Web engineering. ACM, New York, p 13CrossRef Anagnostopoulos I, Stavropoulos P, Kouzas G, Anagnostopoulos C, Vergados DD (2006) Estimating the evolution of categorized web page populations. In: ICWE ’06: workshop proceedings of the sixth international conference on Web engineering. ACM, New York, p 13CrossRef
2.
Zurück zum Zitat Bawa M, Garcia-Molina H, Gionis A, Motwani R (2003) Estimating aggregates on a peer-to-peer network. Technical report, Dept of Computer Science, Stanford University Bawa M, Garcia-Molina H, Gionis A, Motwani R (2003) Estimating aggregates on a peer-to-peer network. Technical report, Dept of Computer Science, Stanford University
3.
Zurück zum Zitat Brown P, Petrovic S (2009) Large scale analysis of the eDonkey P2P file sharing system. In: INFOCOM, Rio de Janeiro, Brazil, pp 2746–2750 Brown P, Petrovic S (2009) Large scale analysis of the eDonkey P2P file sharing system. In: INFOCOM, Rio de Janeiro, Brazil, pp 2746–2750
4.
Zurück zum Zitat Brown P, Petrovic S (2009) A new statistical approach to estimate global file populations in the eDonkey P2P file sharing system. In: International teletraffic congress, Paris, France Brown P, Petrovic S (2009) A new statistical approach to estimate global file populations in the eDonkey P2P file sharing system. In: International teletraffic congress, Paris, France
5.
Zurück zum Zitat Feller W (1968) An introduction to probability theory and its applications, vol 1, 3rd edn. Wiley, New York Feller W (1968) An introduction to probability theory and its applications, vol 1, 3rd edn. Wiley, New York
6.
Zurück zum Zitat Fessant FL, Handurukande SB, Kermarrec AM, Massoulié L (2004) Clustering in peer-to-peer file sharing workloads. In: IPTPS, lecture notes in computer science, vol 3279. Springer, Berlin, pp 217–226 Fessant FL, Handurukande SB, Kermarrec AM, Massoulié L (2004) Clustering in peer-to-peer file sharing workloads. In: IPTPS, lecture notes in computer science, vol 3279. Springer, Berlin, pp 217–226
7.
Zurück zum Zitat Gazey W, Staley M (1986) Population estimation from mark–recapture experiments using a sequential Bayes algorithm. Ecology 67:941–951CrossRef Gazey W, Staley M (1986) Population estimation from mark–recapture experiments using a sequential Bayes algorithm. Ecology 67:941–951CrossRef
8.
Zurück zum Zitat Handurukande S, Kermarrec A, Fessant FL, Massoulié L, Patarin S (2006) Peer sharing behaviour in the eDonkey network, and implications for the design of server-less file sharing systems. In: EuroSys’06. Leuven, Belgium Handurukande S, Kermarrec A, Fessant FL, Massoulié L, Patarin S (2006) Peer sharing behaviour in the eDonkey network, and implications for the design of server-less file sharing systems. In: EuroSys’06. Leuven, Belgium
9.
Zurück zum Zitat Krebs CJ (1989) Ecological methodology. Harper and Row, New York Krebs CJ (1989) Ecological methodology. Harper and Row, New York
10.
Zurück zum Zitat Massoulié L, Merrer EL, Kermarrec AM, Ganesh A (2006) Peer counting and sampling in overlay networks: random walk methods. In: PODC ’06: proceedings of the twenty-fifth annual ACM symposium on principles of distributed computing, New York, NY, USA, pp 123–132 Massoulié L, Merrer EL, Kermarrec AM, Ganesh A (2006) Peer counting and sampling in overlay networks: random walk methods. In: PODC ’06: proceedings of the twenty-fifth annual ACM symposium on principles of distributed computing, New York, NY, USA, pp 123–132
11.
Zurück zum Zitat Petrovic S (2008) Towards a better understanding of eMule. Ph.D. thesis, University of Nice–Sophia Antipolis Petrovic S (2008) Towards a better understanding of eMule. Ph.D. thesis, University of Nice–Sophia Antipolis
12.
Zurück zum Zitat Petrovic S, Brown P, Costeux JL (2007) Unfairness in the e-mule file sharing system. In: International teletraffic congress, Ottawa, Canada, pp 594–605 Petrovic S, Brown P, Costeux JL (2007) Unfairness in the e-mule file sharing system. In: International teletraffic congress, Ottawa, Canada, pp 594–605
13.
Zurück zum Zitat Plissonneau L, Costeux JL, Brown P (2006) Detailed analysis of eDonkey transfers on ADSL. In: 2nd EuroNGI conference on next generation internet design and engineering, Valencia, Spain Plissonneau L, Costeux JL, Brown P (2006) Detailed analysis of eDonkey transfers on ADSL. In: 2nd EuroNGI conference on next generation internet design and engineering, Valencia, Spain
14.
Zurück zum Zitat Ricker WE (1975) Computation and interpretation of biological statistics of fish populations. Fish Res Board Can 191:1–382 Ricker WE (1975) Computation and interpretation of biological statistics of fish populations. Fish Res Board Can 191:1–382
15.
Zurück zum Zitat Schumacher FX, Eschmeyer RW (1943) The estimate of fish population in lakes or ponds. J Tenn Acad Sci (18):228–249 Schumacher FX, Eschmeyer RW (1943) The estimate of fish population in lakes or ponds. J Tenn Acad Sci (18):228–249
16.
Zurück zum Zitat Schwarz C, Seber G (1999) Estimating animal abundance: review III. Stat Sci 14:427–56CrossRef Schwarz C, Seber G (1999) Estimating animal abundance: review III. Stat Sci 14:427–56CrossRef
17.
Zurück zum Zitat Seber G (1982) The estimation of animal abundance and related parameters, 2nd edn. Charles Griffin & Co, London Seber G (1982) The estimation of animal abundance and related parameters, 2nd edn. Charles Griffin & Co, London
18.
Zurück zum Zitat Steiner M, Biersack EW, En Najjary T (2007) Actively monitoring peers in KAD. In: IPTPS’07, 6th international workshop on peer-to-peer systems. Bellevue, USA Steiner M, Biersack EW, En Najjary T (2007) Actively monitoring peers in KAD. In: IPTPS’07, 6th international workshop on peer-to-peer systems. Bellevue, USA
19.
Zurück zum Zitat Stutzbach D, Rejaie R (2006) Understanding churn in peer-to-peer networks. In: Internet measurement conference Stutzbach D, Rejaie R (2006) Understanding churn in peer-to-peer networks. In: Internet measurement conference
Metadaten
Titel
A new statistical approach to estimate global file populations from local observations in the eDonkey P2P file sharing system
verfasst von
Patrick Brown
Sanja Petrovic
Publikationsdatum
01.02.2011
Verlag
Springer-Verlag
Erschienen in
Annals of Telecommunications / Ausgabe 1-2/2011
Print ISSN: 0003-4347
Elektronische ISSN: 1958-9395
DOI
https://doi.org/10.1007/s12243-010-0202-2

Weitere Artikel der Ausgabe 1-2/2011

Annals of Telecommunications 1-2/2011 Zur Ausgabe

Acknowledgments

List of 2010 reviewers