01.02.2011 | Ausgabe 1-2/2011

Annals of Telecommunications 1-2/2011

A new statistical approach to estimate global file populations from local observations in the eDonkey P2P file sharing system

Patrick Brown, Sanja Petrovic
This paper is an extended version of a paper presented at the 21st International Teletraffic Congress, Paris, 15–17 September 2009 [4].


In this paper, we propose a new statistical approach, also known in biology under the name capture–recapture methods in order to estimate global population statistics from local observations. Evaluating population sizes in P2P systems has received much attention lately as these may be useful to set system parameters, to derive other system statistics, or to predict system performance. As these systems are very large, encompassing several millions of users and since they are highly distributed estimating population sizes is a challenging task. More precisely, we are interested in estimating the number of file replicas in the system, i.e., the size of the population of users possessing given files. To this end, we propose a capture–recapture method which is both computationally efficient and accurate. The method proposed allows deriving global population statistics from local and time-limited observations. We apply the method on a measurement data set of several days on a residential network. We compare the results obtained from direct counting procedures with those derived with the proposed methodology.

