Breakdown of Benford’s law for birth data
Introduction
Newcomb [1] and later Benford [2] observed that the occurrence of significant digits in many data sets is uniform but tends to follow a logarithmic distribution such that the smaller digits appear as the first significant digits more frequently than the larger ones, i.e., where is the total number of considered 1st digits for checking the law, in short, the number of data points, and is the number of the observed integer . Usually, it seems that Benford’s law breaks down when there is human manipulation or control (in various ways) of the data.
The literature on the subject is enormous [3], [4] and not all papers can be quoted here. A few of socio-econo-statistical physics papers of interests are pointed out in Section 2.
In this paper, our goal is to investigate whether Benford’s law holds, on long birth time series, distinguishing between the religious adhesion (Eastern Orthodox or not) of families in Romania for a time interval extending from Jan. 01, 1905 till Dec. 31, 2001, i.e. over 97 years or 35 429 days. The results point to a drastic breakdown of Benford’s law. Some interpretation is proposed, based on the statistical aspects due to population sizes, rather than on human thought constraints.
In Section 3, the data acquisition is recalled. It leads to a set of time series. The data of interest are displayed through histograms and discussed following a statistical analysis in Section 3.2. All Benford law tests are found in Section 4, in particular with a test of Benford’s law for the 1st and 2nd digits of the time series of the daily birth number of babies in Romania, distinguishing between Orthodox and Non-Orthodox families.
Since the results point to a drastic breakdown of Benford’s law, a discussion of the findings, followed by an explanation, is found in Section 5.
Section 6 serves for a conclusion emphasizing (i) the interest of such a data study along Benford’s law concepts, and (ii) the complexity of studying a community, and its religiosity, through its baby birth history.
Section snippets
Benford’s law: a short state of the art literature review
The applications of Benford’s law are too numerous to be all quoted here [3], [4]. Nevertheless, for shining some light on the subject, we point to those Benford’s law showing detection of data anomalies in actuarial and financial cases [5], [6], [7], [8], [9], [10], and also in political cases [11], [12] and surveys [13].
Beside these fields of applications, Benford’s law has been applied in less dwelled subjects, e.g., when discussing the appearance of numbers on the internet [14], or
Data
The data were obtained from 1992 and 2002 censuses by the Romanian National Institute of Statistics (NIS). The data of interest pertain to the record of the total number of births in Romania for persons still alive at the 1992 and 2002 census reference points. In this respect, the data might rather be called “survival occurrence from birth date”. The fact that the true daily birth data are not known to us is irrelevant for the present considerations. Thereafter, we will use, for conciseness,
Benford’s laws tests
Due to the range in the number of births, it seems interesting to test Benford’s law, not only for the first reported digits, but also for the second digit. In practice, applications of Benford’s law for fraud detection routinely use more than the first digit [5].
Discussion
The above analysis shows a large difference between observation and expectation for the first digit. For the second digit, the survey data are in rather close agreement with the theoretical distribution. The former observation needs some interpretation.
It seems that Benford’s law breakdown can be more easily understood starting from the Non-Orthodox number of births cases, when observing the histogram data in Fig. 4. The peak count occurs in the bins 90–110. This fact suggests that the most
Conclusions
Benford’s law universal validity has always been questioned. It has been explained and/or justified along various mathematical hypotheses on number occurrences. However, when it is valid in physics and more generally in science is still an open question. Many cases have been discussed as seen in the short list of references given in the bibliography. There are cases, thus data, in which the validity or breakdown can be fully proved. Nevertheless, the causes or origins are debatable. Benford
Acknowledgment
This paper is part of MA scientific activities in COST Action TD1210 “Analyzing the dynamics of information and knowledge landscapes”.
This work by CH was co-financed by the European Social Fund through the Sectorial Operational Programme Human Resources Development 2007–2013, project number POSDRU/1.5/S/59184 Performance and excellence in postdoctoral research in Romanian economics science domain.
References (46)
Frequency of occurrence of numbers in the World Wide Web
Physica A
(2006)Use of Benford’s law in drug discovery data
Drug Discovery Today
(2010)- et al.
Empirical mantissa distributions of pulsars
Astropart. Phys.
(2010) - et al.
Benford’s law and Theil transform of financial data
Physica A
(2012) The law of the leading digits and the world religions
Physica A
(2012)The Benford law behavior of the religious activity data
Physica A
(2014)Theory of analogous force on number sets
Physica A
(2003)- et al.
Statistical dynamics of religions
Physica A
(2009) Note on the frequency of use of the different digits in natural numbers
Amer. J. Math.
(1881)The law of anomalous numbers
Proc. Amer. Philos. Soc.
(1938)
A taxpayer compliance application of Benford’s law
J. Amer. Tax Ass.
The use of Benford’s law as an aid in analytical procedures
Auditing J. Pract. Th.
Digital Analysis Using Benford’s Law: Tests Statistics for Auditors
Reducing false alarms in the detection of human influence on data
J. Account. Audit. Finance
The effective use of Benford’s law to assist in detecting fraud in accounting data
J. Forensic Acc.
Fact and fiction in EU-governmental economic data
Germ. Econ. Rev.
The wrong man is president! Overvotes in the 2000 presidential election in Florida
Presp. Polit.
Detecting problems in survey data using Benford’s law
J. Hum. Resour.
Benford’s law and articles of scientific journals: comparison of JCR and Scopus data
Scientometrics
On roundoff error distributions in floating point and logarithmic arithmetic
Computing
Cited by (41)
Testing Benford's Laws (non)conformity within disclosed companies’ financial statements among hospitality industry in Romania
2021, Physica A: Statistical Mechanics and its ApplicationsCitation Excerpt :The enthusiasm with which Benford’s Laws is currently applied/tested/analysed is also illustrated by the multiple domains and territories from which the data sets are used. Thus, as a symbolic example, we distinguish the economic and financial areas ([23; 24; 25; 26; 27; 28; 29; 30; 31; 32]), scientometrics area ([33; 34; 35]), demography area ([36]), Google searches area ([37]), hydrology area ([38; 39]), astronomy area ([40]), religion ([41]) or, very recent, epidemiology of Covid-19 ([42; 43]). On the institutional level, very recent, European Commission organised a dedicated conference on Stresa (Italy) on Benford’s Laws use, methods and further advance ([44]).
Benford's laws tests on S&P500 daily closing values and the corresponding daily log-returns both point to huge non-conformity
2021, Physica A: Statistical Mechanics and its ApplicationsCitation Excerpt :Usually, one considers that Benford’s laws should be valid if there is no data manipulation or if human constraints are non-existent [13]. Yet, there are cases in which Benford’s laws are either not hold, even though their occurrence should be expected [14], or on the contrary, are not expected to be observed, but are present [15–20]. Thus, testing BLs on various samples should bring some argument about discussing the controversies.
A Benford's Law based methodology for fraud detection in social welfare programs: Bolsa Familia analysis
2021, Physica A: Statistical Mechanics and its ApplicationsCitation Excerpt :Thus, the presentation and demonstration of the law, as a powerful methodology in the audit field, were further emphasized by Hill [20], Pinkham [13], Raimi [21], Durtschi et al. [22], among others, and also in Nigrini and Miller [23], Pimbley [24], Amiram et al. [25], Ausloos et al. [26]. In fact, NBL is also applied outside the financial audit realm; e.g. see Fu et al. [27] for image forensics, Ausloos et al. [28] for birth rate anomalies, Pollach et al. [29] for maternal mortality rates, or elsewhere in the natural sciences Sambridge, et al. [30], and on religious activities Mir [31]. In Brazil, as proposed by Gamermann and Antunes [32], NBL was used to analyse money donations for the electoral campaigns and the election results.
Data validity and statistical conformity with Benford's Law
2021, Chaos, Solitons and FractalsCOVID-19, flattening the curve, and Benford's law
2020, Physica A: Statistical Mechanics and its ApplicationsFraud detection-oriented operators in a data warehouse based on forensic accounting techniques
2018, Computer Fraud and SecurityCitation Excerpt :Various studies have validated this law – including Sambridge and Gauvrit.13,14 Nonetheless, it has also been proven that the success of this law is low when it is applied to correlated data, because data independence is a fundamental characteristic to ensure the compliance of Benford's law.15 Mark Nigrini 16 is one of the authors who have promoted Benford's law the most.