Minimax estimation of the integral of a power of a density
Introduction
Suppose that we observe an i.i.d. sample from a density on the -dimensional cube . We wish to estimate the functional , for a given, known integer , when it is known that belongs to the Hölder space .
We concentrate on the case that the regularity is low relative to the dimension: In this case the square minimax rate of estimation over the unit ball of is known (see Birgé and Massart (1995)) to be not faster than For values of below the cut-off (1.1) this rate is slower than the standard parametric rate , and for large it can be slow even for high regularity levels . In this paper we show that the lower bound is sharp by constructing an estimator with mean square error of order .
This problem was previously considered by many authors, including Birgé and Massart (1995), Bickel and Ritov (1988), Laurent and Massart (2000) and Emery et al. (2000), as a canonical example of a nonlinear functional. A simple construction of a minimax estimator for the case is given in Laurent, 1996, Laurent, 1997, and the case that and is covered by Kerkyacharian and Picard (1996). The authors of the latter paper also indicate that their construction extends to general smooth functionals (Kerkyacharian and Picard, 1996, Section 5).
Similarly to the constructions by these authors, our estimator is based on approximations of in a basis and an analysis of the bias of these approximations. The final estimator is a th-order -statistic with a kernel determined by three approximation levels, but otherwise given by a simple and direct formula. Different from Kerkyacharian and Picard (1996) our construction allows for general approximation schemes, not limited to the Haar basis. In fact, the latter basis cannot be used in the case , as it gives suboptimal approximation in this case. The case can arise under (1.1) if . Not using the Haar basis leads to additional bias terms, which need to be estimated, whence our estimator is different from the one in Kerkyacharian and Picard (1996), both for small and large . However, actually the use of a general basis has led us to simpler formulas.
The paper is organized as follows. In Section 2 we introduce the projection kernels used for the construction of our estimator. In Section 3 we very briefly mention the estimator in the quadratic case. In Section 4 we present our estimator for and state the main result of the paper. Section 5 contains the proof of the main result.
Throughout the paper we use the following notation. Given a function of arguments, the -statistic with kernel is written The “kernel” need not be permutation symmetric in this definition. As an alternative notation for this -statistic we use , where the unspecified indices serve as a reminder of the arguments involved in the -statistic. The notation means that for a constant that is fixed within the context. Undelimited integrals are silently understood to be over the sample space .
Section snippets
Projections
Our estimator can be viewed as an unbiased estimator of an approximation to the functional , constructed from approximations to . In the case we need to combine three such approximations, with different values of , each taken as a projection onto a -dimensional space.
We use a fixed orthogonal projection given by a kernel operator, with the kernel denoted by the same symbol as the operator: . The projection property of the
Quadratic functional
To estimate the quadratic functional we estimate the approximation unbiasedly by the second-order -statistic . For appropriate projections this is precisely the estimator considered by Laurent, 1996, Laurent, 1997, and Kerkyacharian and Picard (1996), and used in Robins and van der Vaart (2006) to construct adaptive confidence sets. Thanks to (2.3) the bias is of the order By standard computations on -statistics (e.g. van der Vaart
Main result
The estimation of for necessitates a more elaborate approximation scheme. We use three levels of truncation and define a th-order kernel by
Theorem 4.1 Let be a projection kernel satisfying(2.2), (2.3). Fix . For sequences , and such that
Proof
In this section we present two lemmas followed by the proof of the main theorem. Throughout the section we assume the conditions of Theorem 4.1. The upper bounds are uniform in ranging over a fixed multiple of the unit ball in .
Lemma 5.1 For any positive integers and a bounded function ,
Proof The expected value is an integral relative to the density of . By bounding this density by , the integral is turned into an integral
References (11)
- et al.
Wavelets on the interval and fast wavelet transforms
Appl. Comput. Harmon. Anal.
(1993) - et al.
Estimating integrated squared density derivatives: sharp best order of convergence estimates
Sankhyā Ser. A
(1988) - et al.
Estimation of integral functionals of a density
Ann. Statist.
(1995) - et al.
- et al.
Cited by (21)
Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators
2024, Journal of EconometricsEfficient Generalization and Transportation
2023, arXiv