Skip to main content

2018 | Buch

Nonparametric Statistics

3rd ISNPS, Avignon, France, June 2016

herausgegeben von: Prof. Dr. Patrice Bertail, Prof. Dr. Delphine Blanke, Dr. Pierre-André Cornillon, Prof. Dr. Eric Matzner-Løber

Verlag: Springer International Publishing

Buchreihe : Springer Proceedings in Mathematics & Statistics

insite
SUCHEN

Über dieses Buch

This volume presents the latest advances and trends in nonparametric statistics, and gathers selected and peer-reviewed contributions from the 3rd Conference of the International Society for Nonparametric Statistics (ISNPS), held in Avignon, France on June 11-16, 2016. It covers a broad range of nonparametric statistical methods, from density estimation, survey sampling, resampling methods, kernel methods and extreme values, to statistical learning and classification, both in the standard i.i.d. case and for dependent data, including big data.

The International Society for Nonparametric Statistics is uniquely global, and its international conferences are intended to foster the exchange of ideas and the latest advances among researchers from around the world, in cooperation with established statistical societies such as the Institute of Mathematical Statistics, the Bernoulli Society and the International Statistical Institute. The 3rd ISNPS conference in Avignon attracted more than 400 researchers from around the globe, and contributed to the further development and dissemination of nonparametric statistics knowledge.

Inhaltsverzeichnis

Frontmatter
Symmetrizing k-nn and Mutual k-nn Smoothers
Abstract
In light of Cohen (Ann Math Stat 37:458–463, 1966) and Rao (Ann Stat 4:1023–1037, 1976), who provide necessary and sufficient conditions for admissibility of linear smoothers, one realizes that many of the well-known linear nonparametric regression smoothers are inadmissible because either the smoothing matrix is asymmetric or the spectrum of the smoothing matrix lies outside the unit interval [0, 1]. The question answered in this chapter is how can an inadmissible smoother transformed into an admissible one? Specifically, this contribution investigates the spectrum of various matrix symmetrization schemes for k-nearest neighbor-type smoothers. This is not an easy task, as the spectrum of many traditional symmetrization schemes fails to lie in the unit interval. The contribution of this study is to present a symmetrization scheme for smoothing matrices that make the associated estimator admissible. For k-nearest neighbor smoothers, the result of the transformation has a natural interpretation in terms of graph theory.
P.-A. Cornillon, A. Gribinski, N. Hengartner, T. Kerdreux, E. Matzner-Løber
Nonparametric PU Learning of State Estimation in Markov Switching Model
Abstract
In this contribution, we develop methods of nonlinear filtering and prediction of an unobservable Markov chain which controls the states of observable stochastic process. This process is a mixture of two subsidiary stochastic processes, the switching of which is controlled by the Markov chain. Each of this subsidiary processes is described by conditional distribution density (cdd). The feature of the problem is that cdd’s and transition probability matrix of the Markov chain are unknown, but a training sample (positive labeled) from one of the two subsidiary processes and training sample (unlabeled) from the mixture process are available. Construction of process binary classifier using positive and unlabeled samples in machine learning is called PU learning. To solve this problem for stochastic processes, nonparametric kernel estimators based on weakly dependent observations are applied. We examine the novel method performance on simulated data and compare it with the same performance of the optimal Bayesian solution with known cdd’s and the transition matrix of the Markov chain. The modeling shows close results for the optimal task and the PU learning problem even in the case of a strong overlapping of the conditional densities of subsidiary processes.
A. Dobrovidov, V. Vasilyev
Multiplicative Bias Corrected Nonparametric Smoothers
Abstract
This contribution presents a general multiplicative bias reduction strategy for nonparametric regression. The approach is most effective when applied to an oversmooth pilot estimator, for which the bias dominates the standard error. The practical usefulness of the method was demonstrated in Burr et al. (IEEE Trans Nucl Sci 57:2831–2840, 2010) in the context of estimating energy spectra. For such data sets, it was observed that the method could decrease significantly the bias with only negligible increase in variance. This chapter presents the theoretical analysis of that estimator. In particular, we study the asymptotic properties of the bias corrected local linear regression smoother, and prove that it has zero asymptotic bias and the same asymptotic variance as the local linear smoother with a suitably adjusted bandwidth. Simulations show that our asymptotic results are available for modest sample sizes.
N. Hengartner, E. Matzner-Løber, L. Rouvière, T. Burr
Efficiency of the V -Fold Model Selection for Localized Bases
Abstract
Many interesting functional bases, such as piecewise polynomials or wavelets, are examples of localized bases. We investigate the optimality of V -fold cross-validation and a variant called V -fold penalization in the context of the selection of linear models generated by localized bases in a heteroscedastic framework. It appears that while V -fold cross-validation is not asymptotically optimal when V is fixed, the V -fold penalization procedure is optimal. Simulation studies are also presented.
F. Navarro, A. Saumard
Non-parametric Lower Bounds and Information Functions
Abstract
We argue that common features of non-parametric estimation appear in parametric cases as well if there is a deviation from the classical regularity condition. Namely, in many non-parametric estimation problems (as well as some parametric cases) unbiased finite-variance estimators do not exist; neither estimator converges locally uniformly with the optimal rate; there are no asymptotically unbiased with the optimal rate estimators; etc.
We argue that these features naturally arise in particular parametric subfamilies of non-parametric classes of distributions. We generalize the notion of regularity of a family of distributions and present a general regularity condition, which leads to the notions of the information index and the information function.
We argue that the typical structure of a continuity modulus explains why unbiased finite-variance estimators cannot exist if the information index is larger than two, while in typical non-parametric situations neither estimator converges locally uniformly with the optimal rate. We present a new result on impossibility of locally uniform convergence with the optimal rate.
S. Y. Novak
Modification of Moment-Based Tail Index Estimator: Sums Versus Maxima
Abstract
In this contribution, we continue the investigation of the SRCEN estimator of the extreme-value index γ (or the tail index α = 1∕γ) proposed in McElroy and Politis (J Statist Plan Infer 137:1389–1406, 2007) for γ > 1∕2. We propose a new estimator based on the local maximum. This, in fact, is a modification of the SRCEN estimator to the case γ > 0. We establish the consistency and asymptotic normality of the newly proposed estimator for i.i.d. data. Additionally, a short discussion on the comparison of the estimators is included.
N. Markovich, M. Vaičiulis
Constructing Confidence Sets for the Matrix Completion Problem
Abstract
In the present contribution we consider the problem of constructing honest and adaptive confidence sets for the matrix completion problem. For the Bernoulli model with known variance of the noise we provide a method with polynomial time complexity for constructing confidence sets that adapt to the unknown rank of the true matrix.
A. Carpentier, O. Klopp, M. Löffler
A Nonparametric Classification Algorithm Based on Optimized Templates
Abstract
This contribution is devoted to a classification problem into two groups. A novel algorithm is proposed, which is based on a distance of each observation from the centroid (prototype, template) of one of the groups. The general procedure is described on the particular task of mouth localization in facial images, where the centroid has the form of a mouth template. While templates are most commonly constructed as simple averages of positive examples, the novel optimization criterion allows to improve the separation between observations of one group (images of mouths) and observations of the other group (images of non-mouths). The separation is measured by means of the weighted Pearson product-moment correlation coefficient. On the whole, the new classification method can be described as conceptually simple and at the same time powerful.
J. Kalina
PAC-Bayesian Aggregation of Affine Estimators
Abstract
Aggregating estimators using exponential weights depending on their risk appears optimal in expectation but not in probability. We use here a slight overpenalization to obtain oracle inequality in probability for such an explicit aggregation procedure. We focus on the fixed design regression framework and the aggregation of linear estimators and obtain results for a large family of linear estimators under a non-necessarily independent sub-Gaussian noise assumptions.
L. Montuelle, E. Le Pennec
Light- and Heavy-Tailed Density Estimation by Gamma-Weibull Kernel
Abstract
In our previous papers we focus on the gamma kernel estimators of density and its derivatives on positive semi-axis by dependent data by univariate and multivariate samples. We introduce the gamma product kernel estimators for the multivariate joint probability density function (pdf) with the nonnegative support and its partial derivatives by the multivariate dependent data with a strong mixing. The asymptotic behavior of the estimates and the optimal bandwidths in the sense of minimal mean integrated squared error (MISE) are obtained. However, it is impossible to fit accurately the tail of the heavy-tailed density by pure gamma kernel. Therefore, we construct the new kernel estimator as a combination of the asymmetric gamma and Weibull kernels, i.e. Gamma-Weibull kernel. The gamma kernel is nonnegative and it changes the shape depending on the position on the semi-axis and possesses good boundary properties for a wide class of densities. Thus, we use it to estimate the pdf near the zero boundary. The Weibull kernel is based on the Weibull distribution which can be heavy-tailed and hence, we use it to estimate the tail of the unknown pdf. The theoretical asymptotic properties of the proposed density estimator like the bias and the variance are derived. We obtain the optimal bandwidth selection for the estimate as a minimum of the MISE. The optimal rate of convergence of the MISE for the density is found.
L. Markovich
Adaptive Estimation of Heavy Tail Distributions with Application to Hall Model
Abstract
The problem of tail index estimation of Hall distribution is considered. We propose the estimators of tail index using the truncated estimation method developed for ratio type functionals. It is shown that the truncated estimator constructed on the sample of fixed size has a guaranteed accuracy in the sense of the L 2m-norm, m ≥ 1. The asymptotic properties of estimators are although investigated. These properties make it possible to find the rates of decreasing of the χ 2 divergence in the almost surely sense between distribution and its adaptive estimator. Simulations confirm theoretical results.
D. N. Politis, V. A. Vasiliev, S. E. Vorobeychikov
Extremal Index for a Class of Heavy-Tailed Stochastic Processes in Risk Theory
Abstract
Extreme values for dependent data corresponding to high threshold exceedances may occur in clusters, i.e., in groups of observations of different sizes. In the context of stationary sequences, the so-called extremal index measures the strength of the dependence and may be useful to estimate the average length of such clusters. This is of particular interest in risk theory where public institutions would like to predict the replications of rare events, in other words, to understand the dependence structure of extreme values. In this contribution, we characterize the extremal index for a class of stochastic processes that naturally appear in risk theory under the assumption of heavy-tailed jumps. We focus on Shot Noise type-processes and we weaken the usual assumptions required on the Shot functions. Precisely, they may be possibly random with not necessarily compact support and we do not make any assumption regarding the monotonicity. We bring to the fore the applicability of the result on a Kinetic Dietary Exposure Model used in modeling pharmacokinetics of contaminants.
C. Tillier
Subsampling for Big Data: Some Recent Advances
Abstract
The goal of this contribution is to develop subsampling methods in the framework of big data and to show their feasibility in a simulation study. We argue that using different subsampling distributions with different subsampling sizes brings a lot of information on the behavior of statistical procedures: subsampling allows to estimate the rate of convergence of different procedures and to construct confidence intervals for general parameters including the generalization error of an algorithm in machine learning.
P. Bertail, O. Jelassi, J. Tressou, M. Zetlaoui
Probability Bounds for Active Learning in the Regression Problem
Abstract
In this contribution we consider the problem of active learning in the regression setting. That is, choosing an optimal sampling scheme for the regression problem simultaneously with that of model selection. We consider a batch type approach and an on-line approach adapting algorithms developed for the classification problem. Our main tools are concentration-type inequalities which allow us to bound the supreme of the deviations of the sampling scheme corrected by an appropriate weight function.
A.-K. Fermin, C. Ludeña
Elemental Estimates, Influence, and Algorithmic Leveraging
Abstract
It is well-known (Subrahmanyam, Sankhya Ser B 34:355–356, 1972; Mayo and Gray, Am Stat 51:122–129, 1997) that the ordinary least squares estimate can be expressed as a weighted sum of so-called elemental estimates based on subsets of p observations where p is the dimension of parameter vector. The weights can be viewed as a probability distribution on subsets of size p of the predictors {x i : i = 1, ⋯ , n}. In this contribution, we derive the lower dimensional distributions of this p dimensional distribution and define a measure of potential influence for subsets of observations analogous to the diagonal elements of the “hat” matrix for single observations. This theory is then applied to algorithmic leveraging, which is a method for approximating the ordinary least squares estimates using a particular form of biased subsampling.
K. Knight
Bootstrapping Nonparametric M-Smoothers with Independent Error Terms
Abstract
On the one hand, nonparametric regression approaches are flexible modeling tools in modern statistics. On the other hand, the lack of any parameters makes these approaches more challenging when assessing some statistical inference in these models. This is crucial especially in situations when one needs to perform some statistical tests or to construct some confidence sets. In such cases, it is common to use a bootstrap approximation instead. It is an effective alternative to more straightforward but rather slow plug-in techniques. In this contribution, we introduce a proper bootstrap algorithm for a robustified version of the nonparametric estimates, the so-called M-smoothers or M-estimates, respectively. We distinguish situations for homoscedastic and heteroscedastic independent error terms, and we prove the consistency of the bootstrap approximation under both scenarios. Technical proofs are provided and the finite sample properties are investigated via a simulation study.
Matúš Maciak
Extension Sampling Designs for Big Networks: Application to Twitter
Abstract
With the rise of big data, more and more attention is paid to statistical network analysis. However, exact computation of many statistics of interest is of prohibitive cost for big graphs. Statistical estimators can thus be preferable. Model-based estimators for networks have some drawbacks. We study design-based estimates relying on sampling methods that were developed specifically for use on graph populations. In this contribution, we test some sampling designs that can be described as “extension” sampling designs. Unit selection happens in two phases: in the first phase, simple designs such as Bernoulli sampling are used, and in the second phase, some units are selected among those that are somehow linked to the units in the first-phase sample. We test these methods on Twitter data, because the size and structure of the Twitter graph is typical of big social networks for which such methods would be very useful.
A. Rebecq
Wavelet Whittle Estimation in Multivariate Time Series Models: Application to fMRI Data
Abstract
In many applications such as finance, geophysics or neuroscience, data are multivariate time series. Identification of correlation between time series is an important feature. Yet differences in memory properties of the processes can induce phase-shifts in estimation of correlations. A semiparametric model for multivariate long-range dependent time series is considered. The coupling between time series is characterized by the long-run covariance matrix. The multivariate wavelet-based Whittle estimation is consistent for the estimation of both the long-range dependence and the covariance matrix. Finally an application to the estimation of a human brain functional network based on fMRI test-retest data sets is described. Our study highlights the benefit of the multivariate analysis, namely improved efficiency of estimation of dependence parameters.
S. Achard, I. Gannaz
On Kernel Smoothing with Gaussian Subordinated Spatial Data
Abstract
We address estimation of a deterministic function μ, that is the mean of a spatial process y(s) in a nonparametric regression context. Here s denotes a spatial coordinate in \({R}_+^2.\) Given k = n 2 observations, the aim is to estimate μ assuming that y has finite variance, and that the regression errors \(\epsilon (\mathbf {s}) = y(\mathbf {s}) - {E}\left \{ y(\mathbf {s})\right \}\) are Gaussian subordinated.
S. Ghosh
Strong Separability in Circulant SSA
Abstract
Circulant singular spectrum analysis (CSSA) is an automated variant of singular spectrum analysis (SSA) developed for signal extraction. CSSA allows to identify the association between the extracted component and the frequencies they represent without the intervention of the analyst. Another relevant characteristic is that CSSA produces strongly separable components, meaning that the resulting estimated signals are uncorrelated. In this contribution we deepen in the strong separability of CSSA and compare it to SSA by means of a detailed example. Finally, we apply CSSA to UK and US quarterly GDP to check that it produces reliable cycle estimators and strong separable components. We also test the absence of any seasonality in the seasonally adjusted time series estimated by CSSA.
J. Bógalo, P. Poncela, E. Senra
Selection of Window Length in Singular Spectrum Analysis of a Time Series
Abstract
Singular Spectrum Analysis (SSA) is a promising non-parametric time series modelling technique that has proved to be successful in data preprocessing in diverse application fields. It is a window length-based method and the appropriate selection of window length plays a crucial role in the accuracy of SSA. However, there are no specific methods depicted in the literature about its selection. In this study, the method of SSA in time series analysis is presented in detail and a sensitivity analysis of window length is carried out based on an observed daily rainfall time series.
P. Unnikrishnan, V. Jothiprakash
Fourier-Type Monitoring Procedures for Strict Stationarity
Abstract
We consider model-free monitoring procedures for strict stationarity of a given time series. The new criteria are formulated as L2-type statistics incorporating the empirical characteristic function. Monte Carlo results as well as an application to financial data are presented.
S. Lee, S. G. Meintanis, C. Pretorius
Nonparametric and Parametric Methods for Change-Point Detection in Parametric Models
Abstract
We consider a posteriori and a priori change-point models. The parametric regression functions of the each phase can be nonlinear or linear, and moreover, in the linear case, the number of the explanatory variables could be large. Theoretical results and simulations are presented for each model. For a posteriori change-point nonlinear model, the results obtained by two estimation techniques are given in the case when the change-point number is known. So, the quantile and empirical likelihood nonparametric methods are considered. If the number of the change-points is unknown, a consistent criterion is proposed. When the change-point model is linear with a large number of explanatory variables, it would make the automatic selection of variables. The adaptive LASSO quantile method is then proposed and studied. On the other hand, we propose a nonparametric test based on the empirical likelihood, in order to test if the model changes. For detecting in real time a change in model, we consider two cases, nonlinear and linear models. For a nonlinear model, a hypothesis test based on CUSUM of LS residuals is constructed. For a linear model with large number of explanatory variables, we propose a CUSUM test statistic based on adaptive LASSO residuals.
G. Ciuperca
Variance Estimation Free Tests for Structural Changes in Regression
Abstract
A sequence of time-ordered observations possesses a trend, which is possibly subject to change at most once at some unknown time point. The aim is to test whether such an unknown change has occurred or not. The change point methods presented here rely on ratio type test statistics based on maxima of the cumulative sums. These detection procedures for the change in regression are also robustified by considering a general score function. The main advantage of the proposed approach is that the variance of the observations neither has to be known nor estimated. The asymptotic distribution of the test statistic under the no change null hypothesis is derived. Moreover, we prove the consistency of the test under alternatives. The results are illustrated through a simulation study, which demonstrates computational efficiency of the procedures. A practical application to real data is presented as well.
Barbora Peštová, Michal Pešta
Bootstrapping Harris Recurrent Markov Chains
Abstract
The main objective of this paper is to present bootstrap uniform functional central limit theorem for Harris recurrent Markov chains over uniformly bounded classes of functions. We show that the result can be generalized also to the unbounded case. To avoid some complicated mixing conditions, we make use of the well-known regeneration properties of Markov chains. Regenerative properties of Markov chains can be applied in order to extend some concepts in robust statistics from i.i.d. to a Markovian setting. It is possible to define an influence function and Fréchet differentiability on the torus which allows to extend the notion of robustness from single observations to the blocks of data instead. We present bootstrap uniform central limit theorems for Fréchet differentiable functionals in a Markovian case.
Gabriela Ciołek
Backmatter
Metadaten
Titel
Nonparametric Statistics
herausgegeben von
Prof. Dr. Patrice Bertail
Prof. Dr. Delphine Blanke
Dr. Pierre-André Cornillon
Prof. Dr. Eric Matzner-Løber
Copyright-Jahr
2018
Electronic ISBN
978-3-319-96941-1
Print ISBN
978-3-319-96940-4
DOI
https://doi.org/10.1007/978-3-319-96941-1