Skip to main content

2016 | Buch

Recent Advances in Robust Statistics: Theory and Applications

herausgegeben von: Claudio Agostinelli, Ayanendranath Basu, Peter Filzmoser, Diganta Mukherjee

Verlag: Springer India

insite
SUCHEN

Über dieses Buch

This book offers a collection of recent contributions and emerging ideas in the areas of robust statistics presented at the International Conference on Robust Statistics 2015 (ICORS 2015) held in Kolkata during 12–16 January, 2015. The book explores the applicability of robust methods in other non-traditional areas which includes the use of new techniques such as skew and mixture of skew distributions, scaled Bregman divergences, and multilevel functional data methods; application areas being circular data models and prediction of mortality and life expectancy. The contributions are of both theoretical as well as applied in nature. Robust statistics is a relatively young branch of statistical sciences that is rapidly emerging as the bedrock of statistical analysis in the 21st century due to its flexible nature and wide scope. Robust statistics supports the application of parametric and other inference techniques over a broader domain than the strictly interpreted model scenarios employed in classical statistical methods.

The aim of the ICORS conference, which is being organized annually since 2001, is to bring together researchers interested in robust statistics, data analysis and related areas. The conference is meant for theoretical and applied statisticians, data analysts from other fields, leading experts, junior researchers and graduate students. The ICORS meetings offer a forum for discussing recent advances and emerging ideas in statistics with a focus on robustness, and encourage informal contacts and discussions among all the participants. They also play an important role in maintaining a cohesive group of international researchers interested in robust statistics and related topics, whose interactions transcend the meetings and endure year round.

Inhaltsverzeichnis

Frontmatter
Flexible Distributions as an Approach to Robustness: The Skew-t Case
Abstract
The use of flexible distributions with adaptive tails as a route to robustness has a long tradition. Recent developments in distribution theory, especially of non-symmetric form, provide additional tools for this purpose. We discuss merits and limitations of this approach to robustness as compared with classical methodology. Operationally, we adopt the skew-t as the working family of distributions used to implement this line of thinking.
Adelchi Azzalini
Determining the Number of Clusters Using Multivariate Ranks
Abstract
Determining number of clusters in a multivariate data has become one of the most important issues in very diversified areas of scientific disciplines. The forward search algorithm is a graphical approach that helps us in this task. The traditional forward search approach based on Mahalanobis distances has been introduced by Hadi (1992), Atkinson (1994), while Atkinson et al. (2004) used it as a clustering method. But like many other Mahalanobis distance-based methods, it cannot be correctly applied to asymmetric distributions and more generally, to distributions which depart from the elliptical symmetry assumption. We propose a new forward search methodology based on spatial ranks, where clusters are grown with one data point at a time sequentially, using spatial ranks with respect to the points already in the subsample. The algorithm starts from a randomly chosen initial subsample. We illustrate with simulated data that the proposed algorithm is robust to the choice of initial subsample and it performs well in different mixture multivariate distributions. We also propose a modified algorithm based on the volume of central rank regions. Our numerical examples show that it produces the best results under elliptic symmetry.
Mohammed Baragilly, Biman Chakraborty
Robust and Sparse Estimation of the Inverse Covariance Matrix Using Rank Correlation Measures
Abstract
Spearman’s rank correlation is a robust alternative for the standard correlation coefficient. Using ranks instead of the actual values of the observations, the impact of outliers remains limited. In this paper, we study an estimator based on this rank correlation measure for estimating covariance matrices and their inverses. The resulting estimator is robust and consistent at the normal distribution. By applying the graphical lasso, the inverse covariance matrix estimator is positive definite if more variables than observations are available in the data set. Moreover, it will contain many zeros, and is therefore said to be sparse. Instead of Spearman’s rank correlation, one can use Kendall correlation, Quadrant correlation or Gaussian rank scores. A simulation study compares the different estimators. This type of estimator is particularly useful for estimating (inverse) covariance matrices in high dimensions, when the data may contain several outliers in many cells of the data matrix. More traditional robust estimators are not well defined or computable in this setting. An important feature of the proposed estimators is their simplicity and easiness to compute using existing software.
Christophe Croux, Viktoria Öllerer
Robust Mixture Regression Using Mixture of Different Distributions
Abstract
In this paper, we examine the mixture regression model based on mixture of different type of distributions. In particular, we consider two-component mixture of normal-t distributions, and skew t-skew normal distributions. We obtain the maximum likelihood (ML) estimators for the parameters of interest using the expectation maximization (EM) algorithm. We give a simulation study and real data examples to illustrate the performance of the proposed estimators.
Fatma Zehra Doğru, Olcay Arslan
Robust Statistical Engineering by Means of Scaled Bregman Distances
Abstract
We show how scaled Bregman distances can be used for the goal-oriented design of new outlier- and inlier robust statistical inference tools. Those extend several known distance-based robustness (respectively, stability) methods at once. Numerous special cases are illustrated, including 3D computer graphical comparison methods. For the discrete case, some universally applicable results on the asymptotics of the underlying scaled-Bregman-distance test statistics are derived as well.
Anna-Lena Kißlinger, Wolfgang Stummer
SB-Robustness of Estimators
Abstract
Examining the SB-robustness of estimators becomes important in situations where the underlying family of distributions has bounded support or bounded parameter space. Such situations occur routinely when dealing with circular data and statistical quality control. In this paper, we first discuss SB-robustness of estimators in the circular data set-up and review some of the recently obtained results in this regard. Later in the paper, we examine the SB-robustness of some of the commonly used performance measures of control charts which are widely used for control of manufacturing processes. It is shown that for a mean control chart, the False Alarm Probability (FAP), the Average Sample Number when the process is in-control (\(ASN_0\)), the No-Signal Probability (NSP), and the Average Sample Number when the process is out-of-control (\(ASN_1\)) are all SB-robust at the family of all normal distributions with bounded mean and standard deviation. We also show that the above-mentioned performance measures are not SB-robust at the larger family of normal distributions with unbounded mean and standard deviation.
Arnab Kumar Laha, A. C. Pravida Raja
Combining Linear Dimension Reduction Subspaces
Abstract
Dimensionality is a major concern in the analysis of large data sets. There are various well-known dimension reduction methods with different strengths and weaknesses. In practical situations it is difficult to decide which method to use as different methods emphasize different structures in the data. Like ensemble methods in statistical learning, several dimension reduction methods can be combined using an extension of the Crone and Crosby distance, a weighted distance between the subspaces that allows to combine subspaces of different dimensions. Some natural choices of weights are considered in detail. Based on the weighted distance we discuss the concept of averages of subspaces and how to combine various dimension reduction methods. The performance of the weighted distances and the combining approach is illustrated via simulations and a real data example.
Eero Liski, Klaus Nordhausen, Hannu Oja, Anne Ruiz-Gazen
On the Computation of Symmetrized M-Estimators of Scatter
Abstract
This paper focuses on the computational aspects of symmetrized M-estimators of scatter, i.e., the multivariate M-estimators of scatter computed on the pairwise differences of the data. Such estimators do not require a location estimate, and more importantly, they possess the important block and joint independence properties. These properties are needed, for example, when solving the independent component analysis problem. Classical and recently developed algorithms for computing the M-estimators and the symmetrized M-estimators are discussed. The effect of parallelization is considered as well as new computational approach based on using only a subset of pairwise differences. Efficiencies and computation time comparisons are made using simulation studies under multivariate elliptically symmetric models and under independent component models.
Jari Miettinen, Klaus Nordhausen, Sara Taskinen, David E. Tyler
Mortality and Life Expectancy Forecasting for a Group of Populations in Developed Countries: A Robust Multilevel Functional Data Method
Abstract
A robust multilevel functional data method is proposed to forecast age-specific mortality rate and life expectancy for two or more populations in developed countries with high-quality vital registration systems. It uses a robust multilevel functional principal component analysis of aggregate and population-specific data to extract the common trend and population-specific residual trend among populations. This method is applied to age- and sex-specific mortality rate and life expectancy for the United Kingdom from 1922 to 2011, and its forecast accuracy is then further compared with standard multilevel functional data method. For forecasting both age-specific mortality and life expectancy, the robust multilevel functional data method produces more accurate point and interval forecasts than the standard multilevel functional data method in the presence of outliers.
Han Lin Shang
Asymptotically Stable Tests with Application to Robust Detection
Abstract
To design highly robust and efficient tests, a new method based on the so-called variational optimization approach for robust estimation proposed by Shurygin (1994a, b) is developed. A new indicator of robustness of tests, the test stability, is introduced. The optimal decision rules maximizing test efficiency under the guaranteed level of test stability are written out. The proposed stable tests are based on redescending M-estimators as the corresponding test statistics. For hypothesis testing of location, one of those tests, namely a radical test, outperforms the conventional robust linear bounded Huber’s and redescending Hampel’s tests under heavy-tailed distributions although being slightly inferior to Huber’s test under the Gaussian and moderately contaminated Gaussian distributions.
Georgy Shevlyakov
Backmatter
Metadaten
Titel
Recent Advances in Robust Statistics: Theory and Applications
herausgegeben von
Claudio Agostinelli
Ayanendranath Basu
Peter Filzmoser
Diganta Mukherjee
Copyright-Jahr
2016
Verlag
Springer India
Electronic ISBN
978-81-322-3643-6
Print ISBN
978-81-322-3641-2
DOI
https://doi.org/10.1007/978-81-322-3643-6