main-content

## Über dieses Buch

This contributed book focuses on major aspects of statistical quality control, shares insights into important new developments in the field, and adapts established statistical quality control methods for use in e.g. big data, network analysis and medical applications. The content is divided into two parts, the first of which mainly addresses statistical process control, also known as statistical process monitoring. In turn, the second part explores selected topics in statistical quality control, including measurement uncertainty analysis and data quality.

The peer-reviewed contributions gathered here were originally presented at the 13th International Workshop on Intelligent Statistical Quality Control, ISQC 2019, held in Hong Kong on August 12-14, 2019. Taken together, they bridge the gap between theory and practice, making the book of interest to both practitioners and researchers in the field of statistical quality control.

## Inhaltsverzeichnis

### Use of Conditional False Alarm Metric in Statistical Process Monitoring

Abstract
The conditional false alarm rate (CFAR) at a particular time is the probability of a false alarm for an assumed in-control process at that time conditional on no previous false alarm. Only the Shewhart control chart designed with known in-control parameters, or conditioned on the estimated parameters, has a constant conditional false alarm rate. Other types of charts, however, can have their control limits determined in order to have any desired pattern of CFARs. The important advantage of the use of this CFAR metric is when sample sizes, population sizes or other covariate information affecting chart performance vary over time. In these cases, the control limit at a particular time can be obtained through control of the CFAR value after the corresponding covariate value is known. This allows one to control the in-control performance of the chart without the need to model or forecast the covariate values. The approach is illustrated using the risk-adjusted Bernoulli cumulative sum (CUSUM) chart.
Anne R. Driscoll, William H. Woodall, Changliang Zou

### Design Considerations and Trade-offs for Shewhart Control Charts

Abstract
When in-control parameters are unknown, they have to be estimated using a reference sample. The control chart performance in Phase II, which is generally measured in terms of the Average Run Length (ARL) or False Alarm Rate (FAR), will vary across practitioners due to the use of different reference samples in Phase I. This variation is especially large for small sample sizes. Although increasing the amount of Phase I data improves the control chart performance, others have shown that the amount required to achieve a desired in-control performance is often infeasibly high. This holds even when the actual distribution of the data is known. When the distribution of the data is unknown, it has to be estimated as well, along with its parameters. This yields even more uncertainty in control chart performance when parametric models are applied. With these issues in mind, choices have to be made in order to control the performance of control charts. We discuss several of these choices and their corresponding implications.
Rob Goedhart

### On the Calculation of the ARL for Beta EWMA Control Charts

Abstract
Accurate calculation of the Average Run Length (ARL) for exponentially weighted moving average (EWMA) charts might be a tedious task. The omnipresent Markov chain approach is a common and effective tool to perform these calculations — see Lucas and Saccucci (1990) and Saccucci and Lucas (1990) for its application in case of EWMA charts. However, Crowder (1987b) and Knoth (2005) provided more sophisticated methods from the rich literature of numerical analysis to solve the ARL integral equation. These algorithms lead to very fast implementations for determining the ARL with high accuracy such as Crowder (1987a), or the R package spc (Knoth 2019) with its functions xewma.arl() and sewma.arl(). Crowder (1987a) utilized the popular Nyström method (Nyström 1930) which fails for bounded random variables existing, for example, in the case of an EWMA chart monitoring the variance. For the latter, Knoth (2005) utilized the so-called collocation method. It turns out that the numerical problems are even more severe for beta distributed random variables, which are bounded from both sides, typically on (0, 1). We illustrate these subtleties and provide extensions from Knoth (2005) to achieve high accuracy in an efficient way.
Sven Knoth

### Flexible Monitoring Methods for High-yield Processes

Abstract
In recent years, advancement in technology brought a revolutionary change in the manufacturing processes. Therefore, manufacturing systems produce a large number of conforming items with a small amount of non-conforming items. The resulting dataset usually contains a large number of zeros with a small number of count observations. It is claimed that the excess number of zeros may cause over-dispersion in the data (i.e., when variance exceeds mean), which is not entirely correct. Actually, an excess amount of zeros reduce the mean of a dataset which causes inflation in the dispersion. Hence, modeling and monitoring of the products from high-yield processes have become a challenging task for quality inspectors. From these highly efficient processes, produced items are mostly zero-defect and modeled based on zero-inflated distributions like zero-inflated Poisson (ZIP) and zero-inflated Negative Binomial (ZINB) distributions. A control chart based on the ZIP distribution is used to monitor the zero-defect process. However, when additional over-dispersion exists in the zero-defect dataset, a control chart based on the ZINB distribution is a better alternative. Usually, it is difficult to ensure that data is over-dispersed or under-dispersed. Hence, a flexible distribution named zero-inflated Conway–Maxwell–Poisson (ZICOM-Poisson) distribution is used to model over or under-dispersed zero-defect dataset. In this study, CUSUM charts are designed based on the ZICOM-Poisson distribution. These provide a flexible monitoring method for quality practitioners. A simulation study is designed to access the performance of the proposed monitoring methods and their comparison. Moreover, a real application is presented to highlight the importance of the stated proposal.
Tahir Mahmood, Ridwan A. Sanusi, Min Xie

### An Average Loss Control Chart Under a Skewed Process Distribution

Abstract
In the global market the quality of products is a crucial factor separating competitive companies within numerous industries. These firms may employ a loss function to measure the loss caused by a deviation of the quality variable from the target value. From the view of Taguchi’s philosophy, monitoring this deviation from the process target value is important, but in practice many quality data have distributions that are not normal but skewed. This paper thus develops an average loss control chart for monitoring quality loss variation under skewed distributions. We investigate the statistical properties of the proposed control chart and measure the out-of-control process detection performance of the proposed loss control charts by using the average run length. The average loss control chart illustrates the best performance in detecting of out-of-control loss location for a left-skewed process distribution and performs better than the existing median loss control chart.
Su-Fen Yang, Shan-Wen Lu

### ARL-Unbiased CUSUM Schemes to Monitor Binomial Counts

Abstract
Counted output, such as the number of defective items per sample, is often assumed to have a marginal binomial distribution. The integer and asymmetrical nature of this distribution and the value of its target mean hinders the quality control practitioner from dealing with a chart for the process mean with a pre-stipulated in-control average run length (ARL) and the ability to swiftly detect not only increases but also decreases in the process mean. In this paper we propose ARL-unbiased cumulative sum (CUSUM) schemes to rapidly detect both increases and decreases in the mean of independent and identically distributed as well as first-order autoregressive (AR(1)) binomial counts. Any shift is detected more quickly than a false alarm is generated by these schemes and their in-control ARL coincide with the pre-specified in-control ARL. We use the R statistical software to provide compelling illustrations of all these CUSUM schemes.
Manuel Cabral Morais, Sven Knoth, Camila Jeppesen Cruz, Christian H. Weiß

### Statistical Aspects of Target Setting for Attribute Data Monitoring

Abstract
We consider early warning systems (EWS) for monitoring multi-stage data, in which downstream variables undergo changes associated with upstream process stages. In such applications, the EWS monitoring arm acts as a search engine that analyses a number of data streams for each monitored variable, as the problems of change detection and identification of the change-causing stage are handled jointly. Given massive amounts of data involved in analysis, it is important to achieve an acceptable balance between false alarms and sensitivity requirements, by focusing on changes of practical significance. The role of the target-setting arm of EWS is to ensure and maintain this balance via suitable selection of control scheme parameters. In this paper, we discuss principles of developing and managing targets, with examples from a supply chain operation.
Emmanuel Yashchin, Aaron Civil, Jeff Komatsu, Paul Zulpa

### MAV Control Charts for Monitoring Two-State Processes Using Indirectly Observed Binary Data

Abstract
Processes described by indirectly observed data naturally arise in applications, such as telehealth systems. The available data can be used to predict the characteristics of interest, which form a process to be monitored. Its randomness is largely related to the classification (diagnosis) errors. To minimize them, one can use ensembles of predictors or try to benefit from the availability of heterogeneous sources of data. However, these techniques require certain modifications to the control charts, which we discuss in this paper. We consider three methods of classification: classical—based on the full set of attributes, and two combined—based on the number of positive evaluations yielded by an ensemble of inter-correlated classifiers. For monitoring the results of classification, we use a moving average control chart for serially dependent binary data. The application of the proposed procedure is illustrated with a real example of the monitoring of patients suffering from bipolar disorder. This monitoring procedure aims to detect a possible change in a patient’s state of health.
Olgierd Hryniewicz, Katarzyna Kaczmarek-Majer, Karol R. Opara

### Monitoring Image Processes: Overview and Comparison Study

Abstract
In this paper, an overview of recent developments on monitoring image processes is presented. We consider a relatively general model wherein the in-control state spatially correlated pixels are monitored. The control charts described are based on non-overlapping regions of interest. This leads to a dimension reduction but, nevertheless, we still face a high-dimensional data set. We consider residual charts and charts based on the generalized likelihood ratio (GLR) approach. For the calculation of the control statistic of the latter chart, the inverse of the covariance matrix of the process must be determined. However, in a high-dimensional setting, this is time consuming and moreover, the empirical covariance matrix does not behave well in such a case. This is the reason why two further control charts are considered which can be regarded as modifications of the GLR statistic. Within an extensive simulation study, the presented control charts are compared with each other using the median run length as a performance criterion.
Yarema Okhrin, Wolfgang Schmid, Ivan Semeniuk

### Parallelized Monitoring of Dependent Spatiotemporal Processes

Abstract
With the growing availability of high-resolution spatial data, such as high-definition images, three-dimensional point clouds of light detection and ranging (LIDAR) scanners, or communication and sensor networks, it might become challenging to detect changes and simultaneously account for spatial interactions in a timely manner. To detect local changes in the mean of isotropic spatiotemporal processes with locally constrained dependence structures, we have proposed a monitoring procedure that can be completely run on parallel processors. This allows for fast detection of local changes (i.e., in the case that only a few spatial locations are affected by the change). Due to parallel computation, high-frequency data could also be monitored. Hence, we additionally focused on the processing time required to compute the control statistics. Finally, the performance of the charts has been analyzed using a series of Monte Carlo simulation studies.
Philipp Otto

### Product’s Warranty Claim Monitoring Under Variable Intensity Rates

Abstract
Product manufacturers have paid great attentions to monitoring number of warranty claims for sold product as high claims trigger improvement opportunities and/or incur excessive operational costs. Poisson distribution has been widely used to model the claim number with the pooled Poisson intensity rate being referred as the nominal failure intensity rate. Since products used by different customers are heterogeneous, failure intensity rates vary from product to product. The counts of warranty claims are often skewed and over-dispersed. Negative binomial (NB) distribution which is the compound of the Poisson-gamma mixture distribution has been widely used to model the over-dispersed count data. However the use of the NB distribution may trigger signals more than expected when the intensity rates are not randomized from time to time. In this paper, the impact of time-varying intensity rates is investigated. We show that conventional control limits based on the NB distribution-based Shewhart chart should be lowered to accommodate the reduced variation of counts when products intensity rates become constant from time to time.
Wenpo Huang, Wei Jiang, Chengyou Shi

### A Statistical (Process Monitoring) Perspective on Human Performance Modeling in the Age of Cyber-Physical Systems

Abstract
With the continued technological advancements in mobile computing, sensors, and artificial intelligence methodologies, computer acquisition of human and physical data, often called cyber-physical convergence, is becoming more pervasive. Consequently, personal device data can be used as a proxy for human operators, creating a digital signature of their typical usage. Examples of such data sources include: wearable sensors, motion capture devices, and sensors embedded in work stations. Our motivation behind this paper is to encourage the quality community to investigate relevant research problems that pertain to human operators. To frame our discussion, we examine three application areas (with distinct data sources and characteristics) for human performance modeling: (a) identification of physical human fatigue using wearable sensors/accelerometers; (b) capturing changes in a driver’s safety performance based on fusing on-board sensor data with online API data; and (c) human authentication for cybersecurity applications. Through three case studies, we identify opportunities for applying industrial statistics methodologies and present directions for future work. To encourage future examination by the quality community, we host our data, Code, and analysis on an online repository.
Fadel M. Megahed, L. Allison Jones-Farmer, Miao Cai, Steven E. Rigdon, Manar Mohamed

### Monitoring Performance of Surgeons Using a New Risk-Adjusted Exponentially Weighted Moving Average Control Chart

Abstract
Risk-adjusted charting procedures have been developed in the literature. One important class of risk-adjusted procedures is based on the likelihood ratio statistic obtained by testing the odds ratio of mortality. The likelihood ratio statistic essentially converts the binary surgical outcomes of death and survival into penalty and reward scores, respectively, that are dependent on the predicted risk of death of a patient. For cardiac operations, the risk distribution is highly right skewed resulting in penalty and reward scores in a narrow range for a majority of the patients. This means effectively there is little risk adjustment for the majority of the patients. We propose a risk-adjusted statistic which is the ratio of surgical outcome to the estimated probability of death as the monitoring statistic. The main characteristic of this statistic is that the resulting penalty score is substantially higher if a patient with low risk dies, and the penalty score decreases sharply as the risk increases. We compare our chart with the original risk-adjusted cumulative sum chart in terms of average run length. Finally, we will perform a retrospective study using data from two surgeons.
Fah F. Gan, Wei L. Koh, Janice J. Ang

### Exploring the Usefulness of Functional Data Analysis for Health Surveillance

Abstract
Health surveillance is the process of ongoing systematic collection, analysis, interpretation, and dissemination of health data for the purpose of preventing and controlling disease, injury, and other health problems. Health surveillance data is often recorded continuously over a selected time interval or intermittently at several discrete time points. These can often be treated as functional data, and hence functional data analysis (FDA) can be applied to model and analyze these types of health data. One objective in health surveillance is early event detection. Statistical process monitoring tools are often used for online event detecting. In this paper, we explore the usefulness of FDA for prospective health surveillance and propose two strategies for monitoring using control charts. We apply these strategies to monthly ovitrap index data. These vector data are used in Hong Kong as part of its dengue control plan.
Zezhong Wang, Inez Maria Zwetsloot

### Rapid Detection of Hot-Spot by Tensor Decomposition with Application to Weekly Gonorrhea Data

Abstract
In many bio-surveillance and healthcare applications, data sources are measured from many spatial locations repeatedly over time, say, daily/weekly/monthly. In these applications, we are typically interested in detecting hot-spots, which are defined as some structured outliers that are sparse over the spatial domain but persistent over time. In this paper, we propose a tensor decomposition method to detect when and where the hot-spots occur. Our proposed methods represent the observed raw data as a three-dimensional tensor including a circular time dimension for daily/weekly/monthly patterns, and then decompose the tensor into three components: smooth global trend, local hot-spots, and residuals. A combination of LASSO and fused LASSO is used to estimate the model parameters, and a CUSUM procedure is applied to detect when and where the hot-spots might occur. The usefulness of our proposed methodology is validated through numerical simulation and a real-world dataset in the weekly number of gonorrhea cases from 2006 to 2018 for 50 states in the United States.
Yujie Zhao, Hao Yan, Sarah E. Holte, Roxanne P. Kerani, Yajun Mei

### An Approach to Monitoring Time Between Events When Events Are Frequent

Abstract
This paper focuses on monitor plans aimed at the early detection of the increase in the frequency of events. The literature recommends either monitoring the Time Between Events (TBE), if events are rare, or counting the number of events per unit non-overlapping time intervals, if events are not rare. Recent monitoring work has suggested that monitoring counts in preference to TBE is not recommended even when counts are low (less than 10). Monitoring TBE is the real-time option for outbreak detection, because outbreak information is accumulated when an event occurs. This is preferred to waiting for the end of a period to count events if outbreaks are large and occur in a short time frame. If the TBE reduces significantly, then the incidence of these events increases significantly. This paper explores monitoring TBE when the daily counts are quite high. We consider the case when TBEs are Weibull distributed.
Ross Sparks, Aditya Joshi, Cecile Paris, Sarvnaz Karimi

### Analysis of Measurement Precision Experiment with Ordinal Categorical Variables

Abstract
Many collaborative studies are run to evaluate precision of measurement methods. The main focus is on estimating repeatability and reproducibility, which are the variation within a laboratory and the overall variation of the measurement method, respectively. ISO 5725 provides how to design and analyze such precision experiments for quantitative cases where the measurement results follow a continuous distribution, namely a normal distribution. However, there are cases where the measurement results are qualitative such as binary or categorical. In this paper, the cases with ordinal categorical variables are considered. Using methods that can be applied to qualitative data, an analysis of a measurement precision experiment with measurements involving ordinal categorical variables is investigated. The data analysed are from an actual precision experiment of intratracheal administration testing whose objectives were to study the precision of a standardized test method for evaluating nanomaterial pulmonary toxicity.
Tomomichi Suzuki, Jun-ichi Takeshita, Mayu Ogawa, Xiao-Nan Lu, Yoshikazu Ojima

### Assessing a Binary Measurement System with Operator and Random Part Effects

Abstract
Consider the assessment of a binary measurement system with multiple operators when a gold standard measurement system is also available (for the assessment study). Data are collected as in a gauge repeatability and reproducibility plan for a continuous measurement system and each operator in the study measures a number of parts multiple times. We characterize the performance of the measurement system by estimating the probabilities of accepting a non-conforming part and of rejecting a conforming part. To model the data, we assume that some parts are more difficult to correctly classify than others and so choose to use random part effects. We consider two cases, modeling the operator effects as fixed or random. For each, we study a conditional and marginal model and their corresponding estimates of the parameters of interest. We also provide guidance on the planning of the assessment study in terms of the number of parts, number of operators and number of repeated measurements.
Stefan H. Steiner, R. Jock MacKay, Kevin Fan

### Concepts, Methods, and Tools Enabling Measurement Quality

Abstract
This contribution provides an overview, illustrated with examples, of applications of statistical methods that support measurement quality and guarantee the intercomparability of measurements made worldwide, in all fields of commerce, industry, science, and technology, including medicine. These methods enable a rigorous definition of measurement uncertainty, and provide the means to evaluate it quantitatively, both for qualitative measurands (for example, the sequence of nucleobases in a DNA strand) and for quantitative measurands (for example, the mass fraction of arsenic in rice). Measurement quality is its trustworthiness and comprises several attributes: reliable calibration involving standards; traceability to the international system of units or to other generally recognized standards; measurement uncertainty that realistically captures contributions from all significant sources of uncertainty; and fitness for purpose of the measurement results, which comprise measured values and evaluations of associated uncertainties. Statistical methods play key roles in the quality system that validates the measurement services (reference materials, calibrations, reference data, and reference instruments) provided by the National Institute of Standards and Technology (NIST). And these services in turn support measurement quality in laboratories, factories, farms, hospitals, transportation, utilities, and weather and environmental monitoring stations throughout the world, contributing to ensure food safety, to manufacture reliable products, and to monitor industrial and natural processes accurately. The NIST Uncertainty Machine (NUM) and the NIST Consensus Builder (NICOB) are web-based tools freely available to metrologists everywhere, that help maintain measurement quality. The NUM serves to evaluate measurement uncertainty and the NICOB builds consensus values from measurement results obtained independently for the same measurand.
Antonio Possolo

### Assessing Laboratory Effects in Key Comparisons with Two Transfer Standards Measured in Two Petals: A Bayesian Approach

Abstract
We propose a new statistical method for analyzing data from a key comparison when two transfer standards are measured in two petals. The approach is based on a generalization of the classical random effects model, a popular procedure in metrology. Bayesian treatment of the model parameters, as well as of the random effects is suggested. The latter can be viewed as potential laboratory effects which are assessed through the proposed analysis. While the prior for the laboratory effects naturally is assigned as a Gaussian distribution, the Berger and Bernardo reference prior is taken for the remaining model parameters. The results are presented in terms of the posterior distributions derived for the laboratory effects. From these distributions, posterior means and credible intervals are calculated. The proposed method paves the way for applying the established random effects model also for data arising from the measurement of several transfer standards in several petals. Finally, the new approach is illustrated for measurements of two 500 mg transfer standards carried out in key comparison CCM.M-K7.
Olha Bodnar, Clemens Elster

### Quality Control Activities Are a Challenge for Reducing Variability

Abstract
It is well known that reducing variability is the basis of quality control activities. The production process can be regarded roughly as a value chain, which is composed of customer voice, product planning, product design, and manufacturing the product. In the outcome of the value chain, three kinds of variability, which are the variability before shipping to market, the variability after shipping to market, and the variability of satisfaction of market, can be considered. Quality control activities can be regarded as thinking about what can be done to reduce the three variabilities and taking actions, then ensuring quality for customers by implementing them. In the value chain, many proposals and improvements have been implemented to reduce the variabilities. In this paper, a structure of the three variabilities above is shown; then activities to reduce the variabilities are discussed. As a result, the activities can be classified into four approaches and they can be systematized as the four approaches to reduce the three kinds of variability.
Ken Nishina

### Is the Benford Law Useful for Data Quality Assessment?

Abstract
Data quality and data fraud are of increasing concern in the digital world. Benford’s Law is used worldwide for detecting non-conformance or data fraud of numerical data. It says that the first non-zero digit $$D_1$$, of a data item from a universe, is not uniformly distributed. The shape is roughly logarithmically decaying starting with $$P(D_1=1)\cong 0.3$$. It is self-evident that Benford’s Law should not be applied for detecting manipulated or faked data before having examined the goodness of fit of the probability model while the business process is free of manipulations, i.e. ‘under control’. In this paper, we are concerned with the goodness-of-fit phase, not with fraud detection itself. We selected five empirical numerical data sets of various sample sizes being publicly accessible as a kind of benchmark, and evaluated the performance of three statistical tests. The tests include the chi-square goodness-of-fit test, which is used in businesses as a standard test, the Kolmogorov–Smirnov test, and the MAD test as originated by Nigrini (1992). We are analyzing further whether the invariance properties of Benford’s Law might improve the tests or not.
Wolfgang Kössler, Hans-J. Lenz, Xing D. Wang
Weitere Informationen