Elsevier

Signal Processing

Volume 83, Issue 12, December 2003, Pages 2481-2497
Signal Processing

Novelty detection: a review—part 1: statistical approaches

https://doi.org/10.1016/j.sigpro.2003.07.018Get rights and content

Abstract

Novelty detection is the identification of new or unknown data or signal that a machine learning system is not aware of during training. Novelty detection is one of the fundamental requirements of a good classification or identification system since sometimes the test data contains information about objects that were not known at the time of training the model. In this paper we provide state-of-the-art review in the area of novelty detection based on statistical approaches. The second part paper details novelty detection using neural networks. As discussed, there are a multitude of applications where novelty detection is extremely important including signal processing, computer vision, pattern recognition, data mining, and robotics.

Introduction

Detecting novel events is an important ability of any signal classification scheme. Given the fact that we can never train a machine learning system on all possible object classes whose data the system is likely to encounter, it becomes important that it is able to differentiate between known and unknown object information during testing. It has been realised in practice by several studies that the novelty detection is an extremely challenging task. It is for this reason that there exist several models of novelty detection that have been shown to perform well on different data. It is clearly evident that there is no single best model for novelty detection and the success depends not only on the type of method used but also statistical properties of data handled.

Several applications require the classifier to act as a detector rather as a classifier, that is, the requirement is to detect whether an input is part of the data that the classifier was trained on or it is in fact unknown. This technique is useful in applications such as fault detection [11], [14], [30], [53], radar target detection [7], detection of masses in mammograms [54], hand written digit recognition [55], Internet and e-commerce [34], statistical process control [22], and several others. Recently, there has been an increased interest in novelty detection as a number of research articles have appeared on autonomous systems based on adaptive machine learning. However, only a very few surveys have appeared, e.g. [40]. Much of earlier work and interest in novelty detection sprung from the study of control systems. High integrity systems could not use the traditional classification method for a number of reasons; abnormalities are very rare or there may be no data that describes the fault conditions. Novelty detection offered a solution to this problem by modelling normal data and using a distance measure and a threshold for determining abnormality. In recent years novelty detection has been used in a number of other applications especially signal processing and image analysis (e.g. biometrics). In these applications the problem becomes more complicated with multiple classes, high dimensionality, noisy features and quite often not enough samples. As such, novelty detection methods have tried to keep up with these problems to offer solutions that can be used in the real world. In this paper we review some of the currently used methods on novelty detection using statistical approaches.

There are several important issues related to novelty detection. We can summarise them in terms of the following principles.

  • (a)

    Principle of robustness and trade-off: a novelty detection method must be capable of robust performance on test data that maximises the exclusion of novel samples while minimising the exclusion of known samples. This trade-off should be, to a limited extent, predictable and under experimental control.

  • (b)

    Principle of uniform data scaling: in order to assist novelty detection, it should be possible that all test data and training data after normalisation lie within the same range [49].

  • (c)

    Principle of parameter minimisation: a novelty detection method should aim to minimise the number of parameters that are user set.

  • (d)

    Principle of generalisation: the system should be able to generalise without confusing generalised information as novel [55].

  • (e)

    Principle of independence: the novelty detection method should be independent of the number of features, and classes available and it should show reasonable performance in the context of imbalanced data set, low number of samples, and noise.

  • (f)

    Principle of adaptability: a system that recognises novel samples during test should be able to use this information for retraining [47].

  • (g)

    Principle of computational complexity: a number of novelty detection applications are online and therefore the computational complexity of a novelty detection mechanism should be as less as possible.

In this survey, we study a number of approaches to novelty detection and remark on how well these studies address the above principles. Each approach has a number of different methods and we detail of the important studies in these areas.

Section snippets

Statistical approaches

Statistical approaches are mostly based on modelling data based on its statistical properties and using this information to estimate whether a test samples comes from the same distribution or not. The techniques used vary in terms of their complexity [40]. The simplest approach can be based on constructing a density function for data of a known class, and then assuming that data is normal computing the probability of a test sample of belonging to that class. The probability estimate can be

Conclusion

In this paper we have presented a survey of novelty detection using statistical approaches. Most of such research is driven by modelling data distributions and then estimating the probability of test data to belong to such distributions. In such model-based approaches, one does need to specify or make assumptions on the nature of training data. In addition, the amount and quality of training data becomes very important in the robust determination of training data distribution parameters.

References (64)

  • T. Brotherton, T. Johnson, G. Chadderdon, Classification and novelty detection using linear models and a class...
  • C Campbell et al.

    A linear programming approach to novelty detection

    (2001)
  • G.A. Carpenter, M.A. Rubin, W.W. Streilein, ARTMAP-FD: familiarity discrimination applied to radar target recognition,...
  • C.K Chow

    On optimum recognition error and reject tradeoff

    IEEE Trans. Inform. Theory

    (January 1970)
  • L.P Cordella et al.

    A method for improving classification reliability of multilayer perceptrons

    IEEE Trans. Neural Networks

    (1995)
  • T Cover et al.

    Nearest neighbor pattern classification

    IEEE Trans. Inform. Theory

    (1967)
  • D Dasgupta et al.

    Novelty-detection in time series data using ideas from immunology, Proceedings of the International Conference on Intelligent Systems

    (1996)
  • D. Dasgupta, F.A. Gonzalez, An immunogenetic approach to intrusion detection, Division of Computer Science, University...
  • D. Dasgupta, N.S. Majumdar, Anomaly detection in multidimensional data using negative selection algorithm, Proceedings...
  • D. Dasgupta, F. Nino, A comparison of negative and positive selection algorithms in novel pattern detection,...
  • M.J. Desforges, P.J. Jacob, J.E. Cooper, Applications of probability density estimation to the detection of abnormal...
  • R.O Duda et al.

    Pattern Classification

    (2001)
  • R.A Fisher et al.

    Limiting forms of the frequency distribution of the largest and smallest member of a sample

    Proc. Camb. Philos. Soc.

    (1928)
  • S. Forrest, A.S. Perelson, L. Allen, R. Cherukuri, Self-non-self discrimination in a computer, Proceedings of the IEEE...
  • S.E. Guttormsson, R.J. Marks II, M.A. El-Sharkawi, Elliptical novelty grouping for on-line short-turn detection of...
  • L.K Hansen et al.

    The error-reject tradeoff

    Open Systems Inform. Dynamics

    (1997)
  • L.K. Hansen, S. Sigurdsson, T. Kolenda, F.A. Nielson, U. Kjems, J. Larsen, Modelling text with generalizable Gaussian...
  • M.E Hellman

    The nearest neighbour classification with a reject option

    IEEE Trans. Systems Sci. Cybernet.

    (July 1970)
  • S.J. Hickinbotham, J. Austin, Neural networks for novelty detection in airframe strain data, Proceedings of IEEE IJCNN,...
  • N. Japkowicz, C. Myers, M. Gluck, A novelty detection approach to classification, Proceedings of the 14th IJCAI...
  • S.P. King, D.M. King, P. Anuzis, K. Astley, L. Tarassenko, P. Hayton, S. Utete, The use of novelty detection techniques...
  • E.M Knorr et al.

    Distance-based outliersalgorithms and applications

    VLDB J.

    (2000)
  • Cited by (1230)

    View all citing articles on Scopus
    View full text