Skip to main content

2017 | Buch

Anomaly Detection Principles and Algorithms

insite
SUCHEN

Über dieses Buch

This book provides a readable and elegant presentation of the principles of anomaly detection,providing an easy introduction for newcomers to the field. A large number of algorithms are succinctly described, along with a presentation of their strengths and weaknesses.

The authors also cover algorithms that address different kinds of problems of interest with single and multiple time series data and multi-dimensional data. New ensemble anomaly detection algorithms are described, utilizing the benefits provided by diverse algorithms, each of which work well on some kinds of data.

With advancements in technology and the extensive use of the internet as a medium for communications and commerce, there has been a tremendous increase in the threats faced by individuals and organizations from attackers and criminal entities. Variations in the observable behaviors of individuals (from others and from their own past behaviors) have been found to be useful in predicting potential problems of various kinds. Hence computer scientists and statisticians have been conducting research on automatically identifying anomalies in large datasets.

This book will primarily target practitioners and researchers who are newcomers to the area of modern anomaly detection techniques. Advanced-level students in computer science will also find this book helpful with their studies.

Inhaltsverzeichnis

Frontmatter

Principles

Frontmatter
Chapter 1. Introduction
Abstract
Incidents of fraud have increased at a rapid pace in recent years, perhaps because very simple technology (such as email) is sufficient to help miscreants commit fraud. Losses may not be directly financial, e.g., an email purportedly from a family member may pretend to communicate a photograph, clicking on whose icon really results in malware coming to reside on your machine. As is the case with health and other unpreventable problems faced by humanity, early detection is essential to facilitate recovery. The automated detection and alerting of abnormal data and behaviors, implemented using computationally efficient software, are critical in this context. These considerations motivate the development and application of the anomaly detection principles and algorithms discussed in this book.
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang
Chapter 2. Anomaly Detection
Abstract
Anomaly detection problems arise in multiple applications, as discussed in the preceding chapter. such as financial fraud, cyber intrusion, video surveillance, and medical image analysis. This chapter discusses the basic ideas of anomaly detection, and sets up a framework within which various algorithms can be analyzed and compared.
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang
Chapter 3. Distance-Based Anomaly Detection Approaches
Abstract
In this chapter we consider anomaly detection based on distance (similarity) measures. Our approach is to explore various possible scenarios in which an anomaly may arise. To keep things simple, in most of the chapter we illustrate basic concepts using one-dimensional observations. Distance based algorithms, proposed by researchers, are presented in Chap. 6.
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang
Chapter 4. Clustering-Based Anomaly Detection Approaches
Abstract
This chapter explores anomaly detection approaches based on explicit identification of clusters in a data set. Points that are not within a cluster become candidates to be considered anomalies. Variations among algorithms result in evaluating the relative anomalousness of points that are near (but not inside) a cluster, and also the points at the periphery of a cluster.
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang
Chapter 5. Model-Based Anomaly Detection Approaches
Abstract
Many data sets are described by models that may capture the underlying processes that lead to generation of data, describing a presumed functional or relational relationship between relevant variables. Such models permit comprehension and concise description of the data sets, facilitating identification of data points that are not consistent with such a description.
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang

Algorithms

Frontmatter
Chapter 6. Distance and Density Based Approaches
Abstract
In Chap. 3, we discussed distance based approaches for anomaly detection; however there the focus was to illustrate how distances can be measured and minor perturbation in proposed distance can change the outcome; illustrated by simple examples. In this chapter we consider anomaly detection techniques that depend on the distances and densities. The densities can be global or local to the point of concern.
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang
Chapter 7. Rank Based Approaches
Abstract
Density-based methodology that exploits k-neighborhood of a data point has many good features. For instance, it is independent of the distribution of the data and is capable of detecting isolated objects. However it has some shortcomings:
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang
Chapter 8. Ensemble Methods
Abstract
In the previous chapters, we have described various anomaly detection algorithms, whose relative performance varies with the dataset and the application being considered.
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang
Chapter 9. Algorithms for Time Series Data
Abstract
Many practical problems involve data that arrive over time, and are hence in a strict temporal sequence. As discussed in Chap. 5, treating the data as a set, while ignoring the time-stamp, loses information essential to the problem. Treating the time-stamp as just another dimension (on par with other relevant dimensions such as dollar amounts) can only confuse the matter: the occurrence of other attribute values at a specific time instant can mean something quite different from the same attribute values occurring at another time, depending on the immediately preceding values. Such dependencies necessitate considering time as a special aspect of the data for explicit modeling, and treating the data as a sequence rather than a set. Hence anomaly detection for time-sequenced data requires algorithms that are substantially different from those discussed in the previous chapters.
Kishan G. Mehrotra, Chilukuri K. Mohan, HuaMing Huang
Backmatter
Metadaten
Titel
Anomaly Detection Principles and Algorithms
verfasst von
Kishan G. Mehrotra
Chilukuri K. Mohan
HuaMing Huang
Copyright-Jahr
2017
Electronic ISBN
978-3-319-67526-8
Print ISBN
978-3-319-67524-4
DOI
https://doi.org/10.1007/978-3-319-67526-8