2007 | OriginalPaper | Buchkapitel
An Efficient Histogram Method for Outlier Detection
verfasst von : Matthew Gebski, Raymond K. Wong
Erschienen in: Advances in Databases: Concepts, Systems and Applications
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
An important problem in database and data mining systems is the detection of outlying points. It is often the case that data observations exhibiting atypical properties are of more interest than those fitting common patterns. While anomaly and outlier detection have received considerable attention from the statistics community, these approaches are primarily focused on analysis of data sets containing relatively few and univariate observations. Recently, valuable approaches have been proposed to facilitate multidimensional analysis for larger data sets. Unfortunately, these approaches are often expensive and require numerous comparisons between each point and the remainder of the data.
We propose an approach using histograms for outlier detection. Sparse regions of the data are recognised and used for identifying points that are likely to be outliers. An extensive experimental evaluation demonstrates the efficiency of our approach under a number of circumstances with varying parameters on real world and synthetic data sets.