2013 | OriginalPaper | Buchkapitel
Exploratory Data Analysis through the Inspection of the Probability Density Function of the Number of Neighbors
verfasst von : Antonio Neme, Antonio Nido
Erschienen in: Advances in Intelligent Data Analysis XII
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Exploratory data analysis is a fundamental stage in data mining of high-dimensional datasets. Several algorithms have been implemented to grasp a general idea of the geometry and patterns present in high-dimensional data. Here, we present a methodology based on the distance matrix of the input data. The algorithm is based in the number of points considered to be neighbors of each input vector. Neighborhood is defined in terms of an hypersphere of varying radius, and from the distance matrix the probability density function of the number of neighbor vectors is computed. We show that when the radius of the hypersphere is systematically increased, a detailed analysis of the probability density function of the number of neighbors unfolds relevant aspects of the overall features that describe the high-dimensional data. The algorithm is tested with several datasets and we show its pertinence as an exploratory data analysis tool.