Skip to main content
Top

2003 | OriginalPaper | Chapter

Generalized k-Medians Clustering for Strings

Authors : Carlos D. Martínez-Hinarejos, Alfons Juan, Francisco Casacuberta

Published in: Pattern Recognition and Image Analysis

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Clustering methods are used in pattern recognition to obtain natural groups from a data set in the framework of unsupervised learning as well as for obtaining clusters of data from a known class. In sets of strings, the concept of set median string can be extended to the (set)k-medians problem. The solution of the k-medians problem can be viewed as a clustering method, where each cluster is generated by each of the k strings of that solution. A concept which is related to set median string is the (generalized) median string, which is an NP-Hard problem. However, different algorithms have been proposed to find approximations to the (generalized) median string. We propose extending the (generalized) median string problem to k strings, resulting in the generalizedk-medians problem, which can also be viewed as a clustering technique. This new technique is applied to a corpus of chromosomes represented by strings and compared to the conventional k-medians technique.

Metadata
Title
Generalized k-Medians Clustering for Strings
Authors
Carlos D. Martínez-Hinarejos
Alfons Juan
Francisco Casacuberta
Copyright Year
2003
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-540-44871-6_59

Premium Partner