Article

Efficient agent-based cluster ensembles

Authors:
Adrian Agogino

UCSC, NASA Ames Research Center, Moffett Field, CA

UCSC, NASA Ames Research Center, Moffett Field, CA
View Profile

,
Kagan Tumer

NASA Ames Research Center, Moffett Field, CA

NASA Ames Research Center, Moffett Field, CA
View Profile

AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systemsMay 2006Pages 1079–1086https://doi.org/10.1145/1160633.1160826

Published:08 May 2006Publication History

AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems

Pages 1079–1086

ABSTRACT

Numerous domains ranging from distributed data acquisition to knowledge reuse need to solve the cluster ensemble problem of combining multiple clusterings into a single unified clustering. Unfortunately current non-agent-based cluster combining methods do not work in a distributed environment, are not robust to corrupted clusterings and require centralized access to all original clusterings. Overcoming these issues will allow cluster ensembles to be used in fundamentally distributed and failure-prone domains such as data acquisition from satellite constellations, in addition to domains demanding confidentiality such as combining clusterings of user profiles. This paper proposes an efficient, distributed, agent-based clustering ensemble method that addresses these issues. In this approach each agent is assigned a small subset of the data and votes on which final cluster its data points should belong to. The final clustering is then evaluated by a global utility, computed in a distributed way. This clustering is also evaluated using an agent-specific utility that is shown to be easier for the agents to maximize. Results show that agents using the agent-specific utility can achieve better performance than traditional non-agent based methods and are effective even when up to 50% of the agents fail.

References

A. Agogino and K. Tumer. Efficient evaluation functions for multi-rover systems. In The Genetic and Evolutionary Computation Conference, pages 1--12, Seatle, WA, June 2004.]] Google ScholarDigital Library
Christopher M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1996.]] Google ScholarDigital Library
D. Boley, M. Gini, R. Gross, S. Han, K. Hastings, G. Kary pis, V. Kumar, B. Mobasher, and J. Moore. Partitioning-based clustering for web document categorization. Decision Support Systems, 27(3):329--341, 1999.]] Google ScholarDigital Library
S. V. Chakravarthy and J. Ghosh. Scale-based clustering using the radial basis function network. IEEE Trans. on Neural Networks, pages 1250--61, September 1996.]]Google Scholar
I. S. Dhillon and D. S. Modha. Concept decompositions for large sparse text data using clustering. Machine Learning, 42(1):143--175, 2001.]]Google ScholarDigital Library
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification (2nd ed). John Wiley & Sons, New York, NY, 2000.]] Google ScholarDigital Library
G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1):359--392, 1998.]] Google ScholarDigital Library
George Karypis, Rajat Aggarwal, Vipin Kumar, and Shashi Shekhar. Multilevel hypergraph partitioning: Applications in vlsi domain. In Proceedings of the Design and Automation Conference, 1997.]] Google ScholarDigital Library
B. W. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. The Bell System Technical Journal, 42(2):291--307, 1970.]]Google ScholarCross Ref
J. Kim and T. Warnow. Tutorial on phylogenetic tree estimation. In "Intelligent Systems for Molecular Biology", Heidelberg, Germany, 1999.]]Google Scholar
T. Kohonen. Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43:59--69, 1982.]]Google ScholarCross Ref
J. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281--297, 1967.]]Google Scholar
T. M. Mitchell. Machine Learning. WCB/McGraw-Hill, Boston, MA, 1997.]] Google ScholarDigital Library
Alexander Strehl. Relationship-based Clustering and Cluster Ensembles for High-dimensional Data Mining. PhD thesis, The University of Texas at Austin, May 2002.]] Google ScholarDigital Library
Alexander Strehl and Joydeep Ghosh. Cluster ensembles --- a knowledge reuse framework for combining partitionings. In Proceedings of AAAI 2002, Edmonton, Canada, pages 93--98. AAAI, July 2002.]] Google ScholarDigital Library
Alexander Strehl, Joydeep Ghosh, and Raymond J. Mooney. Impact of similarity measures on web-page clustering. In Proc. AAAI Workshop on AI for Web Search (AAAI 2000), Austin, pages 58--64. AAAI/MIT Press, July 2000.]]Google Scholar
K. Tumer and D. Wolpert, editors. Collectives and the Design of Complex Systems. Springer, New York, 2004.]] Google ScholarDigital Library
K. Tumer and D. H. Wolpert. Collective intelligence and Braess' paradox. In Proceedings of the Seventeenth National Conference on Artificial Intelligence, pages 104--109, Austin, TX, 2000.]] Google ScholarDigital Library
Ian Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco, 2005.]] Google ScholarDigital Library
D. H. Wolpert and K. Tumer. Optimal payoff functions for members of collectives. Advances in Complex Systems, 4(2/3):265--279, 2001.]]Google ScholarCross Ref

Recommendations

An Efficient Formulation of the Improved Visual Assessment of Cluster Tendency (iVAT) Algorithm

The VAT algorithm is a visual method for determining the possible number of clusters in, or the cluster tendency of a set of objects. The improved VAT (iVAT) algorithm uses a graph-theoretic distance transform to improve the effectiveness of the VAT ...
Read More
Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization

Many clustering algorithms, including cluster ensembles, rely on a random component. Stability of the results across different runs is considered to be an asset of the algorithm. The cluster ensembles considered here are based on k-means clusterers. ...
Read More
Some connectivity based cluster validity indices

Identification of the correct number of clusters and the appropriate partitioning technique are some important considerations in clustering where several cluster validity indices, primarily utilizing the Euclidean distance, have been used in the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
May 2006
1631 pages
ISBN:1595933034
DOI:10.1145/1160633
General Chairs:
Hideyuki Nakashima
Future University - Hakodate, Japan
,
Michael Wellman
University of Michigan
,
Program Chairs:
Gerhard Weiss
Technical University Munich, Germany
,
Peter Stone
The University of Texas at Austin
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 May 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
multiagent systems
reinforcement learning
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,155of5,036submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 342
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient agent-based cluster ensembles

AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems

ABSTRACT

References

Cited By

Recommendations

An Efficient Formulation of the Improved Visual Assessment of Cluster Tendency (iVAT) Algorithm

Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization

Some connectivity based cluster validity indices