ABSTRACT
Many data mining applications involve the task of building a model for predictive classification. The goal of such a model is to classify examples (records or data instances) into classes or categories of the same type. The use of variables (attributes) not related to the classes can reduce the accuracy and reliability of a classification or prediction model. Superuous variables can also increase the costs of building a model - particularly on large data sets. We propose a discrete Particle Swarm Optimization (PSO) algorithm designed for attribute selection. The proposed algorithm deals with discrete variables, and its population of candidate solutions contains particles of different sizes. The performance of this algorithm is compared with the performance of a standard binary PSO algorithm on the task of selecting attributes in a bioinformatics data set. The criteria used for comparison are: (1) maximizing predictive accuracy; and (2) finding the smallest subset of attributes.
- T. Blackwell and J. Branke. Multi-swarm optimization in dynamic environments. In Lecture Notes in Computer Science, volume 3005, pages 489--500. Springer-Verlag, 2004.Google Scholar
- E. S. Correa, M. T. Steiner, A. A. Freitas, and C. Carnieri. Using a genetic algorithm for solving a capacity p-median problem. Numerical Algorithms, 35:373--388, 2004.Google ScholarCross Ref
- D. Freedman, R. Pisani, and R. Purves. Statistics. W. W. Norton & Company, 3rd edition, September 1997.Google Scholar
- A. A. Freitas. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag, October 2002. Google ScholarDigital Library
- S. Janson and M. Middendorf. A hierarchical particle swarm optimizer for dynamic optimization problems. In Evoworkshops 2004: 1st European Workshop on Evolutionary Algorithms in Stochastic and Dynamic Environments, pages 513--524, Coimbra, Portugal, 2004. Springer-Verlag.Google ScholarCross Ref
- G. Kendall and Y. Su. A particle swarm optimisation approach in the construction of optimal risky portfolios. In Proceedings of the 23rd IASTED International Multi-Conference on Applied Informatics, pages 140--145, 2005. Articial intelligence and applications.Google Scholar
- J. Kennedy. Small worlds and mega-minds: effects of neighborhood topology on particle swarm performance. In P. J. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, and A. Zalzala, editors, Proceedings of the Congress of Evolutionary Computation, pages 1931--1938, Piscataway, NJ, USA, 1999. IEEE Press.Google Scholar
- J. Kennedy and R. C. Eberhart. A discrete binary version of the particle swarm algorithm. In Proceedings of the 1997 Conference on Systems, Man, and Cybernetics, pages 4104--4109, Piscataway, NJ, USA, 1997. IEEE.Google ScholarCross Ref
- J. Kennedy and R. C. Eberhart. Swarm Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2001. Google ScholarDigital Library
- M. Løvbjerg and T. Krink. Extending particle swarm optimisers with self-organized criticality. In D. B. Fogel, M. A. El-Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, and M. Shackleton, editors, Proceedings of the 2002 Congress on Evolutionary Computation CEC2002, pages 1588--1593. IEEE Press, 2002. Google ScholarDigital Library
- T. M. Mitchell. Machine Learning. McGraw-Hill, August 1997. Google ScholarDigital Library
- A. Papoulis and S. U. Pillai. Probability, Random Variables and Stochastic Processes With Errata Sheet. McGraw-Hill, 1st edition, December 2001.Google Scholar
- G. L. Pappa, A. J. Baines, and A. A. Freitas. Predicting post-synaptic activity in proteins with data mining. Bioinformatics, 21(2):ii19--ii25, 2005. Google ScholarDigital Library
- R. Poli, C. D. Chio, and W. B. Langdon. Exploring extended particle swarms: a genetic programming approach. In GECCO'05: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation, pages 169--176, New York, NY, USA, 2005. ACM Press. Google ScholarDigital Library
- R. Rapport. Nerve Endings: the Discovery of the Synapse. W. W. Norton & Company, May 2005.Google Scholar
- Y. Shi and R. C. Eberhart. Parameter selection in particle swarm optimization. In EP'98: Proceedings of the 7th International Conference on Evolutionary Programming, pages 591--600, London, UK, 1998. Springer-Verlag. Google ScholarDigital Library
- M. M. Solomon. Algorithms for the vehicle routing and scheduling problems with time window constraints. Operations Research, 35(2):254--265, 1987. Google ScholarDigital Library
- I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2nd edition, 2005. Google ScholarDigital Library
Index Terms
- A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set
Recommendations
Particle swarm and bayesian networks applied to attribute selection for protein functional classification
GECCO '07: Proceedings of the 9th annual conference companion on Genetic and evolutionary computationThe Discrete Particle Swarm (DPSO) algorithm is an optimizationmethod that belongs to the fertile paradigm of Swarm Intelligence. The DPSO was designed for the task of attribute selection and it deals with discrete variables in a straightforward manner. ...
Multivector particle swarm optimization algorithm
AbstractThis paper proposes an improved meta-heuristic algorithm called multivector particle swarm optimization (MVPSO) for solving single-objective optimization problems. MVPSO improves particle swarm optimization (PSO) algorithm by creating more ...
A Modified Quantum-Behaved Particle Swarm Optimization
ICCS '07: Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007Based on the previously introduced Quantum-behaved Particle Swarm Optimization (QPSO), a revised QPSO with Gaussian disturbance on the mean best position of the swarm is proposed. The reason for the introduction of this novel method is that the ...
Comments