skip to main content
10.1145/1143997.1144003acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
Article

A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set

Published:08 July 2006Publication History

ABSTRACT

Many data mining applications involve the task of building a model for predictive classification. The goal of such a model is to classify examples (records or data instances) into classes or categories of the same type. The use of variables (attributes) not related to the classes can reduce the accuracy and reliability of a classification or prediction model. Superuous variables can also increase the costs of building a model - particularly on large data sets. We propose a discrete Particle Swarm Optimization (PSO) algorithm designed for attribute selection. The proposed algorithm deals with discrete variables, and its population of candidate solutions contains particles of different sizes. The performance of this algorithm is compared with the performance of a standard binary PSO algorithm on the task of selecting attributes in a bioinformatics data set. The criteria used for comparison are: (1) maximizing predictive accuracy; and (2) finding the smallest subset of attributes.

References

  1. T. Blackwell and J. Branke. Multi-swarm optimization in dynamic environments. In Lecture Notes in Computer Science, volume 3005, pages 489--500. Springer-Verlag, 2004.Google ScholarGoogle Scholar
  2. E. S. Correa, M. T. Steiner, A. A. Freitas, and C. Carnieri. Using a genetic algorithm for solving a capacity p-median problem. Numerical Algorithms, 35:373--388, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. D. Freedman, R. Pisani, and R. Purves. Statistics. W. W. Norton & Company, 3rd edition, September 1997.Google ScholarGoogle Scholar
  4. A. A. Freitas. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag, October 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Janson and M. Middendorf. A hierarchical particle swarm optimizer for dynamic optimization problems. In Evoworkshops 2004: 1st European Workshop on Evolutionary Algorithms in Stochastic and Dynamic Environments, pages 513--524, Coimbra, Portugal, 2004. Springer-Verlag.Google ScholarGoogle ScholarCross RefCross Ref
  6. G. Kendall and Y. Su. A particle swarm optimisation approach in the construction of optimal risky portfolios. In Proceedings of the 23rd IASTED International Multi-Conference on Applied Informatics, pages 140--145, 2005. Articial intelligence and applications.Google ScholarGoogle Scholar
  7. J. Kennedy. Small worlds and mega-minds: effects of neighborhood topology on particle swarm performance. In P. J. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, and A. Zalzala, editors, Proceedings of the Congress of Evolutionary Computation, pages 1931--1938, Piscataway, NJ, USA, 1999. IEEE Press.Google ScholarGoogle Scholar
  8. J. Kennedy and R. C. Eberhart. A discrete binary version of the particle swarm algorithm. In Proceedings of the 1997 Conference on Systems, Man, and Cybernetics, pages 4104--4109, Piscataway, NJ, USA, 1997. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Kennedy and R. C. Eberhart. Swarm Intelligence. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Løvbjerg and T. Krink. Extending particle swarm optimisers with self-organized criticality. In D. B. Fogel, M. A. El-Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, and M. Shackleton, editors, Proceedings of the 2002 Congress on Evolutionary Computation CEC2002, pages 1588--1593. IEEE Press, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. M. Mitchell. Machine Learning. McGraw-Hill, August 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Papoulis and S. U. Pillai. Probability, Random Variables and Stochastic Processes With Errata Sheet. McGraw-Hill, 1st edition, December 2001.Google ScholarGoogle Scholar
  13. G. L. Pappa, A. J. Baines, and A. A. Freitas. Predicting post-synaptic activity in proteins with data mining. Bioinformatics, 21(2):ii19--ii25, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Poli, C. D. Chio, and W. B. Langdon. Exploring extended particle swarms: a genetic programming approach. In GECCO'05: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation, pages 169--176, New York, NY, USA, 2005. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Rapport. Nerve Endings: the Discovery of the Synapse. W. W. Norton & Company, May 2005.Google ScholarGoogle Scholar
  16. Y. Shi and R. C. Eberhart. Parameter selection in particle swarm optimization. In EP'98: Proceedings of the 7th International Conference on Evolutionary Programming, pages 591--600, London, UK, 1998. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. M. Solomon. Algorithms for the vehicle routing and scheduling problems with time window constraints. Operations Research, 35(2):254--265, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2nd edition, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      GECCO '06: Proceedings of the 8th annual conference on Genetic and evolutionary computation
      July 2006
      2004 pages
      ISBN:1595931864
      DOI:10.1145/1143997

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 July 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      GECCO '06 Paper Acceptance Rate205of446submissions,46%Overall Acceptance Rate1,669of4,410submissions,38%

      Upcoming Conference

      GECCO '24
      Genetic and Evolutionary Computation Conference
      July 14 - 18, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader