skip to main content
research-article
Free Access

Evolutionary biclustering of gene expressions

Authors Info & Claims
Published:01 October 2006Publication History
Skip Abstract Section

Abstract

With the advent of microarray technology it has been possible to measure thousands of expression values of genes in a single experiment. Biclustering or simultaneous clustering of both genes and conditions is challenging particularly for the analysis of high-dimensional gene expression data in information retrieval, knowledge discovery, and data mining. The objective here is to find sub-matrices, i.e., maximal subgroups of genes and subgroups of conditions where the genes exhibit highly correlated activities over a range of conditions while maximizing the volume simultaneously. Since these two objectives are mutually conflicting, they become suitable candidates for multi-objective modeling. In this study, we will describe some recent literature on biclustering as well as a multi-objective evolutionary biclustering framework for gene expression data along with the experimental results.

References

  1. {1} "Special Issue on Bioinformatics," IEEE Computer, vol. 35, July 2002.Google ScholarGoogle Scholar
  2. {2} S. Mitra and T. Acharya, Data Mining: Multimedia, Soft Computing, and Bioinformatics. New York: John Wiley, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {3} S. Tavazoie, J. D. Hughes, M. J. Campbell, R. J. Cho, and G. M. Church, "Systematic determination of genetic network architecture," Nature Genet., vol. 22, pp. 281-285, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  4. {4} J. T. Tou and R. C. Gonzalez, Pattern Recognition Principles. London: Addison-Wesley, 1974.Google ScholarGoogle Scholar
  5. {5} Y. Cheng and G. M. Church, "Biclustering of gene expression data," in Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB), pp. 93-103, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. {6} K. Deb, Multi-Objective Optimization using Evolutionary Algorithms. London: John Wiley, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. {7} K. Deb, S. Agarwal, A. Pratap, and T. Meyarivan, "A fast and elitist multi-objective genetic algorithm : NSGA-II," IEEE Transactions on Evolutionary Computation, vol. 6, pp. 182-197, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. {8} R. Peeters, "The maximum edge biclique problem is NP-Complete," Discrete Applied Mathematics, vol. 131, pp. 651-654, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {9} S. C. Madeira and A. L. Oliveira, "Biclustering algorithms for biological data analysis: A survey," IEEE Transactions on Computational Biology and Bioinformatics, vol. 1, pp. 24-45, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. {10} L. Parsons, E. Haque, and H. Liu, "Subspace clustering for high dimensional data: A review," ACM SIGKDD Explorations Newsletter, vol. 6, pp. 90-105, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. {11} W. H. Au, K. C. C. Chan, A. K. C. Wong, and Y. Wang, "Attribute clustering for grouping, selection, and classification of gene expression data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 2, pp. 83-101, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {12} D. Jiang, C. Tang, and A. Zhang, "Cluster analysis for gene expression data: A survey," IEEE Transactions on Knowledge and Data Engineering, vol. 16, pp. 1370-1386, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. {13} H. L. Turner, T. C. Bailey, W. J. Krzanowski, and C. A. Hemingway, "Biclustering models for structured microarray data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 2, pp. 316-329, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. {14} G. P. Shapiro and P. Tamayo, "Microarray data mining: facing the challenges," ACM SIGKDD Explorations Newsletter, vol. 5, no. 2, pp. 1-5, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. {15} J. Yang, H. Wang, W. Wang, and P. Yu, "Enhanced biclustering on expression data," in Proceedings of the Third IEEE Symposium on BioInformatics and Bioengineering (BIBE'03), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. {16} L. Lazzeroni and A. Owen, "Plaid models for gene expression data," Statistica Sinica, vol. 12, pp. 61-86, 2002.Google ScholarGoogle Scholar
  17. {17} A. Tanay, R. Sharan, and R. Shamir, "Discovering statistically significant biclusters in gene expression data," Bioinformatics, vol. 18, pp. S136-S144, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  18. {18} G. Getz, H. Gal, I. Kela, D. A. Notterman, and E. Domany, "Coupled two-way clustering analysis of breast cancer and colon cancer gene expression data," Bioinformatics, vol. 19, pp. 1079-1089, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  19. {19} J. Liu, W. Wang, and J. Yang, "Gene ontology friendly biclustering of expression profiles," in Proceedings of the 2004 Computational Systems Bioinformatics Conference (CSB 2004), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. {20} A. H. Tewfik and A. B. Tchagang, "Biclustering of DNA microarray data with early pruning," in Proceedings of ICASSP 2005, pp. V773-V776, 2005.Google ScholarGoogle Scholar
  21. {21} E. Segal, B. Taskar, A. Gasch, N. Friedman, and D. Koller, "Rich probabilistic models for gene expression," Bioinformatics, vol. 17, pp. S243-S252, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  22. {22} Z. Zhang, A. Teo, B. C. Ooi, and K. L. Tan, "Mining deterministic biclusters in gene expression data," in Proceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'04), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. {23} S. C. Madeira and A. L. Oliveira, "A linear time biclustering algorithm for time series gene expression data," in WABI 2005, LNBI 3692 (R. Casadio and G. Myers, eds.), pp. 39-52, Berlin: Springer Verlag, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. {24} L. Ji and K. L. Tan, "Identifying time-lagged gene clusters using gene expression data," Bioinformatics, vol. 21, pp. 509-516, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. {25} A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Buhlman, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler, "A systematic comparison and evaluation of biclustering methods for gene expression data," Bioinformatics, vol. 22, pp. 1122-1129, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. {26} J. A. Hartigan, "Direct clustering of a data matrix," Journal of American Statistical Association (JASA), vol. 67, pp. 123-129, 1972.Google ScholarGoogle ScholarCross RefCross Ref
  27. {27} J. Liu, J. Yang, and W. Wang, "Biclustering in gene expression data by tendency," in Proceedings of the 2004 Computational Systems Bioinformatics Conference (CSB 2004), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. {28} Y. Zhang, H. Zha, and C. H. Chu, "A time-series biclustering algorithm for revealing co-regulated genes," in Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. {29} D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA: Addison-Wesley, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. {30} S. Bleuler, A. Prelić, and E. Zitzler, "An EA framework for biclustering of gene expression data," in Proceedings of Congress on Evolutionary Computation, pp. 166-173, 2004.Google ScholarGoogle Scholar
  31. {31} F. Divina and J. S. Aguilar-Ruiz, "Biclustering of expression data with evolutionary computation," IEEE Transactions on Knowledge and Data Engineering, vol. 18, pp. 590-602, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. {32} K. Bryan, P. Cunningham, and N. Bolshakova, "Biclustering of expression data using simulated annealing," in 18th IEEE Symposium on Computer-Based Medical Systems (CSMB 2005), pp. 93-103, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. {33} A. Ben-Dor, B. Chor, R. Karp, and Z. Yakini, "Discovering local structure in gene expression data: the order preserving submatrix problem," in 6th International Conference on Computational Biology, (New York, (NY), USA), pp. 49-57, ACM Press, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. {34} B. J. Gao, O. L. Griffith, M. Ester, and S. J. M. Jones, "Discovering significant OPSM subspace clusters in massive gene expression data," in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 922-928, ACM Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. {35} S. Bleuler and E. Zitzler, "Order preserving clustering over multiple time course experiments," in EvoWorkshops 2005, LNCS 3449 (F. R. et al., ed.), pp. 33-43, Berlin: Springer Verlag, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. {36} M. Banerjee, S. Mitra, and H. Banka, "Evolutionary-rough feature selection in gene expression data," IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. {37} H. Cho, I. S. Dhilon, Y. Guan, and S. Sra, "Minimum sum-squared residue co-clustering of gene expression data," in Proceedings of 4th SIAM International Conference on Data Mining, 2004.Google ScholarGoogle Scholar

Index Terms

  1. Evolutionary biclustering of gene expressions

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image Ubiquity
              Ubiquity  Volume 2006, Issue October
              October 2006
              49 pages
              EISSN:1530-2180
              DOI:10.1145/1183081
              Issue’s Table of Contents

              Copyright © 2006 Copyright is held by the owner/author(s)

              Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 October 2006

              Check for updates

              Qualifiers

              • research-article

            HTML Format

            View this article in HTML Format .

            View HTML Format