Abstract
With the advent of microarray technology it has been possible to measure thousands of expression values of genes in a single experiment. Biclustering or simultaneous clustering of both genes and conditions is challenging particularly for the analysis of high-dimensional gene expression data in information retrieval, knowledge discovery, and data mining. The objective here is to find sub-matrices, i.e., maximal subgroups of genes and subgroups of conditions where the genes exhibit highly correlated activities over a range of conditions while maximizing the volume simultaneously. Since these two objectives are mutually conflicting, they become suitable candidates for multi-objective modeling. In this study, we will describe some recent literature on biclustering as well as a multi-objective evolutionary biclustering framework for gene expression data along with the experimental results.
- {1} "Special Issue on Bioinformatics," IEEE Computer, vol. 35, July 2002.Google Scholar
- {2} S. Mitra and T. Acharya, Data Mining: Multimedia, Soft Computing, and Bioinformatics. New York: John Wiley, 2003. Google ScholarDigital Library
- {3} S. Tavazoie, J. D. Hughes, M. J. Campbell, R. J. Cho, and G. M. Church, "Systematic determination of genetic network architecture," Nature Genet., vol. 22, pp. 281-285, 1999.Google ScholarCross Ref
- {4} J. T. Tou and R. C. Gonzalez, Pattern Recognition Principles. London: Addison-Wesley, 1974.Google Scholar
- {5} Y. Cheng and G. M. Church, "Biclustering of gene expression data," in Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB), pp. 93-103, 2000. Google ScholarDigital Library
- {6} K. Deb, Multi-Objective Optimization using Evolutionary Algorithms. London: John Wiley, 2001. Google ScholarDigital Library
- {7} K. Deb, S. Agarwal, A. Pratap, and T. Meyarivan, "A fast and elitist multi-objective genetic algorithm : NSGA-II," IEEE Transactions on Evolutionary Computation, vol. 6, pp. 182-197, 2002. Google ScholarDigital Library
- {8} R. Peeters, "The maximum edge biclique problem is NP-Complete," Discrete Applied Mathematics, vol. 131, pp. 651-654, 2003. Google ScholarDigital Library
- {9} S. C. Madeira and A. L. Oliveira, "Biclustering algorithms for biological data analysis: A survey," IEEE Transactions on Computational Biology and Bioinformatics, vol. 1, pp. 24-45, 2004. Google ScholarDigital Library
- {10} L. Parsons, E. Haque, and H. Liu, "Subspace clustering for high dimensional data: A review," ACM SIGKDD Explorations Newsletter, vol. 6, pp. 90-105, 2004. Google ScholarDigital Library
- {11} W. H. Au, K. C. C. Chan, A. K. C. Wong, and Y. Wang, "Attribute clustering for grouping, selection, and classification of gene expression data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 2, pp. 83-101, 2005. Google ScholarDigital Library
- {12} D. Jiang, C. Tang, and A. Zhang, "Cluster analysis for gene expression data: A survey," IEEE Transactions on Knowledge and Data Engineering, vol. 16, pp. 1370-1386, 2004. Google ScholarDigital Library
- {13} H. L. Turner, T. C. Bailey, W. J. Krzanowski, and C. A. Hemingway, "Biclustering models for structured microarray data," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 2, pp. 316-329, 2005. Google ScholarDigital Library
- {14} G. P. Shapiro and P. Tamayo, "Microarray data mining: facing the challenges," ACM SIGKDD Explorations Newsletter, vol. 5, no. 2, pp. 1-5, 2003. Google ScholarDigital Library
- {15} J. Yang, H. Wang, W. Wang, and P. Yu, "Enhanced biclustering on expression data," in Proceedings of the Third IEEE Symposium on BioInformatics and Bioengineering (BIBE'03), 2003. Google ScholarDigital Library
- {16} L. Lazzeroni and A. Owen, "Plaid models for gene expression data," Statistica Sinica, vol. 12, pp. 61-86, 2002.Google Scholar
- {17} A. Tanay, R. Sharan, and R. Shamir, "Discovering statistically significant biclusters in gene expression data," Bioinformatics, vol. 18, pp. S136-S144, 2002.Google ScholarCross Ref
- {18} G. Getz, H. Gal, I. Kela, D. A. Notterman, and E. Domany, "Coupled two-way clustering analysis of breast cancer and colon cancer gene expression data," Bioinformatics, vol. 19, pp. 1079-1089, 2003.Google ScholarCross Ref
- {19} J. Liu, W. Wang, and J. Yang, "Gene ontology friendly biclustering of expression profiles," in Proceedings of the 2004 Computational Systems Bioinformatics Conference (CSB 2004), 2004. Google ScholarDigital Library
- {20} A. H. Tewfik and A. B. Tchagang, "Biclustering of DNA microarray data with early pruning," in Proceedings of ICASSP 2005, pp. V773-V776, 2005.Google Scholar
- {21} E. Segal, B. Taskar, A. Gasch, N. Friedman, and D. Koller, "Rich probabilistic models for gene expression," Bioinformatics, vol. 17, pp. S243-S252, 2001.Google ScholarCross Ref
- {22} Z. Zhang, A. Teo, B. C. Ooi, and K. L. Tan, "Mining deterministic biclusters in gene expression data," in Proceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'04), 2004. Google ScholarDigital Library
- {23} S. C. Madeira and A. L. Oliveira, "A linear time biclustering algorithm for time series gene expression data," in WABI 2005, LNBI 3692 (R. Casadio and G. Myers, eds.), pp. 39-52, Berlin: Springer Verlag, 2005. Google ScholarDigital Library
- {24} L. Ji and K. L. Tan, "Identifying time-lagged gene clusters using gene expression data," Bioinformatics, vol. 21, pp. 509-516, 2005. Google ScholarDigital Library
- {25} A. Prelic, S. Bleuler, P. Zimmermann, A. Wille, P. Buhlman, W. Gruissem, L. Hennig, L. Thiele, and E. Zitzler, "A systematic comparison and evaluation of biclustering methods for gene expression data," Bioinformatics, vol. 22, pp. 1122-1129, 2006. Google ScholarDigital Library
- {26} J. A. Hartigan, "Direct clustering of a data matrix," Journal of American Statistical Association (JASA), vol. 67, pp. 123-129, 1972.Google ScholarCross Ref
- {27} J. Liu, J. Yang, and W. Wang, "Biclustering in gene expression data by tendency," in Proceedings of the 2004 Computational Systems Bioinformatics Conference (CSB 2004), 2004. Google ScholarDigital Library
- {28} Y. Zhang, H. Zha, and C. H. Chu, "A time-series biclustering algorithm for revealing co-regulated genes," in Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05), 2005. Google ScholarDigital Library
- {29} D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA: Addison-Wesley, 1989. Google ScholarDigital Library
- {30} S. Bleuler, A. Prelić, and E. Zitzler, "An EA framework for biclustering of gene expression data," in Proceedings of Congress on Evolutionary Computation, pp. 166-173, 2004.Google Scholar
- {31} F. Divina and J. S. Aguilar-Ruiz, "Biclustering of expression data with evolutionary computation," IEEE Transactions on Knowledge and Data Engineering, vol. 18, pp. 590-602, 2006. Google ScholarDigital Library
- {32} K. Bryan, P. Cunningham, and N. Bolshakova, "Biclustering of expression data using simulated annealing," in 18th IEEE Symposium on Computer-Based Medical Systems (CSMB 2005), pp. 93-103, 2000. Google ScholarDigital Library
- {33} A. Ben-Dor, B. Chor, R. Karp, and Z. Yakini, "Discovering local structure in gene expression data: the order preserving submatrix problem," in 6th International Conference on Computational Biology, (New York, (NY), USA), pp. 49-57, ACM Press, 2002. Google ScholarDigital Library
- {34} B. J. Gao, O. L. Griffith, M. Ester, and S. J. M. Jones, "Discovering significant OPSM subspace clusters in massive gene expression data," in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 922-928, ACM Press, 2006. Google ScholarDigital Library
- {35} S. Bleuler and E. Zitzler, "Order preserving clustering over multiple time course experiments," in EvoWorkshops 2005, LNCS 3449 (F. R. et al., ed.), pp. 33-43, Berlin: Springer Verlag, 2005. Google ScholarDigital Library
- {36} M. Banerjee, S. Mitra, and H. Banka, "Evolutionary-rough feature selection in gene expression data," IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews. Google ScholarDigital Library
- {37} H. Cho, I. S. Dhilon, Y. Guan, and S. Sra, "Minimum sum-squared residue co-clustering of gene expression data," in Proceedings of 4th SIAM International Conference on Data Mining, 2004.Google Scholar
Index Terms
- Evolutionary biclustering of gene expressions
Recommendations
Multi-objective evolutionary biclustering of gene expression data
Biclustering or simultaneous clustering of both genes and conditions have generated considerable interest over the past few decades, particularly related to the analysis of high-dimensional gene expression data in information retrieval, knowledge ...
Gene interaction - An evolutionary biclustering approach
DNA Microarray experiments form a powerful tool for studying gene expression patterns, in large scale. Sharing of the regulatory mechanism among genes, in an organism, is predominantly responsible for their co-expression. Biclustering aims at finding a ...
Computational selection of distinct class- and subclass-specific gene expression signatures
In this investigation we used statistical methods to select genes with expression profiles that partition classes and subclasses of biological samples. Gene expression data corresponding to liver samples from rats treated for 24 h with an enzyme inducer ...
Comments