Skip to main content
Top

2011 | OriginalPaper | Chapter

17. Hidden Markov Random Field Models for Network-Based Analysis of Genomic Data

Author : Hongzhe Li

Published in: Handbook of Statistical Bioinformatics

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Graphs and networks are common ways of depicting biological information. In biology, many different biological processes are represented by graphs, such as regulatory networks, metabolic pathways and protein-protein interaction networks. This kind of a priori use of graphs is a useful supplement to the standard numerical data such as microarray gene expression data and single nucleotide polymorphisms (SNPs) data. How to incorporate such a prior network information into analysis of numerical data raises interesting statistical problems. Representing the genetic networks as undirected graphs, we have developed several approaches for identifying differentially expressed genes and genes or SNPs associated with diseases in a unified framework of hidden Markov random field (HMRF) models. Different from the traditional empirical Bayes approaches for analysis of gene expression data, the HMRF-based models account for the prior dependency among the genes on the network and therefore effectively utilize the prior network information in identifying the subnetworks of genes that are perturbed by experimental conditions. In this paper, we briefly review the basic setup of the HMRF models and the emission probability functions for some problems often encountered in analysis of microarray gene expression and SNPs data. We also present some interesting areas that require further research.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Aitchison, J., & Dunsmore, I. R. (1975). Statistical prediction analysis. London: Cambridge University Press.MATHCrossRef Aitchison, J., & Dunsmore, I. R. (1975). Statistical prediction analysis. London: Cambridge University Press.MATHCrossRef
2.
go back to reference Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society, Series B, 36, 192–225.MathSciNetMATH Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society, Series B, 36, 192–225.MathSciNetMATH
3.
go back to reference Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of Royal Statistical Society B, 48, 259–302.MathSciNetMATH Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of Royal Statistical Society B, 48, 259–302.MathSciNetMATH
4.
go back to reference Bader, G. D., Donaldson, I., Wolting, C., Ouellette, B. F., Pawson, T., & Hogue, C. W. (2001). BIND–The Biomolecular Interaction Network Database. Nucleic Acids Research, 29, 242–245.CrossRef Bader, G. D., Donaldson, I., Wolting, C., Ouellette, B. F., Pawson, T., & Hogue, C. W. (2001). BIND–The Biomolecular Interaction Network Database. Nucleic Acids Research, 29, 242–245.CrossRef
5.
go back to reference Chuang, H. Y., Lee, E., Liu, Y. T., Lee, D., & Ideker, T. (2007). Network-based classification of breast cancer metastasis. Molecular Systems Biology, 3, 140.CrossRef Chuang, H. Y., Lee, E., Liu, Y. T., Lee, D., & Ideker, T. (2007). Network-based classification of breast cancer metastasis. Molecular Systems Biology, 3, 140.CrossRef
6.
go back to reference Deng, M., Tu, Z., Sun, F., & Chen, T. (2004). Mapping gene ontology to proteins based on proteinprotein interaction data. Bioinformatics, 20, 895–902.CrossRef Deng, M., Tu, Z., Sun, F., & Chen, T. (2004). Mapping gene ontology to proteins based on proteinprotein interaction data. Bioinformatics, 20, 895–902.CrossRef
7.
go back to reference Deng, M., Zhang, K., Mehta, S., Chen, T., & Sun, F. (2003). Prediction of protein function using proteinprotein interaction data. Journal of Computational Biolology, 10, 947–960.CrossRef Deng, M., Zhang, K., Mehta, S., Chen, T., & Sun, F. (2003). Prediction of protein function using proteinprotein interaction data. Journal of Computational Biolology, 10, 947–960.CrossRef
8.
go back to reference Efron, B., Tibshirani, R., Storey, J. D., & Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Association, 96, 1151–1160.MathSciNetMATHCrossRef Efron, B., Tibshirani, R., Storey, J. D., & Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Association, 96, 1151–1160.MathSciNetMATHCrossRef
9.
go back to reference Hong, F. X., & Li, H. (2006). Functional hierarchical models for identifying genes with different time-course expression profiles. Biometrics, 62, 534–544.MathSciNetMATHCrossRef Hong, F. X., & Li, H. (2006). Functional hierarchical models for identifying genes with different time-course expression profiles. Biometrics, 62, 534–544.MathSciNetMATHCrossRef
10.
go back to reference Ideker, T., & Sharan, R. (2008). Protein networks in disease. Genome Research, 18, 644–652.CrossRef Ideker, T., & Sharan, R. (2008). Protein networks in disease. Genome Research, 18, 644–652.CrossRef
11.
go back to reference Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U., & Speed, T. P. (2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics, 4, 249–264.MATHCrossRef Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U., & Speed, T. P. (2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics, 4, 249–264.MATHCrossRef
12.
go back to reference Kanehisa, M., & Goto, S. (2002). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28, 27–30.CrossRef Kanehisa, M., & Goto, S. (2002). KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28, 27–30.CrossRef
13.
go back to reference Kendziorski, C.M., M.A. Newton, H. Lan, & M.N. Gould (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine, 22, 3899–3914.CrossRef Kendziorski, C.M., M.A. Newton, H. Lan, & M.N. Gould (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine, 22, 3899–3914.CrossRef
14.
go back to reference Li, C., & Li, H. (2008). Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics, 24, 1175–1182.CrossRef Li, C., & Li, H. (2008). Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics, 24, 1175–1182.CrossRef
15.
go back to reference Li, C., Wei, Z., & Li, H. (2010). Network-based empirical Bayes methods for linear models with applications to genomic Data. Journal of Pharmaceutical Statistics, 20, 209–222.MathSciNet Li, C., Wei, Z., & Li, H. (2010). Network-based empirical Bayes methods for linear models with applications to genomic Data. Journal of Pharmaceutical Statistics, 20, 209–222.MathSciNet
16.
go back to reference Li, H, Wei, Z., & Maris, J. (2010). A hidden Markov random field model for genome-wide association studies. Biostatistics, 11, 139–150.CrossRef Li, H, Wei, Z., & Maris, J. (2010). A hidden Markov random field model for genome-wide association studies. Biostatistics, 11, 139–150.CrossRef
17.
go back to reference Lonnstedt, I., & Speed, T. P. (2002). Replicated microarray data. Statistica Sinica, 12, 31–46.MathSciNet Lonnstedt, I., & Speed, T. P. (2002). Replicated microarray data. Statistica Sinica, 12, 31–46.MathSciNet
18.
go back to reference Monni, S., & Li, H. (2010). Bayesian analysis for graph-structured genomics data. In M. Chen, D. K. Dey, P. D. Mueller, & Y. Ye (Eds.), Frontier of statistical decision making and bayesian analysis – In honor of James O. Berger. Monni, S., & Li, H. (2010). Bayesian analysis for graph-structured genomics data. In M. Chen, D. K. Dey, P. D. Mueller, & Y. Ye (Eds.), Frontier of statistical decision making and bayesian analysis – In honor of James O. Berger.
19.
go back to reference Newton, M.A., C.M. Kendziorski, C.S. Richmond, F.R. Blattner, & K.W. Tsui (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology, 8, 37–52.CrossRef Newton, M.A., C.M. Kendziorski, C.S. Richmond, F.R. Blattner, & K.W. Tsui (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology, 8, 37–52.CrossRef
20.
go back to reference Peri, S., Navarro, J. D., Amanchy, R., Kristiansen, T. Z., Jonnalagadda, C. K., Surendranath, V., Niranjan, V., Muthusamy, B., Gandhi, T. K., Gronborg, M., Ibarrola, N., Deshpande, N., Shanker, K., Shivashankar, H. N., Rashmi, B. P., Ramya, M. A., Zhao, Z., Chandrika, K. N., Padma, N., Harsha, H. C., Yatish, A. J., Kavitha, M. P., Menezes, M., Choudhury, D. R., Suresh, S., Ghosh, N., Saravana, R., Chandran, S., Krishna, S., Joy, M., Anand, S. K., Madavan, V., Joseph, A., Wong, G. W., Schiemann, W. P., Constantinescu, S. N., Huang, L., Khosravi-Far, R., Steen, H., Tewari, M., Ghaffari, S., Blobe, G. C., Dang, C. V., Garcia, J. G., Pevsner, J., Jensen, O. N., Roepstorff, P., Deshpande, K. S., Chinnaiyan, A. M., Hamosh, A., Chakravarti, A., & Pandey, A. (2003). Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Research, 13, 2363–2371.CrossRef Peri, S., Navarro, J. D., Amanchy, R., Kristiansen, T. Z., Jonnalagadda, C. K., Surendranath, V., Niranjan, V., Muthusamy, B., Gandhi, T. K., Gronborg, M., Ibarrola, N., Deshpande, N., Shanker, K., Shivashankar, H. N., Rashmi, B. P., Ramya, M. A., Zhao, Z., Chandrika, K. N., Padma, N., Harsha, H. C., Yatish, A. J., Kavitha, M. P., Menezes, M., Choudhury, D. R., Suresh, S., Ghosh, N., Saravana, R., Chandran, S., Krishna, S., Joy, M., Anand, S. K., Madavan, V., Joseph, A., Wong, G. W., Schiemann, W. P., Constantinescu, S. N., Huang, L., Khosravi-Far, R., Steen, H., Tewari, M., Ghaffari, S., Blobe, G. C., Dang, C. V., Garcia, J. G., Pevsner, J., Jensen, O. N., Roepstorff, P., Deshpande, K. S., Chinnaiyan, A. M., Hamosh, A., Chakravarti, A., & Pandey, A. (2003). Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Research, 13, 2363–2371.CrossRef
21.
go back to reference Sharan, R., Ulitsky, I., & Shamir, R. (2007). Network-based prediction of protein function. Molecular Systems Biology, 3(88). Sharan, R., Ulitsky, I., & Shamir, R. (2007). Network-based prediction of protein function. Molecular Systems Biology, 3(88).
22.
go back to reference Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1), Article 3. Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3(1), Article 3.
23.
go back to reference Sun, W., & Cai, T. (2009). Large-scale multiple testing under dependency. Journal of the Royal Statistical Society, Series B, 71, 393–424.MathSciNetMATHCrossRef Sun, W., & Cai, T. (2009). Large-scale multiple testing under dependency. Journal of the Royal Statistical Society, Series B, 71, 393–424.MathSciNetMATHCrossRef
24.
go back to reference Tai, Y. C., & Speed, T. (2006). A multivariate empirical Bayes statistic for replicated microarray time course data. Annals of Statistics, 34, 2387–2412.MathSciNetMATHCrossRef Tai, Y. C., & Speed, T. (2006). A multivariate empirical Bayes statistic for replicated microarray time course data. Annals of Statistics, 34, 2387–2412.MathSciNetMATHCrossRef
25.
go back to reference Ulitsky I., Karp, R. M., & Shamir, R. (2008). Detecting disease-specific dysregulated pathways via analysis of clinical expression profiles. In Proceeding of RECOMB 2008 (pp. 347–359). Berlin: Springer. Ulitsky I., Karp, R. M., & Shamir, R. (2008). Detecting disease-specific dysregulated pathways via analysis of clinical expression profiles. In Proceeding of RECOMB 2008 (pp. 347–359). Berlin: Springer.
26.
go back to reference Ulitsky, I., & Shamir, R. (2008). Detecting pathways transcriptionally correlated with clinical parameters. Proceedings of the 7th annual international conference on computational systems bioinformatics (CSB 08) (pp. 249–258). London, UK: Imperial College Press. Ulitsky, I., & Shamir, R. (2008). Detecting pathways transcriptionally correlated with clinical parameters. Proceedings of the 7th annual international conference on computational systems bioinformatics (CSB 08) (pp. 249–258). London, UK: Imperial College Press.
27.
go back to reference Ulitsky, I., & Shamir, R. (2009). Identifying functional modules using expression profiles and confidence-scored protein interactions. Bioinformatics, 25, 1158–1164.CrossRef Ulitsky, I., & Shamir, R. (2009). Identifying functional modules using expression profiles and confidence-scored protein interactions. Bioinformatics, 25, 1158–1164.CrossRef
28.
go back to reference Wei, P., & Pan, W. (2008). Incorporating gene networks into statistical tests for genomic data via a spatially correlated mixture model. Bioinformatics, 24, 404–411.CrossRef Wei, P., & Pan, W. (2008). Incorporating gene networks into statistical tests for genomic data via a spatially correlated mixture model. Bioinformatics, 24, 404–411.CrossRef
29.
go back to reference Wei, Z., & Li, H. (2007). A Markov random field model for network-based analysis of genomic data. Bioinformatics, 23, 1537–1544.MathSciNetCrossRef Wei, Z., & Li, H. (2007). A Markov random field model for network-based analysis of genomic data. Bioinformatics, 23, 1537–1544.MathSciNetCrossRef
30.
go back to reference Wei, Z., & Li, H. (2008). A hidden spatial-temporal Markov random field model for network-based analysis of time course gene expression data. Annals of Applied Statistics, 2(1), 408–429.MathSciNetMATHCrossRef Wei, Z., & Li, H. (2008). A hidden spatial-temporal Markov random field model for network-based analysis of time course gene expression data. Annals of Applied Statistics, 2(1), 408–429.MathSciNetMATHCrossRef
31.
go back to reference Wei, Z., Minturn, J. E., Rappaport, E., Brodeur, G., and Li, H. (2008). Incorporation of genetic pathway information into analysis of multivariate gene expression data. In A. Yakovle, L. Klebanov, & G. Gaile (Eds.), Statistical methods for microarray data analysis. Unpublished manuscript. Wei, Z., Minturn, J. E., Rappaport, E., Brodeur, G., and Li, H. (2008). Incorporation of genetic pathway information into analysis of multivariate gene expression data. In A. Yakovle, L. Klebanov, & G. Gaile (Eds.), Statistical methods for microarray data analysis. Unpublished manuscript.
32.
go back to reference Yuan, M., & Kendziorski, C. (2006). Hidden Markov models for microarray time course data under multiple biological conditions (with discussion). Journal of the American Statistical Association, 101(476), 1323–1340.MathSciNetMATHCrossRef Yuan, M., & Kendziorski, C. (2006). Hidden Markov models for microarray time course data under multiple biological conditions (with discussion). Journal of the American Statistical Association, 101(476), 1323–1340.MathSciNetMATHCrossRef
33.
go back to reference Zhu, Y., Pan, W., & Shen, X. (2009). Support vector machines with disease-centric network penalty for high dimensional microarray data. Statistics and its Inference, 2(3), 257–269.MathSciNetMATH Zhu, Y., Pan, W., & Shen, X. (2009). Support vector machines with disease-centric network penalty for high dimensional microarray data. Statistics and its Inference, 2(3), 257–269.MathSciNetMATH
Metadata
Title
Hidden Markov Random Field Models for Network-Based Analysis of Genomic Data
Author
Hongzhe Li
Copyright Year
2011
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-16345-6_17

Premium Partner