Skip to main content

Using Semantic Similarities and csbl.go for Analyzing Microarray Data

  • Protocol
  • First Online:
Microarray Data Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1375))

  • 1855 Accesses

Abstract

Cellular phenotypes result from the combined effect of multiple genes, and high-throughput techniques such as DNA microarrays and deep sequencing allow monitoring this genomic complexity. The large scale of the resulting data, however, creates challenges for interpreting results, as primary analysis often yields hundreds of genes. Gene Ontology (GO), a controlled vocabulary for gene products, enables semantic analysis of such gene sets. GO can be used to define semantic similarity between genes, which enables semantic clustering to reduce the complexity of a result set. Here, we describe how to compute semantic similarities and perform GO-based gene clustering using csbl.go, an R package for GO semantic similarity. We demonstrate the approach with expression profiles from breast cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hanahan D, Weinberg RA (2011) Hallmarks of cancer: the next generation. Cell 144:646–674

    Article  CAS  PubMed  Google Scholar 

  2. Vogelstein B, Papadopoulos N, Velculescu VE et al (2013) Cancer genome landscapes. Science 339:1546–1558

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Ashburner M, Ball C, Blake J et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Rebhan M, Chalifa-Caspi V, Prilusky J et al (1998) GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics 14:656–664

    Article  CAS  PubMed  Google Scholar 

  5. Guzzi PH, Mina M, Guerra C et al (2012) Semantic similarity analysis of protein data: assessment with biological features and issues. Brief Bioinform 13:569–585

    Article  PubMed  Google Scholar 

  6. Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. Proceedings of the 14th international joint conference on artificial intelligence, vol 1, pp 448–453

    Google Scholar 

  7. Lord P, Stevens R, Brass A et al (2003) Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19:1275–1283

    Article  CAS  PubMed  Google Scholar 

  8. Mazandu GK, Mulder NJ (2013) Information content-based gene ontology semantic similarity approaches: toward a unified framework theory. BioMed Res In 2013:292063

    Google Scholar 

  9. Harispe S, Sánchez D, Ranwez S et al (2014) A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J Biomed Inform 48:38–53

    Article  PubMed  Google Scholar 

  10. Ovaska K, Laakso M, Hautaniemi S (2008) Fast gene ontology based clustering for microarray experiments. BioData Mining 1:11

    Article  PubMed  PubMed Central  Google Scholar 

  11. The Cancer Genome Atlas Network (2012) Comprehensive molecular portraits of human breast tumours. Nature 490:61–70

    Article  PubMed Central  Google Scholar 

  12. Gentleman RC, Carey VJ, Bates DM et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80

    Article  PubMed  PubMed Central  Google Scholar 

  13. Lin D (1998) An information-theoretic definition of similarity. Proceedings of the 15th international conference on machine learning, pp 296–304

    Google Scholar 

  14. Jiang J, Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of international conference on research in computational linguistics, pp 19–33

    Google Scholar 

  15. Schlicker A, Domingues F, Rahnenführer J et al (2006) A new measure for functional similarity of gene products based on gene ontology. BMC Bioinformatics 7:302

    Article  PubMed  PubMed Central  Google Scholar 

  16. Huang D, Sherman B, Tan Q et al (2007) The DAVID gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol 8:R183

    Article  PubMed  PubMed Central  Google Scholar 

  17. Bodenreider O, Aubry M, Burgun A (2005) Non-lexical approaches to identifying associative relations in the gene ontology. Pac Symp Biocomput 2005:91–102

    Google Scholar 

  18. Pesquita C, Faria D, Bastos H et al (2008) Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatics 9:S4

    Article  PubMed  PubMed Central  Google Scholar 

  19. Brun C, Chevenet F, Martin D et al (2004) Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network. Genome Biol 5:6

    Article  Google Scholar 

  20. Couto FM, Silva MJ, Coutinho PM (2007) Measuring semantic similarity between gene ontology terms. Data Knowl Eng 61:137–152

    Article  Google Scholar 

  21. Yu G, Li F, Qin Y et al (2010) GOSemSim: an R package for measuring semantic similarity among GO terms and gene products. Bioinformatics 26:976–978

    Article  CAS  PubMed  Google Scholar 

  22. Frohlich H, Speer N, Poustka A et al (2007) GOSim – an R-package for computation of information theoretic GO similarities between terms and gene products. BMC Bioinformatics 8:166

    Article  PubMed  PubMed Central  Google Scholar 

  23. Harispe S, Ranwez S, Janaqi S et al (2014) The semantic measures library and toolkit: fast computation of semantic similarity and relatedness using biomedical ontologies. Bioinformatics 30:740–742

    Article  CAS  PubMed  Google Scholar 

  24. Ovaska K, Laakso M, Haapa-Paananen S et al (2010) Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme. Genome Med 2:65

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

I thank Tiia Pelkonen for proofreading.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristian Ovaska .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this protocol

Cite this protocol

Ovaska, K. (2015). Using Semantic Similarities and csbl.go for Analyzing Microarray Data. In: Guzzi, P. (eds) Microarray Data Analysis. Methods in Molecular Biology, vol 1375. Humana Press, New York, NY. https://doi.org/10.1007/7651_2015_241

Download citation

  • DOI: https://doi.org/10.1007/7651_2015_241

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3172-9

  • Online ISBN: 978-1-4939-3173-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics