Abstract
With the advent of high throughput systems to experimentally determine the three-dimensional (3-D) structure of proteins, molecular biologists are in urgent need of systems to automatically store, maintain and explore the vast structural databases that are thus being created. We have designed and implemented the Capri/MR system which makes it possible to identify families of protein structures, as contained in such very large 3-D protein structure databases. Our system is able to automatically index and search a database of proteins by three-dimensional shape, structural and/or physicochemical properties. For each of these diverse protein structure representations, we create a compact rotation and translation invariant index (or signature) which is placed in a database for future querying. A similarity search algorithm performs an exhaustive search against the entire database. Our search algorithm takes advantage of the compact signatures to rapidly find protein structures that are similar in 3-D shape and/or two-dimensional (2-D) properties. As a result, queries in our Capri/MR system run within a fraction of a second, and we are able to accurately group protein structures into the correct families, with very high precision and recall. In addition, our system dynamically processes new protein structures as they become available. We demonstrate the power of Capri/MR against the Protein Data Bank, which contains all known, experimentally determined, 3-D protein structures (48.000 as of January 2008). The main applications of our Capri/MR system lie in structural proteomics, protein evolution and mutation, as well as in drug design, in particular for studying the docking problem and the computer aided design of non-toxic drugs.
- J.-S. Yeh, D.-Y. Chen and M. Ouhyoung, A Web-based Protein Retrieval System by Matching Visual Similarity, Bioinformatics, 21(13), pages 3056--3057, 2005. Google ScholarDigital Library
- A. M. Lesk, Introduction to Protein Science: Architecture, Function, and Genomics, Oxford University Press, 2004.Google Scholar
- E. Paquet and H. L. Viktor, Exploring Protein Architecture using 3D Shape-based Signatures, International Conference of the IEEE Engineering in Medicine and Biology Society, pages 1204--1208, 2007.Google ScholarCross Ref
- E. Paquet and H. L. Viktor, Distributed Virtual Environments for Visualization and Visual Data Mining, ISPRS Int. Workshop on "Visualization and Animation of Reality-based 3D Models", 6 pages, CD ROM, 2003.Google Scholar
- H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov and P. E. Bourne, The Protein Data Bank, Nucleic Acids Research, 28, pages 235--242, 2000.Google ScholarCross Ref
- A. G. Murzin, S. E. Brenner, T. Hubbard and C. Chothia, SCOP: A Structural Classification of Proteins Database of the Investigation of Sequences and Structures, Journal of Molecular Biology, 247, pages 536--540, 1995.Google Scholar
- P. Daras et al., Three-dimensional shape-structure comparison method for protein classification, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 3(3), pages 193--207, 2006. Google ScholarDigital Library
Index Terms
- Capri/MR: exploring protein databases from a structural and physicochemical point of view
Recommendations
Antiviral potential of natural compounds against influenza virus hemagglutinin
The antiviral activity of natural compounds against the HA protein of different subtypes of Influenza virus has been investigated using binding free energy and hydrogen bonding interactions.Display Omitted The curucmin derivatives (CI, CII and CIII) ...
Characterizing the protonation states of the catalytic residues in apo and substrate-bound human T-cell leukemia virus type 1 protease
Display Omitted Protonation states of the catalytic residues in HTLV-1 protease were investigated.In apo HTLV-1 protease, the two catalytic residues are both unprotonated.In HTLV-1 protease-substrate complex, Asp32 is protonated, Asp32' is ...
The binding mode of picrotoxinin in GABAA-ź receptors
Picrotoxinin and the residues in green and red colors line the binding sites in the transmembrane regions of ź1 and ź2 models, respectively.The current structural study explores for the first time the binding mode of picrotoxinin as a non-competitive ...
Comments