ABSTRACT

We describe here an energy based computer software suite for narrowing down the search space of tertiary structures of small globular proteins. The protocol comprises eight different computational modules that form an automated pipeline. It combines physics based potentials with biophysical filters to arrive at 10 plausible candidate structures starting from sequence and secondary structure information. The methodology has been validated here on 50 small globular proteins consisting of 2–3 helices and strands with known tertiary structures. For each of these proteins, a structure within 3–6 Å RMSD (root mean square deviation) of the native has been obtained in the 10 lowest energy structures. The protocol has been web enabled and is accessible at http://www.scfbio-iitd.res.in/bhageerath.

INTRODUCTION

The tertiary structure prediction of a protein using amino acid sequence information alone is one of the fundamental unsolved problems in computational biology/molecular biophysics (1). The folding of protein molecules with a large number of degrees of freedom spontaneously into a unique three-dimensional (3-D) structure is of scientific interest intrinsically and due to its application in structure based drug design endeavors. The cost and time factors involved in experimental techniques urge for an early in silico solution to protein folding problem (2). The ultimate goal is to use computer algorithms to identify amino acid sequences that not only adopt particular 3-D structures but also perform specific functions i.e. to propose designer proteins (3).

Contemporary approaches for protein structure prediction can be broadly classified under two categories viz. (i) comparative modeling, which includes homology modeling and threading (47) and (ii) de novo folding (812). The first category of methods utilizes the structures of already solved proteins as templates (either locally or globally, at the sequence level or at the sub-structure level). With large amounts of genome and proteome data accumulating via sequencing projects, comparative modeling has become the method of choice to characterize sequences where related representatives of a family exist in structural databases (1318). There are several web servers based on comparative modeling approaches such as Swiss Model (4), CPHmodels (19), FAMS (20) and ModWeb (21). The assessors for comparative modeling at CASP6 (Critical Assessment of protein Structure Prediction methods) have noted small improvements in model quality despite increase in the available structures but marginal improvement in alignment accuracy when compared to CASP5 (22). A natural limit for these approaches is the quantity of information available in the structural databases. This highlights the importance of de novo techniques for protein folding.

Significant progress has been made in recent years towards physics-based computation of protein structure, from a knowledge of the amino acid sequence. This approach, commonly referred to as an ab initio method (2325) is based on the thermodynamic hypothesis formulated by Anfinsen (1973), according to which the native structure of a protein corresponds to the global minimum of its free energy under given conditions (26). Protein structure prediction using ab initio method is accomplished by a search for a conformation corresponding to the global-minimum of an appropriate potential energy function without the use of secondary structure prediction, homology modeling, threading etc. (27). In contrast, methods characterized as de novo use the ab initio strategies partly as well as database information directly or indirectly. Table 1 summarizes different known web servers/groups for protein structure prediction and the function(s) therein. The tertiary structure prediction of protein starting from its sequence has been successfully demonstrated on protein sequences <85 residues in length by Baker's group (28,29) using a fragment assembly methodology. The ProtInfo web server by Samudrala et al. (30) predicts protein tertiary structure for sequences <100 amino acids using de novo methodology, where by structures are generated using simulated annealing search phase which minimizes a target scoring function. Scratch web server by Baldi et al. (31) predicts the protein tertiary structure as well as structural features starting from the sequence information alone. Astro-fold (32) an ab initio structure prediction framework by Klepeis and Floudas employs local interactions and hydrophobicity for the identification of helices and beta-sheets respectively followed by global optimization, stochastic optimization and torsion angle dynamics. De novo structure prediction by simfold energy function with the multi-canonical ensemble fragment assembly has been developed by Fujitsuka et al. (33). The function has been tested on 38 proteins along with the fragment assembly simulations and predicts structures within 6.5 Å RMSD (root mean square deviation) of the native in 12 of the cases. Arriving at structures between 3 and 6 Å RMSD of the native expeditiously using ab initio or de novo methodologies remains a formidable challenge.

Table 1

Some de novo/ab initio servers for protein folding

Sl. No.Name of the Web Server/GroupDescription
1.ROBETTA (28,29) (http://robetta.bakerlab.org)De novo Automated structure prediction analysis tool used to infer protein structural information from protein sequence data
2.PROTINFO (30) (http://protinfo.compbio.washington.edu)De novo protein structure prediction web server utilizing simulated annealing for generation and different scoring functions for selection of final five conformers
3.SCRATCH (31) (http://www.igb.uci.edu/servers/psss.html)Protein structure and structural features prediction server which utilizes recursive neural networks, evolutionary information, fragment libraries and energy
4.ASTRO-FOLD (32)Astro-fold: first principles tertiary structure prediction based on overall deterministic framework coupled with mixed integer optimization
5.ROKKY (33) (http://www.proteinsilico.org/rokky/rokky-p/)De novo structure prediction by the simfold energy function with the multi-canonical ensemble fragment assembly
6.BHAGEERATH (www.scfbio-iitd.res.in/bhageerathEnergy based methodology for narrowing down the search space of small globular proteins
Sl. No.Name of the Web Server/GroupDescription
1.ROBETTA (28,29) (http://robetta.bakerlab.org)De novo Automated structure prediction analysis tool used to infer protein structural information from protein sequence data
2.PROTINFO (30) (http://protinfo.compbio.washington.edu)De novo protein structure prediction web server utilizing simulated annealing for generation and different scoring functions for selection of final five conformers
3.SCRATCH (31) (http://www.igb.uci.edu/servers/psss.html)Protein structure and structural features prediction server which utilizes recursive neural networks, evolutionary information, fragment libraries and energy
4.ASTRO-FOLD (32)Astro-fold: first principles tertiary structure prediction based on overall deterministic framework coupled with mixed integer optimization
5.ROKKY (33) (http://www.proteinsilico.org/rokky/rokky-p/)De novo structure prediction by the simfold energy function with the multi-canonical ensemble fragment assembly
6.BHAGEERATH (www.scfbio-iitd.res.in/bhageerathEnergy based methodology for narrowing down the search space of small globular proteins
Table 1

Some de novo/ab initio servers for protein folding

Sl. No.Name of the Web Server/GroupDescription
1.ROBETTA (28,29) (http://robetta.bakerlab.org)De novo Automated structure prediction analysis tool used to infer protein structural information from protein sequence data
2.PROTINFO (30) (http://protinfo.compbio.washington.edu)De novo protein structure prediction web server utilizing simulated annealing for generation and different scoring functions for selection of final five conformers
3.SCRATCH (31) (http://www.igb.uci.edu/servers/psss.html)Protein structure and structural features prediction server which utilizes recursive neural networks, evolutionary information, fragment libraries and energy
4.ASTRO-FOLD (32)Astro-fold: first principles tertiary structure prediction based on overall deterministic framework coupled with mixed integer optimization
5.ROKKY (33) (http://www.proteinsilico.org/rokky/rokky-p/)De novo structure prediction by the simfold energy function with the multi-canonical ensemble fragment assembly
6.BHAGEERATH (www.scfbio-iitd.res.in/bhageerathEnergy based methodology for narrowing down the search space of small globular proteins
Sl. No.Name of the Web Server/GroupDescription
1.ROBETTA (28,29) (http://robetta.bakerlab.org)De novo Automated structure prediction analysis tool used to infer protein structural information from protein sequence data
2.PROTINFO (30) (http://protinfo.compbio.washington.edu)De novo protein structure prediction web server utilizing simulated annealing for generation and different scoring functions for selection of final five conformers
3.SCRATCH (31) (http://www.igb.uci.edu/servers/psss.html)Protein structure and structural features prediction server which utilizes recursive neural networks, evolutionary information, fragment libraries and energy
4.ASTRO-FOLD (32)Astro-fold: first principles tertiary structure prediction based on overall deterministic framework coupled with mixed integer optimization
5.ROKKY (33) (http://www.proteinsilico.org/rokky/rokky-p/)De novo structure prediction by the simfold energy function with the multi-canonical ensemble fragment assembly
6.BHAGEERATH (www.scfbio-iitd.res.in/bhageerathEnergy based methodology for narrowing down the search space of small globular proteins

We have developed a computationally viable de novo strategy for tertiary structure prediction, processing and evaluation. The web server christened Bhageerath takes as input the amino acid sequence and secondary structure information for a query protein and returns 10 candidate structures for the native. In this article, we report the validation and testing of the protein structure prediction web suite Bhageerath with application to 50 small globular proteins. The programs are written in standard C++, with a total of more than ∼8000 lines of code and are easily portable on any POSIX (UNIX, LINUX, IRIX and AIX) compliant system.

MATERIALS AND METHODS

Bhageerath (www.scfbio-iitd.res.in/bhageerath) software suite for protein tertiary structure prediction narrows down the search space to generate probable candidate structures for the native. The flow chart diagram of Bhageerath is depicted in Figure 1.

Figure 1

The flow of information in Bhageerath web server, starting with the input from the user to the final 10 predictions made available to the user.

The first module involves the formation of a 3-D structure from the amino acid sequence with the secondary structural elements in place. The second module involves generation of a large number of trial structures with a systematic sampling of the conformational space of loop dihedrals. The number of trial structures generated is 128(n−1) where n is the number of secondary structural elements. These structures are generated by choosing seven dihedrals from each of the loops (three at both ends and one dihedral from the middle of the loop) and sampling two conformations for each dihedral. The values assigned for dihedrals Φ, Ψ to each amino acid during structure generation are given in supplementary information (Supplementary Table S1). The trial structures generated via dihedral sampling are screened in the third module through persistence length and radius of gyration filters (34), developed for the purpose of reducing the number of improbable candidates. The resultant structures are refined in the fourth module by a Monte Carlo sampling in dihedral space to remove steric clashes and overlaps involving atoms of main chain and side chains. In module five, the structures are energy minimized to further optimize the side chains. The energy minimization is carried out in vacuum with distance dependent dielectric for 200 steps (75 steps steepest descent + 125 steps conjugate gradient). Module six involves ranking of structures using an all atom energy based empirical scoring function (35) followed by selection of the 100 lowest energy structures. Module seven reduces the probable candidates based on the protein regularity index of the Φ and Ψ dihedral values based on the threshold value of 1.5 for Φ and 4.0 for Ψ (Thukral et al., manuscript accepted in J. Biosci.). Module eight further reduces the structures selected in the previous module to 10 using topological equivalence criterion and the accessible surface area [calculated using NACCESS (36)]. The above eight modules are configured to work in a conduit.

Overview of the organization of the suite

Bhageerath is a fully automated web enabled protein structure prediction software suite that is made available through a convenient user interface which returns 10 predictions for a given protein query sequence. A click on the Bhageerath server opens into a window wherein a user can paste a query protein sequence in FASTA format. The current version supports continuous sequences up to 100 amino acids. The user is prompted for amino acid range as secondary structural input. Upon submission the user receives an unique job id for his/her sequence. User has the option to provide an email ID to receive an output link which contains 10 lowest energy candidate structures.

RESULTS

We present here a performance appraisal of the protein tertiary structure prediction software suite on 50 globular proteins with known structures. All the proteins have been extracted from the Protein Data Bank (PDB) (37) and are functionally diverse. We have extracted ∼8000 unique proteins from the PDB at 50% sequence similarity or less. From these, ∼8000 unique proteins, we obtained 329 proteins satisfying the criterion that the number of residues is <100 and the number of secondary structural elements varies between two and three. We have selected our test set of 50 proteins randomly from these 329 proteins. The length of the polypeptide chain varies from 17 to 70 and the total number of helices and strands ranges between two and three.

The results obtained for the 50 globular proteins with the web server are shown in Table 2. The table gives the PDB ID, the number of amino acids in the sequence as well as the number and type of secondary structural elements present in each protein in columns (i)–(iii). The number of structures obtained after the persistence length and radius of gyration filters are given in column (iv) of Table 2. The lowest RMSD obtained in the 100 structures along with its energy rank are provided in the next two columns, (v) and (vi). This is followed by the number of structures selected by ProRegIn filter in column (vii). The number in parenthesis in column (vii) indicates the number of structures with RMSD < 6 Å in the selected structures. The lowest RMSD and the corresponding energy rank after selection with ProRegIn filter are reported in column (viii) and (ix). The structures selected after the Topology filter are reported in column (x) and the number in parenthesis indicates the number of structures with RMSD <6 Å in the final 10 structures. The last two columns of Table 2 [column (xi) and (xii)] show the lowest RMSD with respect to the native obtained from amongst the 10 predicted structures along with the energy rank of the structure. For all the 50 test proteins, irrespective of the nature of secondary structural elements and the length of intervening loops, it may be noted that a few topologically correct structures within an RMSD of 3–6 Å from the native structure are obtained in the final 10 predicted structures. Thus, the ‘needle in a haystack’ problem can be reduced to finding a solution in the best 10 structures at least for small proteins.

Table 2

A performance appraisal of Bhageerath web server for 50 small globular proteins

Sl. No.PDB ID (i)Number of amino acids (ii)Number of secondary structure elements (iii)Number of structures accepted after Persistence length and Radius of gyration filters (iv)Lowest RMSD in the final 100 structures (Å) (v)Energy Rank of the lowest RMSD structure in 100 structures (vi)After ProRegIn filterAfter topology and accessible surface area filter
Number of structures selected (Number of structures <6 Å) (vii)Lowest RMSD (Å) (viii)Energy Rank of the lowest RMSD structure in 100 structures (ix)Number of structures selected (Number of structures <6 Å) (x)Lowest RMSD (Å) (xi)Energy Rank of the lowest RMSD structure in 10 structures (xii)
11E0Q172E1282.52100 (29)2.5210 (10)2.52
21B03182E644.4264 (5)4.4210 (5)4.42
31WQC262H1282.56100 (53)2.5610 (10)2.53
41RJU362H644.64864 (3)4.64810 (2)5.96
51EDM392E1282.9100100 (59)2.910010 (10)3.52
61AB1462H1282.410100 (82)2.41010 (10)2.96
71BX7512E1282.271100 (85)2.27110 (10)3.18
81B6Q562H1283.127100 (8)3.12710 (5)3.110
91ROP562H1284.32100 (6)4.3210 (2)4.32
101NKD592H1283.88100 (4)3.8810 (4)3.86
111RPO612H1283.82100 (6)3.8210 (4)3.82
121QR8682H1284.480100 (3)4.48010 (2)4.410
131FME281H,2E15 5922.952100 (90)2.95210 (8)3.75
141ACW291H,2E15 7263.997100 (45)3.99710 (5)5.18
151DFN303E13 1744.47798 (11)4.47710 (4)5.01
161Q2K311H,2E16 0204.246100 (20)4.24610 (4)4.29
171SCY311H,2E15 4233.110100 (40)3.11010 (4)3.15
181XRX341E,2H14 6303.928100 (19)3.92810 (1)5.61
191ROO353H10712.514100 (100)2.51410 (10)2.85
201YRF353H15 1803.816100 (62)3.81610 (9)4.84
211YRI353H15 1802.881100 (70)2.88110 (8)3.86
221VII363H16 3803.77100 (50)3.7710 (6)3.72
231BGK373H14 1393.833100 (56)3.83310 (8)4.13
241BHI381H,2E14 9235.32100 (5)5.3210 (2)5.32
251OVX381H,2E12 0743.28100 (76)3.2810 (5)4.01
261I6C393E29274.131100 (32)4.13110 (3)5.12
272ERL403H16 2683.118100 (32)3.11810 (2)3.26
281RES433H16 1354.030100 (40)4.03010 (7)4.22
292CPG431E,2H10 9053.620100 (18)3.62010 (1)5.32
301DV0453H14 4884.020100 (21)4.02010 (1)5.14
311IRQ481E,2H11 5923.574100 (18)3.57410 (1)5.39
321GUU503H13 4104.574100 (42)4.57410 (7)4.66
331GV5523H11 1093.53399 (24)3.53310 (5)4.12
341GVD523H10 6263.818100 (35)3.81810 (6)4.99
351MBH523H10 6323.848100 (24)3.84810 (5)4.04
361GAB533H14 4953.616100 (12)3.61610 (3)3.66
371MOF533H16 3842.457100 (96)2.45710 (10)2.95
381ENH543H13 6223.212100 (23)3.21210 (3)4.63
391IDY543H11 1333.384100 (52)3.38410 (8)3.56
401PRV563H54684.45599 (25)4.45510 (7)4.99
411HDD573H12 8493.274100 (22)3.27410 (2)4.88
421BDC603H11 2554.244100 (19)4.24410 (2)4.85
431I5X613H16 3842.62999 (54)2.62910 (10)2.66
441I5Y613H16 3842.620100 (48)2.62010 (10)2.67
451KU3613H57014.968100 (14)4.96810 (3)5.54
461YIB613H16 3842.97100 (75)2.9710 (9)3.55
471AHO641H,2E24294.758100 (15)4.75810 (1)6.06
481DF5683H16 3843.110100 (41)3.11010 (6)3.18
491QR9683H16 3842.949100 (33)2.94910 (9)3.82
501AIL703H16 3844.242100 (5)4.24210 (3)4.27
Sl. No.PDB ID (i)Number of amino acids (ii)Number of secondary structure elements (iii)Number of structures accepted after Persistence length and Radius of gyration filters (iv)Lowest RMSD in the final 100 structures (Å) (v)Energy Rank of the lowest RMSD structure in 100 structures (vi)After ProRegIn filterAfter topology and accessible surface area filter
Number of structures selected (Number of structures <6 Å) (vii)Lowest RMSD (Å) (viii)Energy Rank of the lowest RMSD structure in 100 structures (ix)Number of structures selected (Number of structures <6 Å) (x)Lowest RMSD (Å) (xi)Energy Rank of the lowest RMSD structure in 10 structures (xii)
11E0Q172E1282.52100 (29)2.5210 (10)2.52
21B03182E644.4264 (5)4.4210 (5)4.42
31WQC262H1282.56100 (53)2.5610 (10)2.53
41RJU362H644.64864 (3)4.64810 (2)5.96
51EDM392E1282.9100100 (59)2.910010 (10)3.52
61AB1462H1282.410100 (82)2.41010 (10)2.96
71BX7512E1282.271100 (85)2.27110 (10)3.18
81B6Q562H1283.127100 (8)3.12710 (5)3.110
91ROP562H1284.32100 (6)4.3210 (2)4.32
101NKD592H1283.88100 (4)3.8810 (4)3.86
111RPO612H1283.82100 (6)3.8210 (4)3.82
121QR8682H1284.480100 (3)4.48010 (2)4.410
131FME281H,2E15 5922.952100 (90)2.95210 (8)3.75
141ACW291H,2E15 7263.997100 (45)3.99710 (5)5.18
151DFN303E13 1744.47798 (11)4.47710 (4)5.01
161Q2K311H,2E16 0204.246100 (20)4.24610 (4)4.29
171SCY311H,2E15 4233.110100 (40)3.11010 (4)3.15
181XRX341E,2H14 6303.928100 (19)3.92810 (1)5.61
191ROO353H10712.514100 (100)2.51410 (10)2.85
201YRF353H15 1803.816100 (62)3.81610 (9)4.84
211YRI353H15 1802.881100 (70)2.88110 (8)3.86
221VII363H16 3803.77100 (50)3.7710 (6)3.72
231BGK373H14 1393.833100 (56)3.83310 (8)4.13
241BHI381H,2E14 9235.32100 (5)5.3210 (2)5.32
251OVX381H,2E12 0743.28100 (76)3.2810 (5)4.01
261I6C393E29274.131100 (32)4.13110 (3)5.12
272ERL403H16 2683.118100 (32)3.11810 (2)3.26
281RES433H16 1354.030100 (40)4.03010 (7)4.22
292CPG431E,2H10 9053.620100 (18)3.62010 (1)5.32
301DV0453H14 4884.020100 (21)4.02010 (1)5.14
311IRQ481E,2H11 5923.574100 (18)3.57410 (1)5.39
321GUU503H13 4104.574100 (42)4.57410 (7)4.66
331GV5523H11 1093.53399 (24)3.53310 (5)4.12
341GVD523H10 6263.818100 (35)3.81810 (6)4.99
351MBH523H10 6323.848100 (24)3.84810 (5)4.04
361GAB533H14 4953.616100 (12)3.61610 (3)3.66
371MOF533H16 3842.457100 (96)2.45710 (10)2.95
381ENH543H13 6223.212100 (23)3.21210 (3)4.63
391IDY543H11 1333.384100 (52)3.38410 (8)3.56
401PRV563H54684.45599 (25)4.45510 (7)4.99
411HDD573H12 8493.274100 (22)3.27410 (2)4.88
421BDC603H11 2554.244100 (19)4.24410 (2)4.85
431I5X613H16 3842.62999 (54)2.62910 (10)2.66
441I5Y613H16 3842.620100 (48)2.62010 (10)2.67
451KU3613H57014.968100 (14)4.96810 (3)5.54
461YIB613H16 3842.97100 (75)2.9710 (9)3.55
471AHO641H,2E24294.758100 (15)4.75810 (1)6.06
481DF5683H16 3843.110100 (41)3.11010 (6)3.18
491QR9683H16 3842.949100 (33)2.94910 (9)3.82
501AIL703H16 3844.242100 (5)4.24210 (3)4.27
Table 2

A performance appraisal of Bhageerath web server for 50 small globular proteins

Sl. No.PDB ID (i)Number of amino acids (ii)Number of secondary structure elements (iii)Number of structures accepted after Persistence length and Radius of gyration filters (iv)Lowest RMSD in the final 100 structures (Å) (v)Energy Rank of the lowest RMSD structure in 100 structures (vi)After ProRegIn filterAfter topology and accessible surface area filter
Number of structures selected (Number of structures <6 Å) (vii)Lowest RMSD (Å) (viii)Energy Rank of the lowest RMSD structure in 100 structures (ix)Number of structures selected (Number of structures <6 Å) (x)Lowest RMSD (Å) (xi)Energy Rank of the lowest RMSD structure in 10 structures (xii)
11E0Q172E1282.52100 (29)2.5210 (10)2.52
21B03182E644.4264 (5)4.4210 (5)4.42
31WQC262H1282.56100 (53)2.5610 (10)2.53
41RJU362H644.64864 (3)4.64810 (2)5.96
51EDM392E1282.9100100 (59)2.910010 (10)3.52
61AB1462H1282.410100 (82)2.41010 (10)2.96
71BX7512E1282.271100 (85)2.27110 (10)3.18
81B6Q562H1283.127100 (8)3.12710 (5)3.110
91ROP562H1284.32100 (6)4.3210 (2)4.32
101NKD592H1283.88100 (4)3.8810 (4)3.86
111RPO612H1283.82100 (6)3.8210 (4)3.82
121QR8682H1284.480100 (3)4.48010 (2)4.410
131FME281H,2E15 5922.952100 (90)2.95210 (8)3.75
141ACW291H,2E15 7263.997100 (45)3.99710 (5)5.18
151DFN303E13 1744.47798 (11)4.47710 (4)5.01
161Q2K311H,2E16 0204.246100 (20)4.24610 (4)4.29
171SCY311H,2E15 4233.110100 (40)3.11010 (4)3.15
181XRX341E,2H14 6303.928100 (19)3.92810 (1)5.61
191ROO353H10712.514100 (100)2.51410 (10)2.85
201YRF353H15 1803.816100 (62)3.81610 (9)4.84
211YRI353H15 1802.881100 (70)2.88110 (8)3.86
221VII363H16 3803.77100 (50)3.7710 (6)3.72
231BGK373H14 1393.833100 (56)3.83310 (8)4.13
241BHI381H,2E14 9235.32100 (5)5.3210 (2)5.32
251OVX381H,2E12 0743.28100 (76)3.2810 (5)4.01
261I6C393E29274.131100 (32)4.13110 (3)5.12
272ERL403H16 2683.118100 (32)3.11810 (2)3.26
281RES433H16 1354.030100 (40)4.03010 (7)4.22
292CPG431E,2H10 9053.620100 (18)3.62010 (1)5.32
301DV0453H14 4884.020100 (21)4.02010 (1)5.14
311IRQ481E,2H11 5923.574100 (18)3.57410 (1)5.39
321GUU503H13 4104.574100 (42)4.57410 (7)4.66
331GV5523H11 1093.53399 (24)3.53310 (5)4.12
341GVD523H10 6263.818100 (35)3.81810 (6)4.99
351MBH523H10 6323.848100 (24)3.84810 (5)4.04
361GAB533H14 4953.616100 (12)3.61610 (3)3.66
371MOF533H16 3842.457100 (96)2.45710 (10)2.95
381ENH543H13 6223.212100 (23)3.21210 (3)4.63
391IDY543H11 1333.384100 (52)3.38410 (8)3.56
401PRV563H54684.45599 (25)4.45510 (7)4.99
411HDD573H12 8493.274100 (22)3.27410 (2)4.88
421BDC603H11 2554.244100 (19)4.24410 (2)4.85
431I5X613H16 3842.62999 (54)2.62910 (10)2.66
441I5Y613H16 3842.620100 (48)2.62010 (10)2.67
451KU3613H57014.968100 (14)4.96810 (3)5.54
461YIB613H16 3842.97100 (75)2.9710 (9)3.55
471AHO641H,2E24294.758100 (15)4.75810 (1)6.06
481DF5683H16 3843.110100 (41)3.11010 (6)3.18
491QR9683H16 3842.949100 (33)2.94910 (9)3.82
501AIL703H16 3844.242100 (5)4.24210 (3)4.27
Sl. No.PDB ID (i)Number of amino acids (ii)Number of secondary structure elements (iii)Number of structures accepted after Persistence length and Radius of gyration filters (iv)Lowest RMSD in the final 100 structures (Å) (v)Energy Rank of the lowest RMSD structure in 100 structures (vi)After ProRegIn filterAfter topology and accessible surface area filter
Number of structures selected (Number of structures <6 Å) (vii)Lowest RMSD (Å) (viii)Energy Rank of the lowest RMSD structure in 100 structures (ix)Number of structures selected (Number of structures <6 Å) (x)Lowest RMSD (Å) (xi)Energy Rank of the lowest RMSD structure in 10 structures (xii)
11E0Q172E1282.52100 (29)2.5210 (10)2.52
21B03182E644.4264 (5)4.4210 (5)4.42
31WQC262H1282.56100 (53)2.5610 (10)2.53
41RJU362H644.64864 (3)4.64810 (2)5.96
51EDM392E1282.9100100 (59)2.910010 (10)3.52
61AB1462H1282.410100 (82)2.41010 (10)2.96
71BX7512E1282.271100 (85)2.27110 (10)3.18
81B6Q562H1283.127100 (8)3.12710 (5)3.110
91ROP562H1284.32100 (6)4.3210 (2)4.32
101NKD592H1283.88100 (4)3.8810 (4)3.86
111RPO612H1283.82100 (6)3.8210 (4)3.82
121QR8682H1284.480100 (3)4.48010 (2)4.410
131FME281H,2E15 5922.952100 (90)2.95210 (8)3.75
141ACW291H,2E15 7263.997100 (45)3.99710 (5)5.18
151DFN303E13 1744.47798 (11)4.47710 (4)5.01
161Q2K311H,2E16 0204.246100 (20)4.24610 (4)4.29
171SCY311H,2E15 4233.110100 (40)3.11010 (4)3.15
181XRX341E,2H14 6303.928100 (19)3.92810 (1)5.61
191ROO353H10712.514100 (100)2.51410 (10)2.85
201YRF353H15 1803.816100 (62)3.81610 (9)4.84
211YRI353H15 1802.881100 (70)2.88110 (8)3.86
221VII363H16 3803.77100 (50)3.7710 (6)3.72
231BGK373H14 1393.833100 (56)3.83310 (8)4.13
241BHI381H,2E14 9235.32100 (5)5.3210 (2)5.32
251OVX381H,2E12 0743.28100 (76)3.2810 (5)4.01
261I6C393E29274.131100 (32)4.13110 (3)5.12
272ERL403H16 2683.118100 (32)3.11810 (2)3.26
281RES433H16 1354.030100 (40)4.03010 (7)4.22
292CPG431E,2H10 9053.620100 (18)3.62010 (1)5.32
301DV0453H14 4884.020100 (21)4.02010 (1)5.14
311IRQ481E,2H11 5923.574100 (18)3.57410 (1)5.39
321GUU503H13 4104.574100 (42)4.57410 (7)4.66
331GV5523H11 1093.53399 (24)3.53310 (5)4.12
341GVD523H10 6263.818100 (35)3.81810 (6)4.99
351MBH523H10 6323.848100 (24)3.84810 (5)4.04
361GAB533H14 4953.616100 (12)3.61610 (3)3.66
371MOF533H16 3842.457100 (96)2.45710 (10)2.95
381ENH543H13 6223.212100 (23)3.21210 (3)4.63
391IDY543H11 1333.384100 (52)3.38410 (8)3.56
401PRV563H54684.45599 (25)4.45510 (7)4.99
411HDD573H12 8493.274100 (22)3.27410 (2)4.88
421BDC603H11 2554.244100 (19)4.24410 (2)4.85
431I5X613H16 3842.62999 (54)2.62910 (10)2.66
441I5Y613H16 3842.620100 (48)2.62010 (10)2.67
451KU3613H57014.968100 (14)4.96810 (3)5.54
461YIB613H16 3842.97100 (75)2.9710 (9)3.55
471AHO641H,2E24294.758100 (15)4.75810 (1)6.06
481DF5683H16 3843.110100 (41)3.11010 (6)3.18
491QR9683H16 3842.949100 (33)2.94910 (9)3.82
501AIL703H16 3844.242100 (5)4.24210 (3)4.27

Figure 2 shows a superimposition of the lowest RMSD structure with the respective native structures for all the 50 globular test proteins.

Figure 2

The superimposed lowest RMSD structures for the 50 small globular test proteins used for the validation of Bhageerath web server. The PDB ID's are shown underneath each structure. The predicted structure is shown in red color and the native in blue.

A comparison of the structures obtained with the protein structure prediction web server presented here was carried out with six freely available homology modeling servers: CPHmodels (19), Swiss Model (4), EsyPred3D (38), ModWeb (21), Geno3D (39) and 3Djigsaw (40). While SwissModel, EsyPred3D, Geno3D and 3Djigsaw provide an option for template selection the other two servers are automatic. For the 50 test proteins validated, we have first carried out sequence alignment using PSI BLAST (41) and the templates were selected such that the sequence similarity of the template is >30% and the template is not from the same family. For most of the proteins there was very less sequence similarity with proteins of other families and the templates were restricted to the same family. In such cases the quality of model built is quite high and the RMSD with respect to the native is <1 Å in few cases. The proteins where the templates are selected from different families result in RMSDs comparable to those obtained with Bhageerath web server. Table 3 shows the RMSD of the structures obtained by homology modeling from the respective web servers for all the 50 globular proteins. The template ID, percentage sequence similarity and alignment of the target-template sequence for each method and each structure therein is provided in supplementary information (Supplementary Tables S2–S7). Thus, for new sequences with no known sequence homologues, the Bhageerath web server has the potential to predict a structure to within 3–6 Å RMSD of the native structure with accuracies comparable to the homology modeling servers.

Table 3

A comparison of protein tertiary structure prediction accuracies with different homology modeling servers available in public domain

Sl. No.PDB IDCPHModels (19) RMSD (Å)SwissModel (4) RMSD (Å)EsyPred3D (38) RMSD (Å)ModWeb (21) RMSD (Å)Geno3D (39) RMSD (Å)3DJigSaw (40) RMSD (Å)Bhageerath RMSD (Å)
11E0Q(1–17)1.7 (1–17)1.5 (1–16)2.5
21B03(1–18)3.5 (2–18)4.4
31WQC(1–26)0.5 (1–26)0.4 (1–26)2.5
41RJU(1–36)2.0 (1–36)1.7 (1–36)2.1 (1–36)5.9
51EDM(1–39)1.5 (1–39)1.4 (1–39)0.8 (2–38)0.5 (1–39)1.8 (1–39)3.5
61AB1(1–46)0.6 (1–46)2.8 (1–46)0.4 (1–46)0.4 (1–46)0.7 (1–46) 0.7 (1–46)2.9
71BX7(1–51)0.6 (1–51)0.8 (1–51)2.2 (3–50)0.6 (1–51)2.6 (4–51)2.2 (3–50)3.1
81B6Q(1–56)4.7 (1–56)5.0 (1–56)2.7 (3–56)5.1 (1–56)0.7 (1–56) 1.4 (1–56)4.8 (1–56)3.1
91ROP(1–56)1.3 (1–56)0.6 (1–56)4.7 (3–56)0.7 (1–56)0.7 (1–56) 0.8 (1–56)1.3 (1–56)4.3
101NKD(1–59)0.5 (1–59)7.7 (1–50)0.6 (1–59)1.9 (1–59) 0.7 (1–59)1.3 (1–59) 1.2 (1–59)0.4 (1–59)3.8
111RPO(1–61)0.5 (1–61)7.7 (1–50)0.5 (1–59)1.9 (1–61) 0.7 (1–59)0.8 (1–61) 0.9 (1–61)0.4 (1–61)3.8
121QR8(1–68)0.5 (1–68)0.5 (1–68)1.1 (2–66)1.6 (1–68)0.9 (1–68) 1.2 (1–68)0.5 (1–68)4.4
131FME(1–28)0.7 (1–28)0.9 (1–28)3.7
141ACW(1–29)0.7 (1–29)0.4 (1–29)5.1
151DFN(1–30)0.8 (1–30)0.4 (1–30)1.3 (2–30)5.0
161Q2K(1–31)0.9 (1–31)0.5 (1–31)4.2
171SCY(1–31)0.6 (1–31)0.7 (1–31)3.1
181XRX(1–34)0.5 (1–34)0.3 (1–34)0.7 (1–34)3.1 (1–31)5.6
191ROO(1–35)0.8 (1–35)0.7 (1–35)2.8
201YRF(1–35)1.6 (1–35)0.5 (1–35)1.2 (1–35)1.3 (1–35)4.8
211YRI(1–35)1.7 (1–35)0.7 (1–35)1.4 (1–35)1.5 (1–35)3.8
221VII(1–36)2.4 (2–36)0.9 (1–36)2.2 (2–36)2.0 (2–36)3.7
231BGK(1–37)0.8 (1–37)0.5 (1–37)0.7 (1–37)4.1
241BHI(1–38)0.8 (1–38)0.4 (1–38)1.0 (1–38)1.1 (1–38)5.3
251OVX(1–38)0.9 (1–38)0.3 (1–38)1.0 (1–38)0.6 (1–38)0.3 (1–38)4.0
261I6C(1–39)4.2 (1–39)4.4 (1–39)4.5 (1–39)0.8 (1–39)3.1 (1–34)5.1
272ERL(1–40)1.3 (1–40)0.9 (1–40)0.4 (1–40)0.4 (1–40)1.2 (1–40) 1.1 (1–40)3.2
281RES(1–43)4.2 (1–43)4.1 (1–43)4.2 (1–43)0.8 (1–43)1.2 (1–43) 1.1 (1–43)4.2
292CPG(1–43)0.8 (1–43)0.6 (1–43)1.1 (1–43)0.9 (1–43)0.9 (1–43)0.6 (1–43)5.3
301DV0(1–45)4.2 (1–45)10.5 (1–35)2.0 (1–42)0.7 (1–45) 2.4 (1–44)0.9 (1–45) 1.1 (1–45)0.6 (1–45)5.1
311IRQ(1–48)0.6 (1–48)0.8 (1–48)1.3 (2–48)0.7 (1–48)1.2 (1–48)0.9 (1–48)5.3
321GUU(1–50)2.5 (1–50)2.6 (1–50)2.3 (38–50)5.7 (1–50) 0.5 (1–50) 2.2 (1–49)1.5 (1–48) 1.6 (1–48)1.6 (1–42)4.6
331GV5(1–52)1.4 (1–52)0.6 (1–52)1.3 (1–52)0.68 (1–52) 2.2 (3–52)2.1 (3–46) 2.0 (3–46)1.8 (3–45)4.1
341GVD(1–52)1.4 (1–52)4.2 (1–51)1.3 (1–52)5.5 (1–52)6.6 (1–44) 9.8 (1–44)6.4 (1–43)4.9
351MBH(1–52)1.8 (1–52)3.3 (1–51)1.8 (1–52)1.9 (1–52)1.6 (1–52) 2.1 (1–52)1.1 (6–45)4.0
361GAB(1–53)0.6 (1–53)1.6 (1–53)3.3 (1–53)3.3 (1–53)2.2 (1–53) 2.7 (1–53)0.5 (1–53)3.6
371MOF(1–53)0.6 (1–53)1.8 (1–53)1.9 (1–53)1.7 (1–53) 2.3 (3–51)3.4 (1–53) 3.4 (1–53)1.7 (1–53)2.9
381ENH(1–54)0.5 (1–54)0.8 (1–54)0.9 (3–53)1.7 (1–53) 1.0 (5–53) 2.5 (1–54) 3.0 (1–51)1.7 (1–54) 1.7 (1–54)0.5 (1–54)4.6
391IDY(1–54)4.0 (2–54)10.8 (1–50)3.8 (2–52)0.9 (1–54)10.0 (5–46) 10.0 (5–46)0.3 (1–54)3.5
401PRV(1–56)5.7 (2–56)2.1 (1–56)5.6 (3–56)1.6 (1–56)5.7 (2–56) 5.4 (2–56)5.6 (2–56)4.9
411HDD(1–57)13.2 (1–57)13.3 (1–57)1.2 (1–56)2.7 (1–57) 3.3 (1–51) 3.6 (1–57) 1.5 (9–55) 1.3 (1–56)2.1 (1–56) 2.7 (1–56)0.3 (1–57)4.8
421BDC(1–60)3.4 (1–60)2.7 (6–39) 3.1 (1–38)2.7 (6–39)2.7(5–59)2.1 (1–60) 1.8 (1–60)2.6 (5–37)4.8
431I5X(1–61)0.7 (1–61)1.1 (1–61)1.6 (1–61)1.8 (1–61)1.5 (1–61) 1.4 (1–61)0.9 (1–61)2.6
441I5Y(1–61)0.7 (1–61)1.1 (1–61)1.6 (1–61)1.4 (1–61)1.7 (1–61) 1.1 (1–61)0.9 (1–61)2.6
451KU3(1–61)1.3 (1–61)2.8 (4–61)1.5 (1–61)1.5 (1–61)1.9 (1–61) 1.7 (1–61)0.4 (1–61)5.5
461YIB(1–61)1.8 (2–61)1.7 (1–61)3.4 (1–61)2.8 (2–60)1.5 (1–61) 1.6 (1–61)1.9 (2–61)3.5
471AHO(1–64)0.6 (1–64)0.5 (1–64)1.3 (1–64)0.4 (1–64)1.8 (1–64)0.3 (1–64)6.0
481DF5(1–68)0.6 (1–68)1.5 (1–68)1.8 (2–66)2.2 (1–68)1.8 (1–68) 1.8 (1–68)1.6 (1–68)3.1
491QR9(1–68)0.5 (1–68)0.7 (1–68)1.4 (2–66)1.8 (1–68)1.7 (1–68) 1.8 (1–68)0.6 (1–68)3.8
501AIL(1–70)0.87 (1–70)0.73 (1–70)0.46 (1–70)0.6 (1–70)0.88 (1–70) 0.97 (1–70)0.9 (1–70)4.2
Sl. No.PDB IDCPHModels (19) RMSD (Å)SwissModel (4) RMSD (Å)EsyPred3D (38) RMSD (Å)ModWeb (21) RMSD (Å)Geno3D (39) RMSD (Å)3DJigSaw (40) RMSD (Å)Bhageerath RMSD (Å)
11E0Q(1–17)1.7 (1–17)1.5 (1–16)2.5
21B03(1–18)3.5 (2–18)4.4
31WQC(1–26)0.5 (1–26)0.4 (1–26)2.5
41RJU(1–36)2.0 (1–36)1.7 (1–36)2.1 (1–36)5.9
51EDM(1–39)1.5 (1–39)1.4 (1–39)0.8 (2–38)0.5 (1–39)1.8 (1–39)3.5
61AB1(1–46)0.6 (1–46)2.8 (1–46)0.4 (1–46)0.4 (1–46)0.7 (1–46) 0.7 (1–46)2.9
71BX7(1–51)0.6 (1–51)0.8 (1–51)2.2 (3–50)0.6 (1–51)2.6 (4–51)2.2 (3–50)3.1
81B6Q(1–56)4.7 (1–56)5.0 (1–56)2.7 (3–56)5.1 (1–56)0.7 (1–56) 1.4 (1–56)4.8 (1–56)3.1
91ROP(1–56)1.3 (1–56)0.6 (1–56)4.7 (3–56)0.7 (1–56)0.7 (1–56) 0.8 (1–56)1.3 (1–56)4.3
101NKD(1–59)0.5 (1–59)7.7 (1–50)0.6 (1–59)1.9 (1–59) 0.7 (1–59)1.3 (1–59) 1.2 (1–59)0.4 (1–59)3.8
111RPO(1–61)0.5 (1–61)7.7 (1–50)0.5 (1–59)1.9 (1–61) 0.7 (1–59)0.8 (1–61) 0.9 (1–61)0.4 (1–61)3.8
121QR8(1–68)0.5 (1–68)0.5 (1–68)1.1 (2–66)1.6 (1–68)0.9 (1–68) 1.2 (1–68)0.5 (1–68)4.4
131FME(1–28)0.7 (1–28)0.9 (1–28)3.7
141ACW(1–29)0.7 (1–29)0.4 (1–29)5.1
151DFN(1–30)0.8 (1–30)0.4 (1–30)1.3 (2–30)5.0
161Q2K(1–31)0.9 (1–31)0.5 (1–31)4.2
171SCY(1–31)0.6 (1–31)0.7 (1–31)3.1
181XRX(1–34)0.5 (1–34)0.3 (1–34)0.7 (1–34)3.1 (1–31)5.6
191ROO(1–35)0.8 (1–35)0.7 (1–35)2.8
201YRF(1–35)1.6 (1–35)0.5 (1–35)1.2 (1–35)1.3 (1–35)4.8
211YRI(1–35)1.7 (1–35)0.7 (1–35)1.4 (1–35)1.5 (1–35)3.8
221VII(1–36)2.4 (2–36)0.9 (1–36)2.2 (2–36)2.0 (2–36)3.7
231BGK(1–37)0.8 (1–37)0.5 (1–37)0.7 (1–37)4.1
241BHI(1–38)0.8 (1–38)0.4 (1–38)1.0 (1–38)1.1 (1–38)5.3
251OVX(1–38)0.9 (1–38)0.3 (1–38)1.0 (1–38)0.6 (1–38)0.3 (1–38)4.0
261I6C(1–39)4.2 (1–39)4.4 (1–39)4.5 (1–39)0.8 (1–39)3.1 (1–34)5.1
272ERL(1–40)1.3 (1–40)0.9 (1–40)0.4 (1–40)0.4 (1–40)1.2 (1–40) 1.1 (1–40)3.2
281RES(1–43)4.2 (1–43)4.1 (1–43)4.2 (1–43)0.8 (1–43)1.2 (1–43) 1.1 (1–43)4.2
292CPG(1–43)0.8 (1–43)0.6 (1–43)1.1 (1–43)0.9 (1–43)0.9 (1–43)0.6 (1–43)5.3
301DV0(1–45)4.2 (1–45)10.5 (1–35)2.0 (1–42)0.7 (1–45) 2.4 (1–44)0.9 (1–45) 1.1 (1–45)0.6 (1–45)5.1
311IRQ(1–48)0.6 (1–48)0.8 (1–48)1.3 (2–48)0.7 (1–48)1.2 (1–48)0.9 (1–48)5.3
321GUU(1–50)2.5 (1–50)2.6 (1–50)2.3 (38–50)5.7 (1–50) 0.5 (1–50) 2.2 (1–49)1.5 (1–48) 1.6 (1–48)1.6 (1–42)4.6
331GV5(1–52)1.4 (1–52)0.6 (1–52)1.3 (1–52)0.68 (1–52) 2.2 (3–52)2.1 (3–46) 2.0 (3–46)1.8 (3–45)4.1
341GVD(1–52)1.4 (1–52)4.2 (1–51)1.3 (1–52)5.5 (1–52)6.6 (1–44) 9.8 (1–44)6.4 (1–43)4.9
351MBH(1–52)1.8 (1–52)3.3 (1–51)1.8 (1–52)1.9 (1–52)1.6 (1–52) 2.1 (1–52)1.1 (6–45)4.0
361GAB(1–53)0.6 (1–53)1.6 (1–53)3.3 (1–53)3.3 (1–53)2.2 (1–53) 2.7 (1–53)0.5 (1–53)3.6
371MOF(1–53)0.6 (1–53)1.8 (1–53)1.9 (1–53)1.7 (1–53) 2.3 (3–51)3.4 (1–53) 3.4 (1–53)1.7 (1–53)2.9
381ENH(1–54)0.5 (1–54)0.8 (1–54)0.9 (3–53)1.7 (1–53) 1.0 (5–53) 2.5 (1–54) 3.0 (1–51)1.7 (1–54) 1.7 (1–54)0.5 (1–54)4.6
391IDY(1–54)4.0 (2–54)10.8 (1–50)3.8 (2–52)0.9 (1–54)10.0 (5–46) 10.0 (5–46)0.3 (1–54)3.5
401PRV(1–56)5.7 (2–56)2.1 (1–56)5.6 (3–56)1.6 (1–56)5.7 (2–56) 5.4 (2–56)5.6 (2–56)4.9
411HDD(1–57)13.2 (1–57)13.3 (1–57)1.2 (1–56)2.7 (1–57) 3.3 (1–51) 3.6 (1–57) 1.5 (9–55) 1.3 (1–56)2.1 (1–56) 2.7 (1–56)0.3 (1–57)4.8
421BDC(1–60)3.4 (1–60)2.7 (6–39) 3.1 (1–38)2.7 (6–39)2.7(5–59)2.1 (1–60) 1.8 (1–60)2.6 (5–37)4.8
431I5X(1–61)0.7 (1–61)1.1 (1–61)1.6 (1–61)1.8 (1–61)1.5 (1–61) 1.4 (1–61)0.9 (1–61)2.6
441I5Y(1–61)0.7 (1–61)1.1 (1–61)1.6 (1–61)1.4 (1–61)1.7 (1–61) 1.1 (1–61)0.9 (1–61)2.6
451KU3(1–61)1.3 (1–61)2.8 (4–61)1.5 (1–61)1.5 (1–61)1.9 (1–61) 1.7 (1–61)0.4 (1–61)5.5
461YIB(1–61)1.8 (2–61)1.7 (1–61)3.4 (1–61)2.8 (2–60)1.5 (1–61) 1.6 (1–61)1.9 (2–61)3.5
471AHO(1–64)0.6 (1–64)0.5 (1–64)1.3 (1–64)0.4 (1–64)1.8 (1–64)0.3 (1–64)6.0
481DF5(1–68)0.6 (1–68)1.5 (1–68)1.8 (2–66)2.2 (1–68)1.8 (1–68) 1.8 (1–68)1.6 (1–68)3.1
491QR9(1–68)0.5 (1–68)0.7 (1–68)1.4 (2–66)1.8 (1–68)1.7 (1–68) 1.8 (1–68)0.6 (1–68)3.8
501AIL(1–70)0.87 (1–70)0.73 (1–70)0.46 (1–70)0.6 (1–70)0.88 (1–70) 0.97 (1–70)0.9 (1–70)4.2

The numbers in parenthesis indicate the length of the protein model obtained. Supplementary Tables S2–S7 in the supplementary information contain the template ID, % sequence identity and alignment for each method and structure shown above.

Table 3

A comparison of protein tertiary structure prediction accuracies with different homology modeling servers available in public domain

Sl. No.PDB IDCPHModels (19) RMSD (Å)SwissModel (4) RMSD (Å)EsyPred3D (38) RMSD (Å)ModWeb (21) RMSD (Å)Geno3D (39) RMSD (Å)3DJigSaw (40) RMSD (Å)Bhageerath RMSD (Å)
11E0Q(1–17)1.7 (1–17)1.5 (1–16)2.5
21B03(1–18)3.5 (2–18)4.4
31WQC(1–26)0.5 (1–26)0.4 (1–26)2.5
41RJU(1–36)2.0 (1–36)1.7 (1–36)2.1 (1–36)5.9
51EDM(1–39)1.5 (1–39)1.4 (1–39)0.8 (2–38)0.5 (1–39)1.8 (1–39)3.5
61AB1(1–46)0.6 (1–46)2.8 (1–46)0.4 (1–46)0.4 (1–46)0.7 (1–46) 0.7 (1–46)2.9
71BX7(1–51)0.6 (1–51)0.8 (1–51)2.2 (3–50)0.6 (1–51)2.6 (4–51)2.2 (3–50)3.1
81B6Q(1–56)4.7 (1–56)5.0 (1–56)2.7 (3–56)5.1 (1–56)0.7 (1–56) 1.4 (1–56)4.8 (1–56)3.1
91ROP(1–56)1.3 (1–56)0.6 (1–56)4.7 (3–56)0.7 (1–56)0.7 (1–56) 0.8 (1–56)1.3 (1–56)4.3
101NKD(1–59)0.5 (1–59)7.7 (1–50)0.6 (1–59)1.9 (1–59) 0.7 (1–59)1.3 (1–59) 1.2 (1–59)0.4 (1–59)3.8
111RPO(1–61)0.5 (1–61)7.7 (1–50)0.5 (1–59)1.9 (1–61) 0.7 (1–59)0.8 (1–61) 0.9 (1–61)0.4 (1–61)3.8
121QR8(1–68)0.5 (1–68)0.5 (1–68)1.1 (2–66)1.6 (1–68)0.9 (1–68) 1.2 (1–68)0.5 (1–68)4.4
131FME(1–28)0.7 (1–28)0.9 (1–28)3.7
141ACW(1–29)0.7 (1–29)0.4 (1–29)5.1
151DFN(1–30)0.8 (1–30)0.4 (1–30)1.3 (2–30)5.0
161Q2K(1–31)0.9 (1–31)0.5 (1–31)4.2
171SCY(1–31)0.6 (1–31)0.7 (1–31)3.1
181XRX(1–34)0.5 (1–34)0.3 (1–34)0.7 (1–34)3.1 (1–31)5.6
191ROO(1–35)0.8 (1–35)0.7 (1–35)2.8
201YRF(1–35)1.6 (1–35)0.5 (1–35)1.2 (1–35)1.3 (1–35)4.8
211YRI(1–35)1.7 (1–35)0.7 (1–35)1.4 (1–35)1.5 (1–35)3.8
221VII(1–36)2.4 (2–36)0.9 (1–36)2.2 (2–36)2.0 (2–36)3.7
231BGK(1–37)0.8 (1–37)0.5 (1–37)0.7 (1–37)4.1
241BHI(1–38)0.8 (1–38)0.4 (1–38)1.0 (1–38)1.1 (1–38)5.3
251OVX(1–38)0.9 (1–38)0.3 (1–38)1.0 (1–38)0.6 (1–38)0.3 (1–38)4.0
261I6C(1–39)4.2 (1–39)4.4 (1–39)4.5 (1–39)0.8 (1–39)3.1 (1–34)5.1
272ERL(1–40)1.3 (1–40)0.9 (1–40)0.4 (1–40)0.4 (1–40)1.2 (1–40) 1.1 (1–40)3.2
281RES(1–43)4.2 (1–43)4.1 (1–43)4.2 (1–43)0.8 (1–43)1.2 (1–43) 1.1 (1–43)4.2
292CPG(1–43)0.8 (1–43)0.6 (1–43)1.1 (1–43)0.9 (1–43)0.9 (1–43)0.6 (1–43)5.3
301DV0(1–45)4.2 (1–45)10.5 (1–35)2.0 (1–42)0.7 (1–45) 2.4 (1–44)0.9 (1–45) 1.1 (1–45)0.6 (1–45)5.1
311IRQ(1–48)0.6 (1–48)0.8 (1–48)1.3 (2–48)0.7 (1–48)1.2 (1–48)0.9 (1–48)5.3
321GUU(1–50)2.5 (1–50)2.6 (1–50)2.3 (38–50)5.7 (1–50) 0.5 (1–50) 2.2 (1–49)1.5 (1–48) 1.6 (1–48)1.6 (1–42)4.6
331GV5(1–52)1.4 (1–52)0.6 (1–52)1.3 (1–52)0.68 (1–52) 2.2 (3–52)2.1 (3–46) 2.0 (3–46)1.8 (3–45)4.1
341GVD(1–52)1.4 (1–52)4.2 (1–51)1.3 (1–52)5.5 (1–52)6.6 (1–44) 9.8 (1–44)6.4 (1–43)4.9
351MBH(1–52)1.8 (1–52)3.3 (1–51)1.8 (1–52)1.9 (1–52)1.6 (1–52) 2.1 (1–52)1.1 (6–45)4.0
361GAB(1–53)0.6 (1–53)1.6 (1–53)3.3 (1–53)3.3 (1–53)2.2 (1–53) 2.7 (1–53)0.5 (1–53)3.6
371MOF(1–53)0.6 (1–53)1.8 (1–53)1.9 (1–53)1.7 (1–53) 2.3 (3–51)3.4 (1–53) 3.4 (1–53)1.7 (1–53)2.9
381ENH(1–54)0.5 (1–54)0.8 (1–54)0.9 (3–53)1.7 (1–53) 1.0 (5–53) 2.5 (1–54) 3.0 (1–51)1.7 (1–54) 1.7 (1–54)0.5 (1–54)4.6
391IDY(1–54)4.0 (2–54)10.8 (1–50)3.8 (2–52)0.9 (1–54)10.0 (5–46) 10.0 (5–46)0.3 (1–54)3.5
401PRV(1–56)5.7 (2–56)2.1 (1–56)5.6 (3–56)1.6 (1–56)5.7 (2–56) 5.4 (2–56)5.6 (2–56)4.9
411HDD(1–57)13.2 (1–57)13.3 (1–57)1.2 (1–56)2.7 (1–57) 3.3 (1–51) 3.6 (1–57) 1.5 (9–55) 1.3 (1–56)2.1 (1–56) 2.7 (1–56)0.3 (1–57)4.8
421BDC(1–60)3.4 (1–60)2.7 (6–39) 3.1 (1–38)2.7 (6–39)2.7(5–59)2.1 (1–60) 1.8 (1–60)2.6 (5–37)4.8
431I5X(1–61)0.7 (1–61)1.1 (1–61)1.6 (1–61)1.8 (1–61)1.5 (1–61) 1.4 (1–61)0.9 (1–61)2.6
441I5Y(1–61)0.7 (1–61)1.1 (1–61)1.6 (1–61)1.4 (1–61)1.7 (1–61) 1.1 (1–61)0.9 (1–61)2.6
451KU3(1–61)1.3 (1–61)2.8 (4–61)1.5 (1–61)1.5 (1–61)1.9 (1–61) 1.7 (1–61)0.4 (1–61)5.5
461YIB(1–61)1.8 (2–61)1.7 (1–61)3.4 (1–61)2.8 (2–60)1.5 (1–61) 1.6 (1–61)1.9 (2–61)3.5
471AHO(1–64)0.6 (1–64)0.5 (1–64)1.3 (1–64)0.4 (1–64)1.8 (1–64)0.3 (1–64)6.0
481DF5(1–68)0.6 (1–68)1.5 (1–68)1.8 (2–66)2.2 (1–68)1.8 (1–68) 1.8 (1–68)1.6 (1–68)3.1
491QR9(1–68)0.5 (1–68)0.7 (1–68)1.4 (2–66)1.8 (1–68)1.7 (1–68) 1.8 (1–68)0.6 (1–68)3.8
501AIL(1–70)0.87 (1–70)0.73 (1–70)0.46 (1–70)0.6 (1–70)0.88 (1–70) 0.97 (1–70)0.9 (1–70)4.2
Sl. No.PDB IDCPHModels (19) RMSD (Å)SwissModel (4) RMSD (Å)EsyPred3D (38) RMSD (Å)ModWeb (21) RMSD (Å)Geno3D (39) RMSD (Å)3DJigSaw (40) RMSD (Å)Bhageerath RMSD (Å)
11E0Q(1–17)1.7 (1–17)1.5 (1–16)2.5
21B03(1–18)3.5 (2–18)4.4
31WQC(1–26)0.5 (1–26)0.4 (1–26)2.5
41RJU(1–36)2.0 (1–36)1.7 (1–36)2.1 (1–36)5.9
51EDM(1–39)1.5 (1–39)1.4 (1–39)0.8 (2–38)0.5 (1–39)1.8 (1–39)3.5
61AB1(1–46)0.6 (1–46)2.8 (1–46)0.4 (1–46)0.4 (1–46)0.7 (1–46) 0.7 (1–46)2.9
71BX7(1–51)0.6 (1–51)0.8 (1–51)2.2 (3–50)0.6 (1–51)2.6 (4–51)2.2 (3–50)3.1
81B6Q(1–56)4.7 (1–56)5.0 (1–56)2.7 (3–56)5.1 (1–56)0.7 (1–56) 1.4 (1–56)4.8 (1–56)3.1
91ROP(1–56)1.3 (1–56)0.6 (1–56)4.7 (3–56)0.7 (1–56)0.7 (1–56) 0.8 (1–56)1.3 (1–56)4.3
101NKD(1–59)0.5 (1–59)7.7 (1–50)0.6 (1–59)1.9 (1–59) 0.7 (1–59)1.3 (1–59) 1.2 (1–59)0.4 (1–59)3.8
111RPO(1–61)0.5 (1–61)7.7 (1–50)0.5 (1–59)1.9 (1–61) 0.7 (1–59)0.8 (1–61) 0.9 (1–61)0.4 (1–61)3.8
121QR8(1–68)0.5 (1–68)0.5 (1–68)1.1 (2–66)1.6 (1–68)0.9 (1–68) 1.2 (1–68)0.5 (1–68)4.4
131FME(1–28)0.7 (1–28)0.9 (1–28)3.7
141ACW(1–29)0.7 (1–29)0.4 (1–29)5.1
151DFN(1–30)0.8 (1–30)0.4 (1–30)1.3 (2–30)5.0
161Q2K(1–31)0.9 (1–31)0.5 (1–31)4.2
171SCY(1–31)0.6 (1–31)0.7 (1–31)3.1
181XRX(1–34)0.5 (1–34)0.3 (1–34)0.7 (1–34)3.1 (1–31)5.6
191ROO(1–35)0.8 (1–35)0.7 (1–35)2.8
201YRF(1–35)1.6 (1–35)0.5 (1–35)1.2 (1–35)1.3 (1–35)4.8
211YRI(1–35)1.7 (1–35)0.7 (1–35)1.4 (1–35)1.5 (1–35)3.8
221VII(1–36)2.4 (2–36)0.9 (1–36)2.2 (2–36)2.0 (2–36)3.7
231BGK(1–37)0.8 (1–37)0.5 (1–37)0.7 (1–37)4.1
241BHI(1–38)0.8 (1–38)0.4 (1–38)1.0 (1–38)1.1 (1–38)5.3
251OVX(1–38)0.9 (1–38)0.3 (1–38)1.0 (1–38)0.6 (1–38)0.3 (1–38)4.0
261I6C(1–39)4.2 (1–39)4.4 (1–39)4.5 (1–39)0.8 (1–39)3.1 (1–34)5.1
272ERL(1–40)1.3 (1–40)0.9 (1–40)0.4 (1–40)0.4 (1–40)1.2 (1–40) 1.1 (1–40)3.2
281RES(1–43)4.2 (1–43)4.1 (1–43)4.2 (1–43)0.8 (1–43)1.2 (1–43) 1.1 (1–43)4.2
292CPG(1–43)0.8 (1–43)0.6 (1–43)1.1 (1–43)0.9 (1–43)0.9 (1–43)0.6 (1–43)5.3
301DV0(1–45)4.2 (1–45)10.5 (1–35)2.0 (1–42)0.7 (1–45) 2.4 (1–44)0.9 (1–45) 1.1 (1–45)0.6 (1–45)5.1
311IRQ(1–48)0.6 (1–48)0.8 (1–48)1.3 (2–48)0.7 (1–48)1.2 (1–48)0.9 (1–48)5.3
321GUU(1–50)2.5 (1–50)2.6 (1–50)2.3 (38–50)5.7 (1–50) 0.5 (1–50) 2.2 (1–49)1.5 (1–48) 1.6 (1–48)1.6 (1–42)4.6
331GV5(1–52)1.4 (1–52)0.6 (1–52)1.3 (1–52)0.68 (1–52) 2.2 (3–52)2.1 (3–46) 2.0 (3–46)1.8 (3–45)4.1
341GVD(1–52)1.4 (1–52)4.2 (1–51)1.3 (1–52)5.5 (1–52)6.6 (1–44) 9.8 (1–44)6.4 (1–43)4.9
351MBH(1–52)1.8 (1–52)3.3 (1–51)1.8 (1–52)1.9 (1–52)1.6 (1–52) 2.1 (1–52)1.1 (6–45)4.0
361GAB(1–53)0.6 (1–53)1.6 (1–53)3.3 (1–53)3.3 (1–53)2.2 (1–53) 2.7 (1–53)0.5 (1–53)3.6
371MOF(1–53)0.6 (1–53)1.8 (1–53)1.9 (1–53)1.7 (1–53) 2.3 (3–51)3.4 (1–53) 3.4 (1–53)1.7 (1–53)2.9
381ENH(1–54)0.5 (1–54)0.8 (1–54)0.9 (3–53)1.7 (1–53) 1.0 (5–53) 2.5 (1–54) 3.0 (1–51)1.7 (1–54) 1.7 (1–54)0.5 (1–54)4.6
391IDY(1–54)4.0 (2–54)10.8 (1–50)3.8 (2–52)0.9 (1–54)10.0 (5–46) 10.0 (5–46)0.3 (1–54)3.5
401PRV(1–56)5.7 (2–56)2.1 (1–56)5.6 (3–56)1.6 (1–56)5.7 (2–56) 5.4 (2–56)5.6 (2–56)4.9
411HDD(1–57)13.2 (1–57)13.3 (1–57)1.2 (1–56)2.7 (1–57) 3.3 (1–51) 3.6 (1–57) 1.5 (9–55) 1.3 (1–56)2.1 (1–56) 2.7 (1–56)0.3 (1–57)4.8
421BDC(1–60)3.4 (1–60)2.7 (6–39) 3.1 (1–38)2.7 (6–39)2.7(5–59)2.1 (1–60) 1.8 (1–60)2.6 (5–37)4.8
431I5X(1–61)0.7 (1–61)1.1 (1–61)1.6 (1–61)1.8 (1–61)1.5 (1–61) 1.4 (1–61)0.9 (1–61)2.6
441I5Y(1–61)0.7 (1–61)1.1 (1–61)1.6 (1–61)1.4 (1–61)1.7 (1–61) 1.1 (1–61)0.9 (1–61)2.6
451KU3(1–61)1.3 (1–61)2.8 (4–61)1.5 (1–61)1.5 (1–61)1.9 (1–61) 1.7 (1–61)0.4 (1–61)5.5
461YIB(1–61)1.8 (2–61)1.7 (1–61)3.4 (1–61)2.8 (2–60)1.5 (1–61) 1.6 (1–61)1.9 (2–61)3.5
471AHO(1–64)0.6 (1–64)0.5 (1–64)1.3 (1–64)0.4 (1–64)1.8 (1–64)0.3 (1–64)6.0
481DF5(1–68)0.6 (1–68)1.5 (1–68)1.8 (2–66)2.2 (1–68)1.8 (1–68) 1.8 (1–68)1.6 (1–68)3.1
491QR9(1–68)0.5 (1–68)0.7 (1–68)1.4 (2–66)1.8 (1–68)1.7 (1–68) 1.8 (1–68)0.6 (1–68)3.8
501AIL(1–70)0.87 (1–70)0.73 (1–70)0.46 (1–70)0.6 (1–70)0.88 (1–70) 0.97 (1–70)0.9 (1–70)4.2

The numbers in parenthesis indicate the length of the protein model obtained. Supplementary Tables S2–S7 in the supplementary information contain the template ID, % sequence identity and alignment for each method and structure shown above.

Further comparison of the 10 structures obtained from Bhageerath was carried out with the five candidate structures obtained from the ProtInfo web server (30) and 10 structures obtained with ROBETTA software (28) configured locally. The results shown in Table 4 indicate that the server described here is able to predict structures with RMSDs comparable to those obtained by ProtInfo web server and ROBETTA software. Supplementary Table S8 in the supplementary information provides the comparison of the GDT_TS scores obtained using LGA server (42) for structures obtained with Bhageerath and ProtInfo web servers and ROBETTA software. The GDT_TS scores are also found to be comparable for structures obtained from these three different structure prediction methodologies.

Table 4

A comparison of protein tertiary structure prediction accuracy with ProtInfo web server and ROBETTA software available in the public domain for 50 test proteins

Sl. No.PDB IDRMSD without end loops (Å) (Bhageerath)RMSD without end loops (Å) (ProtInfo)a (30)RMSD without end loops (Å) (ROBETTA)a (28)
11E0Q4.5, 2.5, 3.0, 5.0, 3.4, 3.3, 3.2, 3.3, 5.9, 3.34.0, 4.1, 3.7, 3.9, 4.21.1b
21B0310.3, 4.4, 5.9, 5.5, 6.7, 5.4, 4.5, 6.1, 6.9, 7.54.0, 4.7, 4.1, 4.5, 4.42.7, 3.0
31WQC4.0, 4.5, 2.5, 3.8, 2.9, 5.1, 4.2, 5.7, 3.8, 4.72.1, 1.8, 1.8, 2.0, 2.12.3, 3.4
41RJU6.1, 6.3, 6.6, 5.9, 6.6, 5.9, 6.6, 7.0, 6.7, 7.43.4, 4.9, 3.3, 4.8, 6.03.4, 4.0, 2.5, 3.2, 3.0, 3.6, 4.8, 2.9, 3.0, 3.1
51EDM3.9, 3.5, 3.8, 4.0, 3.6, 5.2, 5.4, 4.1, 3.9, 4.73.4, 4.0, 3.7, 3.3, 3.10.4, 0.5, 0.4, 0.5, 0.6, 0.4, 0.7, 0.7, 1.1, 0.4
61AB14.8, 4.5, 4.3, 5.2, 4.2, 2.9, 4.5, 3.8, 5.8, 3.33.3, 5.1, 6.3, 3.6, 4.92.2, 2.8, 2.9, 2.4, 2.9, 2.7, 3.7, 3.5, 2.2, 3.3
71BX73.3, 4.0, 5.0, 3.2, 4.5, 3.8, 4.8, 3.1, 4.0, 3.52.6, 4.2, 3.7, 4.5, 2.10.9, 1.5, 1.0, 1.6, 1.5, 1.6, 1.4, 1.0, 2.0, 1.5
81B6Q6.1, 8.4, 4.0, 4.4, 3.8, 10.1, 5.3, 9.7, 10.7, 3.110.2, 10.0, 10.0, 10.4, 10.510.0, 9.6, 8.5, 7.6, 12.0, 8.3, 8.2, 7.0, 10.2, 9.0
91ROP5.3, 4.3, 9.2, 7.3, 7.5, 11.0, 14.2, 11.5, 8.7, 6.210.8, 11.5, 11.5, 10.1, 12.45.8, 10.3, 10.0, 11.7, 8.6, 7.0, 8.3, 7.7, 11.2, 13.6
101NKD3.9, 16.2, 10.1, 7.0, 10.6, 3.8, 4.8, 4.9, 7.9, 14.713.5, 13.5, 13.3, 13.4, 11.78.9, 8.9, 10.6, 11.0, 12.6, 10.7, 12.2, 10.1, 11.0, 9.1
111RPO9.9, 3.8, 4.0, 7.5, 14.4, 4.8, 6.0, 13.5, 3.8, 7.510.8, 10.4, 10.4, 10.9, 11.210.3, 8.7, 6.9, 6.0, 12.4, 7.7, 10.1, 7.2, 10.0, 7.7
121QR89.0, 11.1, 8.2, 7.1, 9.7, 14.0, 8.1, 10.9, 5.4, 4.410.1, 9.5, 10.0, 10.4, 12.211.3, 9.3, 9.0, 7.6, 9.5, 12.2, 10.5, 7.1, 11.3, 8.5
131FME4.9, 5.0, 4.8, 6.5, 3.7, 4.5, 4.2, 6.2, 4.3, 4.12.2, 2.3, 2.5, 2.7, 1.63.8, 2.8, 3.3, 4.5, 3.6, 3.1, 2.7, 3.9, 4.4, 3.7
141ACW5.5, 7.0, 5.3, 6.0, 7.4, 5.7, 7.0, 5.1, 7.2, 5.65.8, 5.8, 6.0, 6.2, 7.11.3, 1.7
151DFN5.0, 5.9, 6.5, 5.8, 6.8, 6.0, 7.1, 6.1, 6.5, 7.45.6, 6.8, 6.4, 6.6, 6.41.7, 5.3, 6.0, 5.5, 4.0, 6.3, 5.2, 6.5, 5.2, 6.6
161Q2K7.4, 7.4, 7.2, 4.8, 5.8, 6.5, 5.7, 6.2, 4.2, 7.35.9, 6.0, 5.8, 6.4, 9.11.7, 3.0, 3.3, 1.6, 4.7
171SCY6.1, 4.8, 6.6, 7.2, 3.1, 5.0, 6.5, 6.9, 7.2, 5.65.5, 5.6, 6.5, 6.4, 6.22.2, 2.7, 3.3
181XRX5.6, 8.8, 7.6, 7.7, 9.6, 8.4, 9.0, 6.2, 8.4, 8.28.6, 8.8, 7.8, 8.8, 4.05.2, 9.1, 7.1, 6.2, 4.4, 9.4, 6.6, 4.5, 5.4, 8.2
191ROO3.9, 3.4, 3.3, 3.8, 2.8, 4.1, 3.5, 3.2, 3.2, 3.32.8, 2.7, 2.7, 3.0, 2.71.8, 2.1, 1.9, 2.9, 2.5, 1.2, 2.5, 1.9, 2.8, 2.2
201YRF5.9, 5.7, 5.7, 4.8, 4.9, 5.0, 4.9, 5.0, 6.2, 5.84.3, 4.1, 3.3, 3.3, 4.31.7, 3.1, 4.3
211YRI5.9, 5.5, 4.6, 6.0, 5.5, 3.8, 5.5, 5.4, 5.5, 6.14.2, 4.0, 3.2, 3.2, 4.21.7, 3.9, 2.8
221VII5.5, 3.7, 6.6, 5.9, 6.1, 5.7, 5.6, 6.0, 6.3, 5.74.4, 4.7, 4.5, 4.3, 3.72.4, 3.3, 1.8, 5.6, 4.3, 3.0, 3.2, 3.7, 4.8, 1.9
231BGK5.8, 5.9, 4.1, 6.1, 5.8, 5.5, 5.5, 4.9, 5.2, 6.16.2, 6.0, 6.4, 6.4, 6.26.5, 4.1, 4.6, 2.5, 3.8, 5.9, 3.5, 3.3, 3.5, 6.1
241BHI7.9, 5.3, 6.7, 7.2, 5.4, 8.9, 6.3, 6.6, 6.2, 7.13.7, 3.8, 4.5, 4.5, 5.02.4, 2.4, 1.7, 1.1, 2.8, 1.7, 2.6, 2.2, 2.3, 1.9
251OVX4.0, 6.4, 6.3, 4.3, 6.1, 5.4, 5.3, 5.9, 7.7, 6.14.6, 4.9, 4.4, 5.6, 5.23.2, 1.5, 3.1, 2.6, 4.2, 4.4, 2.3, 2.6, 1.9, 5.0
261I6C7.5, 5.1, 5.4, 6.2, 5.4, 6.2, 8.0, 6.2, 6.7, 7.65.6, 5.7, 5.6, 7.3, 6.93.0, 3.0, 2.2, 3.2, 2.1
272ERL6.7, 8.6, 7.1, 8.4, 7.2, 3.2, 4.1, 6.2, 6.8, 8.17.0, 7.4, 7.1, 7.2, 8.31.3, 7.1
281RES6.1, 4.2, 5.2, 7.7, 4.8, 4.8, 4.3, 7.0, 5.6, 5.57.6, 7.1, 7.0, 7.3, 5.13.5, 3.0, 2.8, 4.3, 4.2, 2.3, 2.0
292CPG10.1, 5.3, 10.0, 8.5, 9.4, 10.6, 7.8, 9.4, 7.4, 7.54.2, 4.5, 5.3, 5.1, 11.08.0, 4.3, 8.5, 8.4, 6.5, 10.0, 4.8, 8.6, 5.5, 7.6
301DV07.7, 7.1, 8.0, 5.1, 8.3, 6.0, 7.8, 8.7, 8.4, 8.53.2, 4.4, 4.0, 2.8, 6.21.6, 1.5, 1.6, 2.0, 1.5, 4.5, 2.4, 2.0, 2.3, 4.2
311IRQ6.8, 6.9, 6.4, 6.7, 10.2, 8.4, 9.8, 9.0, 5.3, 8.28.2, 8.9, 9.1, 9.0, 8.56.1, 4.3, 6.0, 5.0, 6.6, 6.0, 7.4, 5.2, 6.4, 7.5
321GUU5.5, 5.3, 7.7, 4.6, 5.0, 4.6, 5.1, 5.7, 8.9, 9.110.1, 10.1, 9.8, 9.3, 10.12.9, 4.2, 2.9, 7.0, 3.2, 3.7, 2.4, 6.5, 5.6
331GV54.9, 4.1, 4.8, 4.8, 9.0, 9.4, 4.6, 9.2, 9.3, 8.99.4, 9.1, 9.5, 8.9, 3.38.5, 3.7, 9.1, 4.5, 4.7, 5.3, 4.2, 9.1, 3.1, 3.5
341GVD5.7, 6.4, 8.0, 5.1, 6.0, 4.9, 4.9, 6.9, 4.9, 5.59.4, 9.4, 8.8, 9.1, 3.98.5, 3.5, 2.7, 3.0, 4.7, 4.4, 4.3, 2.3, 6.7, 8.9
351MBH9.1, 9.2, 9.2, 4.0, 9.5, 8.4, 5.5, 5.5, 5.0, 5.34.3, 4.1, 5.7, 3.5, 9.58.3, 8.1, 4.2, 2.8, 8.9, 2.4, 7.9, 3.5, 7.7, 7.7
361GAB4.9, 9.2, 6.2, 6.0, 6.8, 3.6, 8.5, 9.7, 8.8, 6.35.5, 5.6, 6.4, 5.4, 5.92.3, 8.8, 2.7, 7.9, 2.8, 8.1, 2.7, 2.3, 2.2, 7.7
371MOF5.7, 3.7, 3.9, 4.2, 2.9, 4.0, 4.9, 4.3, 4.0, 4.912.7, 13.6, 12.5, 12.7, 13.513.7, 11.8, 11.2, 12.6, 12.6, 12.0, 12.2, 12.9, 12.8, 11.2
381ENH6.3, 9.9, 4.6, 9.1, 9.7, 5.8, 5.7, 9.5, 6.2, 6.45.0, 4.6, 4.3, 8.7, 4.22.2, 1.7, 1.8, 5.1, 2.3, 4.6, 3.0, 5.2, 3.1, 3.2
391IDY4.6, 4.9, 8.7, 4.0, 3.6, 3.5, 5.3, 3.7, 6.0, 9.38.7, 8.3, 8.3, 8.8, 4.62.7, 2.5, 3.0, 8.5, 2.1, 2.0, 2.1, 6.8, 2.6, 2.9
401PRV6.9, 5.1, 6.9, 5.8, 5.0, 5.6, 5.6, 9.5, 4.9, 4.92.3, 2.6, 3.0, 3.2, 5.42.5, 2.1, 3.4, 2.9, 3.7, 4.9, 2.9, 2.4, 4.2, 6.8
411HDD10.2, 6.3, 10.2, 5.5, 11.1, 6.2, 9.8, 4.8, 7.0, 6.74.4, 4.7, 5.8, 4.6, 9.72.3, 2.5, 2.2, 3.3, 3.6, 4.4, 3.4, 3.0, 4.2, 4.2
421BDC7.7, 6.1, 6.6, 8.3, 4.8, 7.0, 7.5, 5.0, 6.7, 6.63.1, 3.0, 3.5, 2.8, 5.12.5, 2.5, 3.7, 3.2, 7.7, 4.0, 3.7, 7.9, 2.6, 7.8
431I5X5.5, 5.9, 3.6, 5.4, 5.8, 2.6, 4.3, 6.0, 3.9, 5.111.4, 11.0, 11.0, 11.5, 9.210.8, 6.8, 8.6, 12.5, 4.5, 9.8, 7.1, 13.1, 9.0, 7.0
441I5Y5.8, 5.1, 4.3, 4.3, 3.4, 4.9, 2.6, 3.7, 3.2, 4.09.8, 8.9, 8.4, 11.8, 9.19.6, 7.8, 10.2, 9.1, 8.2, 5.0, 12.5, 11.3, 8.4, 8.1
451KU36.6, 7.4, 6.4, 5.5, 7.2, 5.6, 6.3, 6.2, 5.6, 8.35.6, 5.4, 4.9, 5.4, 9.64.7, 4.4, 5.8, 4.5, 5.3, 5.3, 5.5, 6.2, 4.7, 2.9
461YIB6.7, 5.3, 5.5, 5.8, 3.5, 4.8, 5.1, 4.5, 5.2, 4.617.5, 17.6, 18.3, 17.3, 17.417.8, 17.5, 17.1, 17.1, 17.3, 17.5, 18.5, 16.3
471DF53.4, 5.3, 6.0, 6.1, 7.0, 3.8, 3.4, 3.1, 8.1, 3.49.3, 10.3, 8.7, 9.3, 11.79.9, 8.2, 5.7, 5.6, 9.9, 8.5, 8.6, 11.1, 6.3, 7.0
481AHO7.8, 7.6, 9.1, 8.7, 6.6, 6.0, 7.2, 7.7, 9.2, 7.78.1, 6.6, 4.1, 5.2, 6.00.6, 1.1, 0.6, 1.2, 1.0, 0.4, 0.8, 1.4, 1.2, 0.8
491QR94.3, 3.8, 4.9, 5.1, 10.9, 6.0, 4.0, 4.0, 4.2, 4.611.0, 11.1, 9.6, 11.2, 12.96.3, 8.5, 4.3, 9.9, 8.6, 6.5, 8.7, 11.7, 12.1, 10.7
501AIL10.8, 6.6, 4.4, 6.4, 7.2, 8.9, 4.2, 8.5, 6.0, 4.29.0, 8.9, 8.4, 7.6, 10.33.2, 4.4, 4.5, 5.3, 7.2, 5.4, 6.4
Sl. No.PDB IDRMSD without end loops (Å) (Bhageerath)RMSD without end loops (Å) (ProtInfo)a (30)RMSD without end loops (Å) (ROBETTA)a (28)
11E0Q4.5, 2.5, 3.0, 5.0, 3.4, 3.3, 3.2, 3.3, 5.9, 3.34.0, 4.1, 3.7, 3.9, 4.21.1b
21B0310.3, 4.4, 5.9, 5.5, 6.7, 5.4, 4.5, 6.1, 6.9, 7.54.0, 4.7, 4.1, 4.5, 4.42.7, 3.0
31WQC4.0, 4.5, 2.5, 3.8, 2.9, 5.1, 4.2, 5.7, 3.8, 4.72.1, 1.8, 1.8, 2.0, 2.12.3, 3.4
41RJU6.1, 6.3, 6.6, 5.9, 6.6, 5.9, 6.6, 7.0, 6.7, 7.43.4, 4.9, 3.3, 4.8, 6.03.4, 4.0, 2.5, 3.2, 3.0, 3.6, 4.8, 2.9, 3.0, 3.1
51EDM3.9, 3.5, 3.8, 4.0, 3.6, 5.2, 5.4, 4.1, 3.9, 4.73.4, 4.0, 3.7, 3.3, 3.10.4, 0.5, 0.4, 0.5, 0.6, 0.4, 0.7, 0.7, 1.1, 0.4
61AB14.8, 4.5, 4.3, 5.2, 4.2, 2.9, 4.5, 3.8, 5.8, 3.33.3, 5.1, 6.3, 3.6, 4.92.2, 2.8, 2.9, 2.4, 2.9, 2.7, 3.7, 3.5, 2.2, 3.3
71BX73.3, 4.0, 5.0, 3.2, 4.5, 3.8, 4.8, 3.1, 4.0, 3.52.6, 4.2, 3.7, 4.5, 2.10.9, 1.5, 1.0, 1.6, 1.5, 1.6, 1.4, 1.0, 2.0, 1.5
81B6Q6.1, 8.4, 4.0, 4.4, 3.8, 10.1, 5.3, 9.7, 10.7, 3.110.2, 10.0, 10.0, 10.4, 10.510.0, 9.6, 8.5, 7.6, 12.0, 8.3, 8.2, 7.0, 10.2, 9.0
91ROP5.3, 4.3, 9.2, 7.3, 7.5, 11.0, 14.2, 11.5, 8.7, 6.210.8, 11.5, 11.5, 10.1, 12.45.8, 10.3, 10.0, 11.7, 8.6, 7.0, 8.3, 7.7, 11.2, 13.6
101NKD3.9, 16.2, 10.1, 7.0, 10.6, 3.8, 4.8, 4.9, 7.9, 14.713.5, 13.5, 13.3, 13.4, 11.78.9, 8.9, 10.6, 11.0, 12.6, 10.7, 12.2, 10.1, 11.0, 9.1
111RPO9.9, 3.8, 4.0, 7.5, 14.4, 4.8, 6.0, 13.5, 3.8, 7.510.8, 10.4, 10.4, 10.9, 11.210.3, 8.7, 6.9, 6.0, 12.4, 7.7, 10.1, 7.2, 10.0, 7.7
121QR89.0, 11.1, 8.2, 7.1, 9.7, 14.0, 8.1, 10.9, 5.4, 4.410.1, 9.5, 10.0, 10.4, 12.211.3, 9.3, 9.0, 7.6, 9.5, 12.2, 10.5, 7.1, 11.3, 8.5
131FME4.9, 5.0, 4.8, 6.5, 3.7, 4.5, 4.2, 6.2, 4.3, 4.12.2, 2.3, 2.5, 2.7, 1.63.8, 2.8, 3.3, 4.5, 3.6, 3.1, 2.7, 3.9, 4.4, 3.7
141ACW5.5, 7.0, 5.3, 6.0, 7.4, 5.7, 7.0, 5.1, 7.2, 5.65.8, 5.8, 6.0, 6.2, 7.11.3, 1.7
151DFN5.0, 5.9, 6.5, 5.8, 6.8, 6.0, 7.1, 6.1, 6.5, 7.45.6, 6.8, 6.4, 6.6, 6.41.7, 5.3, 6.0, 5.5, 4.0, 6.3, 5.2, 6.5, 5.2, 6.6
161Q2K7.4, 7.4, 7.2, 4.8, 5.8, 6.5, 5.7, 6.2, 4.2, 7.35.9, 6.0, 5.8, 6.4, 9.11.7, 3.0, 3.3, 1.6, 4.7
171SCY6.1, 4.8, 6.6, 7.2, 3.1, 5.0, 6.5, 6.9, 7.2, 5.65.5, 5.6, 6.5, 6.4, 6.22.2, 2.7, 3.3
181XRX5.6, 8.8, 7.6, 7.7, 9.6, 8.4, 9.0, 6.2, 8.4, 8.28.6, 8.8, 7.8, 8.8, 4.05.2, 9.1, 7.1, 6.2, 4.4, 9.4, 6.6, 4.5, 5.4, 8.2
191ROO3.9, 3.4, 3.3, 3.8, 2.8, 4.1, 3.5, 3.2, 3.2, 3.32.8, 2.7, 2.7, 3.0, 2.71.8, 2.1, 1.9, 2.9, 2.5, 1.2, 2.5, 1.9, 2.8, 2.2
201YRF5.9, 5.7, 5.7, 4.8, 4.9, 5.0, 4.9, 5.0, 6.2, 5.84.3, 4.1, 3.3, 3.3, 4.31.7, 3.1, 4.3
211YRI5.9, 5.5, 4.6, 6.0, 5.5, 3.8, 5.5, 5.4, 5.5, 6.14.2, 4.0, 3.2, 3.2, 4.21.7, 3.9, 2.8
221VII5.5, 3.7, 6.6, 5.9, 6.1, 5.7, 5.6, 6.0, 6.3, 5.74.4, 4.7, 4.5, 4.3, 3.72.4, 3.3, 1.8, 5.6, 4.3, 3.0, 3.2, 3.7, 4.8, 1.9
231BGK5.8, 5.9, 4.1, 6.1, 5.8, 5.5, 5.5, 4.9, 5.2, 6.16.2, 6.0, 6.4, 6.4, 6.26.5, 4.1, 4.6, 2.5, 3.8, 5.9, 3.5, 3.3, 3.5, 6.1
241BHI7.9, 5.3, 6.7, 7.2, 5.4, 8.9, 6.3, 6.6, 6.2, 7.13.7, 3.8, 4.5, 4.5, 5.02.4, 2.4, 1.7, 1.1, 2.8, 1.7, 2.6, 2.2, 2.3, 1.9
251OVX4.0, 6.4, 6.3, 4.3, 6.1, 5.4, 5.3, 5.9, 7.7, 6.14.6, 4.9, 4.4, 5.6, 5.23.2, 1.5, 3.1, 2.6, 4.2, 4.4, 2.3, 2.6, 1.9, 5.0
261I6C7.5, 5.1, 5.4, 6.2, 5.4, 6.2, 8.0, 6.2, 6.7, 7.65.6, 5.7, 5.6, 7.3, 6.93.0, 3.0, 2.2, 3.2, 2.1
272ERL6.7, 8.6, 7.1, 8.4, 7.2, 3.2, 4.1, 6.2, 6.8, 8.17.0, 7.4, 7.1, 7.2, 8.31.3, 7.1
281RES6.1, 4.2, 5.2, 7.7, 4.8, 4.8, 4.3, 7.0, 5.6, 5.57.6, 7.1, 7.0, 7.3, 5.13.5, 3.0, 2.8, 4.3, 4.2, 2.3, 2.0
292CPG10.1, 5.3, 10.0, 8.5, 9.4, 10.6, 7.8, 9.4, 7.4, 7.54.2, 4.5, 5.3, 5.1, 11.08.0, 4.3, 8.5, 8.4, 6.5, 10.0, 4.8, 8.6, 5.5, 7.6
301DV07.7, 7.1, 8.0, 5.1, 8.3, 6.0, 7.8, 8.7, 8.4, 8.53.2, 4.4, 4.0, 2.8, 6.21.6, 1.5, 1.6, 2.0, 1.5, 4.5, 2.4, 2.0, 2.3, 4.2
311IRQ6.8, 6.9, 6.4, 6.7, 10.2, 8.4, 9.8, 9.0, 5.3, 8.28.2, 8.9, 9.1, 9.0, 8.56.1, 4.3, 6.0, 5.0, 6.6, 6.0, 7.4, 5.2, 6.4, 7.5
321GUU5.5, 5.3, 7.7, 4.6, 5.0, 4.6, 5.1, 5.7, 8.9, 9.110.1, 10.1, 9.8, 9.3, 10.12.9, 4.2, 2.9, 7.0, 3.2, 3.7, 2.4, 6.5, 5.6
331GV54.9, 4.1, 4.8, 4.8, 9.0, 9.4, 4.6, 9.2, 9.3, 8.99.4, 9.1, 9.5, 8.9, 3.38.5, 3.7, 9.1, 4.5, 4.7, 5.3, 4.2, 9.1, 3.1, 3.5
341GVD5.7, 6.4, 8.0, 5.1, 6.0, 4.9, 4.9, 6.9, 4.9, 5.59.4, 9.4, 8.8, 9.1, 3.98.5, 3.5, 2.7, 3.0, 4.7, 4.4, 4.3, 2.3, 6.7, 8.9
351MBH9.1, 9.2, 9.2, 4.0, 9.5, 8.4, 5.5, 5.5, 5.0, 5.34.3, 4.1, 5.7, 3.5, 9.58.3, 8.1, 4.2, 2.8, 8.9, 2.4, 7.9, 3.5, 7.7, 7.7
361GAB4.9, 9.2, 6.2, 6.0, 6.8, 3.6, 8.5, 9.7, 8.8, 6.35.5, 5.6, 6.4, 5.4, 5.92.3, 8.8, 2.7, 7.9, 2.8, 8.1, 2.7, 2.3, 2.2, 7.7
371MOF5.7, 3.7, 3.9, 4.2, 2.9, 4.0, 4.9, 4.3, 4.0, 4.912.7, 13.6, 12.5, 12.7, 13.513.7, 11.8, 11.2, 12.6, 12.6, 12.0, 12.2, 12.9, 12.8, 11.2
381ENH6.3, 9.9, 4.6, 9.1, 9.7, 5.8, 5.7, 9.5, 6.2, 6.45.0, 4.6, 4.3, 8.7, 4.22.2, 1.7, 1.8, 5.1, 2.3, 4.6, 3.0, 5.2, 3.1, 3.2
391IDY4.6, 4.9, 8.7, 4.0, 3.6, 3.5, 5.3, 3.7, 6.0, 9.38.7, 8.3, 8.3, 8.8, 4.62.7, 2.5, 3.0, 8.5, 2.1, 2.0, 2.1, 6.8, 2.6, 2.9
401PRV6.9, 5.1, 6.9, 5.8, 5.0, 5.6, 5.6, 9.5, 4.9, 4.92.3, 2.6, 3.0, 3.2, 5.42.5, 2.1, 3.4, 2.9, 3.7, 4.9, 2.9, 2.4, 4.2, 6.8
411HDD10.2, 6.3, 10.2, 5.5, 11.1, 6.2, 9.8, 4.8, 7.0, 6.74.4, 4.7, 5.8, 4.6, 9.72.3, 2.5, 2.2, 3.3, 3.6, 4.4, 3.4, 3.0, 4.2, 4.2
421BDC7.7, 6.1, 6.6, 8.3, 4.8, 7.0, 7.5, 5.0, 6.7, 6.63.1, 3.0, 3.5, 2.8, 5.12.5, 2.5, 3.7, 3.2, 7.7, 4.0, 3.7, 7.9, 2.6, 7.8
431I5X5.5, 5.9, 3.6, 5.4, 5.8, 2.6, 4.3, 6.0, 3.9, 5.111.4, 11.0, 11.0, 11.5, 9.210.8, 6.8, 8.6, 12.5, 4.5, 9.8, 7.1, 13.1, 9.0, 7.0
441I5Y5.8, 5.1, 4.3, 4.3, 3.4, 4.9, 2.6, 3.7, 3.2, 4.09.8, 8.9, 8.4, 11.8, 9.19.6, 7.8, 10.2, 9.1, 8.2, 5.0, 12.5, 11.3, 8.4, 8.1
451KU36.6, 7.4, 6.4, 5.5, 7.2, 5.6, 6.3, 6.2, 5.6, 8.35.6, 5.4, 4.9, 5.4, 9.64.7, 4.4, 5.8, 4.5, 5.3, 5.3, 5.5, 6.2, 4.7, 2.9
461YIB6.7, 5.3, 5.5, 5.8, 3.5, 4.8, 5.1, 4.5, 5.2, 4.617.5, 17.6, 18.3, 17.3, 17.417.8, 17.5, 17.1, 17.1, 17.3, 17.5, 18.5, 16.3
471DF53.4, 5.3, 6.0, 6.1, 7.0, 3.8, 3.4, 3.1, 8.1, 3.49.3, 10.3, 8.7, 9.3, 11.79.9, 8.2, 5.7, 5.6, 9.9, 8.5, 8.6, 11.1, 6.3, 7.0
481AHO7.8, 7.6, 9.1, 8.7, 6.6, 6.0, 7.2, 7.7, 9.2, 7.78.1, 6.6, 4.1, 5.2, 6.00.6, 1.1, 0.6, 1.2, 1.0, 0.4, 0.8, 1.4, 1.2, 0.8
491QR94.3, 3.8, 4.9, 5.1, 10.9, 6.0, 4.0, 4.0, 4.2, 4.611.0, 11.1, 9.6, 11.2, 12.96.3, 8.5, 4.3, 9.9, 8.6, 6.5, 8.7, 11.7, 12.1, 10.7
501AIL10.8, 6.6, 4.4, 6.4, 7.2, 8.9, 4.2, 8.5, 6.0, 4.29.0, 8.9, 8.4, 7.6, 10.33.2, 4.4, 4.5, 5.3, 7.2, 5.4, 6.4
a

The secondary structure information was utilized from the native structure along with the sequence information for both Bhageerath and ROBETTA (Rosetta++ software suite was obtained from UW TechTransfer Digital Ventures). We have generated 10000 decoys starting from sequence and secondary structure information. The top 2000 scoring decoys were selected and top 10 cluster centers were extracted. The ProtInfo (http://protinfo.compbio.washington.edu) predictions were obtained from the sequence information alone.

b

For the system 1e0q it took ∼12 days on a dedicated processor to generate 1000 decoys.

Table 4

A comparison of protein tertiary structure prediction accuracy with ProtInfo web server and ROBETTA software available in the public domain for 50 test proteins

Sl. No.PDB IDRMSD without end loops (Å) (Bhageerath)RMSD without end loops (Å) (ProtInfo)a (30)RMSD without end loops (Å) (ROBETTA)a (28)
11E0Q4.5, 2.5, 3.0, 5.0, 3.4, 3.3, 3.2, 3.3, 5.9, 3.34.0, 4.1, 3.7, 3.9, 4.21.1b
21B0310.3, 4.4, 5.9, 5.5, 6.7, 5.4, 4.5, 6.1, 6.9, 7.54.0, 4.7, 4.1, 4.5, 4.42.7, 3.0
31WQC4.0, 4.5, 2.5, 3.8, 2.9, 5.1, 4.2, 5.7, 3.8, 4.72.1, 1.8, 1.8, 2.0, 2.12.3, 3.4
41RJU6.1, 6.3, 6.6, 5.9, 6.6, 5.9, 6.6, 7.0, 6.7, 7.43.4, 4.9, 3.3, 4.8, 6.03.4, 4.0, 2.5, 3.2, 3.0, 3.6, 4.8, 2.9, 3.0, 3.1
51EDM3.9, 3.5, 3.8, 4.0, 3.6, 5.2, 5.4, 4.1, 3.9, 4.73.4, 4.0, 3.7, 3.3, 3.10.4, 0.5, 0.4, 0.5, 0.6, 0.4, 0.7, 0.7, 1.1, 0.4
61AB14.8, 4.5, 4.3, 5.2, 4.2, 2.9, 4.5, 3.8, 5.8, 3.33.3, 5.1, 6.3, 3.6, 4.92.2, 2.8, 2.9, 2.4, 2.9, 2.7, 3.7, 3.5, 2.2, 3.3
71BX73.3, 4.0, 5.0, 3.2, 4.5, 3.8, 4.8, 3.1, 4.0, 3.52.6, 4.2, 3.7, 4.5, 2.10.9, 1.5, 1.0, 1.6, 1.5, 1.6, 1.4, 1.0, 2.0, 1.5
81B6Q6.1, 8.4, 4.0, 4.4, 3.8, 10.1, 5.3, 9.7, 10.7, 3.110.2, 10.0, 10.0, 10.4, 10.510.0, 9.6, 8.5, 7.6, 12.0, 8.3, 8.2, 7.0, 10.2, 9.0
91ROP5.3, 4.3, 9.2, 7.3, 7.5, 11.0, 14.2, 11.5, 8.7, 6.210.8, 11.5, 11.5, 10.1, 12.45.8, 10.3, 10.0, 11.7, 8.6, 7.0, 8.3, 7.7, 11.2, 13.6
101NKD3.9, 16.2, 10.1, 7.0, 10.6, 3.8, 4.8, 4.9, 7.9, 14.713.5, 13.5, 13.3, 13.4, 11.78.9, 8.9, 10.6, 11.0, 12.6, 10.7, 12.2, 10.1, 11.0, 9.1
111RPO9.9, 3.8, 4.0, 7.5, 14.4, 4.8, 6.0, 13.5, 3.8, 7.510.8, 10.4, 10.4, 10.9, 11.210.3, 8.7, 6.9, 6.0, 12.4, 7.7, 10.1, 7.2, 10.0, 7.7
121QR89.0, 11.1, 8.2, 7.1, 9.7, 14.0, 8.1, 10.9, 5.4, 4.410.1, 9.5, 10.0, 10.4, 12.211.3, 9.3, 9.0, 7.6, 9.5, 12.2, 10.5, 7.1, 11.3, 8.5
131FME4.9, 5.0, 4.8, 6.5, 3.7, 4.5, 4.2, 6.2, 4.3, 4.12.2, 2.3, 2.5, 2.7, 1.63.8, 2.8, 3.3, 4.5, 3.6, 3.1, 2.7, 3.9, 4.4, 3.7
141ACW5.5, 7.0, 5.3, 6.0, 7.4, 5.7, 7.0, 5.1, 7.2, 5.65.8, 5.8, 6.0, 6.2, 7.11.3, 1.7
151DFN5.0, 5.9, 6.5, 5.8, 6.8, 6.0, 7.1, 6.1, 6.5, 7.45.6, 6.8, 6.4, 6.6, 6.41.7, 5.3, 6.0, 5.5, 4.0, 6.3, 5.2, 6.5, 5.2, 6.6
161Q2K7.4, 7.4, 7.2, 4.8, 5.8, 6.5, 5.7, 6.2, 4.2, 7.35.9, 6.0, 5.8, 6.4, 9.11.7, 3.0, 3.3, 1.6, 4.7
171SCY6.1, 4.8, 6.6, 7.2, 3.1, 5.0, 6.5, 6.9, 7.2, 5.65.5, 5.6, 6.5, 6.4, 6.22.2, 2.7, 3.3
181XRX5.6, 8.8, 7.6, 7.7, 9.6, 8.4, 9.0, 6.2, 8.4, 8.28.6, 8.8, 7.8, 8.8, 4.05.2, 9.1, 7.1, 6.2, 4.4, 9.4, 6.6, 4.5, 5.4, 8.2
191ROO3.9, 3.4, 3.3, 3.8, 2.8, 4.1, 3.5, 3.2, 3.2, 3.32.8, 2.7, 2.7, 3.0, 2.71.8, 2.1, 1.9, 2.9, 2.5, 1.2, 2.5, 1.9, 2.8, 2.2
201YRF5.9, 5.7, 5.7, 4.8, 4.9, 5.0, 4.9, 5.0, 6.2, 5.84.3, 4.1, 3.3, 3.3, 4.31.7, 3.1, 4.3
211YRI5.9, 5.5, 4.6, 6.0, 5.5, 3.8, 5.5, 5.4, 5.5, 6.14.2, 4.0, 3.2, 3.2, 4.21.7, 3.9, 2.8
221VII5.5, 3.7, 6.6, 5.9, 6.1, 5.7, 5.6, 6.0, 6.3, 5.74.4, 4.7, 4.5, 4.3, 3.72.4, 3.3, 1.8, 5.6, 4.3, 3.0, 3.2, 3.7, 4.8, 1.9
231BGK5.8, 5.9, 4.1, 6.1, 5.8, 5.5, 5.5, 4.9, 5.2, 6.16.2, 6.0, 6.4, 6.4, 6.26.5, 4.1, 4.6, 2.5, 3.8, 5.9, 3.5, 3.3, 3.5, 6.1
241BHI7.9, 5.3, 6.7, 7.2, 5.4, 8.9, 6.3, 6.6, 6.2, 7.13.7, 3.8, 4.5, 4.5, 5.02.4, 2.4, 1.7, 1.1, 2.8, 1.7, 2.6, 2.2, 2.3, 1.9
251OVX4.0, 6.4, 6.3, 4.3, 6.1, 5.4, 5.3, 5.9, 7.7, 6.14.6, 4.9, 4.4, 5.6, 5.23.2, 1.5, 3.1, 2.6, 4.2, 4.4, 2.3, 2.6, 1.9, 5.0
261I6C7.5, 5.1, 5.4, 6.2, 5.4, 6.2, 8.0, 6.2, 6.7, 7.65.6, 5.7, 5.6, 7.3, 6.93.0, 3.0, 2.2, 3.2, 2.1
272ERL6.7, 8.6, 7.1, 8.4, 7.2, 3.2, 4.1, 6.2, 6.8, 8.17.0, 7.4, 7.1, 7.2, 8.31.3, 7.1
281RES6.1, 4.2, 5.2, 7.7, 4.8, 4.8, 4.3, 7.0, 5.6, 5.57.6, 7.1, 7.0, 7.3, 5.13.5, 3.0, 2.8, 4.3, 4.2, 2.3, 2.0
292CPG10.1, 5.3, 10.0, 8.5, 9.4, 10.6, 7.8, 9.4, 7.4, 7.54.2, 4.5, 5.3, 5.1, 11.08.0, 4.3, 8.5, 8.4, 6.5, 10.0, 4.8, 8.6, 5.5, 7.6
301DV07.7, 7.1, 8.0, 5.1, 8.3, 6.0, 7.8, 8.7, 8.4, 8.53.2, 4.4, 4.0, 2.8, 6.21.6, 1.5, 1.6, 2.0, 1.5, 4.5, 2.4, 2.0, 2.3, 4.2
311IRQ6.8, 6.9, 6.4, 6.7, 10.2, 8.4, 9.8, 9.0, 5.3, 8.28.2, 8.9, 9.1, 9.0, 8.56.1, 4.3, 6.0, 5.0, 6.6, 6.0, 7.4, 5.2, 6.4, 7.5
321GUU5.5, 5.3, 7.7, 4.6, 5.0, 4.6, 5.1, 5.7, 8.9, 9.110.1, 10.1, 9.8, 9.3, 10.12.9, 4.2, 2.9, 7.0, 3.2, 3.7, 2.4, 6.5, 5.6
331GV54.9, 4.1, 4.8, 4.8, 9.0, 9.4, 4.6, 9.2, 9.3, 8.99.4, 9.1, 9.5, 8.9, 3.38.5, 3.7, 9.1, 4.5, 4.7, 5.3, 4.2, 9.1, 3.1, 3.5
341GVD5.7, 6.4, 8.0, 5.1, 6.0, 4.9, 4.9, 6.9, 4.9, 5.59.4, 9.4, 8.8, 9.1, 3.98.5, 3.5, 2.7, 3.0, 4.7, 4.4, 4.3, 2.3, 6.7, 8.9
351MBH9.1, 9.2, 9.2, 4.0, 9.5, 8.4, 5.5, 5.5, 5.0, 5.34.3, 4.1, 5.7, 3.5, 9.58.3, 8.1, 4.2, 2.8, 8.9, 2.4, 7.9, 3.5, 7.7, 7.7
361GAB4.9, 9.2, 6.2, 6.0, 6.8, 3.6, 8.5, 9.7, 8.8, 6.35.5, 5.6, 6.4, 5.4, 5.92.3, 8.8, 2.7, 7.9, 2.8, 8.1, 2.7, 2.3, 2.2, 7.7
371MOF5.7, 3.7, 3.9, 4.2, 2.9, 4.0, 4.9, 4.3, 4.0, 4.912.7, 13.6, 12.5, 12.7, 13.513.7, 11.8, 11.2, 12.6, 12.6, 12.0, 12.2, 12.9, 12.8, 11.2
381ENH6.3, 9.9, 4.6, 9.1, 9.7, 5.8, 5.7, 9.5, 6.2, 6.45.0, 4.6, 4.3, 8.7, 4.22.2, 1.7, 1.8, 5.1, 2.3, 4.6, 3.0, 5.2, 3.1, 3.2
391IDY4.6, 4.9, 8.7, 4.0, 3.6, 3.5, 5.3, 3.7, 6.0, 9.38.7, 8.3, 8.3, 8.8, 4.62.7, 2.5, 3.0, 8.5, 2.1, 2.0, 2.1, 6.8, 2.6, 2.9
401PRV6.9, 5.1, 6.9, 5.8, 5.0, 5.6, 5.6, 9.5, 4.9, 4.92.3, 2.6, 3.0, 3.2, 5.42.5, 2.1, 3.4, 2.9, 3.7, 4.9, 2.9, 2.4, 4.2, 6.8
411HDD10.2, 6.3, 10.2, 5.5, 11.1, 6.2, 9.8, 4.8, 7.0, 6.74.4, 4.7, 5.8, 4.6, 9.72.3, 2.5, 2.2, 3.3, 3.6, 4.4, 3.4, 3.0, 4.2, 4.2
421BDC7.7, 6.1, 6.6, 8.3, 4.8, 7.0, 7.5, 5.0, 6.7, 6.63.1, 3.0, 3.5, 2.8, 5.12.5, 2.5, 3.7, 3.2, 7.7, 4.0, 3.7, 7.9, 2.6, 7.8
431I5X5.5, 5.9, 3.6, 5.4, 5.8, 2.6, 4.3, 6.0, 3.9, 5.111.4, 11.0, 11.0, 11.5, 9.210.8, 6.8, 8.6, 12.5, 4.5, 9.8, 7.1, 13.1, 9.0, 7.0
441I5Y5.8, 5.1, 4.3, 4.3, 3.4, 4.9, 2.6, 3.7, 3.2, 4.09.8, 8.9, 8.4, 11.8, 9.19.6, 7.8, 10.2, 9.1, 8.2, 5.0, 12.5, 11.3, 8.4, 8.1
451KU36.6, 7.4, 6.4, 5.5, 7.2, 5.6, 6.3, 6.2, 5.6, 8.35.6, 5.4, 4.9, 5.4, 9.64.7, 4.4, 5.8, 4.5, 5.3, 5.3, 5.5, 6.2, 4.7, 2.9
461YIB6.7, 5.3, 5.5, 5.8, 3.5, 4.8, 5.1, 4.5, 5.2, 4.617.5, 17.6, 18.3, 17.3, 17.417.8, 17.5, 17.1, 17.1, 17.3, 17.5, 18.5, 16.3
471DF53.4, 5.3, 6.0, 6.1, 7.0, 3.8, 3.4, 3.1, 8.1, 3.49.3, 10.3, 8.7, 9.3, 11.79.9, 8.2, 5.7, 5.6, 9.9, 8.5, 8.6, 11.1, 6.3, 7.0
481AHO7.8, 7.6, 9.1, 8.7, 6.6, 6.0, 7.2, 7.7, 9.2, 7.78.1, 6.6, 4.1, 5.2, 6.00.6, 1.1, 0.6, 1.2, 1.0, 0.4, 0.8, 1.4, 1.2, 0.8
491QR94.3, 3.8, 4.9, 5.1, 10.9, 6.0, 4.0, 4.0, 4.2, 4.611.0, 11.1, 9.6, 11.2, 12.96.3, 8.5, 4.3, 9.9, 8.6, 6.5, 8.7, 11.7, 12.1, 10.7
501AIL10.8, 6.6, 4.4, 6.4, 7.2, 8.9, 4.2, 8.5, 6.0, 4.29.0, 8.9, 8.4, 7.6, 10.33.2, 4.4, 4.5, 5.3, 7.2, 5.4, 6.4
Sl. No.PDB IDRMSD without end loops (Å) (Bhageerath)RMSD without end loops (Å) (ProtInfo)a (30)RMSD without end loops (Å) (ROBETTA)a (28)
11E0Q4.5, 2.5, 3.0, 5.0, 3.4, 3.3, 3.2, 3.3, 5.9, 3.34.0, 4.1, 3.7, 3.9, 4.21.1b
21B0310.3, 4.4, 5.9, 5.5, 6.7, 5.4, 4.5, 6.1, 6.9, 7.54.0, 4.7, 4.1, 4.5, 4.42.7, 3.0
31WQC4.0, 4.5, 2.5, 3.8, 2.9, 5.1, 4.2, 5.7, 3.8, 4.72.1, 1.8, 1.8, 2.0, 2.12.3, 3.4
41RJU6.1, 6.3, 6.6, 5.9, 6.6, 5.9, 6.6, 7.0, 6.7, 7.43.4, 4.9, 3.3, 4.8, 6.03.4, 4.0, 2.5, 3.2, 3.0, 3.6, 4.8, 2.9, 3.0, 3.1
51EDM3.9, 3.5, 3.8, 4.0, 3.6, 5.2, 5.4, 4.1, 3.9, 4.73.4, 4.0, 3.7, 3.3, 3.10.4, 0.5, 0.4, 0.5, 0.6, 0.4, 0.7, 0.7, 1.1, 0.4
61AB14.8, 4.5, 4.3, 5.2, 4.2, 2.9, 4.5, 3.8, 5.8, 3.33.3, 5.1, 6.3, 3.6, 4.92.2, 2.8, 2.9, 2.4, 2.9, 2.7, 3.7, 3.5, 2.2, 3.3
71BX73.3, 4.0, 5.0, 3.2, 4.5, 3.8, 4.8, 3.1, 4.0, 3.52.6, 4.2, 3.7, 4.5, 2.10.9, 1.5, 1.0, 1.6, 1.5, 1.6, 1.4, 1.0, 2.0, 1.5
81B6Q6.1, 8.4, 4.0, 4.4, 3.8, 10.1, 5.3, 9.7, 10.7, 3.110.2, 10.0, 10.0, 10.4, 10.510.0, 9.6, 8.5, 7.6, 12.0, 8.3, 8.2, 7.0, 10.2, 9.0
91ROP5.3, 4.3, 9.2, 7.3, 7.5, 11.0, 14.2, 11.5, 8.7, 6.210.8, 11.5, 11.5, 10.1, 12.45.8, 10.3, 10.0, 11.7, 8.6, 7.0, 8.3, 7.7, 11.2, 13.6
101NKD3.9, 16.2, 10.1, 7.0, 10.6, 3.8, 4.8, 4.9, 7.9, 14.713.5, 13.5, 13.3, 13.4, 11.78.9, 8.9, 10.6, 11.0, 12.6, 10.7, 12.2, 10.1, 11.0, 9.1
111RPO9.9, 3.8, 4.0, 7.5, 14.4, 4.8, 6.0, 13.5, 3.8, 7.510.8, 10.4, 10.4, 10.9, 11.210.3, 8.7, 6.9, 6.0, 12.4, 7.7, 10.1, 7.2, 10.0, 7.7
121QR89.0, 11.1, 8.2, 7.1, 9.7, 14.0, 8.1, 10.9, 5.4, 4.410.1, 9.5, 10.0, 10.4, 12.211.3, 9.3, 9.0, 7.6, 9.5, 12.2, 10.5, 7.1, 11.3, 8.5
131FME4.9, 5.0, 4.8, 6.5, 3.7, 4.5, 4.2, 6.2, 4.3, 4.12.2, 2.3, 2.5, 2.7, 1.63.8, 2.8, 3.3, 4.5, 3.6, 3.1, 2.7, 3.9, 4.4, 3.7
141ACW5.5, 7.0, 5.3, 6.0, 7.4, 5.7, 7.0, 5.1, 7.2, 5.65.8, 5.8, 6.0, 6.2, 7.11.3, 1.7
151DFN5.0, 5.9, 6.5, 5.8, 6.8, 6.0, 7.1, 6.1, 6.5, 7.45.6, 6.8, 6.4, 6.6, 6.41.7, 5.3, 6.0, 5.5, 4.0, 6.3, 5.2, 6.5, 5.2, 6.6
161Q2K7.4, 7.4, 7.2, 4.8, 5.8, 6.5, 5.7, 6.2, 4.2, 7.35.9, 6.0, 5.8, 6.4, 9.11.7, 3.0, 3.3, 1.6, 4.7
171SCY6.1, 4.8, 6.6, 7.2, 3.1, 5.0, 6.5, 6.9, 7.2, 5.65.5, 5.6, 6.5, 6.4, 6.22.2, 2.7, 3.3
181XRX5.6, 8.8, 7.6, 7.7, 9.6, 8.4, 9.0, 6.2, 8.4, 8.28.6, 8.8, 7.8, 8.8, 4.05.2, 9.1, 7.1, 6.2, 4.4, 9.4, 6.6, 4.5, 5.4, 8.2
191ROO3.9, 3.4, 3.3, 3.8, 2.8, 4.1, 3.5, 3.2, 3.2, 3.32.8, 2.7, 2.7, 3.0, 2.71.8, 2.1, 1.9, 2.9, 2.5, 1.2, 2.5, 1.9, 2.8, 2.2
201YRF5.9, 5.7, 5.7, 4.8, 4.9, 5.0, 4.9, 5.0, 6.2, 5.84.3, 4.1, 3.3, 3.3, 4.31.7, 3.1, 4.3
211YRI5.9, 5.5, 4.6, 6.0, 5.5, 3.8, 5.5, 5.4, 5.5, 6.14.2, 4.0, 3.2, 3.2, 4.21.7, 3.9, 2.8
221VII5.5, 3.7, 6.6, 5.9, 6.1, 5.7, 5.6, 6.0, 6.3, 5.74.4, 4.7, 4.5, 4.3, 3.72.4, 3.3, 1.8, 5.6, 4.3, 3.0, 3.2, 3.7, 4.8, 1.9
231BGK5.8, 5.9, 4.1, 6.1, 5.8, 5.5, 5.5, 4.9, 5.2, 6.16.2, 6.0, 6.4, 6.4, 6.26.5, 4.1, 4.6, 2.5, 3.8, 5.9, 3.5, 3.3, 3.5, 6.1
241BHI7.9, 5.3, 6.7, 7.2, 5.4, 8.9, 6.3, 6.6, 6.2, 7.13.7, 3.8, 4.5, 4.5, 5.02.4, 2.4, 1.7, 1.1, 2.8, 1.7, 2.6, 2.2, 2.3, 1.9
251OVX4.0, 6.4, 6.3, 4.3, 6.1, 5.4, 5.3, 5.9, 7.7, 6.14.6, 4.9, 4.4, 5.6, 5.23.2, 1.5, 3.1, 2.6, 4.2, 4.4, 2.3, 2.6, 1.9, 5.0
261I6C7.5, 5.1, 5.4, 6.2, 5.4, 6.2, 8.0, 6.2, 6.7, 7.65.6, 5.7, 5.6, 7.3, 6.93.0, 3.0, 2.2, 3.2, 2.1
272ERL6.7, 8.6, 7.1, 8.4, 7.2, 3.2, 4.1, 6.2, 6.8, 8.17.0, 7.4, 7.1, 7.2, 8.31.3, 7.1
281RES6.1, 4.2, 5.2, 7.7, 4.8, 4.8, 4.3, 7.0, 5.6, 5.57.6, 7.1, 7.0, 7.3, 5.13.5, 3.0, 2.8, 4.3, 4.2, 2.3, 2.0
292CPG10.1, 5.3, 10.0, 8.5, 9.4, 10.6, 7.8, 9.4, 7.4, 7.54.2, 4.5, 5.3, 5.1, 11.08.0, 4.3, 8.5, 8.4, 6.5, 10.0, 4.8, 8.6, 5.5, 7.6
301DV07.7, 7.1, 8.0, 5.1, 8.3, 6.0, 7.8, 8.7, 8.4, 8.53.2, 4.4, 4.0, 2.8, 6.21.6, 1.5, 1.6, 2.0, 1.5, 4.5, 2.4, 2.0, 2.3, 4.2
311IRQ6.8, 6.9, 6.4, 6.7, 10.2, 8.4, 9.8, 9.0, 5.3, 8.28.2, 8.9, 9.1, 9.0, 8.56.1, 4.3, 6.0, 5.0, 6.6, 6.0, 7.4, 5.2, 6.4, 7.5
321GUU5.5, 5.3, 7.7, 4.6, 5.0, 4.6, 5.1, 5.7, 8.9, 9.110.1, 10.1, 9.8, 9.3, 10.12.9, 4.2, 2.9, 7.0, 3.2, 3.7, 2.4, 6.5, 5.6
331GV54.9, 4.1, 4.8, 4.8, 9.0, 9.4, 4.6, 9.2, 9.3, 8.99.4, 9.1, 9.5, 8.9, 3.38.5, 3.7, 9.1, 4.5, 4.7, 5.3, 4.2, 9.1, 3.1, 3.5
341GVD5.7, 6.4, 8.0, 5.1, 6.0, 4.9, 4.9, 6.9, 4.9, 5.59.4, 9.4, 8.8, 9.1, 3.98.5, 3.5, 2.7, 3.0, 4.7, 4.4, 4.3, 2.3, 6.7, 8.9
351MBH9.1, 9.2, 9.2, 4.0, 9.5, 8.4, 5.5, 5.5, 5.0, 5.34.3, 4.1, 5.7, 3.5, 9.58.3, 8.1, 4.2, 2.8, 8.9, 2.4, 7.9, 3.5, 7.7, 7.7
361GAB4.9, 9.2, 6.2, 6.0, 6.8, 3.6, 8.5, 9.7, 8.8, 6.35.5, 5.6, 6.4, 5.4, 5.92.3, 8.8, 2.7, 7.9, 2.8, 8.1, 2.7, 2.3, 2.2, 7.7
371MOF5.7, 3.7, 3.9, 4.2, 2.9, 4.0, 4.9, 4.3, 4.0, 4.912.7, 13.6, 12.5, 12.7, 13.513.7, 11.8, 11.2, 12.6, 12.6, 12.0, 12.2, 12.9, 12.8, 11.2
381ENH6.3, 9.9, 4.6, 9.1, 9.7, 5.8, 5.7, 9.5, 6.2, 6.45.0, 4.6, 4.3, 8.7, 4.22.2, 1.7, 1.8, 5.1, 2.3, 4.6, 3.0, 5.2, 3.1, 3.2
391IDY4.6, 4.9, 8.7, 4.0, 3.6, 3.5, 5.3, 3.7, 6.0, 9.38.7, 8.3, 8.3, 8.8, 4.62.7, 2.5, 3.0, 8.5, 2.1, 2.0, 2.1, 6.8, 2.6, 2.9
401PRV6.9, 5.1, 6.9, 5.8, 5.0, 5.6, 5.6, 9.5, 4.9, 4.92.3, 2.6, 3.0, 3.2, 5.42.5, 2.1, 3.4, 2.9, 3.7, 4.9, 2.9, 2.4, 4.2, 6.8
411HDD10.2, 6.3, 10.2, 5.5, 11.1, 6.2, 9.8, 4.8, 7.0, 6.74.4, 4.7, 5.8, 4.6, 9.72.3, 2.5, 2.2, 3.3, 3.6, 4.4, 3.4, 3.0, 4.2, 4.2
421BDC7.7, 6.1, 6.6, 8.3, 4.8, 7.0, 7.5, 5.0, 6.7, 6.63.1, 3.0, 3.5, 2.8, 5.12.5, 2.5, 3.7, 3.2, 7.7, 4.0, 3.7, 7.9, 2.6, 7.8
431I5X5.5, 5.9, 3.6, 5.4, 5.8, 2.6, 4.3, 6.0, 3.9, 5.111.4, 11.0, 11.0, 11.5, 9.210.8, 6.8, 8.6, 12.5, 4.5, 9.8, 7.1, 13.1, 9.0, 7.0
441I5Y5.8, 5.1, 4.3, 4.3, 3.4, 4.9, 2.6, 3.7, 3.2, 4.09.8, 8.9, 8.4, 11.8, 9.19.6, 7.8, 10.2, 9.1, 8.2, 5.0, 12.5, 11.3, 8.4, 8.1
451KU36.6, 7.4, 6.4, 5.5, 7.2, 5.6, 6.3, 6.2, 5.6, 8.35.6, 5.4, 4.9, 5.4, 9.64.7, 4.4, 5.8, 4.5, 5.3, 5.3, 5.5, 6.2, 4.7, 2.9
461YIB6.7, 5.3, 5.5, 5.8, 3.5, 4.8, 5.1, 4.5, 5.2, 4.617.5, 17.6, 18.3, 17.3, 17.417.8, 17.5, 17.1, 17.1, 17.3, 17.5, 18.5, 16.3
471DF53.4, 5.3, 6.0, 6.1, 7.0, 3.8, 3.4, 3.1, 8.1, 3.49.3, 10.3, 8.7, 9.3, 11.79.9, 8.2, 5.7, 5.6, 9.9, 8.5, 8.6, 11.1, 6.3, 7.0
481AHO7.8, 7.6, 9.1, 8.7, 6.6, 6.0, 7.2, 7.7, 9.2, 7.78.1, 6.6, 4.1, 5.2, 6.00.6, 1.1, 0.6, 1.2, 1.0, 0.4, 0.8, 1.4, 1.2, 0.8
491QR94.3, 3.8, 4.9, 5.1, 10.9, 6.0, 4.0, 4.0, 4.2, 4.611.0, 11.1, 9.6, 11.2, 12.96.3, 8.5, 4.3, 9.9, 8.6, 6.5, 8.7, 11.7, 12.1, 10.7
501AIL10.8, 6.6, 4.4, 6.4, 7.2, 8.9, 4.2, 8.5, 6.0, 4.29.0, 8.9, 8.4, 7.6, 10.33.2, 4.4, 4.5, 5.3, 7.2, 5.4, 6.4
a

The secondary structure information was utilized from the native structure along with the sequence information for both Bhageerath and ROBETTA (Rosetta++ software suite was obtained from UW TechTransfer Digital Ventures). We have generated 10000 decoys starting from sequence and secondary structure information. The top 2000 scoring decoys were selected and top 10 cluster centers were extracted. The ProtInfo (http://protinfo.compbio.washington.edu) predictions were obtained from the sequence information alone.

b

For the system 1e0q it took ∼12 days on a dedicated processor to generate 1000 decoys.

DISCUSSION

We describe here an energy based computational web server Bhageerath, for an automated candidate tertiary structure prediction. The web server permits predictive folding with moderate computational resources. The validation of the computational protocol on 50 globular proteins has shown that the web server selects one or more candidate structures within an RMSD of 3–6 Å with respect to the native in the 10 lowest energy structures. The results presented are for proteins having 2–3 secondary elements with α, β and α/β structures and are obtained solely from the amino acid sequence and secondary structure information (without the aid of multiple sequence alignment, or fold recognition). The results provide a benchmark as to the level of model accuracy one can expect from this web server.

All of the eight modules are currently being executed on a cluster with 32 dedicated UltraSparc III 900 MHz processors. In contrast to typical short return times (ranging from 1 to 10 min) for receiving results from comparative modeling servers, the expected prediction time with Bhageerath web server for two helix systems is 4–5 min while for three helix systems it is ∼2–3 h. However, this depends on the length of the sequence, number of secondary structure elements and the number of structures accepted after the biophysical filters for processing the energetics of each trial structure at the atomic level. It is currently able to process ∼4–5 normally sized jobs per day on 32 processors.

The current version of the web server elicits secondary structure information from the user. For new sequences where secondary structure information is not available, web based secondary structure prediction tools can be employed. We have characterized the results obtained from five different freely available secondary structure prediction servers (4347) available on the web for the 50 test proteins. The predictions are provided in the supplementary information (Supplementary Table S9). We envisage the introduction of a secondary structure predictor in module one shortly. For larger systems, i.e. those containing more than 100 amino acid residues and those with more than three secondary structural elements, we conceive the introduction of loop filters to control the combinatorial explosion in the number of trial structures. We have utilized two biophysical filters presently in module three for trial structure selection and plan to utilize a few more such as hydrophobicity and packing fraction at later stages. Also one could profitably employ constraints on strands for sheet formation, constraints on metal ions to cluster residues and disulphide bridges as filters for reducing the number of trial structures. The all atom empirical energy function utilized in module six was tested previously and was seen to separate native from the decoy structures in 67 of the 69 protein sequences from among 61 640 decoys studied (35). The scoring function calculates the non-bonded energy of each trial structure as a sum of the electrostatics, van der Waals and hydrophobicity. There is scope for improvement in the scoring function particularly in describing the hydrophobicity component. Work on the above mentioned lines as also on a Flexible Monte Carlo simulation strategy to bring down the RMSD < 3 Å of the native is in progress.

The individual modules of Bhageerath are web enabled for free access. These include the four biophysical filters (persistence length, radius of gyration, hydrophobicity ratio and packing fraction), a protein structure optimizer, an all-atom empirical energy based scoring function and ProRegIn utility. These are listed in Table 5 along with their corresponding URL's.

Table 5

A list of modules of Bhageerath converted to independent web utilities with their respective URL's

Sl. No.Name of the utilityDescription
1Persistence length filter (http://www.scfbio-iitd.res.in/software/proteomics/perlen.jsp)A filter based on the maximum uninterrupted length of the polypeptide chain persisting in a particular direction
2Radius of gyration filter (http://www.scfbio-iitd.res.in/software/proteomics/rg.jsp)A filter based on the radius of the molecule and defined as the root mean square distance of the collection of atoms from their common centre of gravity
3Hydrophobicity ratio filter (http://www.scfbio-iitd.res.in/software/proteomics/hyphb.jsp)A filter based on hydrophobicity ratio, which is defined as the ratio of loss in accessible surface area (ASA) per atom of non-polar atoms to the loss in accessible surface area per atom of the polar atoms
4Packing fraction filter (http://www.scfbio-iitd.res.in/software/proteomics/pf.jsp)A filter based on packing density, which utilizes observation that proteins are known to exhibit packing fractions ∼0.7
5Protein structure optimizer (http://www.scfbio-iitd.res.in/software/proteomics/promin.jsp)A utility that minimizes the energy of the protein structure using a combination of steepest descent and conjugate gradient minimization algorithms
6Scoring function for protein structure evaluation (http://www.scfbio-iitd.res.in/utility/proteomics/energy.jsp)An all-atom empirical energy based scoring function which combines second generation force field parameters with a hydrophobicity function
7Protein regularity index (http://www.scfbio-iitd.res.in/software/proregin.jsp)A utility based on the regularity seen in the main chain loop dihedral angles of proteins
Sl. No.Name of the utilityDescription
1Persistence length filter (http://www.scfbio-iitd.res.in/software/proteomics/perlen.jsp)A filter based on the maximum uninterrupted length of the polypeptide chain persisting in a particular direction
2Radius of gyration filter (http://www.scfbio-iitd.res.in/software/proteomics/rg.jsp)A filter based on the radius of the molecule and defined as the root mean square distance of the collection of atoms from their common centre of gravity
3Hydrophobicity ratio filter (http://www.scfbio-iitd.res.in/software/proteomics/hyphb.jsp)A filter based on hydrophobicity ratio, which is defined as the ratio of loss in accessible surface area (ASA) per atom of non-polar atoms to the loss in accessible surface area per atom of the polar atoms
4Packing fraction filter (http://www.scfbio-iitd.res.in/software/proteomics/pf.jsp)A filter based on packing density, which utilizes observation that proteins are known to exhibit packing fractions ∼0.7
5Protein structure optimizer (http://www.scfbio-iitd.res.in/software/proteomics/promin.jsp)A utility that minimizes the energy of the protein structure using a combination of steepest descent and conjugate gradient minimization algorithms
6Scoring function for protein structure evaluation (http://www.scfbio-iitd.res.in/utility/proteomics/energy.jsp)An all-atom empirical energy based scoring function which combines second generation force field parameters with a hydrophobicity function
7Protein regularity index (http://www.scfbio-iitd.res.in/software/proregin.jsp)A utility based on the regularity seen in the main chain loop dihedral angles of proteins
Table 5

A list of modules of Bhageerath converted to independent web utilities with their respective URL's

Sl. No.Name of the utilityDescription
1Persistence length filter (http://www.scfbio-iitd.res.in/software/proteomics/perlen.jsp)A filter based on the maximum uninterrupted length of the polypeptide chain persisting in a particular direction
2Radius of gyration filter (http://www.scfbio-iitd.res.in/software/proteomics/rg.jsp)A filter based on the radius of the molecule and defined as the root mean square distance of the collection of atoms from their common centre of gravity
3Hydrophobicity ratio filter (http://www.scfbio-iitd.res.in/software/proteomics/hyphb.jsp)A filter based on hydrophobicity ratio, which is defined as the ratio of loss in accessible surface area (ASA) per atom of non-polar atoms to the loss in accessible surface area per atom of the polar atoms
4Packing fraction filter (http://www.scfbio-iitd.res.in/software/proteomics/pf.jsp)A filter based on packing density, which utilizes observation that proteins are known to exhibit packing fractions ∼0.7
5Protein structure optimizer (http://www.scfbio-iitd.res.in/software/proteomics/promin.jsp)A utility that minimizes the energy of the protein structure using a combination of steepest descent and conjugate gradient minimization algorithms
6Scoring function for protein structure evaluation (http://www.scfbio-iitd.res.in/utility/proteomics/energy.jsp)An all-atom empirical energy based scoring function which combines second generation force field parameters with a hydrophobicity function
7Protein regularity index (http://www.scfbio-iitd.res.in/software/proregin.jsp)A utility based on the regularity seen in the main chain loop dihedral angles of proteins
Sl. No.Name of the utilityDescription
1Persistence length filter (http://www.scfbio-iitd.res.in/software/proteomics/perlen.jsp)A filter based on the maximum uninterrupted length of the polypeptide chain persisting in a particular direction
2Radius of gyration filter (http://www.scfbio-iitd.res.in/software/proteomics/rg.jsp)A filter based on the radius of the molecule and defined as the root mean square distance of the collection of atoms from their common centre of gravity
3Hydrophobicity ratio filter (http://www.scfbio-iitd.res.in/software/proteomics/hyphb.jsp)A filter based on hydrophobicity ratio, which is defined as the ratio of loss in accessible surface area (ASA) per atom of non-polar atoms to the loss in accessible surface area per atom of the polar atoms
4Packing fraction filter (http://www.scfbio-iitd.res.in/software/proteomics/pf.jsp)A filter based on packing density, which utilizes observation that proteins are known to exhibit packing fractions ∼0.7
5Protein structure optimizer (http://www.scfbio-iitd.res.in/software/proteomics/promin.jsp)A utility that minimizes the energy of the protein structure using a combination of steepest descent and conjugate gradient minimization algorithms
6Scoring function for protein structure evaluation (http://www.scfbio-iitd.res.in/utility/proteomics/energy.jsp)An all-atom empirical energy based scoring function which combines second generation force field parameters with a hydrophobicity function
7Protein regularity index (http://www.scfbio-iitd.res.in/software/proregin.jsp)A utility based on the regularity seen in the main chain loop dihedral angles of proteins

SUPPLEMENTARY DATA

Supplementary Data are available at NAR online.

ACKNOWLEDGEMENTS

Funding from the Department of Biotechnology is gratefully acknowledged. Ms Kumkum Bhushan is a recipient of the Senior Research Fellow award from the Council of Scientific & Industrial Research (CSIR), India. Help received from Ms Lipi Thukral and Mr Shailesh Tripathi is gratefully acknowledged. The Open Access publication charges for this article were waived by Oxford University Press.

Conflict of interest statement. None declared.

REFERENCES

1.

Liwo
A.
,
Khalili
M.
,
Scheraga
H.A.
2005
Ab initio simulation of protein-folding pathways by molecular dynamics with united residue model of polypeptide chains
Proc. Natl Acad. Sci. USA
102
2362
2367

2.

Baker
D.
2000
A surprising simplicity to protein folding
Nature
405
39
42

3.

Klepeis
J.L.
and
Floudas
C.A.
2004
In silico protein design: a combinatorial and global optimization approach
SIAM News
37
1

4.

Guex
N.
and
Peitsch
M.C.
1997
SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling
Electrophoresis
18
2714
2723

5.

Sánchez
R.
and
Šali
A.
1997
Evaluation of comparative protein structure modeling by MODELLER-3
Proteins
29
50
58

6.

Panchenko
A.R.
,
Marcbr-Bauer
A.E.
,
Bryant
S.H.
2000
Combination of threading potentials and sequence profiles improves fold recognition
J. Mol. Biol
.
296
1319
1331

7.

Skolnick
J.E.
and
Kihara
D.
2001
Defrosting the frozen approximation: PROSPECTOR-a new approach to threading
Proteins
42
319
331

8.

Aszodi
A.
,
Gradwell
M.J.
,
Taylor
W.R.
1995
Global fold determination from a small number of distance restrains
J. Mol. Biol
.
251
308
326

9.

Kolinski
A.
,
Jaroszewski
L.
,
Rotkiewicz
P.
,
Skolnick
J.
1998
An efficient Monte Carlo model of protein chains. Modeling the short-range correlations between side group centers of mass
J. Phys Chem
102
4628
4637

10.

Ortiz
A.R.
,
Kolinski
A.
,
Skolnick
J.
1998
Fold assembly of small proteins using Monte Carlo simulations driven by restraints derived from multiple sequence alignments
J. Mol. Biol
.
277
419
448

11.

Huang
E.S.
,
Samudrala
R.
,
Ponder
J.W.
1999
Ab initio fold prediction of small helical proteins using distance geometry and knowledge-based scoring functions
J. Mol. Biol
.
290
267
281

12.

Simons
K.T.
,
Strauss
C.
,
Baker
D.
2001
Prospects for ab initio protein structural genomics
J. Mol. Biol
.
306
1191
1199

13.

Rost
B.
and
Sander
C.
1996
Bridging the protein sequence-structure gap by structure predictions
Annu. Rev. Biophys. Biomol. Struct
.
25
113
136

14.

Guex
N.
,
Diemand
A.
,
Peitsch
M.C.
1999
Protein modeling for all
Trends Biochem. Sci
.
24
364
367

15.

Moult
J.
1999
Predicting protein three-dimensional structure
Curr. Opin. Biotechnol
.
10
583
588

16.

Al-Lazikani
B.
,
Jung
J.
,
Xiang
Z.
,
Honig
B.
2001
Protein structure prediction
Curr. Opin. Struct. Biol
.
5
51
56

17.

Venclovas
C.
2001
Comparative modeling of CASP4 target proteins: Combining results of sequence search with three-dimensional structure assessment
Proteins
45
47
54

18.

Tramontanoa
A.
and
Morea
V.
2003
Assessment of homology based predictions in CASP5
Proteins
53
352
368

19.

Lund
O.
,
Nielsen
M.
,
Lundegaard
C.
,
Worning
P.
2002
X3M a computer program to extract 3D models
Abstract at the CASP5 conference, A102

20.

Ogata
K.
and
Umeyama
H.
2000
An automatic homology modeling method consisting of database searches and simulated annealing
J. Mol. Graph Model
18
305
306

21.

Sali
A.
and
Blundell
T.
1993
Comparative protein modeling by satisfaction of spatial restraints
J. Mol. Biol
.
234
779
815

22.

Tress
M.
,
Ezkurdia
I.
,
Graña
O.
,
Lopez
G.
,
Valencia
A.
2005
Assessment of predictions submitted for the CASP6 comparative modeling category
Proteins
61
27
45

23.

Scheraga
H.A.
1992
Some approaches to the multiple-minima problem in the calculation of polypeptide and protein structures
Int. J. Quantum Chem
.
42
1529
1536

24.

Scheraga
H.A.
1996
Recent developments in the theory of protein folding: searching for the global energy minimum
Biophys. Chem
.
59
329
339

25.

Vasquez
M.
,
Nemethy
G.
,
Scheraga
H.A.
1994
Conformational energy calculations on polypeptides and proteins
Chem. Rev
.
94
2183

26.

Anfinsen
C.B.
1973
Principles that govern the folding of protein chains
Science
181
223

27.

Pillardy
J.
2001
Recent improvements in prediction of protein structure by global optimization of a potential energy function
Proc. Natl Acad. Sci. USA
98
2329
2333

28.

Kim
D.E.
,
Chivian
D.
,
Baker
D.
2004
Protein structure prediction and analysis using the Robetta server
Nucleic Acids Res
.
32
W526
W531

29.

Bradley
P.
,
Misura
K.M.S.
,
Baker
D.
2005
Towards high-resolution de novo structure prediction for small proteins
Science
309
1868
1871

30.

Hung
L.-H.
,
Ngan
S.-C.
,
Liu
T.
,
Samudrala
R.
2005
PROTINFO: new algorithms for enhanced protein structure predictions
Nucleic Acids Res
.
33
W77
W80

31.

Cheng
J.
,
Randall
A.Z.
,
Sweredoski
M.J.
,
Baldi
P.
2005
SCRATCH: a protein structure and structural feature prediction server
Nucleic Acids Res
.
33
W72
W76

32.

Klepeis
J.L.
and
Floudas
C.A.
2003
ASTRO_FOLD: A combinatorial and global optimization framework for ab initio prediction of three-dimensional structures of proteins from the amino acid sequence
Biophys. J
.
85
2119
2146

33.

Fujitsuka
Y.
,
Chikenji
G.
,
Takada
S.
2005
SimFold energy function for de novo protein structure prediction: consensus with Rosetta
Proteins
62
381
398

34.

Narang
P.
,
Bhushan
K.
,
Bose
S.
,
Jayaram
B.
2005
A computational pathway for bracketing native-like structures for small alpha helical globular proteins
Phys. Chem. Chem. Phys
.
7
2364
2375

35.

Narang
P.
,
Bhushan
K.
,
Bose
S.
,
Jayaram
B.
2006
Protein structure evaluation using an all-atom energy based empirical scoring function
J. Biomol. Struct. Dyn
.
23
385
406

36.

Hubbard
S.J.
and
Thornton
J.M.
‘NACCESS’, Computer Program
1993
UK Department of Biochemistry and Molecular Biology
,
University College
London

37.

Berman
H.M.
,
Westbrook
J.
,
Feng
Z.
,
Gilliland
G.
,
Bhat
T.N.
,
Weissig
H.
,
Shindyalov
I.N.
,
Bourne
P.E.
2000
The Protein Data Bank
Nucleic Acids Res
.
28
235
242

38.

Lambert
C.
,
Leonard
N.
,
De Bolle
X.
,
Depiereux
E.
2002
EsyPred3D: Prediction of proteins 3D structures
Bioinformatics
18
1250
1256

39.

Combet
C.
,
Jambon
M.
,
Deleage
G.
,
Geourjon
C.
2002
Geno3D: Automatic comparative molecular modeling of protein
Bioinformatics
18
213
214

40.

Bates
P.A.
,
Kelley
L.A.
,
MacCallum
R.M.
,
Sternberg
M.J.E.
2001
Enhancement of protein modeling by human intervention in applying the automatic programs 3D-JIGSAW and 3D-PSSM
Proteins
45
39
46

41.

Altschul
S.F.
,
Madden
T.L.
,
Schäffer
A.A.
,
Zhang
J.
,
Zhang
Z.
,
Miller
W.
,
Lipman
D.J.
1997
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Res
.
25
3389
3402

42.

Zemla
A.
2003
LGA - a method for finding 3D similarities in protein structures
Nucleic Acids Res
.
31
3370
3374

43.

Bryson
K.
,
McGuffin
L.J.
,
Marsden
R.L.
,
Ward
J.J.
,
Sodhi
J.S.
,
Jones
D.T.
2005
Protein structure prediction servers at University College London
Nucleic Acids Res
.
33
W36
W38

44.

Rost
B.
,
Yachdav
G.
,
Liu
J.
2003
The PredictProtein server
Nucleic Acids Res
.
32
W321
W326

45.

Cuff
J.A.
,
Clamp
M.E.
,
Siddiqui
A.S.
,
Finlay
M.
,
Barton
G.J.
1998
Jpred: a consensus secondary structure prediction server
Bioinformatics
14
892
893

46.

Sen
T.Z.
,
Jernigan
R.L.
,
Garnier
J.
,
Kloczkowski
A.
2005
GOR V server for protein secondary structure prediction
Bioinformatics
21
2787
2788

47.

Frishman
D.
and
Argos
P.
1996
Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence
Protein Eng
.
9
133
142

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.