Protein structure prediction is one of the most significant technologies pursued by computational structural biology and theoretical chemistry. It has the aim of determining the three-dimensional structure of proteins from their amino acid sequences. In more formal terms, this is expressed as the prediction of protein tertiary structure from primary structure. Given the usefulness of known protein structures in such valuable tasks as rational drug design, this is a highly active field of research.
Every two years, the performance of current methods is assessed in the CASP experiment.
The practical role of protein structure prediction is now more important than ever. Massive amounts of protein sequence data may be derived from modern large-scale DNA sequencing efforts such as the Human Genome Project. Despite community-wide efforts in structural genomics, the output of experimentally determined protein structures - typically by time-consuming and relatively expensive X-ray crystallography or NMR spectroscopy - is lagging far behind the output of protein sequences.
A number of factors exist that make protein structure prediction a very difficult task, including:
Despite the above hindrances, much progress is being made by the many research groups that are interested in the task. Prediction of structures for small proteins is now a perfectly realistic goal. A wide range of approaches are routinely applied for such predictions. These approaches may be classified into two broad classes; ab initio modelling and comparative modelling.
Distributed computing projects that attempt to solve the protein prediction problem include Rosetta@home and Predictor@home
Ab initio- or de novo- protein modelling methods seek to build three-dimensional protein models "from scratch", i.e., based on physical principles rather than (directly) on previously solved structures. There are many possible procedures that either attempt to mimic protein folding or apply some stochastic method to search possible solutions (i.e. global optimization of a suitable energy function). These procedures tend to require vast computational resources, and have thus only been carried out for tiny proteins. To attempt to predict protein structure de novo for larger proteins, we will need better algorithms and larger computational resources like those afforded by either powerful supercomputers (such as Blue Gene) or distributed computing (see Human Proteome Folding Project). Although these computational barriers are vast, the potential benefits of structural genomics (by predicted or experimental methods) make ab initio structure prediction an active research field.
Comparative protein modelling uses previously solved structures as starting points, or templates. This is effective because it appears that although the number of actual proteins is vast, there is a limited set of tertiary structural motifs to which most proteins belong. It has been suggested that there are only around 2000 distinct protein folds in nature, though there are many millions of different proteins.
These methods may also be split into two groups:
A very recent review of currently popular software for structure prediction can be found at Nayeem A, Sitkoff D, Krystek S Jr. (2006). A comparative study of available software for high-accuracy homology modeling: From sequence alignments to structural models. Protein Sci 15:808-824.. A partial list of web servers and available tools is maintained here.
In the case of complexes of two or more proteins, where the structures of the proteins are known or can be predicted with high accuracy, protein-protein docking methods can be used to predict the structure of the complex. Information of the effect of mutations at specific sites on the affinity of the complex helps to understand the complex structure and to guide docking methods.
This article is licensed under the GNU Free Documentation License.
It uses material from the
"Protein structure prediction".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world