Journal of Molecular Biology
Volume 342, Issue 1, 3 September 2004, Pages 345-353
Journal home page for Journal of Molecular Biology

A Comparative Study of the Relationship Between Protein Structure and β-Aggregation in Globular and Intrinsically Disordered Proteins

https://doi.org/10.1016/j.jmb.2004.06.088Get rights and content

A growing number of proteins are being identified that are biologically active though intrinsically disordered, in sharp contrast with the classic notion that proteins require a well-defined globular structure in order to be functional. At the same time recent work showed that aggregation and amyloidosis are initiated in amino acid sequences that have specific physico-chemical properties in terms of secondary structure propensities, hydrophobicity and charge. In intrinsically disordered proteins (IDPs) such sequences would be almost exclusively solvent-exposed and therefore cause serious solubility problems. Further, some IDPs such as the human prion protein, synuclein and Tau protein are related to major protein conformational diseases. However, this scenario contrasts with the large number of unstructured proteins identified, especially in higher eukaryotes, and the fact that the solubility of these proteins is often particularly good. We have used the algorithm TANGO to compare the β aggregation tendency of a set of globular proteins derived from SCOP and a set of 296 experimentally verified, non-redundant IDPs but also with a set of IDPs predicted by the algorithms DisEMBL and GlobPlot. Our analysis shows that the β-aggregation propensity of all-α, all-β and mixed α/β globular proteins as well as membrane-associated proteins is fairly similar. This illustrates firstly that globular structures possess an appreciable amount of structural frustration and secondly that β-aggregation is not determined by hydrophobicity and β-sheet propensity alone. We also show that globular proteins contain almost three times as much aggregation nucleating regions as IDPs and that the formation of highly structured globular proteins comes at the cost of a higher β-aggregation propensity because both structure and aggregation obey very similar physico-chemical constraints. Finally, we discuss the fact that although IDPs have a much lower aggregation propensity than globular proteins, this does not necessarily mean that they have a lower potential for amyloidosis.

Introduction

Protein aggregation has long been thought of as an unspecific process caused by the formation of non-native contacts between protein folding intermediates. Recent work, however, shows that often aggregation is a much more specific process than previously expected and that, accordingly, it can be reliably correlated to a combination of simple physico-chemical parameters.1, 2, 3 In particular, several models for aggregation were postulated that all involve the formation of an intermolecular β-sheet initiated by amino acid sequences that act as nuclei for β-aggregation.4, 5, 6, 7, 8 According to these models, aggregation is initiated when amino acid segments having a high hydrophobicity, a good β-sheet propensity and a low net charge are solvent-exposed so that they can associate. As a result one would then expect aggregating protein segments to be buried in the folded state and not to be exposed to the solvent. This is confirmed by the experimental finding that in many globular proteins, aggregation occurs during refolding or under conditions in which denatured or partially folded states are significantly populated, i.e. at high concentration or as a result of destabilizing conditions or mutations.9 Based on these findings we recently developed the computer programme TANGO10 to predict β-aggregating stretches in proteins, based on a statistical mechanics algorithm that considers the physico-chemical parameters described above but also competition between different structural conformations: β-turn, α-helix, β-sheet aggregates and the folded state. The algorithm is based on the assumption that in the ordered β-aggregates the nucleating regions end up fully buried, paying maximal desolvation energy as well as entropy, while satisfying their H-bonding potential. The energy contributions are derived from the FOLD-X force field.11 In a blind test involving 174 peptides from over 20 proteins, TANGO achieved an accuracy of 95% in predicting aggregating sequences, as well as the effect of point mutations on the aggregation tendency of proteins.10 Many intrinsically disordered proteins (IDPs) have been discovered in all kingdoms of life, but especially in higher eukaryotes.12, 13, 14 These are proteins or domains that, in their native state, are either completely disordered or contain large disordered regions.15, 16 More than 180 such proteins are known to date, including prions, CREB, Tau, MAPs and p53.16 These polypeptides perform important regulatory functions and are widespread in eukaryotic cells and tissue. Some acquire structure upon binding to another protein or DNA, others act as structural anchors in large protein–protein and protein–RNA complexes, making use of extended interaction surfaces that are simply not available in more compact conformations.12 Furthermore, many globular proteins contain disordered segments acting as functional modules, e.g. post-translational modification sites and domain ligands. Importantly, many IDPs are involved in key cellular processes and some of them are related to major protein conformational diseases, e.g. prions (BSE), Tau (Alzheimer's disease), and synuclein (Parkinson's disease). The uniting factor associating the above proteins to their disease states is a high degree of aggregation or amylogenicity. Amylogenicity is not itself a direct result of β-aggregation but it is often found in association with and can be strongly promoted by β-aggregation.17 On the other hand, as mentioned above, it is often found experimentally that unstructured proteins are resistant to aggregation, even under harsh treatments such as incubation at high temperature.16 In fact, heat-exposure of cell-extracts is an effective protocol for purification of several recombinantly expressed unstructured proteins.16 It is therefore important to investigate the relationship between intrinsic disorder and aggregation to gain further insight into the potential of IDPs to be implicated into protein conformational diseases. The TANGO algorithm offers the opportunity to compare the aggregation propensities of IDPs and globular proteins, not only by considering average aggregation-related physico-chemical properties, but also by directly comparing the nature and frequency of aggregation-promoting nucleation stretches. This analysis should therefore allow us to test whether disorder does correlate with aggregation, as some cases of disease association suggest, or whether it anti-correlates with aggregation as residue compositional biases of IDPs suggest.

In order to deal with this issue we have used TANGO to compare the aggregation tendency of a non-redundant set of globular proteins derived from the SCOP database (the ASTRAL40 set, see Materials and Methods),18 a set of proteins that were experimentally shown to be unstructured16, 19 as well as a set of predicted disordered protein sequences. Data sets of experimentally verified disordered proteins are scarce and rather error-prone, hence we have collected and cured a set of 296 experimentally verified and published, IDP sequences. This is to our knowledge the largest dataset available to the community. The datasets of predicted disordered segments or proteins were predicted by the DisEMBL20 and GlobPlot21 algorithms and divided into sequences of low (∼50%) and average sequence complexity.

Our analysis clearly shows that aggregation-prone segments are much less frequent in IDPs than in globular proteins, thus accounting for their good solubility. Although more frequent in globular proteins, β-aggregating segments are generally part of the hydrophobic core. These observations show that the compositional bias observed in IDPs reduces secondary and tertiary structure as well as aggregation because both structure and aggregation rely on similar physico-chemical properties. As previously observed,12, 16 IDPs are not completely devoid of structure, as should be expected if some degree of functional specificity has to be obtained, but they perform their particular cellular functions by achieving a low degree of order, retaining only structural propensities that are devoid of aggregation-promoting features.

Section snippets

TANGO score for aggregation and accuracy of the TANGO algorithm

The TANGO algorithm was calibrated using data found in the scientific literature on the aggregation of 174 peptides corresponding to sequence fragments of 21 different proteins, studied by various research groups using circular dichroism (CD) or nuclear magnetic resonance (NMR). Of the peptides in our set, 70 were experimentally observed to aggregate in the concentration range between 100 μM and 1 mM, while the others remained soluble in this concentration range. A detailed description of our

Conclusions

TANGO is an algorithm to predict β-aggregation nucleating regions in proteins. Here, we used TANGO to compare the β-aggregation propensities of globular proteins and intrinsically unstructured proteins. In globular proteins we found similar amounts of β-aggregating nucleation regions in all-α, all-β and mixed α/β proteins. This demonstrates that globular proteins do display a certain degree of structural frustration and can at the same time display propensities for both α and β conformations

Datasets

Here we have used datasets that cover globular and IDPs. Both predicted and experimentally verified datasets are described and for each dataset we have split the data into a low and a normal-complexity set.

Acknowledgements

This work was supported, in part, by EU grant QLRI-CT2-2002-00241. J.W.H.S. and F.R. were supported by International Prize Traveling Fellowships from the Wellcome Trust. Thanks to Sara Quirk for reading this manuscript. We are grateful to Lars Juhl Jensen for the human proteome set.

References (39)

  • A.R. Viguera et al.

    Conformational analysis of peptides corresponding to beta-hairpins and a beta-sheet that represent the entire sequence of the alpha-spectrin SH3 domain

    J. Mol. Biol.

    (1996)
  • V. Munoz et al.

    Elucidating the folding problem of helical peptides using empirical parameters. II. Helix macrodipole effects and rational modification of the helical content of natural peptides

    J. Mol. Biol.

    (1995)
  • C.M. Dobson

    Protein-misfolding diseases: getting out of shape

    Nature

    (2002)
  • F. Chiti et al.

    Kinetic partitioning of protein folding and aggregation

    Nature Struct. Biol.

    (2002)
  • F. Chiti et al.

    Rationalization of the effects of mutations on peptide and protein aggregation rates

    Nature

    (2003)
  • C.C. Blake et al.

    A molecular model of the amyloid fibril

    Ciba Found. Symp.

    (1996)
  • L.C. Serpell et al.

    The molecular basis of amyloidosis

    Cell Mol. Life Sci.

    (1997)
  • L.C. Serpell et al.

    Molecular structure of a fibrillar Alzheimer's A beta fragment

    Biochemistry

    (2000)
  • M. Lopez De La Paz et al.

    De novo designed peptide-based amyloid fibrils

    Proc. Natl Acad Sci. USA

    (2002)
  • Cited by (0)

    R.L., J.S. and F.R. contributed equally to this work.

    View full text