PEDS Advance Access originally published online on August 24, 2007
Protein Engineering Design and Selection 2007 20(10):521-523; doi:10.1093/protein/gzm042
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Short communication |
The PASTA server for protein aggregation prediction
1Department of Physics G. Galilei, University of Padova 2 CNISM, Padova Unit 3 INFN, Sezione di Padova 4Department of Biology and CRIBI Biotech Centre, University of Padova, Viale G. Colombo 3, 35131 Padova, Italy
5 To whom correspondence should be addressed. E-mail: silvio.tosatto{at}unipd.it
| Abstract |
|---|
|
|
|---|
Many different proteins aggregate into amyloid fibrils characterized by cross-ß structure. ß-strands contributed by distinct protein molecules are generally found in a parallel in-register alignment. Here, we describe the web server for a novel algorithm, prediction of amyloid structure aggregation (PASTA), to predict the most aggregation-prone portions and the corresponding ß-strand inter-molecular pairing for a given input sequence. PASTA was previously shown to yield results in excellent agreement with available experimental observations, when tested on both natively unfolded and structured proteins. The web server and downloadable source code are freely accessible from the URL: http://protein.cribi.unipd.it/pasta/.
Keywords: amyloid fibrils/cross-beta structure/parallel in-register arrangement/protein aggregation
| Introduction |
|---|
|
|
|---|
Several neurodegenerative disorders in humans are associated with the conversion of peptides and proteins from their soluble functional forms into well-defined fibrillar aggregates, generally described as amyloid fibrils (Chiti and Dobson, 2006
In spite of the great variability in both sequences and soluble-state structures of precursor proteins, the resulting fibrils exhibit common properties (Sunde and Blake, 1997
). Experimental studies identified the regions of the sequence forming and stabilizing the cross-ß core of the fibrils, and clarified the nature of the intermolecular contacts. Parallel in register arrangements (PIRA) of ß-strands in the fibril core occurs quite frequently (Ferguson et al., 2006
), but anti-parallel arrangements are also possible (Makin et al., 2005
). Mutational studies of the amyloid aggregation kinetics revealed simple correlations between physico-chemical properties and aggregation propensities, allowing the development of different methods which successfully predict aggregation-prone regions [for a recent review see (Caflisch, 2006
)]. All approaches focus on predicting the ß-aggregation propensity of a sequence stretch by itself. A new algorithm, prediction of amyloid structure aggregation (PASTA), was recently introduced by editing a pair-wise energy function for residues facing one another within a ß-sheet (Trovato et al., 2006
). Two different propensity sets were extracted depending on the orientation (parallel or anti-parallel) of the neighboring strands, from a dataset of known native structures of globular proteins. PASTA associates energies to specific ß-pairings of two sequence stretches of the same length, and further assumes that distinct protein molecules involved in fibril formation will adopt the minimum energy ß-pairings in order to better stabilize the cross-ß core. A novel feature of PASTA is the ability to predict the registry of the inter-molecular hydrogen bonds formed between amyloidogenic sequence stretches. In this way, the observed tendency of several proteins towards PIRA was rationalized on general grounds. PASTA, however, has also the intrinsic possibility to predict not in register alignment exactly since it considers all the possible matches of the replicas of the same sequence. The good performance of PASTA was tested on both natively unfolded (Trovato et al., 2006
) and structured proteins (Trovato et al., 2007
).
| Server description |
|---|
|
|
|---|
The PASTA server takes an amino acid sequence as input and predicts which portions of the sequence are more likely to stabilize the cross-ß core of fibrillar aggregates. The input form is very simple, and requires an email address and (optional) title for the prediction job. The output can be divided in three parts: top pairing energies, aggregation profile and pairing matrix.
The top pairing energies are shown in the central part of the output page. Each line contains a predicted high scoring pairing, complete with localization (i.e. residue numbers) and orientation (parallel or anti-parallel). The number of pairings to be output is set in the input form, with a default of 10 pairings. The PASTA energy is indicative of the aggregation propensity. Benchmarking performed on the dataset of 179 peptides derived from the literature (Fernandez-Escamilla et al., 2004
) revealed close to 80% true positive predictions with a
20% false positive rate at a PASTA energy threshold of –4.0 (Fig. 1).
|
The aggregation profile and pairing matrix are provided through links to PDF files. The aggregation profile shows the normalized per-residue probability h(k) calculated from Eq. (5) in Trovato et al., (2006)
As an example, we show in Fig. 2 the PASTA output for the human amyloid ß-peptide (Aß1–40), a peptide known to be involved in the Alzheimer's disease and other pathological conditions such as hereditary cerebral hemmorhage with amyloidosis and inclusion-body myositis (Chiti and Dobson, 2006
). The two top-scoring pairings (residues 12–20 and 31–40, Fig. 2A) and the predicted PIRA alignment (Fig. 2C) are in very good agreement with experimental evidence (residues 12–24 and 30–40, varying somewhat between reports) (Petkova et al., 2002
).
|
| Source code |
|---|
|
|
|---|
For those wishing to run PASTA on large sequence ensembles (e.g. entire genomes) or interested in extending the approach, we are providing the source code as a downloadable TAR archive, reachable from the server homepage. The source code consists of an ANSI C program to compute the PASTA energies and profiles. Two R scripts are provided to generate the same PDF graphics used in the web server. A simple shell script guides the overall program flow and a test sequence is also included. Details concerning installation and usage are explained in a README file.
| Funding |
|---|
|
|
|---|
This work was supported by Programmi di Ricerca Scientifica di Rilevante Interesse Nazionale, grant 2005027330 in 2005. S.T. is funded by a Rientro dei cervelli grant from the Italian Ministry for Education, University and Research (MIUR).
| Footnotes |
|---|
Edited by Regina Murphy
| Acknowledgements |
|---|
|
|
|---|
The authors wish to thank Fabrizio Chiti and Amos Maritan for ongoing collaboration on protein aggregation and Micky Del Favero for expert system administration.
| References |
|---|
|
|
|---|
Caflisch A. Curr. Opin. Chem. Biol. (2006) 10:437–444.[CrossRef][Web of Science][Medline]
Chiti F., Dobson C.M. Annu. Rev. Biochem. (2006) 75:333–366.[CrossRef][Web of Science][Medline]
Chiti F., Stefani M., Taddei N., Ramponi G., Dobson C.M. Nature (2003) 424:805–808.[CrossRef][Medline]
Ferguson N., et al. Proc. Natl Acad. Sci. USA (2006) 103:16248–16253.
Fernandez-Escamilla A.M., Rousseau F., Schymkowitz J., Serrano L. Nat. Biotechnol. (2004) 22:1302–1306.[CrossRef][Web of Science][Medline]
Fowler D.M., Koulov A.V., Alory-Jost C., Marks M.S., Balch W.E., Kelly J.W. PLoS Biol. (2006) 4:e6.[CrossRef][Medline]
Hoang T.X., Marsella L., Trovato A., Seno F., Banavar J.R., Maritan A. Proc. Natl Acad. Sci. USA (2006) 103:6883–6888.
Makin O.S., Atkins E., Sikorski P., Johansson J., Serpell L.C. Proc. Natl Acad. Sci. USA (2005) 102:315–320.
Petkova A.T., Ishii Y., Balbach J.J., Antzutkin O.N., Leapman R.D., Delaglio F., Tycko R. Proc. Natl Acad. Sci. USA (2002) 99:16742–16747.
Sunde M., Blake C. Adv. Protein Chem. (1997) 50:123–159.[Web of Science][Medline]
Trovato A., Chiti F., Maritan A., Seno F. PLoS Comput. Biol. (2006) 2:1608–1618.[Web of Science]
Trovato A., Maritan A., Seno F. J. Phys.: Condens. Matter (2007) 19:285221.[CrossRef]
Received May 14, 2007; revised July 4, 2007; accepted July 6, 2007.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
L. Cendron, A. Trovato, F. Seno, C. Folli, B. Alfieri, G. Zanotti, and R. Berni Amyloidogenic Potential of Transthyretin Variants: INSIGHTS FROM STRUCTURAL AND COMPUTATIONAL ANALYSES J. Biol. Chem., September 18, 2009; 284(38): 25832 - 25841. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Kim, J. Choi, S. J. Lee, W. J. Welsh, and S. Yoon NetCSSP: web application for predicting chameleon sequences and amyloid fibril formation Nucleic Acids Res., July 1, 2009; 37(suppl_2): W469 - W473. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Niwa, B.-W. Ying, K. Saito, W. Jin, S. Takada, T. Ueda, and H. Taguchi Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins PNAS, March 17, 2009; 106(11): 4201 - 4206. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Parasassi, M. De Spirito, G. Mei, R. Brunelli, G. Greco, L. Lenzi, G. Maulucci, E. Nicolai, M. Papi, G. Arcovito, et al. Low density lipoprotein misfolding and amyloidogenesis FASEB J, July 1, 2008; 22(7): 2350 - 2356. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





