PEDS Advance Access published online on September 4, 2007
Protein Engineering Design and Selection, doi:10.1093/protein/gzm041
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Article |
Binding specificities of the GYF domains from two Saccharomyces cerevisiae Paralogs
1Center for Biomembrane Research, Department of Biochemistry and Biophysics, Stockholm University, The Arrhenius Laboratories, 106 91 Stockholm, Sweden 2Research Group for Chemometrics, Biological Chemistry, Department of Chemistry, Umeå University, SE-901 87 Umeå, Sweden
3 To whom correspondence should be addressed. E-mail: ake{at}dbb.su.se
| Abstract |
|---|
|
|
|---|
We have used multivariate statistics and z-scales to represent peptide sequences in a PLS-QSAR model of previously studied binding affinities [ Kofler,M., Motzny,K. and Freund,C. (2005b) Mol. Cell. Proteomics, 4, 1797–1811.] of two GYF domains to an array of immobilized synthetic peptides. As a result, we established structural determinants of the binding specificities of the two proteins. Our model was used to define new sets of yeast proteins potentially interacting with Syh1 (YPL105C) and Smy2 (YBR172C). These sets were subsequently examined for co-occurrence of Gene Ontology terms, leading to suggest a group of likely interacting proteins with a common function in mRNA catabolism. Finally, subcellular localization of a GFP-fused Syh1 and Smy2 reinforced the possibility that these proteins reside in cytoplasmic sites of mRNA degradation, thereby providing experimental confirmation to the predictions from the model.
Keywords: GYF/PLS/QSAR
| Introduction |
|---|
|
|
|---|
In the post-genome era, the use of DNA microarrays has become widespread and the advent of whole proteome chips has been announced (Kung and Snyder, 2006
GYF domains are proline-rich sequence recognizing adaptors found in most eukaryotes (Nishizawa et al., 1998
; Freund et al., 1999
). Two types of GYF domains, the CD2BP2 type and the SMY2 type, distinguished by subtle differences in structure, have been identified in different proteins (Kofler and Freund, 2006
). Proteins carrying the first type of GYF domain were shown to have a role in mRNA splicing (Bialkowska and Kurlandzka, 2002
; Kofler et al., 2004
), while the functional importance of the SMY2 type GYF domains has not been demonstrated yet.
A tandem repeat of the sequence SHRPPPGHR was initially described as a ligand for the GYF domain (Nishizawa et al., 1998
). It was later shown that even one copy of the recognition sequence was sufficient for binding, and in addition that other similar proline-rich sequences were able to interact with GYF domains (Kofler et al., 2005a
,b
). Finally, the affinity of several known GYF domains from different organisms were comprehensibly studied and consensus sequences mediating strong binding were determined (Kofler et al., 2005a
,b
).
We were interested in the putative binding partners of two yeast proteins, Syh1 (systematic name YPL105C) and Smy2 (YBR172C), mediated by their GYF domains. To this end, we analysed the published binding strength of an array of peptide ligands to the two GYF domains (Kofler et al., 2005b
). Using a partial least square projection method, we built a model describing the quantitative structure–activity relationships (QSAR) between the peptide sequences and the signal intensities. This model was useful in establishing sequence features in the peptide ligands causing binding of each of the two domains. With the model, we could define a possible preferential specialization of Smy2-GYF and Syh1-GYF to groups of proteins regulating the initial steps of protein translation and mRNA degradation. Finally, we determined the subcellular localization of Syh1 and Smy2 tagged with GFP and demonstrated that they were consistent with potential accessory roles in mRNA degradation.
| Methods |
|---|
|
|
|---|
Data origin
We used published data to estimate the binding affinity of the various GYF domains to arrays of synthetic peptides (Kofler et al., 2005 b
). The primary data for the spot membranes (spot signal intensities measured in BLU, Boehringer luminescence units) were kindly provided by Dr Christian Freund (Free University and FMP Berlin, Germany).
Partial least squares analysis and validation
Partial least squares (PLS) is a multivariate statistical regression method capable of dealing with a large number of variables in data sets suffering from noise and missing values (Wold et al., 1984
). The objective of the PLS algorithm is to model both the X and Y data matrices while maximizing the covariance between them, as shown in the following equations:
|
| 1 |
|
| 2 |
|
| 3 |
In the general case of PLS applied in this study the Y variables are continuous, however PLS can also be used as a method to discriminate between classes by using discrete variables in Y to describe class assignment. PLS is well described in computational reference literature (Wold et al., 1999
) and also implemented in numerous commercial software.
Cross-validation of the PLS model was achieved by leaving 10% of all observations out and building models based on the remaining 90% of the observations. The goodness of prediction Q2 was subsequently calculated from the squared differences between the predicted and actual values for the observations not used for building the model, so that a poor prediction would have a Q2 value closer to zero and a good one would approach unity, as shown below:
|
| 4 |
QSAR are used to study the connections between chemical structure and the biological activity of molecules such as chemical compounds used as drugs or biological peptides (Wold et al., 1993
). In combination with PLS modeling, QSAR analysis makes it possible to identify structural determinants which are of importance for the observed biological activity. Peptide QSAR is a specialized type of analysis where the structure of biological peptides is translated into a set of descriptor variables. Thus, the X matrix of predictor variables in the PLS model would consist of numbers representing each amino acid in the set of peptides studied. The rows in this matrix correspond to the different peptide species, whereas the columns correspond to the position of the amino acid residue within the peptide. Since more than one numerical descriptor is used for each amino acid, the number of columns is proportional to the length of the peptide and to the number of descriptors per single residue. Likewise, the Y matrix of dependent or response variables in the QSAR–PLS models is built from a measured property of the peptides, such as a biological activity, affinity or some physical or chemical property. Comparing the X and Y matrices by PLS allows drawing conclusions about the relationships between the characteristics and relative position of the building blocks on one hand, and the features of the entire peptide on the other.
To describe the sequences of the peptide ligands whose binding to GYF domains was assayed, we used the z-scales representing the three most important properties of amino acids derived by principal component analysis of 29 physico-chemical measurements for the 20 amino acids (Hellberg et al., 1987
). Alternative numerical representations for amino acid properties have been used by others (Kidera et al., 1985
; Opiyo and Moriyama, 2007
); however, we chose the z-scales for their ease of interpretation, namely z1 as a measure of hydrophilicity/hydrophobicity, z2 as bulkiness and z3 as polarizability/charge.
Peptides of 16 residues, such as those used for the SPOT assay, are likely to form secondary structures. For binding to the GYF domain, the presence of several prolines and a glycine, known as potent
-helix breakers, is an essential feature of GYF domain ligands. In order to account for structural preferences of the amino acids in the peptides, we included a fourth variable for each amino acid,
-helix propensity. Among several published
-helicity scales, we chose one (Levitt, 1978
) which was least correlated with the three original z-scales and therefore provided the highest amount of independent information; it was rescaled to unit length and further referred to as z4. In addition, we considered a left-handed polyproline II (PPII) helix-formation propensity scale (Rucker et al., 2003
), since the GYF domain ligands are in the PPII conformation; however, we did not use it for the presented final PLS model due to the missing values for Trp and Tyr. The complete set of scales used in this study is presented in Table I.
|
Variable selection
In order to represent the peptide sequences numerically, we aligned them using the PPG motif in the peptide as a guide, as shown in Supplementary data, Table S1. Subsequently, each amino acid was replaced by four z-scales, thereby representing each peptide by a 104 dimensional vector. Variables corresponding to gaps at the beginning and end of the aligned sequences were left undefined.
For the PLS model, 147 peptides that had shown affinity to at least one of the GYF domains were considered. Peptides lacking the conserved PPG motif were excluded from the model; these without exceptions had null binding affinity. Columns of variables with more than 80% missing values (corresponding to the first and last four amino acid positions), as well as those with no variation (corresponding to the PPG motif) were also excluded during model building.
The signals from different regions of the same membrane filter obtained in SPOT binding assays can be affected by uneven staining of the membrane (Weiser et al., 2005
). To account for this, we included as X variables the horizontal and vertical positions of the spots on the membrane, centered and normalized. The squared and cross terms were also included, since the dependence of binding on position is not linear (Weiser et al., 2005
). This allowed us to use the repeated samples on the array and implicitly model the influence of position on signal intensity. The compiled dataset is provided as Supplementary data, Table S2.
The Saccharomyces cerevisiae BY4741 strain (MATa, his3
1 leu2
0 met15
0 ura3
0) was used as recipient for the fluorescently tagged genes. GFP fusions were constructed by ligating a PCR amplified DNA fragment of the corresponding gene into a pre-cleaved pAL-GFP (LEU2, pADH1, GFPuv, 2-µ) upstream of the GFP sequence, as follows: pAL-Syh1-GFP: Syh1 (residues 1–849); pAL-Smy2-GFP: Smy2 (residues 190–740); pAL-GYFSyh1-GFP: GYF domain from Syh1 (residues 1–205); pAL-GYFSmy2-GFP: GYF domain from Smy2 (residues 190–275). Yeast cells were transformed by the lithium-acetate method (Gietz et al., 1995
) sequentially, first with the plasmid pRP1186 (Dcp2-RFP, URA3, Cen) (Teixeira et al., 2005
) and then with one of the GFP fusions listed above. Transformants were selected on standard yeast drop-out solid media containing 1.8% agar, 2% glucose, 0.67% yeast nitrogen base and a drop-out amino acid supplement lacking Uracil (for pRP1186 transformation) or both Uracil and Leucine (for the second transformation). The resulting strains were grown logarithmically in 5 ml liquid drop-out media at 30°C; 1 ml aliquotes were collected by centrifugation, washed briefly with distilled water, embedded in 1% low melting point agarose on microscopy slides and used for colocalization visualization. Zeiss Axioplan fluorescence microscope equipped with a Hamamatsu 1394 ORCA-ER CCD-camera was used for image acquisition.
SIMCA-P 11.0 (Umetrics AB, Umeå, Sweden) was used for PLS modeling. The implementation of Gene Ontology (GO) term finder (Boyle et al., 2004
) at the Saccharomyces genome database was used for GO data mining. Scatter plot figures were generated with R 2.4.0 (R Development Core Team, 2006
). Images were processed with Adobe Photoshop 6.0.
| Results |
|---|
|
|
|---|
Correlation between Syh1-GYF and Smy2-GYF binding
To get an overview of the level of correlation between the two binding affinities, we plotted the raw values as estimated previously (Kofler et al., 2005b
) of Syh1-GYF against Smy2-GYF binding (Fig. 1). There was a weak yet significant positive correlation between the binding of the two GYF domains to the array of peptides (r = 0.57, n = 198, P < 0.0001). However, it was also evident from the data that the two GYF domains from Syh1 and Smy2 did not always bind equally well to certain ligands, and that the positive correlation was largely due to the high number of non-binding peptides. Therefore, we were interested in estimating whether the differences were caused by the naturally occurring variation caused by the detection and quantification method, or, alternatively, if these differences could be attributed to different affinities of the GYF domains.
|
Noise estimation
We used spots which were repeated several times on the membrane in order to estimate the variance arising from noise during signal acquisition. Samples for three peptides from Msl5p were repeated six to seven times as positive controls and we estimated the variation for each species separately. In addition, two more peptides, corresponding to highly repetitive genes, were represented by totally 15 spots. The results are shown in Table II. We noted the high standard deviation in the binding response of peptides a and b to both Syh1-GYF and Smy2-GYF; the data for peptides c, d and e seemed more constant within different measurements. We calculated the replicate variance and compared it to the total variance. For both membranes, the replicate variance accounted for a substantial portion of the variance of the complete dataset, 36% for Syh1 and 29% for Smy2. Therefore, it was not obvious whether the apparent differences between binding of the two domains are caused by different affinities to the peptide ligands or by noise.
|
PLS model parameters
As an instrument to objectively address the noise in the raw data, a PLS model was calculated. The peptide sequences, transformed as described in Methods, were used as X variables and the two sets of binding affinity derived for Syh1 and Smy2 as Y responses. The model had two principal components, explaining together 55% of the variance. Cross-validation of the model estimated its predictive power, Q2 = 0.40. While it is generally accepted that Q2
0.5 is characteristic of a good model, this threshold is application specific (Eriksson et al., 2003
). In the present case, the high replicate error was symptomatic of a relatively high noise which defined a realistic upper limit for the model parameters (Eriksson et al., 2003
).
Qualitative discrimination analysis
We had modeled the binding strength of GYF domains as a continuous value. However, while the reliability of SPOT synthesis allows for distinction between weak and strong binders, it is commonly not sufficient to estimate binding parameters from single or a few spots (Weiser et al., 2005
). In order to use the PLS model for discrimination analysis, we applied a cutoff threshold to the signal intensity raw data and to the predicted values, which separates strong binders, on one hand, and weak binders or non-binders, on the other. The threshold was selected from the ordered binding responses for both Syh1-GYF and Smy2-GYF between the third and forth quartile. To validate the performance of this qualitative model, we calculated its specificity and sensitivity over the entire set of data points on the membrane after classifying them as weak or strong binders using this threshold. Model predictions were classified using the same threshold as the raw data, since they were calculated to the same scale. Sensitivity was defined as the fraction of true binders also identified by the model, whereas specificity was the fraction of weak binders which were correctly identified as such. The sensitivity was 78% for Smy2-GYF and 74% for Syh1-GYF, whereas the specificity was 88% and 76% respectively, a generally good performance exceeding in all cases the recommended 70% limit (Eriksson et al., 2003
). For the purpose of prediction of binding, we used the thresholds 5 x 105 BLU for Smy2 and 105 BLU for Syh1, thus ensuring higher sensitivity while maintaining the specificity
70%.
With a PLS model, it is possible to plot simultaneously the weights w* and c for the respective X and Y matrix of the model (cf. Methods). This plot shows which X variables correlate with the Y variables. In addition, variables with large absolute values of their weights (high loadings) plot away from the origin, and have higher importance for the model.
Loadings in the PLS model of the predictor X variables and the dependent (response) variables Syh1 and Smy2 are shown on Fig. 2. The most important variables for the first model dimension (Fig. 2, blue arrow) were z116, z416 and z29. The first variable reflects the hydrophilicity/hydrophobicity at position 16 of the alignment, or immediately following the PPG motif, and its importance has already been described (Kofler et al., 2005b
). The second variable is the helix-forming propensity of the same residue. It excludes residues with low helix propensity such as Trp and Pro from favoring binding and, combined with the high hydrophobicity, narrows the preferred residues for this position in high affinity ligands to Leu, Met and Ala. The third important variable was z29 which is a measure of the bulkiness of the residue at position 9, possibly relating to steric matching of the ligand to the binding groove of the GYF domain. This dimension provided separation of common Syh1 and Smy2 binders versus non-binders.
|
The differences in binding preferences between Syh1-GYF and Smy2-GYF were illustrated by the second dimension of the model (Fig. 2, green arrow). Here, the variables z312, z318 and z319, representing polarizability/charge of the residues at positions 12, 18 and 19, had a positive loading for Smy2-GYF but negative for Syh1-GYF (Fig. 2). This demonstrated the preference for binding of Smy2-GYF to more positively charged peptides than Syh1-GYF. The opposite tendency was observed for the helix-forming propensity of GYF ligands, as judged from z48, z49, z411 and z412, which had positive loadings towards Syh1 and negative to Smy2 (Fig. 2). The latter likely reflects the higher content of helix-breakers among Smy2-GYF ligands, such as generally longer proline stretches found in Vrp1, Msl5, Prp8, Eap1 and Mot2.
Analysis of the prediction from the PLS model allowed us to define sequences with potentially strong binding to Syh1-GYF and Smy2-GYF while removing the effects of noise and peptide length. We compiled a new non-redundant set of 149 sequences from the Saccharomyces genomic database for peptides carrying the loose consensus PPGJ, where J is any hydrophobic amino acid (Table III). All peptides in the set had a length of 20 unless present at the very C-terminus of the corresponding protein, and the conserved Gly residue was at position 15. This set was overlapping with the one used before (Kofler et al., 2005b
) but contained also peptides from 17 proteins which have not been directly assayed as GYF domain ligands. All peptides in the set were translated into the same set of z-scale descriptors as the training set. Using the PLS model and features of the software (see Methods/Software), binding was predicted for each peptide in arbitrary units corresponding to the measured BLU intensities of the training set. The peptides were plotted according to Syh1-GYF against Smy2-GYF predicted affinity (Fig. 3).
|
|
Analyzing the entire set of proteins carrying the core PPGJ motif could potentially identify connections between them; however, we took advantage of our knowledge about the peptides that had actually demonstrated affinity for the GYF domain. Subsets of these candidate interactors, with predicted binding to either Smy2-GYF or Syh1-GYF above the thresholds established earlier (see Qualitative Discrimination Analysis), were used to search for patterns of common function, process or localization in the cell using the GO term finder (Boyle et al., 2004
|
GYF domain localization
Subcellular localization could be very informative for both function and interactions of proteins (Huh et al., 2003
); therefore, we tagged the GYF domains alone as well as the entire Syh1 and Smy2 proteins by GFP and examined their distribution in living yeast cells. For both Syh1 and Smy2, we observed a peculiar punctate pattern (Fig. 4A and E). Prompted by the model predictions, we also labeled mRNA processing bodies by the decapping enzyme Dcp2-RFP (Fig. 4B and F). Dcp2 is the catalytic subunit of the Dcp1-Dcp2 decapping enzyme complex (Dunckley et al., 2001
) and is a constituent of processing bodies (Sheth and Parker, 2003
). In a significant number of cases, the spots for Syh1-GFP and Smy2-GFP colocalized with those for Dcp2-RFP (Fig. 4D and H), thereby demonstrating the presence of Syh1 in cytoplasmic mRNA processing bodies. Similar results were obtained for the GYF domains only (Fig. 4L and P), but not for GFP nor for other regions of the Syh1 protein lacking the GYF domain (not shown).
|
| Discussion |
|---|
|
|
|---|
We have analyzed a previously published set of data (Kofler et al., 2005a
-helicity in the proline-rich ligand for Syh1-GYF, as established by the PLS model and illustrated by the opposite to Syh1 loadings of the z3 and z4 variables at several positions (Fig. 3), may dictate the higher affinity of this domain to a different subset of peptides. Indeed, the ligand binding grooves of Smy2 and Syh1 GYF domains are quite similar in sequence and thus do not provide an obvious explanation to the observed differences. However, calculating the pI values of the two domains (for the sequence corresponding to the available NMR structure solution of the plant homolog) reveals a difference, namely pISyh1 = 4.94 and pISmy2 = 4.50. Moreover, building homology models of Syh1-GYF and Smy2-GYF and rendering their electrostatic surfaces showed the Smy2 GYF surface as more electronegative (Supplementary data, Figure S1). While we cannot infer a specific site important for the higher affinity of Smy2-GYF to more positive ligands, we suspect that the overall charge bias of the domain may contribute to the extent of binding through electrostatic forces in the conditions of the Spot Assay. Similarly, the preference for a longer proline tract shown by Smy2-GYF may have to deal with the recently described alternative binding modes of proline-rich peptides to the GYF domain (Gu et al., 2005
Substitution analysis of phage display derived GYF ligands did not show dependence of binding on residues outside of the core PPGJ motif, and at the same time, naturally occurring peptides with the same motif clearly showed different affinities for the GYF domain (Kofler et al., 2005a
,b
). Possible explanation for this apparent discrepancy is if residues outside PPGJ have a cumulative effect on binding. As a result, changing those residues one at a time within a strong binder identified by phage display could have no noticeable effect. On the other hand, comparing the more dissimilar natural sequences derived from 140 proteins, many of which do not interact with GYF in vivo and have not been subjected to selective pressure, could result in a larger variation in affinity.
Previously, the affinity of the GYF-domain containing proteins to several proteins participating in pre-mRNA splicing has been emphasized (Fromont-Racine et al., 1997
; Bialkowska and Kurlandzka, 2002
; Kofler et al., 2004
). This affinity either supports specific interactions of functional importance, or simply reflects the presence of proline-rich sequences in the target proteins. An extended dataset representing 153 yeast proteins added even more functional protein groups to the GYF domain ligands (Kofler et al., 2005b
). Considering the existing variation in function and localization of the possible ligands, it is feasible that specialization exist between the two structural types of GYF domains. Thus, functional interactions may be formed with either splicing machinery components, normally localized to the cell nucleus, or with cytoplasmic proteins.
We found out that a large number of proteins with affinity for the Smy2 type GYF were connected to cytoplasmic processing of mRNA. In particular, Eap1 is a negative regulator of translation initiation (Cosentino et al., 2000
), and therefore an inducer of mRNA degradation (Coller and Parker, 2005
); Pat1 is a mRNA-decapping factor (Bonnerot et al., 2000
); Kem1 is the exoribonuclease of processing bodies (Geerlings et al., 2000
); Mot2, Ccr4 and Pop2 are part of the 3' to 5' mRNA deadenylation complex (Daugeron et al., 2001
; Tucker et al., 2001
; Tucker et al., 2002
). In addition, Cdc39 and Not5, which are also components of the CCR4-NOT complex (Collart and Struhl, 1994
; Chen et al., 2001
), had low binding affinity (Cdc39) or were not included (Not5) in the original experiment; both of them were predicted in our PLS model to be potential binders of GYF. Some of the peptides had a high affinity towards both Smy2-GYF and Syh1-GYF (Eap1, Kem1, Mot2, Pop2), whereas others displayed preference for Smy2-GYF (Pat1, Ccr4) or Syh1-GYF (Not5). In an independent study of protein complexes from yeast, Syh1 but not Smy2 was identified as a component of a complex together with several translation initiation factors (Krogan et al., 2006
). Therefore, the differences between Smy2-GYF and Syh1-GYF also found herein may indicate divergence in the function of the two proteins since the duplication of the corresponding genes undergone by S. cerevisiae (Wolfe and Shields, 1997
).
A common feature of GYF domains appears to be the interactions with proteins directly involved in different aspects of RNA metabolism. This is the case for the U5 snRNP specific proteins interacting with Lin1p in yeast (Stevens et al., 2001
; Bialkowska and Kurlandzka, 2002
) and with CD2BP2 in human (Kofler et al., 2004
; Laggerbauer et al., 2005
). However, it has not been shown for the GYF proteins to interact directly with RNA or to be required for specific steps of RNA metabolism. Similarly, we discuss here the ability of Smy2-type GYF proteins to bind mRNA catabolism factors and show cellular localization of GYF proteins to sites of mRNA decay; nevertheless, at this point we lack data implicating GYF proteins in mRNA catabolism directly. Therefore, we favor the possibility that GYF-domain proteins are indirectly involved in RNA metabolism by regulating the activity of interactors.
The proteins Eap1 and Msl5, which have been shown to bind GYF domains in vivo (Kofler et al., 2005a
,b
), both carry several repeats of the PPGJ cognate motif. Motif repetition within a single ligand has been demonstrated to enhance binding to the GYF domain (Freund et al., 2002
). Similarly, the presence of the cognate motif in more than one constituent of a multi-protein complex (e.g. the Ccr4-Not complex) may be considered as an alternative means to stimulate interactions. This implies a potential role for GYF domain proteins in the spatial organization of their interactors.
Cytoplasmic localization has been reported for Syh1 (Huh et al., 2003
) which is consistent with our observation for colocalization with cytoplasmic mRNA catabolism enzymes. The mRNA processing bodies are highly specialized structures, most prominent during conditions of stress, and had not been observed until recently (Sheth and Parker, 2003
). Importantly, several of the proteins carrying proline-rich GYF ligands are known to localize to processing bodies or to be closely associated with their function. Our study is the first report on the connection of Syh1 and Smy2 to processing bodies. The finding that the isolated GYF domains from these proteins could also colocalize to P-bodies when fused to GFP suggests that the GYF domain may alone be responsible for targeting. In this context, it would further be interesting to see if interactions with a PRS ligand are involved or whether the opposite side of the Smy2-type GYF domain is capable of interactions similar to those recently reported for the U5 snRNP 52K protein CD2BP2 (Nielsen et al., 2007
).
Very recently, the accumulation of the translation initiation factors eIF4E and eIF4G in P-bodies was reported (Brengues and Parker, 2007
). Interestingly, eIF4G competes with Eap1 for binding to eIF4E (Cosentino et al., 2000
), and the presence of both initiation factors may indicate that Eap1 is absent from the same complex. The interaction of Eap1 with the GYF domain proteins (Kofler et al., 2005a
,b
) will be interesting to address in this context.
It should be pointed out that the interactions of GYF with Eap1 and Ccr4 have been discussed earlier (Kofler et al., 2005b
), however in the context of different processes and emphasizing on the participation of Ccr4 in transcription (Denis and Malvar, 1990
). Therefore, the current work is the first instance which unifies the interaction of the GYF domain with these proteins in a common framework related to control of mRNA degradation. The demonstrated localization at specific subcellular sites of mRNA decay illustrates the relevance of these interactions. Together with the interactions with ribosomal components shown by others (Fleischer et al., 2006
; Krogan et al., 2006
), our findings suggest interesting regulatory functions for GYF domain proteins to be elucidated in upcoming experiments.
The importance of powerful methods for data analysis such as PLS-QSAR and data mining such as the GO term finder is underlined by the rapid advances in genomics and proteomics in the past years. Combined quantitative/qualitative approaches such as the one described herewith are useful for data validation, for dealing with noise and for extracting additional information from raw data. In the current example of the yeast GYF domain binding, the prediction from the modeling has led to important observations of the GYF-containing protein localization, which will be subsequently studied in further detail.
| Supplementary data |
|---|
|
|
|---|
Supplementary data are available at PEDS online.
| Footnotes |
|---|
Edited by Richard Goldstein
| Acknowledgements |
|---|
|
|
|---|
We are grateful to Christian Freund (Free University and FMP Berlin, Germany) for making available the GYF binding data and to Roy Parker (Howard Hughes Medical Institute, Tucson, Arizona, USA) for providing DCP2-RFP. This work was supported by the Swedish Research Council.
| References |
|---|
|
|
|---|
Bialkowska A., Kurlandzka A. Yeast (2002) 19:1323–1333.[CrossRef][ISI][Medline]
Bonnerot C., Boeck R., Lapeyre B. Mol. Cell. Biol. (2000) 20:5939–5946.
Boyle E.I., Weng S., Gollub J., Jin H., Botstein D., Cherry J.M., Sherlock G. Bioinformatics (2004) 20:3710–3715.
Brengues M., Parker R. Mol. Cell Biol. (2007) 18(7):2592–2602.[CrossRef]
Chen J., Rappsilber J., Chiang Y.C., Russell P., Mann M., Denis C.L. J. Mol. Biol (2001) 314:683–694.[CrossRef][ISI][Medline]
Collart M.A., Struhl K. Genes Dev. (1994) 8:525–537.
Coller J., Parker R. Cell (2005) 122:875–886.[CrossRef][ISI][Medline]
Cosentino G.P., Schmelzle T., Haghighat A., Helliwell S.B., Hall M.N., Sonenberg N. Mol. Cell. Biol. (2000) 20:4604–4613.
Daugeron M.C., Mauxion F., Seraphin B. Nucleic Acids Res. (2001) 29:2448–2455.
Denis C.L., Malvar T. Genetics (1990) 124:283–291.[Abstract]
Dunckley T., Tucker M., Parker R. Genetics (2001) 157:27–37.
Eriksson L., Jaworska J., Worth A.P., Cronin M.T., McDowell R.M., Gramatica P. Environ. Health Perspect. (2003) 111:1361–1375.[ISI][Medline]
Fleischer T.C., Weaver C.M., McAfee K.J., Jennings J.L., Link A.J. Genes Dev. (2006) 20:1294–1307.
Freund C., Dotsch V., Nishizawa K., Reinherz E.L., Wagner G. Nat. Struct. Biol. (1999) 6:656–660.[CrossRef][ISI][Medline]
Freund C., Kuhne R., Yang H., Park S., Reinherz E. L., Wagner G. EMBO J. (2002) 21:5985–5995.[CrossRef][ISI][Medline]
Fromont-Racine M., Rain J.C., Legrain P. Nat. Genet. (1997) 16:277–282.[CrossRef][ISI][Medline]
Geerlings T.H., Vos J.C., Raue H.A. RNA (2000) 6:1698–1703.[Abstract]
Gietz R.D., Schiestl R.H., Willems A.R., Woods R.A. Yeast (1995) 11:355–360.[CrossRef][ISI][Medline]
Gu W., Kofler M., Antes I., Freund C., Helms V. Biochemistry (2005) 44:6404–6415.[CrossRef][Medline]
Hellberg S., Sjostrom M., Skagerberg B., Wold S. J. Med. Chem. (1987) 30:1126–1135.[CrossRef][ISI][Medline]
Huh W.K., Falvo J.V., Gerke L.C., Carroll A.S., Howson R.W., Weissman J.S., O'Shea E.K. Nature (2003) 425:686–691.[CrossRef][Medline]
Kidera A., Konishi Y., Oka M., Ooi T., Scheraga H.A. Protein J. (1985) 4:23–55.[CrossRef]
Kofler M., Heuer K., Zech T., Freund C. J. Biol. Chem. (2004) 279:28292–28297.
Kofler M., Motzny K., Beyermann M., Freund C. J. Biol. Chem. (2005a) 280:33397–33402.
Kofler M., Motzny K., Freund C. Mol. Cell. Proteomics. (2005b) 4:1797–1811.
Kofler M.M., Freund C. FEBS J. (2006) 273:245–256.[CrossRef][Medline]
Krogan N.J., et al. Nature (2006) 440(7084):637–643.[CrossRef][Medline]
Kung L.A., Snyder M. Nat. Rev. Mol. Cell. Biol. (2006) 7:617–622.[CrossRef][ISI][Medline]
Laggerbauer B., Liu S., Makarov E., Vornlocher H.P., Makarova O., Ingelfinger D., Achsel T., Luhrmann R. RNA (2005) 11:598–608.
Levitt M. Biochemistry (1978) 17:4277–4285.[CrossRef][Medline]
Nielsen T.K., Liu S., Luhrmann R., Ficner R. J. Mol. Biol. (2007) 369:902–908.[CrossRef][ISI][Medline]
Nishizawa K., Freund C., Li J., Wagner G., Reinherz E.L. Proc. Natl Acad. Sci. USA (1998) 95:14897–14902.
Opiyo S.O., Moriyama E.N. J. Proteome Res. (2007) 6:846–853.[CrossRef][ISI][Medline]
R Development Core Team. R: A Language and Environment for Statistical Computing (2006) Vienna, Austria: Foundation for Statistical Computing. ISBN 3-900051-07-0.
Rucker A.L., Pager C.T., Campbell M.N., Qualls J.E., Creamer T.P. Proteins (2003) 53:68–75.[CrossRef][ISI][Medline]
Sheth U., Parker R. Science (2003) 300:805–808.
Stevens S.W., Barta I., Ge H.Y., Moore R.E., Young M.K., Lee T.D., Abelson J. RNA (2001) 7:1543–1553.[Abstract]
Teixeira D., Sheth U., Valencia-Sanchez M.A., Brengues M., Parker R. RNA (2005) 11:371–382.
Tucker M., Staples R.R., Valencia-Sanchez M.A., Muhlrad D., Parker R. EMBO J. (2002) 21:1427–1436.[CrossRef][ISI][Medline]
Tucker M., Valencia-Sanchez M.A., Staples R.R., Chen J., Denis C.L., Parker R. Cell (2001) 104:377–386.[CrossRef][ISI][Medline]
Weiser A.A., Or-Guil M., Tapia V., Leichsenring A., Schuchhardt J., Frommel C., Volkmer-Engert R. Anal. Biochem. (2005) 342:300–311.[CrossRef][ISI][Medline]
Wold S., Eriksson L., Sjöström M. PLS in Chemistry. The Encyclopedia of Computational Chemistry—Schleyer P.v.R., Allinger N.L., Clark T, et al, eds. (1999) Chichester, UK: Wiley. 2006–2020.
Wold S., Johansson E., Cocchi M. 3D QSAR in drug design theory, methods and aplications—Kubinyi H., ed. (1993) Leiden. 523–550.
Wold S., Ruhe A., Wold H., Dunn W.J. III. SIAM J. Sci. Stat. Comput. (1984) 5:735–743.[CrossRef]
Wolfe K.H., Shields D.C. Nature (1997) 387:708–713.[CrossRef][Medline]
Received March 9, 2007; revised June 25, 2007; accepted July 5, 2007.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Q. Wang, L. Zhang, B. Lynn, and B. C. Rymond A BBP-Mud2p heterodimer mediates branchpoint recognition and influences splicing substrate abundance in budding yeast Nucleic Acids Res., May 1, 2008; 36(8): 2787 - 2798. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




