Protein Engineering, Vol. 12, No. 11, 953-957,
November 1999
© 1999 Oxford University Press
Physicochemical factors for discriminating between soluble and membrane proteins: hydrophobicity of helical segments and protein length
Tokyo University of Agriculture and Technology, Department of Biotechnology, Koganei, Tokyo 184-8588, Japan
| Abstract |
|---|
|
|
|---|
The average hydrophobicity of a polypeptide segment is considered to be the most important factor in the formation of transmembrane helices, and the partitioning of the most hydrophobic (MH) segment into the alternative nonpolar environment, a membrane or hydrophobic core of a globular protein may determine the type of protein produced. In order to elucidate the importance of the MH segment in determining which of the two types of protein results from a given amino acid sequence, we statistically studied the characteristics of MH helices, longer than 19 residues in length, in 97 membrane proteins whose three-dimensional structure or topology is known, as well as 397 soluble proteins selected from the Protein Data Bank. The average hydrophobicity of MH helices in membrane proteins had a characteristic relationship with the length of the protein. All MH helices in membrane proteins that were longer than 500 residues had a hydrophobicity greater than 1.75 (Kyte and Doolittle scale), while the MH helices in membrane proteins smaller than 100 residues could be as hydrophilic as 0.1. The possibility of developing a method to discriminate membrane proteins from soluble ones, based on the effect of size on the type of protein produced, is discussed.
Keywords: hydrophobicity/length of protein/membrane protein/protein folding/transmembrane helix
| Introduction |
|---|
|
|
|---|
Recent developments in the human genome project have increased the need for more information about the structure and function of proteins in total proteomes. However, many amino acid sequences in total proteomes show no sequence homology with other known proteins. For such proteins, the information has to be extracted from the sequences alone, which has proved difficult. A realistic approach to the problem of computational analysis of total proteomes is to classify amino acid sequences into several categories related to their function.
The location of proteins in the cell is closely related to their function, and the classification of proteins by their location is a prerequisite for more detailed structure prediction. The location of proteins in a cell is determined according to two processes during the early stage of protein biosynthesis. If an amino acid sequence has a signal peptide, the protein is translocated through the membrane, and transmembrane segments anchor the protein to the membrane. The present study focused on the problem of how to discriminate between membrane and soluble proteins. Specifically, our objectives were to determine whether the hydrocarbon region of a membrane protein or the hydrophobic core of a soluble protein were essential for targeting the most hydrophobic (MH) segment into alternative nonpolar environments
The importance of MH helices for the problem of discrimination between soluble and membrane proteins was first addressed by Klein et al. (1985), using the average hydrophobicity of MH helices together with some statistical parameters characterizing transmembrane helices. The accuracy of discrimination by this method was as good as 95%. However, better accuracy is necessary for the analyses of a total proteome, since membrane proteins make up only a minor fraction of a proteome (Arkin et al., 1997
; Frishman and Mewes, 1997
; Wallin and von Heijne, 1998
). Even when soluble proteins are predicted with an error of only 5%, the fraction of false positives becomes much larger than this value because a total proteome contains far fewer membrane proteins than soluble proteins.
Many previous methods of transmembrane helix prediction, including the method of Klein et al. (1985), implicitly assumed that transmembrane helices are influenced only by short term characteristics, such as hydrophobicity (Kyte and Doolittle, 1982
; Steitz et al., 1982
), helical periodicity (Eisenberg et al., 1984
; Mitaku et al., 1984
, 1985
; Rees et al., 1989
; Jähnig, 1990
), propensity (Jones et al., 1994
), positive charge (von Heijne, 1989
) and alignment of amino acid sequences (Persson and Argos, 1994
). The hydropathy plot, for example, uses an average hydropathy index for several residues, which means that the interaction within a region of only several residues is assumed to account for the stabilization of transmembrane helices. However, the hydrophobic core of a soluble protein is formed by many other parts of the same protein, suggesting the importance of protein size for targeting MH helices.
In the present work, we determined whether the length of the protein affects discrimination between soluble and membrane proteins. The results clearly indicate that in addition to the average hydrophobicity of MH helices, protein size was an important factor in determining protein type.
| Materials and methods |
|---|
|
|
|---|
The amino acid sequences of 397 soluble proteins and 97 membrane proteins were used for comparing sequence characteristics of soluble and membrane proteins. Amino acid sequences of soluble proteins were selected from 901 PDB_SELECT entries, which are based on a 25% sequence identity cut-off (Hobohm et al., 1992
|
We also used two datasets of amino acid sequences from membrane proteins: (i) 16 membrane proteins with 3D- structures from PDB, containing the photosynthetic reaction center 1prc (H, L and M) (Deisenhofer et al., 1985
As shown in Table I
, the number of real helices in soluble proteins is comparable with that of transmembrane helices, particularly for proteins smaller than 100 residues and larger than 500 residues. Since the purpose of the present study was to compare the characteristics of helical segments in soluble and membrane proteins, a sufficiently large number of proteins and helical segments is necessary in order to reach any significant conclusions. Therefore, merge helices were added to the dataset of real helices in soluble proteins. The total number of helices was 1234 in soluble proteins, including 745 merge helices. The fraction of membrane proteins in each size range was 27% for L < 100, 17% for 100
L < 500 and 33% for L
500. Because the characteristics of amino acid sequences from signal peptides are different from those of true transmembrane helices, we only used the amino acid sequences from mature proteins, which do not contain signal peptides. Therefore, we define the length of a protein as the number of amino acids present in the mature protein.
The hydrophobicity of polypeptide segments was evaluated by the hydropathy index of Kyte and Doolittle (1982). The average hydrophobicity,
, is calculated using the following equation:
|
|
| Results |
|---|
|
|
|---|
The hydrophobicity of helices from soluble and membrane proteins was compared in order to elucidate the characteristics of both types of protein. The histograms in Figure 1
|
However, we did not have to predict all transmembrane helices to discriminate membrane proteins from soluble ones, as pointed out by Klein et al. (1985). Correct prediction of only one transmembrane segment in a polypeptide was enough for this purpose. Solid bars in Figure 1
In Figure 2
, protein length is plotted as a function of the average hydrophobicity of MH transmembrane helices. A characteristic relationship was observed between the two parameters. The average hydrophobicity of MH transmembrane helices was higher than 1.75 for membrane proteins longer than 500 residues, whereas for a number of MH transmembrane helices in membrane proteins smaller than 100 residues, hydrophobicity values were as low as 0.1.
|
The histograms of protein length are shown in Figure 3a
< 0), region II (0
< 1.0), region III (1.0
< 1.75) and region IV (
1.75) of Figure 2
|
Figure 4a,
500; (b) 100
L < 500; (c) L < 100. The hydrophobicity of MH helices in membrane proteins longer than 500 residues was higher than 1.5, and the separation between soluble and membrane proteins was complete (Figure 4a
L < 500) was in the range between 0 and 0.25 (Figure 4b
|
| Discussion |
|---|
|
|
|---|
The present results show that two parameters, protein size and the average hydrophobicity of the MH helix, may be used to develop a new method for the prediction of membrane proteins from amino acid sequences. The prediction of membrane proteins using the average hydrophobicity of MH helices was originally proposed by Klein et al. (1985) and the importance of this parameter was confirmed using a larger dataset in the present work. About 75% of membrane proteins could be distinguished from soluble proteins by this parameter. However, we identified a region of the average hydrophobicity, between 0 and 1.75, in which the two types of proteins coexist. Roughly 25% of membrane proteins were found in this region (Figure 1
In the present work, however, a completely different parameter, protein length, was found to be essential for targeting MH helices to membrane or soluble proteins. The average hydrophobicity of MH helices in membrane proteins longer than 500 residues was higher than 1.5. Because the average hydrophobicity of MH helices of soluble proteins was lower than 1.25, the type of protein could be well distinguished for the dataset of proteins longer than 500 residues. On the other hand, although many membrane proteins shorter than 100 residues were found in the region between 0 and 1.75, the distribution of the hydrophobicity of helices in soluble proteins correspondingly shifted to the lower values. Thus, the discrimination between soluble and membrane proteins may be improved by using the relationship between protein length and the average hydrophobicity of the MH helices.
However, these two parameters are not enough for complete discrimination, as seen from the overlapping regions in Figures 3 and 4![]()
. This suggests that other factors must exist which stabilize MH helices in the membrane or in the hydrophobic core of soluble proteins. Recently, we made public a system for membrane protein discrimination and transmembrane helix prediction (SOSUI) (http://www.tuat.ac.jp/~mitaku/adv_sosui/) (Hirokawa et al., 1998
), in which the two parameters discussed in this work were incorporated and the problems mentioned above were partly solved. The details of the algorithm of the discrimination in the SOSUI system will be described elsewhere.
The discrimination between soluble and membrane proteins is generally related to the problem of the early stage of protein folding. The present work showed that the length of a protein is correlated with the fate of its MH segment. When the MH segment has intermediate hydrophobicity, a short polypeptide tends to become a membrane protein, while a long polypeptide is folded to form a soluble protein. This correlation between protein size and type seems reasonable from the physicochemical viewpoint. A hydrophobic segment is energetically unfavorable in water and tries to find a nonpolar environment. However, a short polypeptide cannot make a sufficiently nonpolar environment to cover the hydrophobic segment. Therefore, a hydrophobic segment in a short protein tends to be partitioned into a membrane.
However, the penetration of a polypeptide into a membrane is mostly driven by the translocation machinery of the cell (Sakaguchi et al., 1992
; Rapoport et al., 1996
). Thus, the physicochemical consideration is not enough for understanding the correlation between protein size and type. The translocation machinery has to recognize transmembrane segments by some local sequence patterns, and the relationship between such sequence patterns and the length of membrane proteins is still unclear. More theoretical and experimental research is necessary to elucidate the physical mechanism of the size effect on the discrimination between soluble and membrane proteins.
| Acknowledgments |
|---|
This work was partly supported by Grant-in-Aid for basic research and priority area research of `Genome Science' from Monbusho (Ministry of Education, Science, Sports and Culture) of Japan.
| Notes |
|---|
1 To whom correspondence should be addressed; email: mitaku{at}cc.tuat.ac.jp
| References |
|---|
|
|
|---|
Allen,J.P., Feher,G., Yeates,T.O., Rees,D.C., Deisenhofer,J., Michel,H. and Huber,R. (1986) Proc. Natl Acad. Sci. USA, 83, 85898593.
Arkin,I.T., Brunger,A.T. and Engelman,D.M. (1997) Proteins, 28, 465466.[Web of Science][Medline]
Deisenhofer,J., Epp,O., Miki,K., Huber,R. and Michel,H. (1985) Nature, 318, 618624.[Web of Science]
Eisenberg,D., Weiss,R.M. and Terwilliger,T.C. (1984) Proc. Natl Acad. Sci. USA, 81, 140144.
Fariselli,P. and Casadio,R. (1996) CABIOS, 12, 4148.
Frishman,D. and Mewes,H.W. (1997) Nature Struct. Biol., 4, 626628.[Web of Science][Medline]
Grigorieff,N., Ceska,T.A., Downing,K.H., Baldwin,J.M. and Henderson,R. (1996) J. Mol. Biol., 259, 393421.[Web of Science][Medline]
Hirokawa,T., Boon-Chieng,S. and Mitaku,S. (1998) Bioinformatics, 14, 378379.
Hobohm,U. and Sander,C. (1994) Protein Sci., 3, 522524.[Web of Science][Medline]
Hobohm,U., Scharf,M., Schneider,R. and Sander,C. (1992) Protein Sci., 1, 409417.[Web of Science][Medline]
Jähnig,F. (1990) TIBS, 15, 9395.
Jones,D.T., Taylor,W.R. and Thornton,J.M. (1994) Biochemistry, 33, 30383049.[Medline]
Klein,P., Kanehisa,M. and DeLisi,C. (1985) Biochim. Biophys. Acta, 815, 468476.[Medline]
Kyte,J. and Doolittle,R.F. (1982) J. Mol. Biol., 157, 105132.[Web of Science][Medline]
Mitaku,S., Hoshi,S., Abe,T. and Kataoka,R. (1984) J. Phys. Soc. Jpn, 53, 40834090.
Mitaku,S., Hoshi,S. and Kataoka,R. (1985) J. Phys. Soc. Jpn, 54, 20472054.
Persson,B. and Argos,P. (1994) J. Mol. Biol., 237, 182192.[Web of Science][Medline]
Prince,S.M., Papiz,M.Z., Freer,A.A., McDermott,G., Hawthornthwaite-Lawless,A.M., Cogdell,R.J. and Isaacs,N.M. (1997) J. Mol. Biol., 268, 412423.[Web of Science][Medline]
Rapoport,T.A., Jungnickel,B. and Kutay,U. (1996) Annu. Rev. Biochem, 65, 271303.[Web of Science][Medline]
Rees,D.C., DeAntonio,L. and Eisenberg,D. (1989) Science, 245, 510513.
Sakaguchi,M., Tomiyoshi,R., Kuroiwa,T., Mihara,K. and Omura,T. (1992) Proc. Natl Acad. Sci. USA, 89, 1619.
Steitz,T.A., Goldman,A. and Engelman,D.M.. (1982) Biophys. J., 37, 124125.
Terwilliger,T.C. and Eisenberg,D. (1982) J. Biol. Chem., 257, 60106015.
Tsukihara,T., Aoyama,H., Yamashita,E., Tomizaki,T., Yamaguchi,H., Shinzawa-Itoh,K., Nakashima,R., Yaono,R. and Yoshikawa,S. (1995) Science, 269, 10691074.
von Heijne,G. (1989) Nature, 341, 456458.[Medline]
Wallin,E. and von Heijne,G. (1998) Protein Sci., 7, 10291038.[Web of Science][Medline]
Received February 26, 1999; revised June 2, 1999; accepted July 12, 1999.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
P. K. Sinha, J. Torres-Bacete, E. Nakamaru-Ogiso, N. Castro-Guerrero, A. Matsuno-Yagi, and T. Yagi Critical Roles of Subunit NuoH (ND1) in the Assembly of Peripheral Subunits with the Membrane Domain of Escherichia coli NDH-1 J. Biol. Chem., April 10, 2009; 284(15): 9814 - 9823. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Torres-Bacete, E. Nakamaru-Ogiso, A. Matsuno-Yagi, and T. Yagi Characterization of the NuoM (ND4) Subunit in Escherichia coli NDH-1: CONSERVED CHARGED RESIDUES ESSENTIAL FOR ENERGY-COUPLED ACTIVITIES J. Biol. Chem., December 21, 2007; 282(51): 36914 - 36922. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Shridas and C. J. Waechter Human Dolichol Kinase, a Polytopic Endoplasmic Reticulum Membrane Protein with a Cytoplasmically Oriented CTP-binding Site J. Biol. Chem., October 20, 2006; 281(42): 31696 - 31704. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Ciocchini, M. S. Roset, N. Inon de Iannino, and R. A. Ugalde Membrane Topology Analysis of Cyclic Glucan Synthase, a Virulence Determinant of Brucella abortus J. Bacteriol., November 1, 2004; 186(21): 7205 - 7213. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Fukumoto, X. Xuan, K. Kadota, I. Igarashi, C. Sugimoto, K. Fujisaki, H. Nagasawa, T. Mikami, and H. Suzuki High-Level Expression of Truncated Surface Antigen P50 of Babesia gibsoni in Insect Cells by Baculovirus and Evaluation of Its Immunogenicity and Antigenicity Clin. Vaccine Immunol., July 1, 2003; 10(4): 596 - 601. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Diaz, M. C. Mansilla, A. J. Vila, and D. de Mendoza Membrane Topology of the Acyl-Lipid Desaturase from Bacillus subtilis J. Biol. Chem., December 6, 2002; 277(50): 48099 - 48106. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. Knipple, C.-L. Rosenfield, R. Nielsen, K. M. You, and S. E. Jeong Evolution of the Integral Membrane Desaturase Gene Family in Moths and Flies Genetics, December 1, 2002; 162(4): 1737 - 1752. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Fernandez, P. Shridas, S. Jiang, M. Aebi, and C. J. Waechter Expression and characterization of a human cDNA that complements the temperature-sensitive defect in dolichol kinase activity in the yeast sec59-1 mutant: the enzymatic phosphorylation of dolichol and diacylglycerol are catalyzed by separate CTP-mediated kinase activities in Saccharomyces cerevisiae Glycobiology, September 1, 2002; 12(9): 555 - 562. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||













