PEDS Advance Access published online on November 10, 2007
Protein Engineering Design and Selection, doi:10.1093/protein/gzm057
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nuc-PLoc: a new web-server for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM
1 Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, 1954 Hua-Shan Road, Shanghai 200030, China 2 School of Information Engineering, Jiangnan University, Wuxi 214122 3 Gordon Life Science Institute, San Diego, CA 92130, USA
5 To whom correspondence should be addressed.E-mail: hbshen{at}crystal.harvard.edu
| Abstract |
|---|
|
|
|---|
The life processes of an eukaryotic cell are guided by its nucleus. In addition to the genetic material, the cellular nucleus contains many proteins located at its different compartments, called subnuclear locations. Information of their localization in a nucleus is indispensable for the in-depth study of system biology because, in addition to helping determine their functions, it can provide illuminative insights of how and in what kind of microenvironments these subnuclear proteins are interacting with each other and with other molecules. Facing the deluge of protein sequences generated in the post-genomic age, we are challenged to develop an automated method for fast and effectively annotating the subnuclear locations of numerous newly found nuclear protein sequences. In view of this, a new classifier, called Nuc-PLoc, has been developed that can be used to identify nuclear proteins among the following nine subnuclear locations: (1) chromatin, (2) heterochromatin, (3) nuclear envelope, (4) nuclear matrix, (5) nuclear pore complex, (6) nuclear speckle, (7) nucleolus, (8) nucleoplasm and (9) nuclear promyelocytic leukaemia (PML) body. Nuc-PLoc is featured by an ensemble classifier formed by fusing the evolution information of a protein and its pseudo-amino acid composition. The overall jackknife cross-validation accuracy obtained by Nuc-PLoc is significantly higher than those by the existing methods on the same benchmark data set through the same testing procedure. As a user-friendly web-server, Nuc-PLoc is freely accessible to the public at http://chou.med.harvard.edu/bioinf/Nuc-PLoc.
Keywords: fusion/Nuc-PLoc/position-specific scoring matrix/pseudo-amino acid composition/subnuclear location
| Introduction |
|---|
|
|
|---|
The nucleus exists only in eukaryotic cells. Located at the center of a cell like its kernel, the nucleus is the most prominent and largest cellular organelle (Lodish et al., 1995
10% of the total volume of a typical animal cell (Alberts et al., 2002Functioning as the brain of eukaryotic cells, the nucleus guides the life processes of the cells by directing their reproduction, controlling their differentiation and regulating their metabolic activities. In addition to the genetic material, a nucleus contains many proteins located at its different compartments, called subnuclear locations. Information of the subnuclear locations of these proteins is important because it not only provides useful clues about their functions but also helps understand how and in what kind of microenvironments they interact with each other and with other molecules, and hence is indispensable for the in-depth study of system biology at the cell nucleus level.
Although the proteins subnuclear localization can be determined by conducting various experiments, such as cell fractionation, electron microscopy and fluorescence microscopy (Murphy et al., 2000
), it is both time-consuming and costly to acquire such information solely by experiments. With the deluge of protein sequences generated in the post-genomic age, it is highly desired to develop an automated method for efficiently identifying the subnuclear location of a query protein according to its sequence. Actually, many methods have been proposed for predicting protein subcellular localization (see, e.g., Nakai and Kanehisa, 1992
; Nakashima and Nishikawa, 1994
; Cedano et al., 1997
; Chou and Elrod, 1999
; Nakai and Horton, 1999
; Yuan, 1999
; Emanuelsson et al., 2000
, 2007
; Nakai, 2000
; Feng, 2001
, 2002
; Feng and Zhang, 2001
; Hua and Sun, 2001
; Chou and Cai, 2002
; Nair and Rost, 2002
; Gardy et al., 2003; Pan et al., 2003
; Park and Kanehisa, 2003
; Zhou and Doctor, 2003
; Huang and Li, 2004
; Gao et al., 2005
; Garg et al., 2005
; Lei and Dai, 2005
; Matsuda et al., 2005
; Xiao et al., 2005
; Guo et al., 2006
; Hoglund et al., 2006
; Lee et al., 2006
; Pierleoni et al., 2006
; Zhang et al., 2006b, 2006c; Chou and Shen, 2007a; Shen and Chou, 2007
; Shi et al., 2007
; and the references cited in a recent review (Chou and Shen, 2007b)); in contrast, however, much fewer prediction methods (particularly with web-server) have been reported for predicting the protein subnuclear localization (Lei and Dai, 2005
; Shen and Chou, 2005a). The present study was initiated in an attempt to enrich the latter by introducing a novel and powerful approach through fusing the pseudo-amino acid composition (Chou, 2001
) and position-specific scoring matrix (Altschul et al., 1997
), in hope to stimulate the development of this area, which is vitally important for in-depth understanding of the biological pathways in nucleus.
| Materials and methods |
|---|
|
|
|---|
Protein sequences were collected from the Swiss-Prot database (version 52.0 released on 6 May 2007) at http://www.ebi.ac.uk/swissprot/ according to the annotation information in the CC (comment or notes) field. In order to collect as much desired information as possible, but meanwhile ensure a high quality for the working data sets, the data were screened strictly according to the following criteria. (1) Because a same subnuclear location (-!-SUBCELLULAR LOCATION) in the CC field might be annotated with different terms, several key words were used for a same subcellular location. For example, in search for nuclear envelope proteins, the key words nuclear envelope, nuclear inner membrane and nuclear outer membrane were used. (2) Sequences annotated with ambiguous or uncertain terms, such as potential, probable, probably, maybe, likely or by similarity, were excluded. (3) Sequences annotated by two or more locations were not included because of lack of the uniqueness. (4) Sequences annotated with fragment were excluded; also, sequences with <50 amino acid residues were removed because they might just be fragments. (5) To avoid any homology bias, a redundancy cutoff was operated by a culling program to winnow those sequences which have
80% sequence identity to any other in a same subnuclear location.
After strictly following the above five procedures, we obtained 714 proteins, of which 99 belong to chromatin, 22 to heterochromatin, 61 to nuclear envelope, 29 to nuclear matrix, 79 to nuclear pore complex, 67 to nuclear speckle, 307 to nucleolus, 37 to nucleoplasm and 13 to nuclear PML body (Fig. 1). Each of the nine subnuclear locations corresponds to a subset
i (i = 1, 2, ... , 9) as shown in Table I. Thus, the benchmark data set
is a union of nine subsets, i.e.
|
| (1) |
is the symbol for union in the set theory. The sequences of the 714 subnuclear proteins as well as their accession numbers are given in Online Supporting Information A available at PEDS online.
|
|
It is instructive to point out that the benchmark data set constructed here is different from that of Shen and Chou (2005a). The reasons for us to re-construct the benchmark data set are as follows: (i) much more nuclear protein data are available now in Swiss-Prot database that allows us to construct a benchmark data set with a higher quality and (ii) the sequences in the original data set (Shen and Chou, 2005a) were not treated by a cutoff procedure as done here to reduce the redundancy and homologous bias.
To represent a protein sample P with L amino acid residues by its evolution information, the position-specific scoring matrix (PSSM) was introduced as its descriptor, i.e.
|
| (2) |
i
j represents the score of the amino acid residue in the ith position of the protein sequence being mutated to amino acid type j during the evolution process. Here, for simplifying the formulation without losing generality, let us use the numerical codes 1, 2, ... , 20 to represent the 20 native amino acid types according to the alphabetical order of their single character codes. The L x 20 scores in the matrix of Eq.(2) for PPSSM were generated using PSI-BLAST (Schaffer et al., 2001|
| (3) |
0i
j represents the original scores directly created by PSI-BLAST that are generally shown as positive or negative integers. This is not the case for the converted scores, which will have a zero mean value over the 20 amino acids and will remained unchanged if going through the same conversion procedure again. The positive score means that the corresponding mutation occurs more frequently in the alignment than expected by chance, whereas the negative score means just the opposite. Large positive scores often indicate critical functional residues, such as active site residues and residues required for interactions with other molecules. However, according to the PSSM descriptor [Eq. (2)], proteins with different lengths will correspond to matrices with different numbers of rows. To make the PSSM descriptor become a uniform representation, one possible approach is to represent a protein sample P by
|
| (4) |
|
| (5) |
j represents the average score of the amino acid residues in the protein P being mutated to amino acid type j during the evolution process. However, if
PSSM of Eq. (4) was used to represent the protein P, all the sequence-order information would be lost. To avoid complete loss of the sequence-order information, the concept of the pseudo-amino acid composition as originally proposed in Chou (2001)|
| (6) |
|
| (7) |
can be 0, 1, 2, ..., or 49, preliminary test results indicated that when
> 10, the corresponding success rate dropped down. To simplify the problem, we can just focus on the optimal region of
= 0, 1, ... , and 10. When
= 0, Eq. (6) is degenerated to Eq. (4).
On the other hand, according to the representation of the pseudo-amino acid composition (PseAA) as defined in Chou (2001)
, the protein P is formulated by
|
| (8) |
are the
correlation factors that reflect the first tier, second tier, ... , and the
th tier sequence order correlation patterns, respectively (see Fig. 1 of Chou, 2001
elements in Eq. (8) can be easily derived by the PseAAC web-server at http://chou.med.harvard.edu/bioinf/PseAAC/ or by Eqs (2–6) of Chou (2001)
factors that approximately incorporate the sequence-order effects. In this study, the optimal range for
is from 1 to 20. Using the PseAA composition descriptor to represent protein samples as such can significantly improve the prediction quality for the subcellular localization of proteins and their other attributes as demonstrated by a series of recent publications (Pan et al., 2003
According to the PsePSSM descriptor [Eq. (6)], a protein can be represented by 11 different vectors, each of which corresponds to a different
(0, 1, ... , or 10), whereas according to the PseAA composition descriptor [Eq. (8)], it can be represented by 20 different vectors, each of which corresponds to a different
(1, 2, ... , or 20). To avoid the over-fitting problem and reducing the cluster-tolerance capacity, instead of using a higher dimensional vector to represent the protein by combining the 11 + 20 = 31 vectors of different
and
, we are to introduce 31 individual basic classifiers each of which is trained and operated based on one of the aforementioned 31 descriptors. The final result is determined by an ensemble classifier formed by fusing the 31 basic classifiers through a voting system, as will be detailed below.
For the convenience of the later formulation, let us use the following equation to cover both
PsePSSM
and PPseAA
for representing a protein sample:
|
| (9) |
In this study, the optimized evidence-theoretic K nearest neighbor (OET-KNN) classifier was utilized to identify the subnuclear location of a query protein. The OET-KNN classifier is a very powerful classification engine as demonstrated by its role in enhancing the success rates of predicting membrane types (Shen and Chou, 2005b), where a detailed formulation of OET-KNN classifier can be found. There are two parameters that may directly affect the predicted result of an OET-KNN classifier. One is K, the number of the nearest proteins counted against the query protein during the prediction process; the other is
, i.e. which of the 31 descriptors in Eq. (9) is used as the base of the classifier. Accordingly, here the OET-KNN classifier should be formulated as an operator with the parameters K and
explicitly shown, i.e.
|
| (10) |
. It is time-consuming and tedious to test the results using different numbers of K and
one by one in order for getting the optimal result. To solve such a problem, the following two-dimensional fusion approach was adopted. Preliminary tests indicated that the success rates obtained by
(K,
) trained by the current benchmark data set became remarkably lower when K > 10, so it is sufficient to just consider:
|
| (11) |
is a symbol in the set theory meaning member of, then we have a set of 10 x 31 = 310 individual classifiers as expressed by
|
| (12) |
(1, 1) is the OET-KNN classifier trained according to the 1-nearest-neighbor rule in the degenerated 20-D PSSM space [cf. Eq. (6)],
(2, 2) is the classifier trained according to the 2-nearest-neighbor rule in the 40-D PsePSSM space with
= 2, and so forth. The ensemble classifier formed by fusing such 310 individual classifiers is formulated by
|
| (13) |
denotes the fusion operator. The detailed process of how the ensemble classifier 

works is as follows. Suppose the predicted classification result by
(K,
) for the query protein P is
|
| (14) |
is the action operator with the meaning of using
(K,
) to identify P, leading to the result of CK,
which is a member of
as defined by Eq. (1). The voting score for the query protein P belonging to the ith subset (subnuclear location)
i is given by
|
| (15) |
is the weight and was set at 1 for simplicity, the delta function in Eq. (15) is given by
|
| (16) |
|
| (17) |
To provide an intuitive picture, a flowchart is provided in Fig. 2 to show the process of how the ensemble classifier works in identifying protein subnuclear localization.
|
| Results and discussion |
|---|
|
|
|---|
As a demonstration, the jackknife test was performed with the current approach on the benchmark data set (see Table I and Online Supporting Information A available at PEDS online). The jackknife test is deemed the most objective and rigorous cross-validation procedure in statistical prediction (Chou and Zhang, 1995
The results thus obtained are given in Table II, where for facilitating comparison, the corresponding results by ProtLoc (Cedano et al., 1997
), support vector machines (SVM) (Vapnik, 1998
) and single OET-KNN classifier (Shen and Chou, 2005a) are also listed.
|
The 20-D amino acid composition (a special case of PseAA composition when
= 0) was widely used to represent the protein samples in bioinformatics for predicting various attributes of proteins. However, the 20-D amino acid composition does not contain any sequence order information. To avoid completely losing the sequence order information, the PseAA composition [Eq. (8)] was proposed by Chou (2001)
= 14) is 7–19% higher than those based on the conventional 20-D amino acid composition.
Also, as shown in Table II, the overall success rate by the jackknife test obtained with the current ensemble classifier by fusing PsePSSM and PseAA is 67.4%, which is
31% and 19% higher than the rates obtained by ProtLoc (Cedano et al., 1997
) and SVM (Vapnik, 1998
) based on the conventional amino acid composition, and
12% higher than the rate obtained by the single OET-KNN classifier based on PseAA composition (Shen and Chou, 2005a). The SVM predictor used in this study was C-SVC type and was trained based on radial basis function (RBF) kernel function with the parameter of
= 0.5.
In order to demonstrate the power of the ensemble classifier formulated in this paper, we also compare the performance of a single base classifier with the ensemble classifier. It was observed that if the prediction was conducted in the PseAA composition space, the success rate obtained by the ensemble classifier was 10–13% higher than those by the individual classifiers, and that, if the prediction was conducted in the PsePSSM space, the ensemble classifier was superior to the individual classifiers by 3–7%. All these evidences indicate that the predictions obtained by individual classifiers might cause bias, prone to lead to false results.
Listed in Table III are the Matthews correlation coefficient (MCC) indexes for the nine subnuclear locations obtained by the jackknife tests with the SVM algorithm and the current predictor, respectively. The definition of MCC is given by
|
| (18) |
|
where TP represents the true positive; TN, the true negative; FP, the false positive and FN, the false negative (Fig. 3). It can be seen from Table III that the results obtained by the current predictor not only possess higher success rates but also are more stable than those by the SVM approach, indicating that the new approach is indeed very powerful and promising.
|
Nuc-PLoc server is implemented with C language and HTML programming in Fedora Linux system and can be accessed freely at http://chou.med.harvard.edu/bioinf/Nuc-PLoc. On the basis of the locally computation under the configuration of AMD Athlon(tm) dual core processor 4200+ and 2.0G RAM memory, one can obtain the prediction result in 30 ± 12 seconds for each query sequence.
| Conclusion |
|---|
|
|
|---|
The following conclusions have been drawn through this study. (i) The success rate in identifying the protein subnuclear localization can be significantly enhanced by incorporating the protein evolution information. (ii) The ensemble classifier formed by fusing a series of basic classifiers through a voting system is a very efficient approach that allows the predictor to cover as much information as possible without causing the over-fitting problem.
To support the people working in the relevant area, a web-server called Nuc-PLoc is provided at http://chou.med.harvard.edu/bioinf/Nuc-PLoc, which is freely accessible to the public.
| Footnotes |
|---|
4 Present address: Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| Acknowledgement |
|---|
|
|
|---|
The authors wish to express their gratitude to the two anonymous reviewers, whose constructive comments were very helpful in strengthening the presentation of this study.
| References |
|---|
|
|
|---|
Alberts B., Johnson A., Lewis J., Raff M., Roberts K., Walter P. Molecular Biology of the Cell (2002) 4th edn. New York: Garland Science.
Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. Nucleic Acids Res. (1997) 25:3389–3402.
Cao Y., Liu S., Zhang L., Qin J., Wang J., Tang K. BMC Bioinformatics (2006) 7:20.[CrossRef][Medline]
Cedano J., Aloy P., Perez-Pons J.A., Querol E. J. Mol. Biol. (1997) 266:594–600.[CrossRef][Web of Science][Medline]
Chen C., Tian Y.X., Zou X.Y., Cai P.X., Mo J.Y. J Theor Biol. (2006) 243(a):444–448.[CrossRef][Web of Science][Medline]
Chen C., Zhou X., Tian Y., Zou X., Cai P. Anal Biochem. (2006) 357(b):116–121.[CrossRef][Web of Science][Medline]
Chen Y.L., Li Q.Z. J. Theor. Biol. (2007) doi:10.1016/j.jtbi.2007.05.019.
Chou K.C. Proteins Struct. Funct. Genet. (2001) 43:246–255. (Erratum: ibid. 2001, 44, 60).[CrossRef][Web of Science][Medline]
Chou K.C., Cai Y.D. J. Biol. Chem. (2002) 277:45765–45769.
Chou K.C., Elrod D.W. Protein Eng. (1999) 12:107–118.
Chou K.C., Shen H.B. J. Proteome Res. (2007) 6(a):1728–1734.[Web of Science][Medline]
Chou K.C., Shen H.B. Anal Biochem. (2007) 370(b):1–16.[CrossRef][Web of Science][Medline]
Chou K.C., Zhang C.T. J. Biol. Chem. (1994) 269:22014–22020.
Chou K.C., Zhang C.T. Crit. Rev. Biochem. Mol. Biol. (1995) 30:275–349.[Web of Science][Medline]
Du P., Li Y. BMC Bioinformatics (2006) 7:518.[CrossRef][Medline]
Emanuelsson O., Brunak S., von Heijne G., Nielsen H. Nat Protoc. (2007) 2:953–971.[CrossRef][Medline]
Emanuelsson O., Nielsen H., Brunak S., von Heijne G. J. Mol. Biol. (2000) 300:1005–1016.[CrossRef][Web of Science][Medline]
Feng Z.P. Biopolymers (2001) 58:491–499.[CrossRef][Web of Science][Medline]
Feng Z.P. In Silico Biol. (2002) 2:291–303.[Medline]
Feng Z.P., Zhang C.T. Int. J. Biol. Macromol. (2001) 28:255–261.[CrossRef][Web of Science][Medline]
Gao Q.B., Wang Z.Z. Protein Eng. Des. Sel. (2006) 19:511–516.
Gao Q.B., Wang Z.Z., Yan C., Du Y.H. FEBS Lett. (2005) 579:3444–3448.[CrossRef][Web of Science][Medline]
Gardy J.L., et al. Nucleic Acids Res. (2003) 31:3613–3617.
Garg A., Bhasin M., Raghava G.P. J. Biol. Chem. (2005) 280:14427–14432.
Guo J., Lin Y., Liu X. Proteomics (2006) 6:5099–5105.[CrossRef][Web of Science][Medline]
Hoglund A., Donnes P., Blum T., Adolph H.W., Kohlbacher O. Bioinformatics (2006) 22:1158–1165.
Hua S., Sun Z. Bioinformatics (2001) 17:721–728.
Huang Y., Li Y. Bioinformatics (2004) 20:21–28.
Jahandideh S., Abdolmaleki P., Jahandideh M., Asadabadi E.B. Biophys. Chem. (2007) 128:87–93.[CrossRef][Web of Science][Medline]
Kedarisetti K.D., Kurgan L.A., Dick S. Biochem. Biophys. Res. Commun. (2006) 348:981–988.[CrossRef][Web of Science][Medline]
Kurgan L.A., Stach W., Ruan J. J. Theor. Biol. (2007) doi.org/10.1016/j.jtbi.2007.05.017.
Lee K., Kim D.W., Na D., Lee K.H., Lee D. Nucleic Acids Res. (2006) 34:4655–4666.
Lei Z., Dai Y. BMC Bioinformatics (2005) 6:291.[CrossRef][Medline]
Lin H., Li Q.Z. Biochem. Biophys. Res. Commun. (2007) 354(a):548–551.[CrossRef][Web of Science][Medline]
Lin H., Li Q.Z. J. Comput. Chem. (2007) 28(b):1463–1466.[CrossRef][Web of Science][Medline]
Lodish H., Baltimore D., Berk A., Zipursky S.L., Matsudaira P., Darnell J. Molecular Cell Biology, Chap.3 (1995) 3rd edn. New York: Scientific American Books.
Matsuda S., Vert J.P., Saigo H., Ueda N., Toh H., Akutsu T. Protein Sci. (2005) 14:2804–2813.[CrossRef][Web of Science][Medline]
Mondal S., Bhavna R., Mohan Babu R., Ramakumar S. J. Theor. Biol. (2006) 243:252–260.[CrossRef][Web of Science][Medline]
Murphy R.F., Boland M.V., Velliste M. Proc. Int. Conf. Intell. Syst. Mol. Biol. (2000) 8:251–259.[Medline]
Nair R., Rost B. Protein Sci. (2002) 11:2836–2847.[CrossRef][Web of Science][Medline]
Nakai K. Adv. Protein Chem. (2000) 54:277–344.[Web of Science][Medline]
Nakai K., Horton P. Trends Biochem. Sci. (1999) 24:34–36.[CrossRef][Web of Science][Medline]
Nakai K., Kanehisa M. Genomics (1992) 14:897–911.[CrossRef][Web of Science][Medline]
Nakashima H., Nishikawa K. J. Mol. Biol. (1994) 238:54–61.[CrossRef][Web of Science][Medline]
Nakashima H., Nishikawa K., Ooi T. J. Biochem. (1986) 99:152–162.
Pan Y.X., Zhang Z.Z., Guo Z.M., Feng G.Y., Huang Z.D., He L. J.Protein Chem. (2003) 22:395–402.[CrossRef][Web of Science][Medline]
Park K.J., Kanehisa M. Bioinformatics (2003) 19:1656–1663.
Pierleoni A., Martelli P.L., Fariselli P., Casadio R. Bioinformatics (2006) 22:e408–e416.
Pu X., Guo J., Leung H., Lin Y. J. Theor. Biol. (2007) 247:259–265.[CrossRef][Web of Science][Medline]
Schaffer A.A., Aravind L., Madden T.L., Shavirin S., Spouge J.L., Wolf Y.I., Koonin E.V., Altschul S.F. Nucleic Acids Res. (2001) 29:2994–3005.
Shen H.B., Chou K.C. Biochem. Biophys. Res. Comm. (2005) 337(a):752–756.[CrossRef][Web of Science][Medline]
Shen H.B., Chou K.C. Biochem. Biophys. Res. Commun. (2005) 334(b):288–292.[CrossRef][Web of Science][Medline]
Shen H.B., Chou K.C. Biochem. Biophys. Res. Commun. (2007) 355:1006–1011.[CrossRef][Web of Science][Medline]
Shi J.Y., Zhang S.W., Pan Q., Cheng Y.-M., Xie J. Amino Acids (2007) doi 10.1007/s00726-006-0475-y.
Spector D.L. J. Cell. Sci. (2001) 114:2891–2893.[Web of Science][Medline]
Vapnik V. Statistical Learning Theory (1998) New York: Wiley-Interscience.
Xiao X., Shao S., Ding Y., Huang Z., Huang Y., Chou K.C. Amino Acids (2005) 28:57–61.[CrossRef][Web of Science][Medline]
Yuan Z. FEBS Lett. (1999) 451:23–26.[CrossRef][Web of Science][Medline]
Zhang S.W., Pan Q., Zhang H.C., Shao Z.C., Shi J.Y. Amino Acids (2006) 30(a):461–468.[CrossRef][Web of Science][Medline]
Zhang T., Ding Y., Chou K.C. Comput. Biol. Chem. (2006) 30(b):367–371.[CrossRef][Web of Science][Medline]
Zhang Z.H., Wang Z.H., Zhang Z.R., Wang Y.X. FEBS Lett. (2006) 580(c):6169–6174.[CrossRef][Web of Science][Medline]
Zhou G.P. J. Protein Chem. (1998) 17:729–738.[CrossRef][Web of Science][Medline]
Zhou G.P., Doctor K. Proteins Struct. Funct. Genet. (2003) 50:44–48.[CrossRef][Web of Science][Medline]
Received July 16, 2007; revised August 13, 2007; accepted September 13, 2007.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


