Protein Engineering vol. 16 no. 11 pp. 791-793, 2003
© 2003 Oxford University Press
SMoS: a database of structural motifs of protein superfamilies
1National Centre for Biological Sciences (NCBS), Bangalore 560 065, India and 2Centre for Biotechnology, Anna University, Chennai 600025, India
3 To whom correspondence should be addressed. e-mail: mini{at}ncbs.res.in
| Abstract |
|---|
|
|
|---|
The Structural Motifs of Superfamilies (SMoS) database provides information about the structural motifs of aligned protein domain superfamilies. Such motifs among structurally aligned multiple members of protein superfamilies are recognized by the conservation of amino acid preference and solvent inaccessibility and are examined for the conservation of other features like secondary structural content, hydrogen bonding, non-polar interaction and residue packing. These motifs, along with their sequence and spatial orientation, represent the conserved core structure of each superfamily and also provide the minimal requirement of sequence and structural information to retain each superfamily fold.
| Introduction |
|---|
|
|
|---|
The superfamily is a hierarchical classification that contains proteins of different families and subfamilies having similar structure and function. These proteins might have very low sequence identities but retain the same fold through well conserved secondary structural parts. On the basis of conservation of criteria, like amino acid preference and solvent accessibility, several conserved segments of proteins belonging to the same superfamily have been identified. These segments are termed structural motifs. These motifs, along with their sequence and spatial orientation and preservation of various structural criteria, represent the conserved core of each superfamily. The structural features of such motifs for several superfamilies are integrated into the Structural Motifs of Superfamilies (SMoS) database. The definition of superfamilies is in direct correspondence with SCOP (Murzin et al., 1995
| Description |
|---|
|
|
|---|
Aligned sequences of superfamilies have been obtained from CAMPASS (Sowdhamini et al., 1998
More structural parameters like secondary structural content, hydrogen bonding, non-polar interaction and residue packing (Ooi number; Nishikawa and Ooi, 1986
) are examined among structurally aligned multiple members of protein superfamilies. The motifs are also ranked on the basis of conservation of all these criteria (Figure 1). A structural feature is considered conserved at an alignment position if it is present in all or all-but-one members of a superfamily. The average conservation score, considering all six structural features, has been calculated and is represented in a graphical format. The extent of conservation of the structural features is also compared between the identified motif regions and the rest of the protein.
|
Despite similar topology or conservation of residues, evolutionary divergence and poor sequence identity amongst superfamily members is often reflected as differences in the orientations and positions of individual structural motifs. The motif regions are transformed into a vector representation by the least-squares fit method (Chou et al., 1984
|
| Applications |
|---|
|
|
|---|
The availability of such information is useful since they are conserved sequence patterns that will assist in the identification of more members of an existing superfamily. Motifs, derived from 12 superfamilies, when scanned into non-redundant sequence databases could successfully recognize 104 uncharacterized or hypothetical proteins, which are distantly related to known superfamilies of proteins and unobtainable by other sensitive procedures (S.Chakrabarti and R.Sowdhamini, unpublished data). For example, the connections between a hypothetical protein and members of the cysteine hydrolase superfamily could be identified using this approach despite a poor sequence identity of 15%. Such structural templates or motifs provide constraints that are complementary to functional motifs obtained from various resources. The utilization of spatial restraints derived from structural templates also results in more accurate three-dimensional models of protein sequences using homology modelling techniques where there is a distant relationship between the query and any of the structural homologues that are detailed elsewhere (S.Chakarabarti, J.John and R.Sowdhamini, submitted for publication). This strategy can be employed, in general, to overcome the inherent limitation of comparative modelling methods when using multiple distantly related templates.
| Discussion |
|---|
|
|
|---|
Structural motifs can be used as sequence signatures for proteins belonging to a similar functional class under the classification strata of the superfamily. Therefore, these conserved regions can be utilized to identify and classify similar sequences of the superfamily of proteins. The objective definition of a structural motif is somewhat context dependent. We have used the conservation of structural features like amino acid sequence similarity and solvent burial as the primary requirement for identification of structural motifs since they represent the core of proteins. This has been emphasized even for homologous families (Zvelebil et al., 1987
The availability of a web resource for structural motifs of superfamilies is valuable since the evolutionary divergence makes it impossible to derive conserved sequence segments simply by residue conservation. Identification and projection of structure-based motifs mapped on alignments will be useful for improving alignments and to build better three-dimensional models involving distant relationships. This is a natural follow-up of alignments of distantly related proteins that can be grouped into superfamilies (Sowdhamini et al., 1998
). Structural motifs provided in the SMoS database have important applications in sequence searches, sequence alignments and distant homology modelling. This can also help to rationalize and design mutation experiments in proteins.
Availability
The SMoS database is accessible via http://www.ncbs.res.in/
faculty/mini/SMoS/index.htm
| References |
|---|
|
|
|---|
Chou,K.C., Memethy,G. and Scheraga,H.A. (1984) J. Am. Chem. Soc., 106, 31613170.[CrossRef]
Felsenstein,J. (1997) Syst. Biol., 46, 101111.[CrossRef][Web of Science][Medline]
Mallika,V., Bhaduri,A. and Sowdhamini,R. (2002) Nucleic Acids Res., 30, 284288.
Murzin,A.G., Brenner,S.E., Hubbard,T. and Chothia,C. (1995) J. Mol. Biol., 247, 536540.[CrossRef][Web of Science][Medline]
Nishikawa,K. and Ooi,T. (1986) J. Biochem., 100, 10431047.
Overington,J.P., Johnson,M.S., Sali,A. and Blundell,T.L. (1990) Proc. R. Soc. Lond. B Biol. Sci., 241, 132145.[Medline]
Sali,A. and Blundell,T.L. (1990) J. Mol. Biol., 212, 403428.[CrossRef][Web of Science][Medline]
Sayle,A. and Minler-White,E.J. (1995) Trends Biochem. Sci., 20, 374375.[CrossRef][Web of Science][Medline]
Sowdhamini,R., Burke,D.F., Huang,J.F., Mizuguchi,K., Nagarajaram,H.A., Srinivasan,N., Steward,R.E. and Blundell,T.L. (1998) Structure, 6, 10871094.[Medline]
Srinivasan,N., Sowdhamini,R., Ramakrishnan,C. and Balaram,P. (1991) In Balaram,P. and Ramaseshan,S. (eds), Molecular Conformation and Biological Interactions. Indian Academy of Sciences, Bangalore, pp. 5973.
Zvelebil,M.J., Barton,G.J., Taylor,W.R. and Sternberg,M.J. (1987) J. Mol. Biol., 195, 957961.[CrossRef][Web of Science][Medline]
Received April 25, 2003; revised September 9, 2003; accepted September 24, 2003.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. Pugalenthi, K. Tang, P. N. Suganthan, and S. Chakrabarti Identification of structurally conserved residues of proteins in absence of structural homologs using neural network ensemble Bioinformatics, January 15, 2009; 25(2): 204 - 210. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Pugalenthi, P. N. Suganthan, R. Sowdhamini, and S. Chakrabarti MegaMotifBase: a database of structural motifs in protein families and superfamilies Nucleic Acids Res., January 1, 2008; 36(suppl_1): D218 - D221. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Pugalenthi, P. N. Suganthan, R. Sowdhamini, and S. Chakrabarti SMotif: a server for structural motifs in proteins Bioinformatics, March 1, 2007; 23(5): 637 - 638. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Pugalenthi, A. Bhaduri, and R. Sowdhamini iMOTdb--a comprehensive collection of spatially interacting motifs in proteins Nucleic Acids Res., January 1, 2006; 34(suppl_1): D285 - D286. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Chakrabarti, A. P. Anand, N. Bhardwaj, G. Pugalenthi, and R. Sowdhamini SCANMOT: searching for similar sequences using a simultaneous scan of multiple sequence motifs Nucleic Acids Res., July 1, 2005; 33(suppl_2): W274 - W276. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



