Protein Engineering, Vol. 15, No. 12, 979-986,
December 2002
© 2002 Oxford University Press
Binding free energy calculations of galectin-3ligand interactions
Department of Chemistry, University of Calcutta, 92 A.P.C. Road,Kolkata-700 009, India
| Abstract |
|---|
|
|
|---|
Galectins show remarkable binding specificity towards ß-galactosides. A recently developed method for calculating binding free energies between a protein and its substrates has been used to evaluate the binding specificity of galectin-3. Five disaccharides and a tetrasaccharide were used as the substrates. The calculated binding free energies agree quite well with the experimental data and the ranking of binding affinities is well reproduced. For all the six proteinligand complexes it was observed that electrostatic interactions oppose binding whereas the non-polar contributions drive complex formation. The observed binding specificity of galectin-3 for galactosides rather than glucosides is discussed in light of our results.
Keywords: binding free energies/binding specificity/galectin-3/molecular dynamics simulations
| Introduction |
|---|
|
|
|---|
Mammalian cell surfaces and the extracellular matrix that surround them are rich in glycoconjugates (e.g. glycoproteins and glycolipids). Complex carbohydrates, an integral part of the cell surface glycoconjugates, are now being recognized as molecules with enormous coding capacity of meaningful messages in the form of their monosaccharide components, linkages, branching patterns, etc. They can act as recognition units for indigenous receptors, e.g. lectins (for review see Lis and Sharon, 1998
Amongst the commonly found oligosaccharides on the cell surface the ß-galactosides are prominent components of the sugar chains found in both glycoproteins and glycolipids. The receptors for this linkage are galectins which are animal lectins having at least one carbohydrate recognition domain (CRD) of specific ß-galactoside binding activity (Barondes et al., 1994
; Cooper and Barondes, 1999
). These galectins are soluble and widely distributed in the vertebrates and play a diverse intra- and extracellular biological function (Perillo et al., 1998
). In mammals, so far, 12 glycoproteins have been defined as galectins (galectin-1 through galectin-12) (Cooper and Barondes, 1999
; Yang et al., 2001
). One of them, galectin-3, a well-studied and representative member of this family, contains a conserved
14 kDa carbohydrate recognition domain showing high affinity for ß-galactosides. Expression of galectin-3 is highest in activated macrophages, basofils and mast cells (Frigeri et al., 1993
; Liu, 1993
; Sato and Hughes, 1994
), some epithelial cells, e.g. intestine, kidney (Foddy et al., 1990
; Lindstedt et al., 1993
; Lotz et al., 1993
) and in some sensory neurons (Regan et al., 1986
; Cameron et al., 1993
). It has been shown to activate various cell types through cross linkage of appropriate cell surface glycoproteins, including cell adhesion molecules, to promote neurite growth (Pesheva et al., 1998
) and induce differentiation and angiogenesis of endothelial cells (Nangia-Makker et al., 2000
). It acts as a chemoattractant for monocyte (Sano et al., 2000
) and endothelial cells (Nangia-Makker et al., 2000
). It has also been shown that galectin-3 is active in vitro in inducing pre-mRNA splicing (Dagher et al., 1995
). Galectin-3 is over-expressed in some types of cancer in which the normal parental cells do not express the protein, including specific types of lymphomas (Hsu et al., 1996
; Konstantinov et al., 1996
), thyroid carcinoma (Xu et al., 1995
; Fernandez et al., 1997
; Hsu et al., 1999
), etc. Studies of cells transfected with galectin-3 cDNA or treated with specific antisense oligonucleotide, however, have provided evidence for the involvement of galectin-3 in tumor development and metastasis (Raz et al., 1990
; Bresalier et al., 1998
). It is likely that the glycoconjugate-mediated recognition processes of galectin-3 might be the key step in many of these biological processes.
It has been found experimentally that galectin-3 has different binding activity for galactose-containing oligosaccharides. The relative binding affinity of galectin-3 with different ß-galactosides is Galß(14)GlcNAc > Galß(13)GlcNAc > Galß(14) Glc > Galß(13)GalNAc > Glcß(14)Glc (Sparrow et al., 1987
). The binding affinity of galectin-3 for oligosaccharides containing the above disaccharide linkages is stronger. This indicates that the primary binding site in galectin-3 may be specific for galactose but the secondary sites can have more flexibility in terms of the type of ligand. The structure of human galectin-3 carbohydrate recognition domain (CRD) complexed with lactose/N-acetyllactosamine has been solved at 2.1 Å resolution (Seetharaman et al., 1998
). The high binding activity is attained by the galectin-3 through a tightly coordinated combination of hydrogen bonding, hydrophobic aromatic residuesugar interactions, and a precise steric fit.
To capitalize on knowledge about the subtleties of lectincarbohydrate interaction for rational marker/drug design, the intimate details of the recognition process need to be understood and exploited. In order to provide mechanistic insight into the ligandreceptor interaction or to explain the binding affinity of a ligand the individual contributions of various factors which originate from the receptor, the ligand and/or the solvent to the overall free energy change have to be estimated. It is essential to study several receptorligand complexes to arrive at a consensus. The X-ray diffraction data, which provides the three-dimensional structure of the complexes, does not provide detailed information about the contributions of ligand and/or receptor flexibility. Also, the number of experimentally known ligandreceptor complexes is limited. Hence to provide mechanistic insights into the origins of ligand binding activity, computer-assisted molecular modeling of the reactants before and after complex formation can be utilized.
We have chosen the relative binding activity data for Galß-(14)Glc, Galß(14)GlcNAc, Galß(13)GalNAc, Galß(13)-GlcNAc, Glcß(14)Glc and an oligosaccharide as test cases. Åqvist et al. (Åqvist et al., 1994
) have proposed a semi-empirical linear interaction energy (LIE) approach for estimating absolute ligand binding free energies which depends on molecular dynamics simulations of the bound and free ligands in solution. Here the binding free energy is approximated as:
![]() | (1) |
and ß are empirical parameters. Åqvist et al. (Åqvist et al., 1994
0.5 and ß
0.16 are suitable for different protein systems. However other values of these two parameters have been reported for different protein systems and it appears that these two parameters may be protein dependent. Regardless of the transferability of these parameters, the LIE calculation method has been shown to be quite successful in predicting relative binding affinities of ligands. There are several advantages to this method. Since the LIE method simulates only the final states, it is quite fast, it takes into account the flexibility of both the reactants, and finally, since solvent molecules are explicitly included, the desolvation free energy can be reasonably handled (W.Wang et al., 1999| Materials and methods |
|---|
|
|
|---|
Disaccharide formation
For the molecular modeling components the InsightII Package (version 98.0, Accelrys Inc.) was used. For the molecular mechanics and molecular dynamics calculations DISCOVER Module and CVFF force fields were used. All the disaccharides and the tetrasaccharide were built from monosaccharide templates. The glycosidic dihedral angles were changed from 0° to 360° at 30° intervals. Each conformation thus generated was minimized keeping the dihedral angle fixed. From the grid search the lowest energy conformation was identified and minimized without further constraint. The lowest energy conformation thus obtained was taken as the minimized conformation of the disaccharides. The tetrasaccharide was built joining the minimized disaccharide conformation and further minimized without constraint. The energy was minimized using the conjugate gradient method with no non-bonded cutoff. A constant dielectric of value 1.0 was used for all calculations.
Ligandwater assembly
The minimized conformations of all the disaccharide ligands were layered with about 12 Å thickness of water molecules. The total number of water molecules required was about 359362 for disaccharides and 454 for the tetrasaccharide. The water layer was divided into two parts. The inner part is constructed defining the interface water molecules around the ligand of about 6 Å radius. The rest of the water molecules were defined as the outer layer. In the inner water layer the number of water molecules was about 134144 for disaccharides and 186 for the tetrasaccharide. The water layer of the ligandwater assemblies was minimized initially by fixing the inner water layer, and then fixing the outer layer and finally unfixing all the solvent molecules. The minimization was performed by the steepest descent followed by the conjugate gradient method using 15 Å non-bonded cutoff and a constant dielectric of 1.0. All the optimized ligandwater assemblies were subjected to a dynamic run at 300 K constraining the covalent bond length using Rattle. Initial equilibration for 100 ps was followed by a 1.5 ns simulation run. The time step for the dynamic simulation was 2 fs.
Galectinligandwater assembly
A recently published crystal structure (Seetharaman et al., 1998
) of the carbohydrate recognition domain of galectin complexed with Galß(14)GlcNAc (PDB entry A3K) was used as the model of the galectin-3 molecule. The hydrogen atoms are generated in the crystal structure and their positions were optimized by minimizing the structure keeping the heavy atom fixed and removing all implicit water molecules. The minimization was performed using the steepest descent and conjugate gradient method. The minimized conformations of the ligands Glcß(14)Glc, Galß(13)GalNAc, Galß(14)Glc, Galß(13)GlcNAc, Galß(14)GlcNAc, Gal(13)GlcNAc(13)-Gal(14)Glc were docked on the minimized structure of galectin. These docked complexes were further minimized by initially fixing all the protein atoms and then unfixing the side-chain atoms. For the tetrasaccharide the first or the second galactose residue was docked in the binding site and the energy minimized. The complex with the second galactose residue at the primary binding site had
20 kcal/mol energy less than the complex with the first galactose residue at the primary binding site. We had used the former complex for the MD simulations. Water molecules were layered on the minimized structure of the galectinligand complexes with a thickness of 15 Å. A total of about 28002900 water molecules was needed. In the simulations reported here, the crystallographic water molecules were not included, as was also the case with Åqvist and Mowbray (Åqvist and Mowbray, 1995
). Since the solvent molecules that were used have no bias in their positions, using some of them with defined positions might bias the calculations. The entire system was optimized by initially fixing the complex and then fixing only the backbone atoms of the protein. The minimized conformation was then superimposed on the crystal structure and it was observed that there are water molecules at most of the crystallographic water positions. During the minimization 15 Å non-bonded cutoff and a constant dielectric of 1.0 were used. All the minimized structures of the galectinligandwater assemblies were subjected to dynamic simulation at 300 K during which a 10 Å sphere around the ligand molecule containing interface waterprotein atoms was allowed to move and the rest of the protein and water atoms were held fixed. The covalent bond lengths were constrained using Rattle (as implemented in DISCOVER). The time step of the dynamic simulation was 2 fs. The initial equilibration of 100 ps was followed by a 500 ps stimulation run.
| Results and discussion |
|---|
|
|
|---|
We have used five disaccharides and a tetrasaccharide ligand to study the interactions with galectin-3. Table I
|
The coefficient used for the electrostatic interaction energy in Equation 1
in the paper by Aqvist et al.). The values obtained for different systems varied widely, ranging from 0.16 to 1.043 (Åqvist et al., 1994
The structural agreement between the average MD structures and the experimental one is very good. The root-mean square deviation of the averaged MD coordinates with respect to the X-ray coordinates of the galectin-3N-acetyllactosamine complex is between 0.63 to 0.76 Å for all the heavy atoms of the protein within 10 Å sphere of the ligand which were kept mobile during the MD simulations. The time-averaged structures of the lectinligand complexes obtained from the simulations were examined to understand the atomic details of the binding specificity. Figure 1af
show the binding site of galectin-3 with the different ligands. The hydrogen bonded partners observed in the MD average conformations have been compared with those observed in the crystal structure of the galectin-3N-acetyllactosamine complex and are given in Table II
. The comparison shows that most of the hydrogen bonds are retained during the MD simulations. Figure 1a
shows the MD average structure of the galectin-3N-acetyllactosamine complex. It has been seen that the binding is mostly through the hydroxyl groups at the 4, 6 positions and the O5 atom of the galactose residue and the hydroxyl group at position 3 of the GlcNAc residue. The protein residues, which are participating in direct hydrogen bonding, are His158, Arg162, Asn174 and Glu184. In addition to that several water molecules are also participating in the hydrogen bonding network. Although crystallographic water molecules were not included in the simulations the reported water-mediated hydrogen bonds with O2, O6 of Gal and O6 of Glc/GlcNAc have been reproduced (Table II
). In addition, several water-mediated hydrogen bonds have been observed in the solution simulations, which were not observed crystallographically.
|
|
It is interesting to note that galectin can bind both Galß-(14) as well as Galß(13) linkages (Sparrow et al., 1987
|
Comparison of the calculated binding free energies with the experimental ones show that the calculations are reasonably successful in reproducing the small observed experimental range, despite the fact that the individual interaction energies are quite large. The correct order of the binding affinities is also reproduced in most of the cases. The specificity of galectins for the galactose-containing disaccharides (ß-galactosides) is remarkable, however it is observed that replacement of galactose by glucose at the non-reducing end of the disaccharide leads to a drastic reduction in binding by the lectin (Sparrow et al., 1987
Gcalc value for the disaccharide Glcß-(14)Glc. In the free form it is observed that Glcß(14)Glc has the most favourable electrostatic interaction with the solvent molecule and binding of lectin to this particular ligand leads to a large reduction in electrostatic contribution which is not compensated by the small favourable van der Waals component. On the other hand, the Galß(14)GlcNAc ligand with higher experimentally observed binding affinity has a smaller unfavourable electrostatic component and a larger favourable van der Waals interaction. Superimposition of the MD average conformations of the binding sites containing these two ligands is shown in Figure 3
|
In a recent report on calculation of binding free energies for MHC class I proteinpeptide interactions using the continuum method, Froloff et al. (Froloff et al., 1997
|
| Conclusion |
|---|
|
|
|---|
The present study involves the calculation of absolute binding free energies for one tetra- and five disaccharides to galectin-3 using the LIE approximation. The linear response method used here was able to reproduce the experimental data reasonably well. This method is not CPU intensive like free energy perturbation and thermodynamic integration methods. In addition, no ligand mutation is necessary during the simulations and this allows the use of a variety of different ligands. One limitation of this method is the choice of the van der Waals coefficient, for which system dependency has been observed. We have found that this method is able to describe the binding affinity of the lectin satisfactorily. Using this approach it is also possible to suggest modes of binding for a ligand for which crystallographic data is not available.
The results reported here suggest that both hydrogen bonding and hydrophobic interactions are important for determining the affinity of the carbohydrates towards the lectin. While the aqueous environment might favour the solvation of the carbohydrates electrostatically, the binding to protein is favoured by van der Waals interaction. The polar residues of the protein, e.g. Arg, His, Glu, Asn, etc. provide the necessary hydrogen bonding partners and residues like Trp, CH2 groups of Arg side-chains, etc. can provide the van der Waals neighbours. The results also indicate that the water molecules can compete with protein side-chains to form hydrogen bonds with the ligands. It is interesting to see that in the case of a weakly binding ligand [Glcß(14)Glc], the hydrogen bonding partners, though different from a stronger ligand, are quite high in number, whereas the van der Waals counterparts are fewer. We suggest this indicates that contrary to the popular belief that the sugar binding specificity of the lectins is mostly determined by the number and/or pattern of the hydrogen bonding network, the non-polar interactions are equally crucial. We have also found that better binding ligands lead to a larger burial of solvent accessible surface area of the protein. The intimate details of the recognition process thus obtained can be exploited further for a rational design of mimetics in glycosciences.
| Notes |
|---|
1 To whom correspondence should be addressed. E-mail: chaitali{at}cucc.ernet.in, chaitalicu{at}yahoo.com
| Acknowledgments |
|---|
The work is funded by the Centre for Scientific and Industrial Research, Government of India [No. 01(1500)/98/EMR-II]. The use of the computational facility of the Computer Centre, University of Calcutta is gratefully acknowledged.
| References |
|---|
|
|
|---|
Åqvist,J. and Mowbray,S.L. (1995) J. Biol. Chem., 270, 99789981.
Åqvist,J., Medina,C. and Samuelsson,J.E. (1994) Protein Eng., 7, 385391.
Barondes,S.H., Cooper,D.N., Gitt,M.A. and Leffler,H. (1994) J. Biol. Chem., 269, 2080720810.
Bresalier,R.S., Mazurek,N., Sternberg,L.R., Byrd,J.C., Yunker,C.K., Nangia-Makker,P. and Raz,A. (1998) Gastroenterology, 115, 287296.[CrossRef][ISI][Medline]
Cameron,A.A., Dougherty,P.M., Garrison,C.J., Wills,W.D. and Carlton,S.M. (1993) Brain Res., 620, 6471.[CrossRef][ISI][Medline]
Cooper,D.N. and Barondes,S.H. (1999) Glycobiology, 9, 979984 and references therein.
Dagher,S.F., Wang,J.L. and Patterson,R.J. (1995) Proc. Natl Acad. Sci. USA, 92, 12131217.
Elgavish,S. and Shaanan,B. (1997) Trends Biochem. Sci., 22, 462467.[CrossRef][ISI][Medline]
Fernandez,P.L., Merino,M.J., Gomez,M., Campo,E., Medina,T., Castronovo,V., Sanjuan,X., Cardesa,A., Liu,F.-T. and Sobel,M.E. (1997) J. Pathol., 181, 8086.[CrossRef][ISI][Medline]
Foddy,L., Stamatoglou,S.C. and Haughes,R.C. (1990) J. Cell Sci., 97, 139148.
Frigeri,L.G., Zuberi,R.I. and Liu,F.T. (1993) Biochemistry, 32, 76447649.[CrossRef][Medline]
Froloff,N., Windemuth,A. and Honig,B. (1997) Protein Sci., 6, 12931301.[Abstract]
Hsu,D.K., Hammes,S.R., Kuwabara,I., Greene,W.C. and Liu,F.T. (1996) Am. J. Pathol., 148, 16611670.[Abstract]
Hsu,D.K., Dowling,C.A., Jeng,K.C.G., Chen,J.T., Yang,R.Y. and Liu,F.T. (1999) Int. J. Cancer, 81, 519526.[CrossRef][ISI][Medline]
Kollman,P.A. (1993) Chem. Rev., 93, 23952417.[CrossRef]
Konstantinov,K.N., Robbins,B.A. and Liu,F.T. (1996) Am. J. Pathol., 148, 2530.[Abstract]
Lindstedt,R., Apodaca,G., Barondes,S.H., Mostov,K. and Leffler, H. (1993) J. Biol. Chem., 268, 1175011757.
Lis,H. and Sharon,N. (1998) Chem. Rev., 98, 637674.[CrossRef][ISI][Medline]
Liu,F.T. (1993) Immunol. Today, 14, 486490.[CrossRef][ISI][Medline]
Lotz,M.M., Andrews,C.W.Jr, Korzelius,C.A., Lee,E.C., Steele,G.D.Jr, Clarke,A. and Mercurio,A.M. (1993) Proc. Natl Acad. Sci. USA, 90, 34663470.
Miyamoto,S. and Kollman,P.A. (1993) Proc. Natl Acad. Sci. USA, 90, 84028406.
Nangia-Makker,P., Honjo,Y., Sarvis,R., Akahani,S., Hogan,V., Pienta,K.J. and Raz, A. (2000) Am. J. Pathol., 156, 899909.
Paulsen,M.D. and Ornstein,R.L. (1996) Protein Eng., 9, 567571.
Perillo,N.L., Marcus,M.E. and Baum,L.G. (1998) J. Mol. Med., 76, 402412.[CrossRef][ISI][Medline]
Pesheva,P., Kuklinski,S., Schmitz,B. and Probstmeier,R. (1998) J. Neurosci. Res., 54, 639654.[CrossRef][ISI][Medline]
Raz,A., Zhu,D., Hogan,V., Shah,N., Raz,T., Karkash,R., Pazerini,G. and Carmi,P. (1990) Int. J. Cancer, 46, 871877.[ISI][Medline]
Regan,L.J., Dodd,J., Barondes,S.H. and Jessell,T.M. (1986) Proc. Natl Acad. Sci. USA, 83, 22482252.
Sano,H., Hsu,D.K., Yu,L., Apgar,J.R., Kuwabara,I., Yamanaka,T., Hirashima,M. and Liu,F.T. (2000) J. Immunol., 165, 21562164.
Sato,S. and Hughes,R.C. (1994) J. Biol. Chem., 269, 44244430.
Seetharaman,J., Kanigsberg,A., Slaaby,R., Leffler,H., Barondes,S.H. and Rini,J.M. (1998) J. Biol. Chem., 273, 1304713052.
Sparrow,C.P., Leffler,H. and Barondes,S.H. (1987) J. Biol. Chem., 262, 73837390.
Still,W.C., Tempcyk,A., Hawley,R.C. and Hendrickson,T. (1990) J. Am. Chem. Soc., 112, 61276129.[CrossRef]
Wang,J., Dixon,R. and Kollman,P.A. (1999) Proteins, 34, 6981.[CrossRef][ISI][Medline]
Wang,W., Wang,J. and Kollman,P.A. (1999) Proteins, 34, 395402.[CrossRef][ISI][Medline]
Weis,W.I. and Drickamer,K. (1996) Annu. Rev. Biochem., 65, 441473.[CrossRef][ISI][Medline]
Xu,X.C., El-Naggar,A.K. and Lotan,R. (1995) Am. J. Pathol., 147, 815822.[Abstract]
Yang,R.Y., Hsu,D.K., Yu,L., Ni,J. and Liu,F.T. (2001) J. Biol. Chem., 276, 2025220260.
Received March 6, 2002; revised September 24, 2002; accepted October 10, 2002.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








