Ligands, dictionary and refinement Garib N Murshudov York Structural Biology Laboratory University of York
Outline 1.Introduction 2.Dictionary of ligands 3.Sources of dictionary and idealised coordinates 4.Tools for ligand description in ccp4 5.How to use dictionary in refinement (REFMAC) 6.Conclusions
The need for prior chemical knowledge Refinement Atomic model description Graphics Simulations ………..
Atomic model description ATOM 7 C LEU A ATOM 8 O LEU A ATOM 9 N ILE A ATOM 10 CA ILE A ATOM 11 CB ILE A Default pointers in PDB file Pointer to link description Pointer to monomer description Pointer to atom description
Refmac5 Dictionary Describes all amino acids All nucleic acids Common sugars Many organic and inorganic compounds Links and modifications There are tools to deal with dictionary Dictionary format is mmCIF
General category data_comp_list loop_ _chem_comp.id _chem_comp.three_letter_code _chem_comp.name _chem_comp.group _chem_comp.number_atoms_all _chem_comp.number_atoms_nh _chem_comp.desc_level GLC-b-D GLC 'beta_D_glucose ' D-pyranose Group: peptide, DNA/RNA, pyranose, non-polymer Level: C or M – complete or minimal description
Atom category loop_ _chem_comp_atom.comp_id _chem_comp_atom.atom_id _chem_comp_atom.type_symbol _chem_comp_atom.type_energy _chem_comp_atom.partial_charge _chem_comp_atom.x _chem_comp_atom.y _chem_comp_atom.z GLC-b-D C1 C CH GLC-b-D H1 H HCH
Bond category loop_ _chem_comp_bond.comp_id _chem_comp_bond.atom_id_1 _chem_comp_bond.atom_id_2 _chem_comp_bond.type _chem_comp_bond.value_dist _chem_comp_bond.value_dist_esd GLC-b-D O1 C1 single GLC-b-D C2 C1 single Type: single, double, triple, aromatic, metal
Angle category loop_ _chem_comp_angle.comp_id _chem_comp_angle.atom_id_1 _chem_comp_angle.atom_id_2 _chem_comp_angle.atom_id_3 _chem_comp_angle.value_angle _chem_comp_angle.value_angle_esd GLC-b-D H1 C1 O GLC-b-D O1 C1 C
Torsion angles category loop_ _chem_comp_tor.comp_id _chem_comp_tor.id _chem_comp_tor.atom_id_1 _chem_comp_tor.atom_id_2 _chem_comp_tor.atom_id_3 _chem_comp_tor.atom_id_4 _chem_comp_tor.value_angle _chem_comp_tor.value_angle_esd _chem_comp_tor.period GLC-b-D var_1 C1 C2 O2 HO GLC-b-D var_2 C1 C2 C3 C Period: number of energetic minima
Chirality category 1.Tetrahedral chirality 2.Non-tetrahedral chirality Usually on C or N with sp3 hybridisation Usually for metal coordination
Chirality category loop_ _chem_comp_chir.comp_id _chem_comp_chir.id _chem_comp_chir.atom_id_centre _chem_comp_chir.atom_id_1 _chem_comp_chir.atom_id_2 _chem_comp_chir.atom_id_3 _chem_comp_chir.volume_sign GLC-b-D chir_01 C5 C4 O5 C6 positive GLC-b-D chir_02 C4 C3 O4 C5 positive GLC-b-D chir_03 C3 C2 O3 C4 negative GLC-b-D chir_04 C2 C1 O2 C3 positive..... Sign: positive, negative, both, anomer 1 3 C + _
Metal chirality Metal chirality is only used to create coordinates loop_ _chem_comp_chir.comp_id _chem_comp_chir.id _chem_comp_chir.atom_id_centre _chem_comp_chir.atom_id_1 _chem_comp_chir.atom_id_2.... _chem_comp_chir.atom_id_8 _chem_comp_chir.volume_sign MONid chir_id Ac Ab Af A1 A2 A3 A4 A5 A6 cross6 Where: Ac - chiral centre atom Ab - back atom,Af - forward atom A1,A2,...,AN - atoms in the same plane, N can be = 0,1,2,3,4,5,6 these atoms form the point group. crossN - cross chirality specification
Example metal chirality (OC7) OC7 chir_01 CA O5 O7 O1 O4 O2 O3 O6. cross5 CA O5O7 O1 O4 O2O3 O6
Plane category loop_ _chem_comp_plane_atom.comp_id _chem_comp_plane_atom.plane_id _chem_comp_plane_atom.atom_id _chem_comp_plane_atom.dist_esd PHE plan CB PHE plan CG PHE plan CD
Example of a modification Modification formalism allows to change a monomer Modification describes in details the result of chemical reaction
Modification: general category data_mod_list loop_ _chem_mod.id _chem_mod.name _chem_mod.comp_id _chem_mod.group_id O1MET O1_metyl_of_sugar. pyranose group_id: means only for sugars
Modification: atom category loop_ _chem_mod_atom.mod_id _chem_mod_atom.function _chem_mod_atom.atom_id _chem_mod_atom.new_atom_id _chem_mod_atom.new_type_symbol _chem_mod_atom.new_type_energy _chem_mod_atom.new_partial_charge O1MET change O1.. O O1MET delete HO O1MET add. CM C CH O1MET add. HM1 H HCH function: only - change, delete or add
Modification: bond category loop_ _chem_mod_bond.mod_id _chem_mod_bond.function _chem_mod_bond.atom_id_1 _chem_mod_bond.atom_id_2 _chem_mod_bond.new_type _chem_mod_bond.new_value_dist _chem_mod_bond.new_value_dist_esd O1MET add O1 CM single O1MET add CM HM1 single O1MET add CM HM2 single O1MET add CM HM3 single
Example of peptide link Link formalism allows to join monomers together Link describes in details the result of chemical reaction
Link: general category data_link_list loop_ _chem_link.id _chem_link.comp_id_1 _chem_link.mod_id_1 _chem_link.group_comp_1 _chem_link.comp_id_2 _chem_link.mod_id_2 _chem_link.group_comp_2 _chem_link.name ALPHA1-4. DEL-HO4 pyranose. DEL-O1 pyranose glycosidic_bond_alpha1-4 mod_id _1: modification of first monomer before the linkage mod_id_2 : modification of second monomer before the linkage
Link: bond category loop_ _chem_link_bond.link_id _chem_link_bond.atom_1_comp_id _chem_link_bond.atom_id_1 _chem_link_bond.atom_2_comp_id _chem_link_bond.atom_id_2 _chem_link_bond.type _chem_link_bond.value_dist _chem_link_bond.value_dist_esd ALPHA1-4 1 O4 2 C1 single atom_1_comp_id: means first monomer atom_2_comp_id: means second monomer
Source of dictionary and coordinates MSDchem PRODRG RELIBASE CORINA QM or other energy minimsation programs CSD
MSDchem You can search by formula, substructure and others. Results can be save as cif file and used by libcheck to create dictionary for refmac
MSDchem: JME 1) Draw substructure, write a smile file or load SDF, MOL, mmCIF, 2) Search
PRODRG server JME Load your file
PRODRG: JME Draw your ligand, transfer to PRODRG window and run
PRODRG output It can write out dictionaries for CNS REFMAC5, SHELX and others
Tools in CCP4 LIBCHECK - creates the complete monomer description from minimal - creates coordinates from complete monomer description SKETCHER - graphical program that creates the minimal monomer description for LIBCHECK MAKECIF - creates restraints
Ways to create dictionary 1. From chemical structure 2. From Cartesian coordinates Using SKETCHER: monomer is drawn specifying atoms and bonds From SMILE strings, sdf file, mol2 file Coordinates from CSD Energetically optimised coordinates MOL2 file SDF file
Smile strings: An example SMILE for ALA: 3D representation: For description of smile:
Sketcher Initial 2D sketch After LIBCHECK and REFMAC
Restraints: monomer linkage 1.Chain links (trans/cis, DNA/RNA, sugar links, gap) 2.Standard links (SS bridges, sugar-protein links) 3.Potential links 4.Links between alternative conformations 5.Symmetry links 6.User links
Modifications and links in PDB file SSBOND 1 CYS L 88 CYS L 23 LINK SG CYS H SG BCYS H 140 SS LINK TYR L 139 PRO L 140 PCIS LINK GLY H 127 GLY H 133 gap LINK MAG Y 1 GAL Y 2 BETA1-4 LINK O LEU B 61 NA NA X 6 LEU-NA LINK OE1 GLU A 139 NA NA X symmetry MODRES GAL Y 2 GAL-b-D RENAME Modification IDStandard name Name in PDB file Link ID
Conclusions Ligand dictionaries should designed with care. Interpetation of chemistry may depend on that Such resources as MSDchem, PRODRG can help to create an accurate dictionary Links and modifications are important component for understanding protein chemistry Unfortunately no automatic link generation programs available yet (we are working on that)
Acknowledgments Alexei Vagin – YSBL, York Roberto Steiner – Kings coll. Andrey Lebedev – YSBL, York Liz Potterton – YSBL, York Fei Long – YSBL, York Wellcome Trust, BBSRC, BIOXHIT, CCP4 – money