Houssam Nassif, Hassan Al-Ali, Sawsan Khuri, Walid Keirouz, and David Page An ILP Approach to Model and Classify Hexose Binding Sites.

Slides:



Advertisements
Similar presentations
Proteins: Structure reflects function….. Fig. 5-UN1 Amino group Carboxyl group carbon.
Advertisements

The Chemical Nature of Enzyme Catalysis
Review.

Amino Acids PHC 211.  Characteristics and Structures of amino acids  Classification of Amino Acids  Essential and Nonessential Amino Acids  Levels.
A Ala Alanine Alanine is a small, hydrophobic
Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
Lactate dehydrogenase + 38 ATP + 2 ATP. How does lactate dehydrogenase perform its catalytic function ?
Proteins Function and Structure.
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
Metabolic fuels and Dietary components Lecture - 2 By Dr. Abdulrahman Al-Ajlan.
FUNDAMENTALS OF MOLECULAR BIOLOGY Introduction -Molecular Biology, Cell, Molecule, Chemical Bonding Macromolecule -Class -Chemical structure -Forms Important.
5’ C 3’ OH (free) 1’ C 5’ PO4 (free) DNA is a linear polymer of nucleotide subunits joined together by phosphodiester bonds - covalent bonds between.
An overview of amino acid structure Topic 2. Biomacromolecule A naturally occurring substance of large molecular weight e.g. Protein, DNA, lipids etc.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
1 Levels of Protein Structure Primary to Quaternary Structure.
Amino Acids and Proteins 1.What is an amino acid / protein 2.Where are they found 3.Properties of the amino acids 4.How are proteins synthesized 1.Transcription.
©CMBI 2001 Amino Acids “ When you understand the amino acids, you understand everything ”
Lipids A. Classified based on solubility (like dissolves like) 1. insoluble in polar solvents 2. soluble in nonpolar solvents 3. lipids are hydrophobic.
©CMBI 2001 A Ala Alanine Alanine is a small, hydrophobic residue. Its side chain, R, is just a methyl group. Alanine likes to sit in an alpha helix,it.
Chapter 3 The Chemistry of Organic Molecules
Protein Structure FDSC400. Protein Functions Biological?Food?
Proteins. The central role of proteins in the chemistry of life Proteins have a variety of functions. Structural proteins make up the physical structure.
Marlou Snelleman 2012 Proteins and amino acids. Overview Proteins Primary structure Secondary structure Tertiary structure Quaternary structure Amino.
Proteins are polymers of amino acids.
Proteins account for more than 50% of the dry mass of most cells
Chapter Three Amino Acids and Peptides
Biochemistry Chapter 3. Water polar compound  one end is slightly negative while the other is slightly positive polar compound  one end is slightly.
The Chemistry of Carbon
©CMBI 2006 Amino Acids “ When you understand the amino acids, you understand everything ”
On the nature of cavities on protein surfaces: Application to the Identification of drug-binding sites Murad Nayal, Barry Honig Columbia University, NY.
1.Overall amino acid structure 2.Amino acid stereochemistry 3.Amino acid sidechain structure & classification 4.‘Non-standard’ amino acids 5.Amino acid.
Amino Acids & Side Groups Polar Charged ◦ ACIDIC negatively charged amino acids  ASP & GLU R group with a 2nd COOH that ionizes* above pH 7.02nd COOH.
Amino Acids are the building units of proteins
Amino acids. Essential Amino Acids 10 amino acids not synthesized by the body arg, his, ile, leu, lys, met, phe, thr, trp, val Must obtain from the diet.
CHAPTER 4 CARBON AND THE MOLECULAR DIVERSITY OF LIFE
Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”
Marlou Snelleman 2011 Proteins and amino acids. Overview Proteins Primary structure Secondary structure Tertiary structure Quaternary structure Amino.
Proteins.
Proteins Structure of proteins Proteins are made of C, H, O and nitrogen and may have sulfur. The monomers of proteins are amino acids An amino acid.
Amino Acids  Amino Acids are the building units of proteins. Proteins are polymers of amino acids linked together by what is called “ Peptide bond” (see.
Genomics Lecture 3 By Ms. Shumaila Azam. Proteins Proteins: large molecules composed of one or more chains of amino acids, polypeptides. Proteins are.
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
Protein chemistry Lecture Amino acids are the basic structural units of proteins consisting of: - Amino group, (-NH2) - Carboxyl group(-COOH)
Amino acids Proof. Dr. Abdulhussien Aljebory College of pharmacy
Amino Acid & Basic Classification
Amino acids.
PROTEINS LEVELS OF PROTEIN STRUCTURE Central Carbon Atom 3.
The Molecules of Cells.
Proteins.
Protein Structure FDSC400. Protein Functions Biological?Food?
The forces at work on proteins/ glutamic acid and valine
Chapter 3 Proteins.
Carbon and Molecular Diversity
A Ala Alanine Alanine is a small, hydrophobic
Packet #9 Supplement.
Biomolecules Nutrient Sort Biology 12.
Amino Acids Amine group -NH2 Carboxylic group -COOH
Packet #9 Supplement.
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
The 20 amino acids.
Chapter Three Amino Acids and Peptides
Tay Sachs vs Sickle cell
Levels of Protein Structure
Volume 8, Issue 12, Pages (December 2001)
The 20 amino acids.
Chapter 18 Naturally Occurring Nitrogen-Containing Compounds
How glutaminyl-tRNA synthetase selects glutamine
“When you understand the amino acids,
Presentation transcript:

Houssam Nassif, Hassan Al-Ali, Sawsan Khuri, Walid Keirouz, and David Page An ILP Approach to Model and Classify Hexose Binding Sites

Problem Description Hexoses play a key role in many cellular pathways Hexose binding properties are of great interest to biomedical researchers Current protein-sugar computational models are based on prior biochemical knowledge Goal: Investigate the empirical support for biochemical findings by comparing ILP-induced rules to actual biochemical results

Biochemical Review An amino acid consists of: a central carbon atom C an amino group NH2 a carboxyl group COOH a hydrogen atom H and a side chains R Amino acids differ by their side chain R Side chain confer to each amino acid its distinctive properties There are 20 different amino acids, also called residues

Biochemical Review A protein is a long chain of amino acids linked together Residue sequence determines protein shape and function Similar residues can be easily interchanged in a protein

Biochemical Review Hexoses are 6-carbon sugar molecules Hexoses consist of: A core pyranose ring Hydroxyl ( -OH ) groups sticking out

Biochemical Review Pyranose ring: Apolar (no charge) hydrophobic (water hating) Hydroxyl ( -OH ) groups: Polar (negative charge) Hydrophilic (water loving) Form hydrogen bonds Interaction forces between hexoses and residues are due to: Charge Hydrogen bond Hydrophobicity

Prior Biochemical Findings Planar polar residues ( Asn, Asp, Gln, Glu, Arg ) are frequently involved in hydrogen bonding. The aromatic residues ( Trp, Tyr, Phe, His ), stack against the apolar surface of the sugar pyranose ring. Planar polar and aromatic residues are present at higher frequencies in hexose binding sites. Ordered water molecules and metal ions are involved in binding specificity and affinity. Hexoses and their binding sites are neither hydrophobic nor hydrophilic. They exhibit both properties in a dual nature.

Predicting Sugar Binding Sites Prior biochemical findings have been incorporated in binding site classifiers Black box models We take the opposite approach: Given hexose binding sites data, what biochemical rules can we extract with no prior biochemical knowledge?

Dataset Mine Protein Data Bank for galactose, glucose and mannose Remove redundancies and keep proteins with a hexose docked in binding site Get 80 protein-hexose binding sites (positive set) Extract 80 negatives: non-hexose binding sites and non- binding surface grooves Total of 160 entries, equally divided.

Binding Site Representation Only the few atoms present at the binding site determine the binding site affinity. We define the binding site as a sphere of radius 10 Å, centered at the binding site center We extract all atoms in this sphere We ONLY consider atoms, not residues For every atom, we compute its charge, hydrogen bonding, and hydrophobicity properties

Problem Formulation Use Aleph heuristic search to learn first-order rules Estimate the performance using 10-fold cross-validation The consequent of any rule is bind(A), where A is predicted to be a hexose binding site Restrict clause length to a maximum of 8 literals Tolerate a clause coverage of up to 5 training-set negatives Minimize the cost function: cost = (# covered negatives) − (# covered positives)

Literals Individual atom literal: point(A,B,C,D,E, F,G,H, I, J) A: binding site B: atom number C, D, E: Cartesian coordinates F: charge value (nominal) G: hydrogen bonding value (nominal) H: hydrophobicity value (nominal) I, J: atomic element and its name

Literals Distance between two atoms literal: dist(A,B1,B2,M,N) A: binding site B1, B2: two atoms numbers M: their Euclidean distance N: the error, resulting in M±N (set to 0.5 Å)

Results 32.5% error rate over 10 folds (p-value < ) comparable to other general sugar binding site classifiers Although we only consider atoms, we can infer valuable information regarding amino acids Example: ND1 atoms are only present in His residues. A rule requiring ND1 is actually requiring His. We present the rule’s English translation, with residue substitution and sorted by coverage

A is a hexose-binding site if: 1. It has a Trp residue and a Glu with an OE1 atom that is 8.53 Å away from a negatively charged Oxygen. [Pos cover = 22, Neg cover = 4] 2. It has a Phe or Tyr residue and an Asp with an OD1 atom that is 5.24 Å away from an Asp or Asn ’s OD1. [Pos cover = 21, Neg cover = 3] 3. It has a branching aliphatic residue ( Leu, Val, Ile ), an Asp and an Asn. Asp and Asn ’s OD1 atoms are 3.41 Å away. [Pos cover = 15, Neg cover = 0]

A is a hexose-binding site if: 4. It has a hydrophilic non-hydrogen bonding Nitrogen atom ( Pro, Arg, His ) with a distance of 7.95 Å away from a His ND1 nitrogen, and 9.60 Å away from a branching aliphatic residue’s CG1. [Pos cover = 10, Neg cover = 0] 5. It has a hydrophobic CD2 atom, a hydrophilic Pro backbone or His ND1 nitrogen and two Glu (or two Gln ) distant by Å. [Pos cover = 11, Neg cover = 2] 6. It has an Asp B, two identical atoms Q and X, and a hydrophilic hydrogen-bonding atom K. Atoms K, Q and X have the same charge. B’s ODE1 oxygen share the same Y-coordinate with K and the same Z- coordinate with Q. Atom X is 8.29 Å away from atom K. [Pos cover = 8, Neg cover = 0]

A is a hexose-binding site if: 7. It has a Ser, and two Gln and/or His, with NE2 atoms that are 3.88 Å apart. [Pos cover = 8, Neg cover = 2] 8. It has an Asn and a Phe, Tyr or His residue, with a CE1 atom that is 7.07 Å away from a Calcium. [Pos cover = 5, Neg cover = 0] 9. It has a Lys or Arg, a Phe or Tyr, a Pro or His, and a Sulfate or a Phosphate. [Pos cover = 3, Neg cover = 0]

Discussion We infer most of the established biochemical information Rules 1 and 2, with highest coverage, rely on the aromatic residues Trp, Tyr, and Phe. The fourth aromatic residue, His, is mentioned in many different rules. This highlights the docking interaction between the hexose and the aromatic residues. All rules require the presence of a planar polar residue ( Asn, Asp, Gln, Glu, Arg ). These residues are most frequently involved in hexose hydrogen- bonding.

Discussion The residues mostly mentioned in the rules are aromatic and planar polar. Which mirrors the fact that they are present at higher frequencies in hexose binding sites Rule 5 requires both a hydrophobic and a hydrophilic elements. It reflects the dual nature of hexose docking. Rules 8 and 9 require the presence of different ions (Calcium, Sulfate, Phosphate), confirming the relevance of ions in hexose binding.

New discovery? Rule 2 suggests a dependency between Phe / Tyr and Asn / Asp. Such a relation has been proven in lectins. Similarly, rule 1 suggests a dependency between Trp and Glu. A link not previously identified in the literature Further investigation is needed to confirm this finding

Conclusion ILP achieves a similar accuracy as other general sugar black- box classifiers In addition, it offers insight into the discriminating process. Aleph was able to induce most of the known hexose-protein interaction biochemical rules. ILP finds a previously unreported dependency between Trp and Glu.