Introduction to Computational Biology BS123A/BME195/MB223 UC-Irvine Ray Luo, MBB, BS.

Slides:



Advertisements
Similar presentations
Chemistry 2100 Lecture 10.
Advertisements

Review.
Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
Fundamentals of Protein Structure August, 2006 Tokyo University of Science Tadashi Ando.
Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve molecular biology problems; Intersection of molecular biology.
1 SURVEY OF BIOCHEMISTRY Protein Function. 2 PRS In a protein, the most conformationally restricted amino acid is_____ and the least conformationally.
Proteins Function and Structure.
Proteins include a diversity of structures, resulting in a wide range of functions Protein functions include structural support, storage, transport, enzymes,
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
FUNDAMENTALS OF MOLECULAR BIOLOGY Introduction -Molecular Biology, Cell, Molecule, Chemical Bonding Macromolecule -Class -Chemical structure -Forms Important.
• Exam II Tuesday 5/10 – Bring a scantron with you!
5’ C 3’ OH (free) 1’ C 5’ PO4 (free) DNA is a linear polymer of nucleotide subunits joined together by phosphodiester bonds - covalent bonds between.
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
1 Levels of Protein Structure Primary to Quaternary Structure.
Amino Acids and Proteins 1.What is an amino acid / protein 2.Where are they found 3.Properties of the amino acids 4.How are proteins synthesized 1.Transcription.
Protein Structure FDSC400. Protein Functions Biological?Food?
You Must Know How the sequence and subcomponents of proteins determine their properties. The cellular functions of proteins. (Brief – we will come back.
Proteins and Enzymes Nestor T. Hilvano, M.D., M.P.H. (Images Copyright Discover Biology, 5 th ed., Singh-Cundy and Cain, Textbook, 2012.)
Bioinformatics Ayesha M. Khan Spring Phylogenetic software PHYLIP l 2.
Proteins are polymers of amino acids.
Protein Structural Prediction. Protein Structure is Hierarchical.
Proteins account for more than 50% of the dry mass of most cells
Question 1. List the following items in order of size from the smallest to largest: A. Nucleosome smallest__C_____ B. Mitochondrion __A_____ C. Diameter.
Unit 7 RNA, Protein Synthesis & Gene Expression Chapter 10-2, 10-3
Pharm 202 Computer Aided Drug Design Phil Bourne -> Courses -> Pharm 202 Several slides are taken from UC Berkley.
Protein Synthesis. DNA RNA Proteins (Transcription) (Translation) DNA (genetic information stored in genes) RNA (working copies of genes) Proteins (functional.
Proteins account for more than 50% of the dry mass of most cells
CSE 6406: Bioinformatics Algorithms. Course Outline
Chapter 2: Chemistry of Life 1. Organic chemistry is the study of all compounds that contain bonds between carbon atoms.
Computer aided drug design
CS 790 – Bioinformatics Introduction and overview.
BIOCHEMISTRY REVIEW Overview of Biomolecules Chapter 4 Protein Sequence.
Course website Lecture notes will be posted before the lectures if possible. Handouts will be passed around for additional.
Amino Acids & Side Groups Polar Charged ◦ ACIDIC negatively charged amino acids  ASP & GLU R group with a 2nd COOH that ionizes* above pH 7.02nd COOH.
Proteins.
Chapter 05. Building Proteins DNA’s instructions are translated into thousands of proteins that do a cell’s work Protein molecules communicate and coordinate.
Chapter 8 Microbial Genetics part A. Life in term of Biology –Growth of organisms Metabolism is the sum of all chemical reactions that occur in living.
Cell Structure.
NOTES: 2.3 part 2 Nucleic Acids & Proteins. So far, we’ve covered… the following MACROMOLECULES: ● CARBOHYDRATES… ● LIPIDS… Let’s review…
GENOME: an organism’s complete set of genetic material Humans ~3 billion base pairs CHROMOSOME: Part of the genome; structure that holds tightly wound.
THE STRUCTURE AND FUNCTION OF MACROMOLECULES Proteins - Many Structures, Many Functions 1.A polypeptide is a polymer of amino acids connected to a specific.
Chap. 1 basic concepts of Molecular Biology Introduction to Computational Molecular Biology Chapter 1.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
A program of ITEST (Information Technology Experiences for Students and Teachers) funded by the National Science Foundation Background Session #3 DNA &
RNA 2 Translation.
Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”
Proteins.
Proteins Structure of proteins Proteins are made of C, H, O and nitrogen and may have sulfur. The monomers of proteins are amino acids An amino acid.
Chapter 3 Proteins.
Proteins: Primary Structure Lecture 6 Chapters 4 & 5 9/10/09.
Evolution and the Foundations of Biology
Overview of Biomolecules proteinsLipid membraneDNA.
Genes in ActionSection 2 Section 2: Regulating Gene Expression Preview Bellringer Key Ideas Complexities of Gene Regulation Gene Regulation in Prokaryotes.
CHAPTER 5 THE STRUCTURE AND FUNCTION OF MACROMOLECULES Copyright © 2002 Pearson Education, Inc., publishing as Benjamin Cummings Section D: Proteins -
Prepared By: Syed Khaleelulla Hussaini. Outline Proteins DNA RNA Genetics and evolution The Sequence Matching Problem RNA Sequence Matching Complexity.
Genomics Lecture 3 By Ms. Shumaila Azam. Proteins Proteins: large molecules composed of one or more chains of amino acids, polypeptides. Proteins are.
Molecular Modeling in Drug Discovery: an Overview
ZOOLOGY—STUDY OF ANIMALS
Proteins account for more than 50% of the dry mass of most cells
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
Proteins.
Transport proteins Transport protein Cell membrane
Protein Structure FDSC400. Protein Functions Biological?Food?
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Proteins account for more than 50% of the dry mass of most cells
Virtual Screening.
Chapter 3 Proteins.
Directed Mutagenesis and Protein Engineering
Proteins account for more than 50% of the dry mass of most cells
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
Presentation transcript:

Introduction to Computational Biology BS123A/BME195/MB223 UC-Irvine Ray Luo, MBB, BS

What to expect? This is an introduction to computational representations and algorithms for analysis of sequence, structure and function in molecular biology. It aims to give an understanding of the biological problems that arise and how algorithms are developed to address them.

Goals of this course Learn how to present molecular data using computer graphics Understand computational challenges in molecular biology Understand basic algorithms that establish context for rest of field Identify opportunities in this field, and perhaps formulate projects to explore further

How to reach these goals? Give you a feeling for main issues in computational molecular biology: sequence, structure, and function Give you exposure to classic computational problems as manifested in biology. Give you exposure to classic biological problems represented computationally.

Things that won’t be emphasized Won’t give you any balanced understanding of the experimental approaches used in molecular biology. Won’t force you to make a novel contribution to computational biology. Won’t ask you to write a computer program for any project, though you’re require to describe how a program/algorithm work in English.

Prerequisite Freshman calculus, phys, and chem completed or being taken concurrently. Willing to think quantitatively in biology. Willing to use computer in learning/working.

Course website Lecture notes will be posted before the lectures if possible. Handouts will be passed around for additional information.

Contacts Office hours by appointment 3206 Natural Sciences I

Lecture info Meeting place: NSI 2144 Meeting times: Tue/Thur 9:00-10:30am

Grading o50% based on weekly homework/project o25% by in-class final o25% by take-home final

Grading: How to get an A? It is crucial to turn in weekly homework and projects to get a passing grade. The finals are 90% based on homework and projects so make sure you know each assigned problem well for the finals. You have to do the in- class final very fast.

Uses of Computation in Biology Ecology Physiology Cell Biology Molecular Biology

Careers in Computational Molecular Biology Pharmaceutical/Biotechnology Industry Universities and Colleges

Life in Industry Identification of potential drug targets, mostly enzymes, with molecular biologist and biochemists Discovery of lead compounds, with medicinal chemists, aka organic chemists Optimization of lead compounds, with medical chemists Prediction of drug-like properties, ADME-T (absorption, distribution, metabolism, excretion, toxicity )

Why the things you learn here are important? Computer aided drug design

What is a drug? Defined composition with a pharmacological effect Regulated by the Food and Drug Administration (FDA) What is the process of Drug Discovery and Development?

Drugs and the Discovery Process Small Organic Molecules –Natural products fermentation broths plant extracts animal fluids (e.g., snake venoms) –Synthetic Medicinal Chemicals Project medicinal chemistry derived Combinatorial chemistry derived Biological Molecules –Natural products (isolation) –Recombinant products

Discovery vs. Development Discovery includes: Concept, mechanism, assay, screening, hit identification, lead demonstration, lead optimization Discovery also includes In Vivo proof of concept in animals and demonstration of a therapeutic effect Development begins when the decision is made to put a molecule into phase I clinical trials

Discovery and Development The time from conception to approval of a new drug is typically years The vast majority of molecules fail along the way The estimated cost to bring to market a successful drug is now $800 million!! (Dimasi, 2000) However, the annual profit of a drug can be $ 1 billion per year Pharmaceutical industry has been one of the best performing sections in economy

Drug Discovery Disciplines Medicine Physiology/pathology Pharmacology Molecular/cellular biology Automation/robotics Medicinal, analytical,and combinatorial chemistry Structural and computational chemistries Computational biology

Drug Discovery Program Rationales Unmet Medical Need Me Too! - Market - ($$$s) Drugs in search of indications –Side-effects often lead to new indications Indications in search of drugs –Mechanism based, hypothesis driven, reductionism

Issues in Drug Discovery Hits and Leads - Is it a “Druggable” target? Resistance Delivery - oral and otherwise Metabolism Solubility, toxicity Patentability …

A Little History of Computer Aided Drug Design 1960’s - Review target-drug interactions 1980’s- Automation - high throughput target/drug selection 1980’s- Databases (information technology) - combinatorial libraries 1980’s- Fast computers - docking 1990’s- Faster computers - genome assembly - genomic based target selection 2000’s- Fast information handling - pharmacogenomics

From the Computer Perspective

Comparing Growth Rates

From the Target Perspective

(a) myoglobin (b) hemoglobin (c) lysozyme (d) transfer RNA (e) antibodies (f) viruses (g) actin (h) the nucleosome (i) myosin (j) ribosome Status - Numbers and Complexity Courtesy of David Goodsell, TSRI

From the Drug Perspective

Combinatorial Libraries Blaney and Martin - Curr. Op. In Chem. Biol. (1997) 1:54-59 Thousands of variations to a fixed template Good libraries span large areas of chemical and conformational space - molecular diversity Diversity in - steric, electrostatic, hydrophobic interactions... Desire to be as broad as “Merck” compounds from random screening Computer aided library design is in its infancy

Computer-Assisted Drug Design Computer driven drug discovery Data driven drug discovery

An overview of biomolecules Living organisms are more ordered than their surroundings. So the first task is to maintain a separation between inside and outside. The second task is to spend energy to keep things in order. The functions of life are to facilitate the acquisition and expenditure of energy.

Cell Cells are the smallest compartments that are ordered and separated from the surroundings. Note that ordered compartments were difficult to get started de novo, and so have found ways to pass on the apparatus necessary to perpetuate themselves.

Tasks of a living cell Gather energy from surroundings. Use energy to maintain inside/outside distinction. Use extra energy to reproduce. Develop strategies for being efficient at their tasks: developing ways to move around; developing signaling capabilities; developing ways for energy capture; developing ways of reproduction.

Molecular means to realize these tasks Ability to separate inside from outside with lipids Ability to build three-dimensional molecules that assist their functions, proteins, RNA Ability to store information for these tasks, part of reproduction also, DNA

A simple model of a cell proteinsLipid membraneDNA

Lipids Made of hydrophilic (water loving) molecular fragment connected to hydrophobic fragment. Spontaneously form sheets (lipid bilayers, membranes) in which all the hydrophilic ends align on the outside, and hydrophobic ends align on the inside. Creates a very stable separation, not easy to pass through except for water and a few other small atoms/molecules.

Lipids

Lipid bilayers: Structure

Lipid bilayers: Functions

Lipid bilayers

A simple model of a cell proteinsLipid membraneDNA

Proteins: A chain of linked subunits These subunits are amino acids (also called protein residues for historical reasons). There are 20 different amino acids with different physical and chemical properties. The interaction of these properties allows a chain of the amino acids (upto 1000’s long) to fold into a unique, reproducible 3D shape.

20 amino acids Common back bone Unique side chain Ala A Alanine Glu E Glutamic Acid Arg R Arginine

Amino acid structures Fig. 5.3

Amino acid properties Polar amino acids: THR, SER, ASN, GLN, TYR, HIS, TRP, CYS Charged amino acids: ASP, GLU, LYS, ARG, HIS, CYS Hydrophobic amino acids: VAL, LEU, ILE, PHE, ALA, PRO, GLY, MET, TYR, TRP

Representations of proteins 1-d sequence: Alanine-Tyrosine-Valine= ALA-TYR-VAL= A-Y-V

Representations of proteins: 2-d THH HHHHHTLLLH HHHHHGGGLS STTEEEEEEE

Representations of proteins: 3-d

Protein features Protein can be stabilized by salt bridges Protein can be folded to a unique structure due to the existence of disulfide bonds Protein may function as an enzyme whose active sites are crucial for its function

A simple model of a cell proteinsLipid membraneDNA

DNA structures DNA packs in the nucleus to form chromosome

DNA structure

DNA is a sequence too It has a common back bone, and side chains, though only 4 kinds. A sequence of these subunits is also specified as a string: ACTTAGGACATTTTAG, which is a simplified representation of a chemical structure.

DNA is a sequence too DNA uses an alphabet of 4 letters (ATCG), i.e. bases. Long sequences of these 4 letters are linked together to create genes and control information.

Information in DNA DNA encodes proteins: each amino acid can be specified by 3 bases. Ribosome reads a DNA sequence and creates the corresponding protein chain. GENETIC CODE: 64 mappings of 3 bases to 1 amino acid.

Genetic code

The gene for myoglobin ctgcagataa ctaactaaag gagaacaaca acaatggttc tgtctgaagg tgaatggcag ctggttctgc atgtttgggc taaagttgaa gctgacgtcg ctggtcatgg tcaggacatc ttgattcgac tgttcaaatc tcatccggaa actctggaaa aattcgatcg tttcaaacat ctgaaaactg aagctgaaat gaaagcttct gaagatctga aaaaacatgg tgttaccgtg ttaactgccc taggtgctat ccttaagaaa aaagggcatc atgaagctga gctcaaaccg cttgcgcaat cgcatgctac taaacataag atcccgatca aatacctgga attcatctct gaagcgatca tccatgttct gcattctaga catccaggta acttcggtgc tgacgctcag ggtgctatga acaaagctct cgagctgttc cgtaaagata tcgctgctaa ctgggttacc agggttaatg aggtacc BASE COUNT 155 a 108 c 115 g 129 t MVLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLKTEAEM KASEDLKKHGVTVLTALGAILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISEAI IHVLHSRHPGNFGADAQGAMNKALELFRKDIAAKYKELGYQG

Genes and control The set of all genes required for an organism is the organism’s GENOME. Human genome has 3,000,000,000 bases divided into 23 linear segments (chromosomes). A gene has on average 1340 DNA bases, thus specifying a protein of about 447 amino acids. Humans have about 35,000 genes = 40,000,000 DNA bases = 3% of total DNA in genome. Humans have another 2,960,000,000 bases for control information. (e.g. when, where, how long, etc...)

Genotype and phenotype Genotype—the genetic sequences associated with an individual organism. Phenotype—the observable non-sequence features of an individual organism (e.g. color, shape, activity of an enzyme)

How do we proceed? In order to obtain insight into the ways in which genes and gene products function: Analyze DNA and protein sequences to search clues for structure, function and control – sequence analysis Analyze structures to search clues for sequences, function and control – structural analysis Understand how sequences and structures leads to functions – functional analysis

But what are functions of genes? Signal transduction: sensing a physical signal and turning into a chemical signal Structural support: creating the shape and of a cell or set of cells Enzymatic catalysis: accelerating chemical reactions otherwise too slow to be useful for living things Transport: getting things in and out of a compartment.

But what are functions of genes? Movement: contracting in order to pull things together or push things apart Transcription control: deciding when other genes should be turned on/off Trafficking: affecting where different elements end up inside a cell.

Evolution is the key Common descent of organisms implies that they will share many basic approaches Development of new phenotypes in response to environmental pressure can lead to specialized approaches More recent divergence implies more shared approaches between species The important thing is which is shared and which is not unshared. This is also important for drug discovery in biomedicine.

Seeing is believing: Computer Graphics

Je-2147/HIV Protease Complex

HIV Integrase

The Small Ribosomal Subunit

The Large Ribosomal Subunit