Thanks to Harvard/MIT Team: Jake Jaffe, Kyriacos Leptos, Matt Wright, Daniel Segre, Martin Steffen DARPA BIOCOMP 23-May-2002 Model-data integration. Issues.

Slides:



Advertisements
Similar presentations
Nucleus Control center of the cell –contains the “genetic library” encoded in the sequences of nucleotides in molecules of DNA code for the amino acid.
Advertisements

IGEM Journal Club 6/30/10. “Even in simple bacterial cells, do the chromosomes contain the entire genetic repertoire? If so, can a complete genetic system.
How Cell Work - Introduction of Molecular Biology.
How E. Coli find its middle Journal Club talk by Xianfeng Song Advisor: Sima Setayeshgar.
Global Mapping of the Yeast Genetic Interaction Network Tong et. al, Science, Feb 2004 Presented by Bowen Cui.
New SNPs: Sift, Polyphen, etc. SIFT: predicting amino acid changes that affect protein function Pauline C. Ng and Steven Henikoff, Nucleic Acids Research,
Molecular Genetics DNA RNA Protein Phenotype Genome Gene
Prof. Drs. Sutarno, MSc., PhD.. Biology is Study of Life Molecular Biology  Studying life at a molecular level Molecular Biology  modern Biology The.
DARPA BAA 01-26: BIO-COMP Technical challenges and risks: “DNA computing” so far focused on computing.
Metabolic networks Guest lecture by Dr. Carlotta Martelli 26_10_2007.
Flux Balance Analysis. FBA articles Advances in flux balance analysis. K. Kauffman, P. Prakash, and J. Edwards. Current Opinion in Biotechnology 2003,
A Model of Bacterial Chromosome Architecture Matthew Wright, Daniel Segre, George Church.
Metabolic network analysis Marcin Imielinski University of Pennsylvania March 14, 2007.
Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.
BIO 404/504 – Molecular Genetics Dr. Berezney Lecture 1.
Thanks to: DARPA BioComp DNA&RNA Polonies: Mitra, Shendure, Zhu Protein MS: Jaffe, Leptos Metabolism/Proliferation models : Segre, Vitkup, Badarinarayana.
Thanks to the Lipper Center for Computational Genetics Government and private grant agencies: NHLBI, NSF, ONR, DOE, DARPA, HHMI, Armenise Corporate collaborators.
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
1 Protein2: Properties & Quantitation (Last week) Separation of proteins & peptides Protein localization & complexes Peptide identification (MS/MS) –Database.
GTL Facilities Characterization and Imaging of Molecular Machines Lee Makowski.
RETROVIRUSES.
Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment.
Metabolic/Subsystem Reconstruction And Modeling. Given a “complete” set of genes… Assemble a “complete” picture of the biology of an organism? Gene products.
MCB 317 Genetics and Genomics MCB 317 Topic 10, part 5 A Story of Transcription.
Molecular properties of plasmids
REPLICATION Chapter 7.
Nuclear Architecture/Overview Double-membrane envelope Has lumen that is continuous with ER Outer membrane also has ribosomes like ER Nuclear envelope.
1 Genetics Instructor: Dr. Jihad Abdallah Topic 6: DNA replication.
Bioinformatics and it’s methods Prepared by: Petro Rogutskyi
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Flux Balance Analysis Evangelos Simeonidis Metabolic Engineering.
Lecture 1: Metabolism and Assembly Reactions Reading assignments in Text: Lengeler et al Text:pages Metabolic overview Text:pages ,
The Central Dogma of Molecular Biology by E. Börje Lindström This learning object has been funded by the European Commissions FP6 BioMinE project.
Finish up array applications Move on to proteomics Protein microarrays.
Introduction to Proteomics 1. What is Proteomics? Proteomics - A newly emerging field of life science research that uses High Throughput (HT) technologies.
DNA metabolism Replication Early on - “Template” so molecules can line up in a specific order and be joined to create a new macromolecule 1940s - DNA =
Genetics: Chromosome Organization. Chromosomes: Structures that contain the genetic material (DNA) Genome – complete set of genetic material in a particular.
Chapter 11 Phage strategies.
Steady-state flux optima AB RARA x1x1 x2x2 RBRB D C Feasible flux distributions x1x1 x2x2 Max Z=3 at (x 2 =1, x 1 =0) RCRC RDRD Flux Balance Constraints:
Proteome and interactome Bioinformatics.
Replication Transcription Translation
1 Departament of Bioengineering, University of California 2 Harvard Medical School Department of Genetics Metabolic Flux Balance Analysis and the in Silico.
Introduction: Acknowledgments Thanks to Department of Biotechnology (DBT), the Indo-US Science and Technology Forum (IUSSTF), University of Wisconsin-Madison.
Proteomics Session 1 Introduction. Some basic concepts in biology and biochemistry.
10 AM Tue 20-Feb Genomics, Computing, Economics Harvard Biophysics 101 (MIT-OCW Health Sciences & Technology 508)MIT-OCW Health Sciences & Technology 508.
DNA, RNA, and Protein Replication Transcription Translation.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Genomes & The Tree of Life
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Lecture #19 Growth states of cells. Outline Objective functions The BOF The core E. coli model The genome-scale E. coli model Using BOF.
1 Genomics Advances in 1990 ’ s Gene –Expressed sequence tag (EST) –Sequence database Information –Public accessible –Browser-based, user-friendly bioinformatics.
DNA Replication 20/02/ DNA replication is central to life and to evolution; in which the stored genomic information is handed down to the next.
DNA and the Genetic Code 46 molecules of DNA are located in the nucleus of all cells in the human body except sperm and oocytes –23 molecules are inherited.
1 Molecular genetics of bacteria Emphasis: ways that bacteria differ from eukaryotes DNA structure and function; definitions. DNA replication Transcription.
Site-Directed Mutagenesis
BT8118 – Adv. Topics in Systems Biology
Protein2: Last week's take home lessons
DNA REPLICATION IN PROKARYOTES
INTRODUCTION TO VIRUSES
SUMMARY OVERVIEW OF PROTEIN SYNTHESIS
Nutrigenomics/pharmacogenomics
Proteomics Informatics David Fenyő
From Mendel to Genomics
Volume 5, Issue 5, Pages (December 2013)
Helicase and Polymerase Move Together Close to the Fork Junction and Copy DNA in One-Nucleotide Steps  Manjula Pandey, Smita S. Patel  Cell Reports  Volume.
Volume 10, Issue 2, Pages (August 2011)
Jing Chen, John Neu, Makoto Miyata, George Oster  Biophysical Journal 
Replisome Assembly at oriC, the Replication Origin of E
A Mathematical Model of the Liver Circadian Clock Linking Feeding and Fasting Cycles to Clock Function  Aurore Woller, Hélène Duez, Bart Staels, Marc.
SV40 Large T Antigen Hexamer Structure
Presentation transcript:

Thanks to Harvard/MIT Team: Jake Jaffe, Kyriacos Leptos, Matt Wright, Daniel Segre, Martin Steffen DARPA BIOCOMP 23-May-2002 Model-data integration. Issues of flux optimality & polymer mechanics of 4D cell models

gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Post- 300 genomes & 3D structures

DoD Relevance: Accurate Bio I/O Engineering Over-determined Calculable Protein folding vs. crystallography Accurate Comprehensive/Quantitative Bio-Systems Embrace outliers Analytic & Synthetic Useful Computer-Aided-Design (CAD) >>INTEGRATION<<

DNA RNA Protein: in vivo & in vitro interactions Metabolites Replication rate Environment Technical challenge: Integrating Measures & Models Microbes Cancer & stem cells Darwinian In vitro replication Small multicellular organisms RNAi Insertions SNPs

Human Red Blood Cell ODE model 200 measured parameters GLC e GLC i G6P F6P FDP GA3P DHAP 1,3 DPG 2,3 DPG 3PG 2PG PEP PYR LAC i LAC e GL6PGO6PRU5P R5P X5P GA3P S7P F6P E4P GA3PF6P NADP NADPH NADP NADPH ADP ATP ADP ATP ADP ATP NADH NAD ADP ATP NADH NAD K+K+ Na + ADP ATP ADP ATP 2 GSHGSSG NADPHNADP ADO INO AMP IMP ADO e INO e ADE ADE e HYPX PRPP R1P R5P ATP AMP ATP ADP Cl - pH HCO 3 - Jamshidi, Edwards, Fahland, Church, Palsson, B.O. (2001) Bioinformatics 17: 286. (

Gene deletions Normalized optimal growth Linear Programming Flux Balance Analysis (v ko =0)

Minimal Perturbation Analysis for the analysis of non-optimal metabolic phenotypes Daniel Segre Challenge #1: Suboptimality of mutants --integrating growth rate and flux data

This is a Quadratic Programming (QP) problem: Minimize Dist=  i (x i -a i ) 2 given Sx=b ; x  0 Minimize (x T Qx)/2 + a T x given Sx=b ; x  0 Standard form:

Optimal (FBA) Suboptimal(MPA) p = 4·10 -3 p = 22 test for prediction of essential genes:

C009-limited Experimental Fluxes Predicted Fluxes  pyk (LP) WT (LP) Experimental Fluxes Predicted Fluxes Experimental Fluxes Predicted Fluxes  pyk (QP)  =0.91 p=8e-8  =-0.06 p=6e-1  =0.56 P=7e-3

DNA RNA Protein: in vivo & in vitro interactions Metabolites Replication rate Environment Technical challenge: Integrating Measures & Models Microbes Cancer & stem cells Darwinian In vitro replication Small multicellular organisms RNAi Insertions SNPs

Minimal Perturbation Analysis for the analysis of non-optimal metabolic phenotypes Challenge #1: Suboptimality of mutants --integrating growth rate and flux data

Polymer mechanics of 4D cell models (Automating integration of data) Challenge #2: integrating proteomics & in vivo crosslinking data

Mapping genome folding DNA:DNA, DNA:protein, protein:protein in vivo crosslinks Dekker etal. Science : Capturing chromosome conformation.

In vivo crosslinking DNA-binding proteins

Retention time min P S W C M V A R C C T K D Q G A G L F E K [Optional 1 st & 2 nd Protein dimensions: Subcellular fractions, Sizing of native protein complexes 1st peptide Dimension: Strong Cation Exchange Charge 2 nd peptide Dimension: Reverse Phase Chromatography Hydrophobicity 3 rd peptide Dimension: Mass Spectrometry Mass per charge Multidimensional protein and peptide separations for MS quantitation m/z

Β.Β.A. C. rt 1 rt 2 rt 3 MS1 D.

Minimal Cell Projects The first FULL proteome model would benefit from a small number of natural cell states & genes. 3D-structure of a cell during replication & motility. Genome engineering / complete synthesis.

Small sequenced genomes (excludes organelle/symbionts) Mollicutes = cell-wall-less bacteria, a subgroup of Clostridia “gram-positive” o Acholeplasmataceae Acholeplasma, Anaeroplasma, Phytoplasma o Mycoplasmatales Entomoplasmataceae (florum) Mycoplasmataceae pulmonis urealyticum pneumoniae genitalium (mobile) Spiroplasmataceae Megabases

Motility Species  nm/ secReplicateTemp M. mobile30005 hr25 M. pneumoniae M. florum U. urealyticum 0>1037 E.coli H. sapiens 1000 >1037 RNA Pol / ribosome20 (=50 nt/s) E.coli DNA Pol3 300 (=1000 nt/s)

Attachment organelle replication Seto S, Layh-Schmitt G, Kenri T, Miyata M. J Bacteriol :1621 Visualization of the attachment organelle and cytadherence proteins of Mycoplasma pneumoniae by immunofluorescence microscopy.

Mycoplasma pneumoniae Regula, et al, Microbiology 147: , scale bar = 100 nm

Hypothetical mechanisms

Proteo- genomic mapping (of peptide data in 3 forward & 3 reverse frames)

Use of proteogenomic mapping to discover B. a new ORF. C. a new ORF & delete an inaccurately predicted ORF. D. N-terminal extension of an existing ORF.

Constraints Replication Membrane-bound polyribosomes Other RNA and/or protein complexes Metabolism DNA Structural Forces

Genome folding & cell 3D structure Seto & Miyata (1999) Partitioning, movement, and positioning of nucleoids in Mycoplasma capricolum J. Bact. 181:6073 Cell = 0.5  kbp genome Extended diameter = 80  ~200 transverses with each membrane encoding gene anchored to the cell surface. How to segregate this?

Paired fork model Dingman CW. Bidirectional chromosome replication: some topological considerations. J Theor Biol 1974 Jan;43(1): Sundin O, Varshavsky A. Terminal stages of SV40 DNA replication proceed via multiply intertwined catenated dimers. Cell Aug;21(1): Hearst JE, Kauffman L, McClain WM. A simple mechanism for the avoidance of entanglement during chromosome replication. Trends Genet Jun;14(6): Bouligand, Y, Norris V (2000) “Both replication forks appear to be part of a single complex or factory, as first proposed by Dingman.” Roos M, Lingeman R, Woldringh CL, Nanninga N. Biochimie 2001 Jan;83(1):67-74 Experiments on movement of DNA regions in Escherichia coli evaluated by computer simulation.

Constraints Replication Membrane-bound polyribosomes could anchor the RNA polymerase and hence the gene’s DNA to within 20 nm of the cell surface. Other RNA and/or protein complexes Metabolism DNA Structural Forces

Origin Blue: First MPN gene# Green : Mid gene # 344 (ter) Red: Last gene# 688 Side view, no replication ( gene#)

Off-axial view, no replicated segments, unoptimized membrane Yellow: Membrane Pink: Ribosomal White: Hypothetical & abundant Green : Misc. abundant Blue: Weak

Axial view, no replicated segments Yellow: Membrane Pink: Ribosomal White: Hypothetical & abundant Green : Misc. abundant Blue: Weak

Origin Yellow: Membrane Pink: Ribosomal White: Hypothetical & abundant Green : Misc. abundant Blue: Weak Side view, no replicated segments

Origin Blue: Origin of replication Red: Terminus Side view, no replication (dis from ori)

R1R1 R2R2 M1M1 M2M2 M3M3  Simple example cost function for chromosome structure optimization

2002_5_16_h18_ _5_16_h19_ _5_16_h19_ _5_16_h19_ _5_16_h19_ _5_16_h20_ _5_16_h20_ _5_16_h20_ _5_16_h21_ _5_16_h21_ _5_16_h21_ _5_16_h23_ _5_17_h0_ _5_17_h0_ _5_17_h4_ _5_17_h4_ _5_17_h4_ _5_17_h4_ _5_17_h4_ _5_17_h5_ _5_17_h5_ _5_17_h6_ _5_17_h6_ _5_17_h7_ _5_17_h7_ _5_17_h7_ _5_17_h7_ _5_17_h8_ E_final s Searching six helical parameters for chromosomal fold

Monte carlo minimization of the model fit to constraints.

2002_5_17_h5_

2002_5_16_h20_

2002_5_17_h4_

2002_5_17_h4_

data_2002_5_19_h0_40

data_2002_5_16_h18_42

data_2002_5_16_h19_34

data_2002_5_16_h21_50

data_2002_5_16_h19_42

data_2002_5_16_h21_56

data_2002_5_16_h20_3

data_2002_5_16_h19_0

data_2002_5_16_h20_30

data_2002_5_16_h21_5

Origin Blue: Left replicated segment (yelgr=high gene#) Red: Right (i.e. middle) segment Aqua: unduplicated segment of the circular genome Avoidance of entanglement throughout cell cycle

M. pneumoniae genes generally point away from Ori More significant if abundance data are integrated Alignment of known motors: Polymerases,b ribosomes, F1 ATPase

Biospice 2.0 Deliverables: toolsets for data integration & optimality assessment #1QP MPA flux & growth modeling #2: 4D-model current plan: Chromosome segregation Membrane-bound polysomes Ribosomal protein/rRNA assembly Motility (coordination with replication origin) Next few months: Other protein complexes Space filling metric Replication entanglement metric In vivo crosslinking