Thanks to the Lipper Center for Computational Genetics Government and private grant agencies: NHLBI, NSF, ONR, DOE, DARPA, HHMI, Armenise Corporate collaborators.

Slides:



Advertisements
Similar presentations
Lecture 2 Strachan and Read Chapter 13
Advertisements

Modeling sequence dependence of microarray probe signals Li Zhang Department of Biostatistics and Applied Mathematics MD Anderson Cancer Center.
DARPA BAA 01-26: BIO-COMP Technical challenges and risks: “DNA computing” so far focused on computing.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Office hours Wednesday 3-4pm 304A Stanley Hall. Fig Association mapping (qualitative)
Thanks to Harvard/MIT Team: Jake Jaffe, Kyriacos Leptos, Matt Wright, Daniel Segre, Martin Steffen DARPA BIOCOMP 23-May-2002 Model-data integration. Issues.
Parallel human genome analysis: Microarray-based expression monitoring of 1000 genes Mark Schena, Dari Shalon, Renu Heller, Andrew Chai, Patrick O. Brown,
Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics BioSystems Synthesis: New optima demand new technologies.
Thanks to: DARPA BioComp DNA&RNA Polonies: Mitra, Shendure, Zhu Protein MS: Jaffe, Leptos Metabolism/Proliferation models : Segre, Vitkup, Badarinarayana.
DNA Sequencing and Gene Analysis
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
David Goodsell. GtL Workshop B: Experimental Technology Development and Integration Tue at 2 PM Co-Chairs – George Church, Harvard Medical School Ham.
1 Characterization, Amplification, Expression Screening of libraries Amplification of DNA (PCR) Analysis of DNA (Sequencing) Chemical Synthesis of DNA.
Genome Function Project We thank for support: Government and private grant agencies: NHLBI, NSF, ONR, DOE, DARPA, HHMI, Lipper, Armenise Corporate collaborators.
Introduce to Microarray
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Share, Search, Merge, Check, Design: e.g. 3D & Sequence alignment.
Chapter 20: Biotechnology Ms. Whipple Brethren Christian High School.
Kamila Balušíková.  DNA – sequence of genes, repetitive sequence of noncoding regions  RNA  Proteins gene expression.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Fine Structure and Analysis of Eukaryotic Genes
Genetic and Molecular Epidemiology Lecture III: Molecular and Genetic Measures Jan 19, 2009 Joe Wiemels HD 274 (Mission Bay)
AP Biology Ch. 20 Biotechnology.
CO 10.
Data Type 1: Microarrays
Restriction Nucleases Cut at specific recognition sequence Fragments with same cohesive ends can be joined.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Ch. 20 Biotechnology. DNA cloning yields multiple copies of a gene or other DNA segment Gene cloning and other techniques, collectively termed DNA technology,
Finish up array applications Move on to proteomics Protein microarrays.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Literature reviews revised is due4/11 (Friday) turn in together: revised paper (with bibliography) and peer review and 1st draft.
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
19.1 Techniques of Molecular Genetics Have Revolutionized Biology
Introduction to DNA microarray technologies Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor short course Summer 2002.
Chap. 5 Problem 1 Recessive mutations must be present in two copies (homozygous) in diploid organisms to show a phenotype (Fig. 5.2). These mutations show.
1 FINAL PROJECT- Key dates –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max.
MCB 317 Genetics and Genomics Topic 11 Genomics. Readings Genomics: Hartwell Chapter 10 of full textbook; chapter 6 of the abbreviated textbook.
Taqman Technology and Its Application to Epidemiology Yuko You, M.S., Ph.D. EPI 243, May 15 th, 2008.
Gene Expression Analysis. 2 DNA Microarray First introduced in 1987 A microarray is a tool for analyzing gene expression in genomic scale. The microarray.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Analysis of protein-DNA interactions with tiling microarrays
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Chapter 20: DNA Technology and Genomics - Lots of different techniques - Many used in combination with each other - Uses information from every chapter.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Molecular Basis for Relationship between Genotype and Phenotype DNA RNA protein genotype function organism phenotype DNA sequence amino acid sequence transcription.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
G ENETIC T ECHNOLOGY. 1) G ENETIC C LONING 1) G ENETIC C LONING O VERVIEW 1. Remove bacterial plasmid with restriction enzymes 2. Add in gene of interest.
MCT = Molecular Colony Technique Alexander Chetverin Institute of Protein Research of the Russian Academy of Sciences References: NAR(10)2349 from 1993.
DNA Technology & Genomics CHAPTER 20. Restriction Enzymes enzymes that cut DNA at specific locations (restriction sites) yielding restriction fragments.
Vectors Bacteria, viruses or liposomes into which DNA can be inserted. These can be used to grow genes, harvest the proteins they code for or deliver them.
Gene Expression Analysis
Microarray Technology and Applications
Chapter 20: DNA Technology and Genomics
Chapter 20 – DNA Technology and Genomics
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
Chapter 14 Bioinformatics—the study of a genome
In vivo optimization of the tagging approach using the Act5C model locus and flow cytometry-based quantification of the Act5C-GFP tagging success. In vivo.
AtG3BP1 is a homolog of the human HsG3BP1.
Chapter 20: DNA Technology and Genomics
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
Data Type 1: Microarrays
Presentation transcript:

Thanks to the Lipper Center for Computational Genetics Government and private grant agencies: NHLBI, NSF, ONR, DOE, DARPA, HHMI, Armenise Corporate collaborators & sponsors: Affymetrix, GTC, Mosaic, Aventis, Dupont, Cistran CHI Macroresults through Microarrays 3 George Church 1-May-02 Array quantitation for modeling mutations affecting RNA, protein interactions & cell proliferation.

gggatttagctcagtt gggagagcgccagact gaa gat ttg gag gtcctgtgttcgatcc acagaattcgcacca Post- 300 genomes & 3D structures

DNA RNA Protein: in vivo & in vitro interactions Metabolites Replication rate Environment Biosystems Measures & Models Microbes Cancer & stem cells Darwinian In vitro replication Small multicellular organisms RNAi Insertions SNPs

Functional Genomics Challenges Systems dynamics and optimality modeling. Multiple genetic domains per gene: high density readout of whole genome mutant phenotypes. Multiple RNAs & regulatory proteins per gene. Many causative genes & haplotypes per disease. Polony RNA exon-typing Multiplex in situ RNA & protein analyses Automated differentiation Homologous recombination genome engineering

Human Red Blood Cell ODE model 200 measured parameters GLC e GLC i G6P F6P FDP GA3P DHAP 1,3 DPG 2,3 DPG 3PG 2PG PEP PYR LAC i LAC e GL6PGO6PRU5P R5P X5P GA3P S7P F6P E4P GA3PF6P NADP NADPH NADP NADPH ADP ATP ADP ATP ADP ATP NADH NAD ADP ATP NADH NAD K+K+ Na + ADP ATP ADP ATP 2 GSHGSSG NADPHNADP ADO INO AMP IMP ADO e INO e ADE ADE e HYPX PRPP R1P R5P ATP AMP ATP ADP Cl - pH HCO 3 - Jamshidi, Edwards, Fahland, Church, Palsson, B.O. (2001) Bioinformatics 17: 286. (

Modeling suboptimality: Segre, Edwards, Vitkup

Wild type, C 0.4-limited CC=0.97 Calculted Flux Calculated & Observed Fluxes in wt Observed Fluxes in wt

Replication rate of a whole-genome set of mutants Badarinarayana, et al. (2001) Nature Biotech.19: 1060

Replication rate challenge met: multiple homologous domains thrA metL lysC 10.4 probes Selective disadvantage in minimal media

Multiple mutations per gene Correlation between two selection experiments Badarinarayana, et al. (2001) Nature Biotech.19: 1060

Comparison of selection data with Flux Balance Optimization predictions on 488 genes predictionsnumber of genes negatively selected not negatively selected essential reduced growth rate non essential P-value Chi Square = > < Novel duplicates? Position effects, toxin accumulation, non-opt?

DNA RNA Protein: in vivo & in vitro interactions Metabolites Replication rate Environment Biosystems Measures & Models microbes cancer & stem cells In vitro replication small multicellular organisms RNAi Insertions SNPs

RNA quantitation issues Small fold changes in RNA are important. Example: 1.5-fold in trisomies. Cross-hybridizing RNAs. Alternative RNAs, gene families. Mixed tissues. In situ hybridization has low multiplex.

Gene Expression database Aach, Rindone, Church, (2000) Genome Research 10: Microarrays 1 Affymetrix 2 Lynx-MPSS 3, SAGE 4 experiment control R/G ratios R, G values quality indicators ORF PM MM Averaged PM-MM “presence” feature statistics 25-mers Counts of 14-mers sequence tags for each ORF 1 DeRisi, et.al., Science 278: (1997) 2 Lockhart, et.al., Nat Biotech 14: (1996) 3 Brenner et al. Massively Parallel Signature Sequencing, Nat Biotechnol. 18:630-4 (2000) 4 Velculescu, et.al, Serial Analysis of Gene Expression, Science 270: (1995) agactagcag

RNA Cluster Analyses: Cell Cycle MCBSCB CLUSTER Number of ORFs Distance from ATG (b.p.) Number of sites Distance from ATG (b.p.) Number of sites Number of ORFs N = 186 Tavazoie, et al Nature Genetics 22:281.

(homeobox gene Crx-/-) Livesey, Furukawa, Steffen, Church, Cepko (2000) Current Biol. 10:301. sp Combining mouse knockouts with RNA array analysis

DNA RNA Protein: in vivo & in vitro interactions Metabolites Replication rate Environment Biosystems Measures & Models microbes cancer & stem cells In vitro replication small multicellular organisms RNAi Insertions SNPs

ds-DNAarray HMS: Martha Bulyk, Xiaohua Wang, Martin Steffen MRC: Yen Choo Combinatorial arrays for binding constants Human/Mouse EGR1

Combinatorial DNA-binding protein domains ds-DNAarray PhagepVIIIpIII Antibodies Combinatorial arrays for binding constants

Phycoerythrin - 2º IgG Combinatorial DNA-binding protein domains ds-DNAarray Martha Bulyk et al Phage Combinatorial arrays for binding constants

Isalan et al., Biochemistry (‘98) 37: Interactions of Adjacent Basepairs in EGR1 Zinc Finger DNA Recognition

high [DNA] (+) ctrl sequence for wt binding alignment oligos etc. Wildtype EGR1 Microarray

WildtypeRSDHLTT RGPDLARREDVLIR LRHNLET TGG 2.8 nM GCG 16 nM 2.5 nM TAT 5.7 nM AAA,AAT,ACT,AGA, AGC,AGT,CAT,CCT, CGA,CTT,TTC,TTT AAT 240 nM KASNLVS Motifs weight all 64 K a app

DNA RNA Protein: in vivo & in vitro interactions Metabolites Replication rate Environment Biosystems Measures & Models microbes cancer & stem cells In vitro replication small multicellular organisms RNAi Insertions SNPs

Common diseases: billions of “new” alleles plus a millions of balanced polymorphisms 60 new mutations per generation * 5,000 generations since major bottleneck(s) which set up the linkage patterns (=300,000 per genome) Each of the 3 Gbp in the genome exist in all SNP forms: A,C,G,T,  600,000 of each SNP on earth (spread over the common haplotypes). The population frequency will be <0.01%. ( Aach et al, 2001 Nature 409: 856) Functional genomics (FG) may provide better leads for therapies & diagnostics. (Accuracy goal 1 ppb?)

Projected costs affect our view of what is possible. In 1985, the dawn of the genome project, $10 per bp, would have been $30B per genome. In 2002, Perlegen or Lynx: $3M (10 3 bits/$, 4 logs) In 2001, the cost of video data collection? bits/$ Genotyping & functional genomics demand will probably be as high as permitted by costs.

Femtoliter ( ) scale & low-cost scanners Polymerase DNA colonies (polonies) Fluorescent in situ sequencing (FISSEQ) Why lower-cost, high quality “sequencing”? Mitra & Church Nucleic Acids Res. 27: e34 Environmental, food, & biodiversity monitoring Human genome haplotyping RNA splicing & editing immune B&T cell receptor spectra & How ?

A’ B B B B B B A Single Molecule From Library B B A’ 1st Round of PCR Primer is Extended by Polymerase B A’ B Primer A has 5’ immobilizing (Acrydite) modification.

1. Remove 1 strand of DNA. 2. Hybridize Universal Primer. 3. Add Red (Cy3) dTTP. BB’ 3’5’ A G T.. T 4. Wash; Scan Red Channel BB’ 3’5’ G C G.. Sequence polonies by sequential, fluorescent single-base extensions

5. Add Green (FITC) dCTP 6. Wash; Scan Green Channel BB’ 3’5’ A G T. T C BB’ 3’5’ G C G.. C Sequence polonies by sequential, fluorescent single-base extensions

Polony Template 3’ P’ P 5’ AATACAATTCACACAGGAAACAGCTATGACATTC TATTGTTAAAGTGTGTCCTTTGTCGATACTGGTA…5’ FITC ( C )CY3 ( T ) Mean Intensity: 58, , , , 43 Primer Extension 26 cycles, 34 Nucleotides

Femtoliter ( ) scale & low-cost scanners Polymerase DNA colonies (polonies) Fluorescent in situ sequencing (FISSEQ) Why lower-cost, high quality “sequencing”? Mitra & Church Nucleic Acids Res. 27: e34 Environmental, food, & biodiversity monitoring Human genome haplotyping RNA splicing & editing immune B&T cell receptor spectra & How ?

Femtoliter ( ) scale & low-cost scanners Polymerase DNA colonies (polonies) Fluorescent in situ sequencing (FISSEQ) Why lower-cost, high quality “sequencing”? Mitra & Church Nucleic Acids Res. 27: e34 Environmental, food, & biodiversity monitoring Human genome haplotyping RNA splicing & editing immune B&T cell receptor spectra & How ?

RNA Exon typing Single molecules of RNA dispersed. Multiplex polonies spanning all likely variable exons Sequential probing of each exon.

Functional Genomics Challenges Systems dynamics and optimality modeling. Multiple genetic domains per gene: high density readout of whole genome mutant phenotypes. Multiple RNAs & regulatory proteins per gene. Many causative genes & haplotypes per disease. Polony RNA exon-typing Multiplex in situ RNA & protein analyses Automated differentiation Homologous recombination genome engineering

For more information: arep.med.harvard.edu