1 Use a circular template to get redundant reads and so more accuracy. Pacific Biosciences.

Slides:



Advertisements
Similar presentations
PCR Lab Notes. What does PCR Stand For? Polymerase chain reaction.
Advertisements

Sequence Capture and Targeted Re-sequencing
Functional Genomics with Next-Generation Sequencing
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
Next–generation DNA sequencing technologies – theory & practice
DNA and RNA. I. DNA Structure Double Helix In the early 1950s, American James Watson and Britain Francis Crick determined that DNA is in the shape of.
Microsatellite Instability Detection by Next Generation Sequencing S.J. Salipante, S.M. Scroggins, H.L. Hampel, E.H. Turner, and C.C. Pritchard September.
 ribose  Adenine  Uracil  Adenine  Single.
Gene Structure: DNA RNA Protein Dr. Jason Tasch. Nucleic Acids Sequence of Nucleotides Nucleotide composed of: –Nitrogenous Base Purine Pyrimidine –Sugar.
Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.
Whole Exome Sequencing for Variant Discovery and Prioritisation
Fine Structure and Analysis of Eukaryotic Genes
The virochip (UCSF) is a spotted microarray. Hybridization of a clinical RNA (cDNA) sample can identify specific viral expression.
High-throughput bisulfite sequencing reveals relationships between gene expression and DNA methylation in the bivalve, Crassostrea gigas Mackenzie Gavery.
RNA Ribonucleic Acid. Structure of RNA  Single stranded  Ribose Sugar  5 carbon sugar  Phosphate group  Adenine, Uracil, Cytosine, Guanine.
5.3 – Advances in Genetics Trashketball!. Selecting organisms with desired traits to be parents of the next generation is… A. Inbreeding A. Inbreeding.
HaloPlexHS Get to Know Your DNA. Every Single Fragment.
LECTURE CONNECTIONS 14 | RNA Molecules and RNA Processing © 2009 W. H. Freeman and Company.
PROTEIN SYNTHESIS The formation of new proteins using the code carried on DNA.
Other genomic arrays: Methylation, chIP on chip… UBio Training Courses.
Epigenetics and Obesity A potential role for the imprinted gene MEST in diet-induced obesity in mice Pennington Biomedical Research Center Robert A. Koza.
What is central dogma? From DNA to Protein
Lecture-3 EXOME SEQUENCING Huseyin Tombuloglu, Phd GBE423 Genomics & Proteomics.
While replication, one strand will form a continuous copy while the other form a series of short “Okazaki” fragments Genetic traits can be transferred.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
RNA, Transcription, and the Genetic Code. RNA = ribonucleic acid -Nucleic acid similar to DNA but with several differences DNARNA Number of strands21.
INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.
Placed on the same page as your notes Warm-up pg. 48 Complete the complementary strand of DNA A T G A C G A C T Diagram 1 A T G A C G A C T T A A C T G.
PROTEIN SYNTHESIS The formation of new proteins using the code carried on DNA.
TRANSCRIPTION AND TRANSLATION Vocabulary. GENE EXPRESSION the appearance in a phenotype characteristic or effect attributed to a particular gene.
Genetic Code and Interrupted Gene Chapter 4. Genetic Code and Interrupted Gene Aala A. Abulfaraj.
Journal #4: In DNA, which nucleotide pairs with Adenine? Guanine? Which DNA nucleotide is not represented in RNA? Fun Fact: Every human spent about half.
Gene Expression and Protein Synthesis
Accel-NGS® Methyl-Seq DNA Library Kit
MBD-Chip.
Experimental Verification Department of Genetic Medicine
RNA & Protein synthesis
Chapter 4 “DNA Finger Printing”
Jeopardy! Molecular Genetics Edition.
Eukaryotic Gene Finding
TRANSCRIPTION Sections 5.2 & 5.3.
Transcription.
Jacek Majewski  The American Journal of Human Genetics 
PROTEIN SYNTHESIS THE DETAILS.
Copy-number alterations in an archival breast cancer sample.
Relationship between Genotype and Phenotype
Daily Warm-Up Dec. 11th -What are the three enzymes involved with replication? What is the function of each? Homework: -Read 13.1 Turn in: -Nothing.
Protein Synthesis: Transcription
Copyright Pearson Prentice Hall
Copyright Pearson Prentice Hall
AH Biology: Unit 1 Proteomics and Protein Structure 1
Supplemental Figure 3 A B C T-DNA 1 2 RGLG1 2329bp 3 T-DNA 1 2 RGLG2
Higher Biology Unit 1: 1.3 Transcription.
Figure 1 SpliceSeq “splice graphs” for the 7 qRT-PCR tested genes
12-3 RNA and Protein Synthesis
The Structure of the Genome
DNA methylation patterns of four DMR-associated genes.
Copyright Pearson Prentice Hall
Copyright Pearson Prentice Hall
Copyright Pearson Prentice Hall
Gene Structure.
Fig. 2 Dynamics of H3K4me3 in human early development.
Methylation status of IGFBPL1 in human breast cancer.
In vitro interaction domain mapping of pVHL and AUF1.
Positions of the EGFR exon 20 insertions identified over a 3-year period and comparison with the spectrum of EGFR exon 19 and HER2 insertion mutations.
Methylation of cytosine and consequences of deamination of methyl-C
Figure Genetic characterization of the novel GYG1 gene mutation (A) GYG1_cDNA sequence and position of primers used. Genetic characterization of the novel.
CpG Methylation Analysis—Current Status of Clinical Assays and Potential Applications in Molecular Diagnostics  Antonia R. Sepulveda, Dan Jones, Shuji.
Identification of a New Splice Form of the EDA1 Gene Permits Detection of Nearly All X- Linked Hypohidrotic Ectodermal Dysplasia Mutations  Alex W. Monreal,
Gene Structure.
Presentation transcript:

1 Use a circular template to get redundant reads and so more accuracy. Pacific Biosciences

2 DNA methylation detection by bisulfite conversion

3 Detection of methylated adenine in Pacific Biosciences (SMRT) sequencing

4 IPD = average interpulse duration ratio (meth/non-meth) Template position

5 Pacific Biosciences 50,000 ZMWs (Aug., 2011), and density may climb Long reads (e.g., full molecules to determine full length splicing isoforms) Direct RNA sequencing possible. DNA methylation detectable

6 Agilent SureSelect RNA Target Enrichment Capture a subgenomic region of interest for economy and speed of sequencing: E.g., the entire exome (all exons w/o introns or intergeneic regions) hundreds of cancer genes a particular genomic locus Alternative: hybridize to a custom microarray. Agilent

7 Nimblegen (Roche) sub=-genomic DNA capture options: Beads or microarrays

8 Targeted Capture and Next- Generation Sequencing Identifies C9orf75, encoding Taperin, as the Mutated Gene in Nonsyndromic Deafness DFNB79 Rehman et al. American Journal of Human Genetics 86, 378–388,2010 Some results using DNA capture for subgenomic sequencing

9 ----CpG-- > ----C m pG--- > < ---G p C m --- Na bisulfite Heat cytosine uracil ----UpG-- > ----C m pG--- > Na bisulfite Heat deamination PCR ----TpG-- > <--ApC CpG-- > <--GpC--- All NON-methylated Cs changed to T. Sequence and compare to deduce the methylated C’s Detection of methylated C (~all in CpG dinucleotides) DS DNA

10 DEEP SEQUENCING (Next generation sequencing, High throughput sequencing, Massively parallel sequencing) applications: Human genome re-sequencing (mutations, SNPs, haplotypes, disease associations, personalized medicine) Tumor genome sequencing Microbial flora sequencing (microbiome, viruses) Metagenomic sequencing (without cell culturing) RNA sequencing (RNAseq; gene expression levels, miRNAs, lncRNAs, splicing isoforms) Chromatin structure (ChIP-seq; histone modifications, nucleosome positioning) Epigenetic modifications (DNA CpG methylation and hydroxymethylation) Transcription kinetics (GROseq; nascent RNA, BrdU pulse labeled RNA) High throughput genetics (QUEPASA; cis-acting regulatory motif discovery) Drug discovery (bar-coded organic molecule libraries) [Manocci PNAS paper]

11 Ke et al, and Chasin, Quantitative evaluation of all hexamers as exonic splicing elements. Genome Res : ). Order an equal mixture of all 4 bases at these 6 positions

12 Quantifying extensive phenotypic arrays from sequence arrays (= QUEPASA)

13 Rank6-merESRseq score (~ -1 to +1) 1 AGAAGA GAAGAT GACGTC GAAGAC TCGTCG TGAAGA CAAGAA CGTCGA : 4086TAGATA AGGTAG CGTCGC CTTAAA CCTTTA GCAAGA TAGTTA TCGCCG CCAGCA CTAGTA TAGTAG TAGGTA CTTTTA Best exonic splicing enhancers Worst exonic splicing enhancers, = best exonic splicing silencers - - -

14 Composite exon (from ~100,000) Constitutive exons Alternativexons Pseudo exons

15 Experiment: Sequence of 36 Quality code CGCACTGTGCTGGAGCTCCCGGGGTTAACTCTAGAA abU^Vaa`a\aaa]aWaTNZ`aa`Q][TE[UaP_U] TACACTGTGCTGGAGCTCCCAACGGCAACTCTAGAA a`P^Wa`[`Wa^`X_X_XWVa^NSP]_]S^X_T\X^ CGCACTGTGCTGGAGCTCCCATGGAGAACTCTAGAA aTa`^b``baaaa^aab^YaTQLOHIa`^a``TX]] TACACTGTGCTGGAGCTCCCCTCCCAAACTCTAGAA I_`aaaa`aaaaaaa_a_^[KZIGIGZ`U`\^P^^` CGCACTGTGCTGGAGCTCCCAATAGTAACTTTAGAA aY_\abb[T\abaaa`a`bZ[HXXIZa_`_LGMS[` TATACTGTGCTGGAGCTCCCGACGTAAACTCTAGAA aba]^aa_a]`aa]_]`XWSMFGGIPX[P]X`V_Y^ TACACTGTGCTGGAGCTCCCTGGTAAAACTCTAGAA a_^a^aa`aYaaa_aY`Y_^[I]VY\`]V]R\W]VV TACACTGTGCTGGAGCTCCCAATAAAAACTCTAGAA XZababa`aZaaaaaYaYXX`baa``\\TaUa\aW` 2 nt barcode (TA or CG) Constant regions (peculiar to our expt.) Variable region Barcoding allows multiplexing of several or many experiments at once (in one channel of a sequencer)  economy. Here, two biological replicates What the data looks like: Error

16 Next generation methods for high throughput genetic analysis: Use custom oligo libraries to construct minigene libraries (40,000, up to 60 nt long): E.g., for saturation mutagenesis to identify all exonic bases contributing to splicing (or transcription or polyadenylation, …..) Use bar codes to detect sequences missing from the selected molecules E.g., Nat Biotechnol : High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Patwardhan RP, Lee C, Litvin O, Young DL, Pe'er D, Shendure J. Long (200-mer) synthetic oligo library

17 OUTLINE OF LECTURE TOPICS COMING UP Expression and manipulation of transgenes in the laboratory In vitro mutagenesis to isolate variants of your protein/gene with desirable properties –Single base mutations –Deletions –Overlap extension PCR –Cassette mutagenesis To study the protein: Express your transgene –Usually in E. coli, for speed, economy –Expression in eukaryotic hosts –Drive it with a promoter/enhancer –Purify it via a protein tag –Cleave it to get the pure protein Explore protein-protein interaction Co-immunoprecipitation (co-IP) from extracts 2-hybrid formation surface plasmon resonance FRET (Fluorescence resonance energy transfer) Complementation readout

18 PCR fragment subsequent cloning in a plasmid (or not, the PCR product itself can be used in many ways, e.g., transfection) Cut with RE 1 and 2 Ligate into similarly cut vector RS1 RS2 RS1 RS2 Site-directed mutagenesis by overlap extension PCR 1 2 Strachan and Read Human Mol. Genet.3, p.148

19 Original sequence coding for, e.g., a transcription enhancer region Cassette mutagenesis = random mutagenesis but in a limited region: 1) by error-prone PCR * *--*-** * *--* * *-*-* * * Cut in primer sites and clone upstream of a reporter protein sequence. Pick colonies Analyze phenotypes Sequence PCR fragment with high Taq polymerase and Mn +2 instead of Mg +2  errors

20 Original enhancer sequence -* *-*-* * * * *--*-** * *--* Buy 2 doped oligos; anneal OK for up to ~80 nt. Clone upstream of a reporter. Doping = e.g., 90% G, 3.3% A, 3.3% C, 3.3% T at each position Pick colonies Analyze phenotypes Sequence Cassette mutagenesis = random mutagenesis but in a limited region: 2) by “doped” synthesis Target = e.g., an enhancer element

21 E. coli as a host PROs:Easy, flexible, high tech, fast, cheap; but problems CONs Folding (can misfold) Sorting within the cell -> can form inclusion bodies Purification -- endotoxins Modifications -- not done ( glycosylation, phosphorylation, etc. ) Modifications: Glycoproteins Acylation: acetylation, myristoylation Methylation (arg, lys) Phosphorylation (ser, thr, tyr) Sulfation (tyr) Prenylation (farnesyl, geranylgeranyl on cys) Vitamin C-Dependent Modifications (hydroxylation of proline and lysine) Vitamin K-Dependent Modifications (gamma carboxylation of glu) Selenoproteins (seleno-cys tRNA at UGA stop)

E. coli expression vectors Promoter examples: 1) Lac promoter (with operator)-YFG, + lac repressor (I gene): Induce expression by inactivationof thelac repressor with IPTG or lactose 2) As above but with a hybrid Tac promoter (tryptophan operon + lac operon): Stronger. Use i q mutant of lac I gene, which prodices high levels of the lac repressor. Expression regulatatable over several orders of magnitude. 3) BAD promoter-YFG. Arabinose utilization operon. Inducible by arabinose via the endogenous araC gene for a transciptional activator. Background levels driven down by including glucose. 4) Phage T7 promoter-YFG. Vector carries gene for T7 polymerase, under control of the lac promoter. Add IPTG or lactose to induce T7 polymerase and thence YFG. IPTG = isoproplthiogalactoside (non-metabolizable indicer) YFG = your favorite gene

23 Myristoylation – myristoic acid to N-terminal glycine alpha amino group Anchors protein to memebrane.

24 Lysine epsilon amino group modifications mono methyl, dimethyl also Well-studied in histones, microtubules

25 Via seleno-cys tRNA at a UGA nonsense codon Sequence context dictates efficiency.

26 Gamma carboxylation of glutamic acid Binds calcium, used in coagulation proteins

27 Some alternative hosts Yeasts (Saccharomyces, Pichia) Insect cells with baculovirus vectors Mammalian cells in culture (later) Whole organisms (mice, goats, corn) (not discussed) In vitro (cell-free), for analysis only, not preparatively (good for radiolabeled proteins, discussed later)

Some popular yeast promoters ARS = autonomously replicating sequence element Selectable marker ori muenchen.de/Yeast_Biol/04http://biochemie.web.med.uni- muenchen.de/Yeast_Biol/04 Yeast Molecular Techniques.pdf

29 Yeast Expression Vector (example) 2μ = 2 micron plasmid 2 mu seq features: yeast ori ori E = bacterial ori Amp r = bacterial selection LEU2, e.g. = Leu biosynthesis for yeast selection Saccharomyces cerevisiae (baker’s yeast) ori E Your favorite gene (Yfg) LEU2 Amp r GAPD term’n GAPD prom Complementation of an auxotrophy can be used instead of drug-resistance Auxotrophy = state of a mutant in a biosynthetic pathway resulting in a requirement for a nutrient GAPD = the enzyme glyceraldehyde-3 phosphate dehydrogenase For growth in E. coli

Got this far

31 Genomic DNA HIS4 mutation - Yeast - genomic integration via homologous recombination HIS4 gfY pt Vector DNA Functional HIS4 gene Defective HIS4 gene Yfg t p Genomic DNA

32 Double recombination Yeast (integration in Pichia pastoris) AOX1 gene (  ~ 30% of total protein) Genomic DNA AOX1p Yfg AOX1tHIS4 3’AOX1 Genomic DNA HIS4 Yfg AOX1p AOX1t 3’AOX1 Vector DNA P. pastoris -tight control -methanol induced (AOX1) -large scale production (gram quantities) Alcohol oxidase gene

Expression in mammalian cells Lab examples of immortal cell lines: HEK293 Human embyonic kidney (high transfection efficiency) HeLa Human cervical carcinoma (historical, low RNase) CHO Chinese hamster ovary (hardy, diploid DNA content, mutants) CosMonkey cells with SV40 replication proteins (-> high transgene copies) 3T3Mouse or human exhibiting ~regulated (normal-like) growth + various others, many differentiated to different degrees, e.g.: BHKBaby hamster kidney HepG2Human hepatoma GH3Rat pituitary cells PC12Mouse neuronal-like tumor cells MCF7Human breast cancer HT1080 Human fibroblastic cells with near diploid karyotype IPSinduced pluripotent stem cells and: Primary cells cultured with a limited lifetime. E.g., MEF = mouse embryonic fibroblasts, HDF = Human diploid fibroblasts Common in industry: NS1mAbsMouse plasma cell tumor cells Vero vaccines African greem monkey cells CHOmAbs, other therapeutic proteinsChinese hamster ovary cells PER6mAbs, other therapeutic proteinsHuman retinal cells

Mammalian cell expression Generalized gene structure for mammalian expression: cDNA gene Mam.prom. polyA site intron 5’UTR 3’UTR Intron is optional but a good idea

Popular mammalian cell promoters SV40 LargeT Ag (Simian Virus 40) RSV LTR (Rous sarcoma virus) MMTV (steroid inducible) (Mouse mammary tumor virus) HSV TK (low expression) (Herpes simplex virus) Metallothionein (metal inducible, Cd ++ ) CMV early (Cytomegalovirus) Actin EIF2alpha Engineered inducible / repressible: tet, ecdysone, glucocorticoid (tet = tetracycline)

Engineered regulated expression: Tetracycline-reponsive promoters Tet-OFF (add tet  shut off) tTA cDNA tTA = tet activator fusion protein: tetR = tet repressor (original role) tetR domain VP16 transcription activation domain No tet. Binds tet operator (multiple copies) (if tet not also bound) tetR domain Tetracycline (tet), or, better, doxicyclin (dox) active not active CMV prom. polyA site tTA gene must be in cell (permanent transfection, integrated): Tet-OFF (Bujold et al.) Allosteric change in conformation VP16 transcription activation domain

MIN. CMV prom. your favorite gene polyA site Mutliple tet operator elements MIN. CMV prom. your favorite gene polyA site tetR domain VP16 tc’n act’n domain not active little transcripton (2%?, bkgd) Doxicyclin present: MIN. CMV prom. your favorite gene polyA site active Plenty of transcripton No doxicyclin: tetR domain VP16 tc’n act’n domain RNA po l Tet-OFF, cont.

Tetracycline-reponsive promoters Tet-ON (add tet  turn on gene tTA cDNA tetR domain VP16 tc’n act’n domain tetR domain VP16 tc’n act’n domain Tetracycline (tet), or, better, doxicyclin (dox) active not active Full CMV prom. polyA site Different fusion protein: Does NOT bind tet operator (if tet not bound) Tet-ON Must be in cell (permanent transfection, integrated): commercially available (293, CHO) or do-it-yourself

MIN. CMV prom. your favorite gene polyA site Mutliple tet operator elements MIN. CMV prom. your favorite gene polyA site active Doxicyclin absent: MIN. CMV prom. your favorite gene polyA site active Plenty of transcripton (> 50X) Add dox: tetR domain VP16 tc’n act’n domain RNA pol II Tet-ON tetR domain VP16 tc’n act’n domain not active little transcription (bkgd.) doxicyclin