Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,

Slides:



Advertisements
Similar presentations
Uses of Cloned Genes sequencing reagents (eg, probes) protein production insufficient natural quantities modify/mutagenesis library screening Expression.
Advertisements

ATG GAG GAA GAA GAT GAA GAG ATC TTA TCG TCT TCC GAT TGC GAC GAT TCC AGC GAT AGT TAC AAG GAT GAT TCT CAA GAT TCT GAA GGA GAA AAC GAT AAC CCT GAG TGC GAA.
RNA Say Hello to DNA’s little friend!. EngageEssential QuestionExplain Describe yourself to long lost uncle. How do the mechanisms of genetics and the.
Supplementary Fig.1: oligonucleotide primer sequences.
Transcription & Translation Worksheet
CS262 Lecture 9, Win07, Batzoglou Gene Recognition.
Section 8.6: Gene Expression and Regulation
Gene Recognition Credits for slides: Marina Alexandersson Lior Pachter Serge Saxonov.
Gene Regulation and Microarrays. Overview A. Gene Expression and Regulation B. Measuring Gene Expression: Microarrays C. Finding Regulatory Motifs.
Introduction to bioinformatics Lecture 2 Genes and Genomes.
CS262 Lecture 9, Win07, Batzoglou Rapid Global Alignments How to align genomic sequences in (more or less) linear time.
Genome annotation. What we have GATCAATGATGATAGGAATTGAAAGTGTCTTAATTACAATCCCTGTGCAATTATTAATAACTTTTTTGTT CACCTGTTCCCAGAGGAAACCTCAAGCGGATCTAAAGGAGGTATCTCCTCAAAAGCATCCTCTAATGTCA.
Gene Recognition Credits for slides: Marina Alexandersson Lior Pachter Serge Saxonov.
Identification of regulatory elements. Transcriptional Regulation Strongest regulation happens during transcription Best place to regulate: No energy.
Gene Expression Overview
Genomics 101 DNA sequencing Alignment Gene identification Gene expression Genome evolution …
chromosome organization, what about genome organization?
CS262 Lecture 15, Win07, Batzoglou Multiple Sequence Alignments.
Gene Recognition Credits for slides: Serafim Batzoglou Marina Alexandersson Lior Pachter Serge Saxonov.
Gene Prediction: Statistical Approaches Lecture 22.
Multiple Sequence Alignments Algorithms. MLAGAN: progressive alignment of DNA Given N sequences, phylogenetic tree Align pairwise, in order of the tree.
Introduction to Molecular Biology. G-C and A-T pairing.
Gene expression.
 Genetic information, stored in the chromosomes and transmitted to the daughter cells through DNA replication is expressed through transcription to RNA.
Today… Genome 351, 8 April 2013, Lecture 3 The information in DNA is converted to protein through an RNA intermediate (transcription) The information in.
Reading the blueprint of life DNA sequencing. Introduction The blueprint of life is contained in the DNA in the nuclei of eukaryotic cells and simply.
IGEM Arsenic Bioremediation Possibly finished biobrick for ArsR by adding a RBS and terminator. Will send for sequencing today or Monday.
Nature and Action of the Gene
Biological Dynamics Group Central Dogma: DNA->RNA->Protein.
Gene Prediction in silico Nita Parekh BIRC, IIIT, Hyderabad.
More on translation. How DNA codes proteins The primary structure of each protein (the sequence of amino acids in the polypeptide chains that make up.
Part Transcription 1 Transcription 2 Translation.
Undifferentiated Differentiated (4 d) Supplemental Figure S1.
Wellcome Trust Workshop Working with Pathogen Genomes Module 2 Gene Prediction.
Supplemental Table S1 For Site Directed Mutagenesis and cloning of constructs P9GF:5’ GAC GCT ACT TCA CTA TAG ATA GGA AGT TCA TTT C 3’ P9GR:5’ GAA ATG.
Gene finding and gene structure prediction M. Fatih BÜYÜKAKÇALI Computational Bioinformatics 2012.
Lecture 10, CS5671 Neural Network Applications Problems Input transformation Network Architectures Assessing Performance.
PART 1 - DNA REPLICATION PART 2 - TRANSCRIPTION AND TRANSLATION.
TRANSLATION: information transfer from RNA to protein the nucleotide sequence of the mRNA strand is translated into an amino acid sequence. This is accomplished.
Today… Genome 351, 8 April 2013, Lecture 3 The information in DNA is converted to protein through an RNA intermediate (transcription) The information in.
Genome Annotation Haixu Tang School of Informatics.
Chapter 14 Genetic Code and Transcription. You Must Know The differences between replication (from chapter 13), transcription and translation and the.
Gene Expression. Gene expression All cells in one organism have the same DNA. But different cells have very different functions. In each cell at certain.
An Introduction to Bioinformatics Algorithmswww.bioalgorithms.info Gene Prediction: Statistical Approaches.
Suppl. Figure 1 APP23 + X Terc +/- Terc +/-, APP23 + X Terc +/- G1Terc -/-, APP23 + X G1Terc -/- G2Terc -/-, APP23 + X G2Terc -/- G3Terc -/-, APP23 + and.
Structure and Function of DNA DNA Replication and Protein Synthesis.
GENE EXPRESSION. Transcription 1. RNA polymerase unwinds DNA 2. RNA polymerase adds RNA nucleotides (A ↔ U, G ↔ C) 3. mRNA is formed! DNA reforms a double.
Example 1 DNA Triplet mRNA Codon tRNA anticodon A U A T A U G C G
Name of presentation Month 2009 SPARQ-ed PROJECT Mutations in the tumor suppressor gene p53 Pulari Thangavelu (PhD student) April Chromosome Instability.
DNA, RNA and Protein.
Figure 17.4 DNA molecule Gene 1 Gene 2 Gene 3 DNA strand (template) TRANSCRIPTION mRNA Protein TRANSLATION Amino acid ACC AAACCGAG T UGG U UU G GC UC.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
Genomics 101 DNA sequencing Alignment Gene identification
bacteria and eukaryotes
RNA and Protein Synthesis
Modelling Proteomes.
Supplementary information Table-S1 (Xiao)
Sequence – 5’ to 3’ Tm ˚C Genome Position HV68 TMER7 Δ mt. Forward
Supplementary Figure 1 – cDNA analysis reveals that three splice site alterations generate multiple RNA isoforms. (A) c.430-1G>C (IVS 6) results in 3.
Huntington Disease (HD)
Section Objectives Relate the concept of the gene to the sequence of nucleotides in DNA. Sequence the steps involved in protein synthesis.
DNA By: Mr. Kauffman.
Gene architecture and sequence annotation
More on translation.
Transcription You’re made of meat, which is made of protein.
Fundamentals of Protein Structure
Python.
Bellringer Please answer on your bellringer sheet:
Shailaja Gantla, Conny T. M. Bakker, Bishram Deocharan, Narsing R
Presentation transcript:

Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,

Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages

Genome is fixed – Cells are dynamic A genome is static  Every cell in our body has a copy of same genome A cell is dynamic  Responds to external conditions  Most cells follow a cell cycle of division Cells differentiate during development

Gene regulation Gene regulation is responsible for dynamic cell Gene expression varies according to:  Cell type  Cell cycle  External conditions  Location

Where gene regulation takes place Opening of chromatin Transcription Translation Protein stability Protein modifications

Transcriptional Regulation Strongest regulation happens during transcription Best place to regulate: No energy wasted making intermediate products However, slowest response time After a receptor notices a change: 1.Cascade message to nucleus 2.Open chromatin & bind transcription factors 3.Recruit RNA polymerase and transcribe 4.Splice mRNA and send to cytoplasm 5.Translate into protein

Transcription Factors Binding to DNA Transcription regulation: Certain transcription factors bind DNA Binding recognizes DNA substrings: Regulatory motifs

Promoter and Enhancers Promoter necessary to start transcription Enhancers can affect transcription from afar

Regulation of Genes Gene Regulatory Element RNA polymerase (Protein) Transcription Factor (Protein) DNA

Regulation of Genes Gene RNA polymerase Transcription Factor (Protein) Regulatory Element DNA

Regulation of Genes Gene RNA polymerase Transcription Factor Regulatory Element DNA New protein

Example: A Human heat shock protein TATA box: positioning transcription start TATA, CCAAT: constitutive transcription GRE: glucocorticoid response MRE:metal response HSE:heat shock element TATASP1 CCAAT AP2 HSE AP2CCAAT SP1 promoter of heat shock hsp GENE

Gene expression Protein RNA DNA transcription translation CCTGAGCCAACTATTGATGAA PEPTIDEPEPTIDE CCUGAGCCAACUAUUGAUGAA

The Genetic Code

Eukaryotes vs Prokaryotes Eukaryotic cells are characterized by membrane-bound compartments, which are absent in prokaryotes. “Typical” human & bacterial cells drawn to scale. BIOS Scientific Publishers Ltd, 1999 Brown Fig 2.1

Prokaryotic genes – searching for ORFs. -Small genomes have high gene density Haemophilus influenza – 85% genic -No introns -Operons One transcript, many genes -Open reading frames (ORF) – contiguous set of codons, start with Met-codon, ends with stop codon.

Example of ORFs. There are six possible ORFs in each sequence for both directions of transcription.

Eukaryotes vs Prokaryotes Eukaryotic cells are characterized by membrane-bound compartments, which are absent in prokaryotes. “Typical” human & bacterial cells drawn to scale. BIOS Scientific Publishers Ltd, 1999 Brown Fig 2.1

Gene structure exon1 exon2exon3 intron1intron2 transcription translation splicing exon = protein-coding intron = non-coding Codon: A triplet of nucleotides that is converted to one amino acid

Gene structure exon1 exon2exon3 intron1intron2 transcription translation splicing exon = coding intron = non-coding

Finding genes Start codon ATG 5’ 3’ Exon 1 Exon 2 Exon 3 Intron 1Intron 2 Stop codon TAG/TGA/TAA Splice sites

atg tga ggtgag caggtg cagatg cagttg caggcc ggtgag

0. We can sequence the mRNA Expressed Sequence Tag (EST) sequencing is expensive It has some false positive rates (aberrant splicing) The method sequences all RNAs and not just those that code for genes This is difficult for rare genes (those that are expressed rarely or in low quantities. Still this is an invaluable source of information (when available)

Biology of Splicing (

1. Consensus splice sites ( Donor: 7.9 bits Acceptor: 9.4 bits (Stephens & Schneider, 1996)

2. Recognize “coding bias” Each exon can be in one of three frames ag—gattacagattacagattaca—gtaagFrame 0 ag—gattacagattacagattaca—gtaagFrame 1 ag—gattacagattacagattaca—gtaagFrame 2 Frame of next exon depends on how many nucleotides are left over from previous exon Codons “tag”, “tga”, and “taa” are STOP  No STOP codon appears in-frame, until end of gene  Absence of STOP is called open reading frame (ORF) Different codons appear with different frequencies— coding bias

2. Recognize “coding bias” Amino AcidSLCDNA codons IsoleucineIATT, ATC, ATA LeucineLCTT, CTC, CTA, CTG, TTA, TTG ValineVGTT, GTC, GTA, GTG PhenylalanineFTTT, TTC MethionineMATG CysteineCTGT, TGC AlanineAGCT, GCC, GCA, GCG GlycineGGGT, GGC, GGA, GGG ProlinePCCT, CCC, CCA, CCG ThreonineTACT, ACC, ACA, ACG SerineSTCT, TCC, TCA, TCG, AGT, AGC TyrosineYTAT, TAC TryptophanWTGG GlutamineQCAA, CAG AsparagineNAAT, AAC HistidineHCAT, CAC Glutamic acidEGAA, GAG Aspartic acidDGAT, GAC LysineKAAA, AAG ArginineRCGT, CGC, CGA, CGG, AGA, AGG Stop codons StopTAA, TAG, TGA Can map 61 non-stop codons to frequencies & take log-odds ratios

3. Genes are “conserved”

Approaches to gene finding Homology  Procrustes Ab initio  Genscan, Genie, GeneID Comparative  TBLASTX, Rosetta Hybrids  GenomeScan, GenieEST, Twinscan, SLAM…

HMMs for single species gene finding: Generalized HMMs

HMMs for gene finding GTCAGAGTAGCAAAGTAGACACTCCAGTAACGC exon intron intergene

GHMM for gene finding TAAAAAAAAAAAAAAAATTTTTTTTTTTTTTTGGGGGGGGGGGGGGGCCCCCCC Exon1Exon2Exon3 duration

Observed duration times

Better way to do it: negative binomial EasyGene: Prokaryotic gene-finder Larsen TS, Krogh A Negative binomial with n = 3

Splice Site Models WMM: weight matrix model = PSSM (Staden 1984) WAM: weight array model = 1 st order Markov (Zhang & Marr 1993) MDD: maximal dependence decomposition (Burge & Karlin 1997) decision-tree like algorithm to take significant pairwise dependencies into account

Splice site detection 5’ 3’ Donor site Position 