Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,

Similar presentations


Presentation on theme: "Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,"— Presentation transcript:

1 Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,

2 Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages

3 Genome is fixed – Cells are dynamic A genome is static  Every cell in our body has a copy of same genome A cell is dynamic  Responds to external conditions  Most cells follow a cell cycle of division Cells differentiate during development

4 Gene regulation Gene regulation is responsible for dynamic cell Gene expression varies according to:  Cell type  Cell cycle  External conditions  Location

5 Where gene regulation takes place Opening of chromatin Transcription Translation Protein stability Protein modifications

6 Transcriptional Regulation Strongest regulation happens during transcription Best place to regulate: No energy wasted making intermediate products However, slowest response time After a receptor notices a change: 1.Cascade message to nucleus 2.Open chromatin & bind transcription factors 3.Recruit RNA polymerase and transcribe 4.Splice mRNA and send to cytoplasm 5.Translate into protein

7 Transcription Factors Binding to DNA Transcription regulation: Certain transcription factors bind DNA Binding recognizes DNA substrings: Regulatory motifs

8 Promoter and Enhancers Promoter necessary to start transcription Enhancers can affect transcription from afar

9 Regulation of Genes Gene Regulatory Element RNA polymerase (Protein) Transcription Factor (Protein) DNA

10 Regulation of Genes Gene RNA polymerase Transcription Factor (Protein) Regulatory Element DNA

11 Regulation of Genes Gene RNA polymerase Transcription Factor Regulatory Element DNA New protein

12 Example: A Human heat shock protein TATA box: positioning transcription start TATA, CCAAT: constitutive transcription GRE: glucocorticoid response MRE:metal response HSE:heat shock element TATASP1 CCAAT AP2 HSE AP2CCAAT SP1 promoter of heat shock hsp70 0 --158 GENE

13 Gene expression Protein RNA DNA transcription translation CCTGAGCCAACTATTGATGAA PEPTIDEPEPTIDE CCUGAGCCAACUAUUGAUGAA

14 The Genetic Code

15 Eukaryotes vs Prokaryotes Eukaryotic cells are characterized by membrane-bound compartments, which are absent in prokaryotes. “Typical” human & bacterial cells drawn to scale. BIOS Scientific Publishers Ltd, 1999 Brown Fig 2.1

16 Prokaryotic genes – searching for ORFs. -Small genomes have high gene density Haemophilus influenza – 85% genic -No introns -Operons One transcript, many genes -Open reading frames (ORF) – contiguous set of codons, start with Met-codon, ends with stop codon.

17 Example of ORFs. There are six possible ORFs in each sequence for both directions of transcription.

18 Eukaryotes vs Prokaryotes Eukaryotic cells are characterized by membrane-bound compartments, which are absent in prokaryotes. “Typical” human & bacterial cells drawn to scale. BIOS Scientific Publishers Ltd, 1999 Brown Fig 2.1

19 Gene structure exon1 exon2exon3 intron1intron2 transcription translation splicing exon = protein-coding intron = non-coding Codon: A triplet of nucleotides that is converted to one amino acid

20 Gene structure exon1 exon2exon3 intron1intron2 transcription translation splicing exon = coding intron = non-coding

21 Finding genes Start codon ATG 5’ 3’ Exon 1 Exon 2 Exon 3 Intron 1Intron 2 Stop codon TAG/TGA/TAA Splice sites

22

23

24 atg tga ggtgag caggtg cagatg cagttg caggcc ggtgag

25 0. We can sequence the mRNA Expressed Sequence Tag (EST) sequencing is expensive It has some false positive rates (aberrant splicing) The method sequences all RNAs and not just those that code for genes This is difficult for rare genes (those that are expressed rarely or in low quantities. Still this is an invaluable source of information (when available)

26 Biology of Splicing (http://genes.mit.edu/chris/)

27 1. Consensus splice sites (http://www-lmmb.ncifcrf.gov/~toms/sequencelogo.html) Donor: 7.9 bits Acceptor: 9.4 bits (Stephens & Schneider, 1996)

28 2. Recognize “coding bias” Each exon can be in one of three frames ag—gattacagattacagattaca—gtaagFrame 0 ag—gattacagattacagattaca—gtaagFrame 1 ag—gattacagattacagattaca—gtaagFrame 2 Frame of next exon depends on how many nucleotides are left over from previous exon Codons “tag”, “tga”, and “taa” are STOP  No STOP codon appears in-frame, until end of gene  Absence of STOP is called open reading frame (ORF) Different codons appear with different frequencies— coding bias

29 2. Recognize “coding bias” Amino AcidSLCDNA codons IsoleucineIATT, ATC, ATA LeucineLCTT, CTC, CTA, CTG, TTA, TTG ValineVGTT, GTC, GTA, GTG PhenylalanineFTTT, TTC MethionineMATG CysteineCTGT, TGC AlanineAGCT, GCC, GCA, GCG GlycineGGGT, GGC, GGA, GGG ProlinePCCT, CCC, CCA, CCG ThreonineTACT, ACC, ACA, ACG SerineSTCT, TCC, TCA, TCG, AGT, AGC TyrosineYTAT, TAC TryptophanWTGG GlutamineQCAA, CAG AsparagineNAAT, AAC HistidineHCAT, CAC Glutamic acidEGAA, GAG Aspartic acidDGAT, GAC LysineKAAA, AAG ArginineRCGT, CGC, CGA, CGG, AGA, AGG Stop codons StopTAA, TAG, TGA Can map 61 non-stop codons to frequencies & take log-odds ratios

30 3. Genes are “conserved”

31 Approaches to gene finding Homology  Procrustes Ab initio  Genscan, Genie, GeneID Comparative  TBLASTX, Rosetta Hybrids  GenomeScan, GenieEST, Twinscan, SLAM…

32 HMMs for single species gene finding: Generalized HMMs

33 HMMs for gene finding GTCAGAGTAGCAAAGTAGACACTCCAGTAACGC exon intron intergene

34 GHMM for gene finding TAAAAAAAAAAAAAAAATTTTTTTTTTTTTTTGGGGGGGGGGGGGGGCCCCCCC Exon1Exon2Exon3 duration

35 Observed duration times

36 Better way to do it: negative binomial EasyGene: Prokaryotic gene-finder Larsen TS, Krogh A Negative binomial with n = 3

37 Splice Site Models WMM: weight matrix model = PSSM (Staden 1984) WAM: weight array model = 1 st order Markov (Zhang & Marr 1993) MDD: maximal dependence decomposition (Burge & Karlin 1997) decision-tree like algorithm to take significant pairwise dependencies into account

38 Splice site detection 5’ 3’ Donor site Position 


Download ppt "Genes: Regulation and Structure Many slides from various sources, including S. Batzoglou,"

Similar presentations


Ads by Google