Presentation is loading. Please wait.

Presentation is loading. Please wait.

Monday, October 18, 1:43:47 PM Outline for today Lec 06

Similar presentations


Presentation on theme: "Monday, October 18, 1:43:47 PM Outline for today Lec 06"— Presentation transcript:

1 Monday, October 18, 1:43:47 PM Outline for today Lec 06 Gene Prediction: What are genes? Where are genes? Why do we care about a definition? Prokaryotic vs. eukaryotic gene models Introns/exons Splicing Alternative splicing Genes-in-genes, genes-ad-genes Multi-subunit proteins Gene identification Homology-based gene prediction Similarity Searches (e.g. BLAST, BLAT) Genome Browsers RNA evidence (ESTs) Ab initio gene prediction Gene prediction programs: prokaryotes, eukaryotes Promoter prediction PolyA-signal prediction Splice site, start/stop-codon predictions Slide 132

2 Monday, October 18, 1:43:47 PM Lec 06 Alternative splicing Alternative splicing can be either constitutive or regulated Constitutive alternative splicing: more than one product is always made from a pre-mRNA. Regulative alternative splicing: different forms of mRNA are produced at different time, under different conditions, or in different cell or tissue types. Alternative splicing is regulated by activators and repressors. The regulating sequences : exonic or intronic; splicing enhancers (ESE or ISE) or silencers (ESS and ISS). The former enhance and the latter repress splicing. Proteins that regulate splicing bind to these specific sites for their action. Mo Chen & James L. Manley (2009): Nature Reviews Molecular Cell Biology 10, Slide 133

3 Monday, October 18, 1:43:47 PM Lec 06 Alternative splicing Alternative splicing can generate tens of thousands of mRNAs from a single primary transcript. Alternative splicing generates segments of mRNA variability that can insert or remove amino acids, shift the reading frame, or introduce a termination codon. The typical human gene contains an average of 8 exons. Up to 59% of human genes generate multiple mRNAs by alternative splicing and ∼80% of alternative splicing results in changes in the encoded protein. A large fraction of alternative splicing undergoes cell specific regulation in which splicing pathways are modulated according to cell type, developmental stage, gender, or in response to external stimuli. Heart muscle mRNA 1 2 3 5 Pre-mRNA 1 2 3 4 5 Uterine muscle mRNA 1 3 4 5 Slide 134

4 Monday, October 18, 1:43:47 PM Lec 06 Alternative splicing Alternative splicing is the process where one gene produces more than one type of mRNA. Environment, Development DNA mRNA 80% 20% Cell type 1 10% 90% Cell type 2 absent 100% Cell type 3 The phenotype is determined by the proteome & transcriptome. Selection acts on the phenotype, and is blind to the genotype. Therefore: two species/individuals that have different forms of a protein will be selected differently - even if the genes DNA sequence is identical. Slide 135

5 Monday, October 18, 1:43:47 PM Alternative splicing Lec 06 Alternative splicing can generate mRNAs encoding proteins with different, even opposite functions. Alternative splicing of the fas apoptosis receptor Fas pre-mRNA Fas Fas ligand (membrane-associated) 7 6 5 APOPTOSIS Soluble Fas (membrane) (+) (-) Intron 1 Intron 2 Therefore, understanding the mechanism of RNA splicing in normal cells and how it is regulated in different tissues and at different stages of development of an organism is essential in order to develop strategies to correct aberrant splicing in human pathologies. Slide 136

6 Pathologies resulting from aberrant splicing
Monday, October 18, 1:43:47 PM Pathologies resulting from aberrant splicing can be grouped in two major categories Lec 06 Mutations affecting a specific messenger RNA and disturbing its normal splicing pattern. Examples: ß-Thalassemia Duchenne Muscular Dystrophy Cystic Fibrosis Frasier Syndrome Frontotemporal Dementia and Parkinsonism Mutations affecting proteins that are involved in splicing. Spinal Muscular Atrophy Retinitis Pigmentosa Myotonic Dystrophy Slide 137

7 Splice variant detection
Monday, October 18, 1:43:47 PM Splice variant detection Lec 06 PCR method: simple, sensitive, with std curve enough accurate, however, only internal changes are detectable and can’t scaled up. Capture probe: very sensitive and accurate, complicated probe design, expensive. Microarray method: can be scaled up to an entire genome (high throughput), so any types of splice variants are detectable, but not very accurate, complex and expensive. 1 2A 3 2B Slide 138

8 Monday, October 18, 1:43:47 PM Outline for today Lec 06 Gene Prediction: What are genes? Where are genes? Why do we care about a definition? Prokaryotic vs. eukaryotic gene models Introns/exons Splicing Alternative splicing Genes-in-genes, genes-ad-genes Multi-subunit proteins Gene identification Homology-based gene prediction Similarity Searches (e.g. BLAST, BLAT) Genome Browsers RNA evidence (ESTs) Ab initio gene prediction Gene prediction programs: prokaryotes, eukaryotes Promoter prediction PolyA-signal prediction Splice site, start/stop-codon predictions Slide 139

9 Bidirectional and partially overlapping genes
Monday, October 18, 1:43:47 PM Bidirectional and partially overlapping genes Lec 06 Not very common in human genome. Provides possibility for common regulation of a gene pair. Partially overlapping genes are usually encoded by opposite DNA strands. Found in dense gene areas, as HLA class III complex on 6p21.3. Could represent sense-antisense pair with one gene is coding mRNA, another is non-coding. Slide 140

10 Genes within genes Monday, October 18, 1:43:47 PM
Lec 06 Nested intronic genes Neurofibromatosis gene (NF1) OGMP-Oligodendrocyte myelin glycoprotein EVI2A and EVO2B homologues of ecotropic viral intergration sites in mouse. Two overlapping genes encoded by same strand of mt DNA (unique example). Two independent AUG located in frame-shift to each other, second stop codon is derived from TA + A from poly-A. Slide 141

11 Gene prediction Comparative Genomics
Monday, October 18, 1:43:47 PM Gene prediction Lec 06 Comparative Genomics When we BLAST a sequence is that comparative genomics? Entire genome compared to other entire genomes. Use information from many genomes to learn more about the individual genes. What are some questions that comparative genomics can address? How has the organism evolved? What differentiates species? Which genes are required for organisms to survive in a certain environment? Which non-coding regions are important? Slide 142

12 Gene prediction through comparative genomics
Monday, October 18, 1:43:47 PM Gene prediction through comparative genomics Lec 06 Different questions require different comparisons Highly similar (conserved) regions between two genomes are useful or else they would have diverged. If genomes are too closely related all regions are similar, not just genes. If genomes are too far apart, analogous regions may be too dissimilar to be found. Slide 143

13 Prokaryotes gene prediction
Monday, October 18, 1:43:47 PM Prokaryotes gene prediction Lec 06 NCBI ORF finder ORF Finder - identifies all possible ORFs in a DNA sequence by locating the standard and alternative stop and start codons. The deduced amino acid sequences can then be used to BLAST against GenBank. ORF finder is also packaged in the sequence submission software Sequin. Based on NCBI ORF finder 90 ORFs were identified in the Contig3 (28715 bp). This method is still not proper way for gene identification! Slide 144

14 Prokaryotes gene prediction
Monday, October 18, 1:43:47 PM Prokaryotes gene prediction Lec 06 Gene calling anomalies Short genes: a gene is called 'short' when it has been truncated significantly at the 5'-end. Such genes are significantly shorter than their homologs in other species. Often this truncation causes the loss of important functional domains, resulting in theoretical loss of function of the gene. Slide 145

15 Prokaryotes gene prediction
Monday, October 18, 1:43:47 PM Prokaryotes gene prediction Lec 06 Gene calling anomalies Long genes: a gene is called 'long' when it has been extended at the 5'-end. Such genes are significantly longer than their homolog's in other species. A long gene can create overlaps with neighbouring features, with the result being that neighbouring genes are called short or features in the flanking intergenic regions are missed. Slide 146

16 Prokaryotes gene prediction
Monday, October 18, 1:43:47 PM Prokaryotes gene prediction Lec 06 Gene calling anomalies Unique gene: a gene is called 'unique' when it has no known homolog's in other species. For such genes, Blast comparisons at the amino acid level with genes in other organisms return no hits. Often, such a gene call is an anomaly which, in turn, causes other anomalies, e.g. neighbouring genes called short. DdesDRAFT_0263 is a unique gene. If DdesDRAFT_0264 were detected as a short gene, DdesDRAFT_0263 would actually be responsible for this short call. Dubious (uncertain) gene: a gene called as unique that is too short to be a functional gene is classified as 'dubious.' In actual practice, very few (1-10) dubious genes are found in the gene calls. When present, both unique and dubious genes are included when searching intergenic regions for missed genes. Slide 147

17 Prokaryotes gene prediction
Monday, October 18, 1:43:47 PM Prokaryotes gene prediction Lec 06 Gene calling anomalies Split genes interrupted by frame shifts and stop codons: a reported split gene could be a good gene that is interrupted by frame-shifts or stop codons. Such a gene is called as a series of consecutive smaller genes, all of which have many blast hits in common. Split genes DdesDRAFT_1032 and DdesDRAFT_1033 interrupted by a frame-shift. Slide 148

18 Prokaryotes gene prediction
Monday, October 18, 1:43:47 PM Prokaryotes gene prediction Lec 06 Gene calling anomalies Missed genes: gene prediction programs often miss genes however, an alignment of this region indicates the presence of a perfectly good gene. No genes had been predicted in the region between DdesDRAFt_0231 and DdesDRAFT_0232. However, an alignment of this region indicates the presence of a perfectly good gene. Slide 149

19 Free gene prediction software
Monday, October 18, 1:43:47 PM Free gene prediction software Lec 06 GeneMark: Georgia Institute of Technology, Atlanta, Georgia, USA. Based on GeneMark gene prediction software 14 genes were predicted in the Contig3 (28715 bp). Slide 150

20 Free gene prediction software
Monday, October 18, 1:43:47 PM Free gene prediction software Lec 06 Softberry: (FGENESB) Bacterial Operon and Gene Prediction. Based on Softberry gene prediction software 6 genes were predicted in the Contig3 (28715 bp). Slide 151

21 Free gene prediction software
Monday, October 18, 1:43:47 PM Free gene prediction software Lec 06 EasyGene: gene finding in prokaryotes (1.2b Server). Based on EasyGene 1.2b Server 6 genes were predicted in the Contig3 (28715 bp). Slide 152

22 Free gene prediction software
Monday, October 18, 1:43:47 PM Free gene prediction software Lec 06 Glimmer: NCBI Microbial Genome Annotation Tools. Based on Glimmer 81 genes were predicted in the Contig3 (28715 bp). Slide 153


Download ppt "Monday, October 18, 1:43:47 PM Outline for today Lec 06"

Similar presentations


Ads by Google