Download presentation
Presentation is loading. Please wait.
1
SS 2008lecture 4 Biological Sequence Analysis 1 V4 Genome of Arabidopsis thaliana Review of lecture V3... - What are Tandem repeats? - How does one find CpG islands? - What are Gardiner-Frommer and Takai-Jones parameters? - Why do we need t-tests? – - What are the findings of (Hutter et al. 2006)?
2
SS 2008lecture 4 Biological Sequence Analysis 2 Arabidopsis thaliana Arabidopsis thaliana is a small flowering plant that is widely used as a model organism in plant biology. Arabidopsis is a member of the mustard (Brassicaceae) family, which includes cultivated species such as cabbage and radish. Arabidopsis is not of major agronomic significance, but it offers important advantages for basic research in genetics and molecular biology. TAIR
3
SS 2008lecture 4 Biological Sequence Analysis 3 Some useful statistics for Arabidopsis thaliana –Small genome (114.5 Mb/125 Mb total) has been sequenced in the year 2000. –Extensive genetic and physical maps of all 5 chromosomes. –A rapid life cycle (about 6 weeks from germination to mature seed). –Prolific seed production and easy cultivation in restricted space. –Efficient transformation methods utilizing Agrobacterium tumefaciens. –A large number of mutant lines and genomic resources many of which are available from Stock Centers. –Multinational research community of academic, government and industry laboratories. Such advantages have made Arabidopsis a model organism for studies of the cellular and molecular biology of flowering plants.TAIR collects and makes available the information arising from these efforts. TAIR
4
SS 2008lecture 4 Biological Sequence Analysis 4 Arabidopsis thaliana genome sequence Representation of the Arabidopsis chromosomes. Sequenced portions are red, telomeric and centromeric regions are light blue, heterochromatic knobs are shown black and the rDNA repeat regions are magenta. Left: DAPI-stained chromosomes. Gene density (`Genes') ranged from 38 per 100 kb to 1 gene per 100 kb; expressed sequence tag matches (`ESTs') ranged from more than 200 per 100 kb to 1 per 100 kb. Transposable element densities (`TEs') ranged from 33 per 100 kb to 1 per 100 kb. Mitochondrial and chloroplast insertions (`MT/CP') were assigned black and green tick marks, respectively. Transfer RNAs and small nucleolar RNAs (`RNAs') were assigned black and red ticks marks, respectively. Nature 408, 796 (2000)
5
SS 2008lecture 4 Biological Sequence Analysis 5 Arabidopsis thaliana genome sequence Nature 408, 796 (2000) The proportion of Arabidopsis proteins having related counterparts in eukaryotic genomes varies by a factor of 2 to 3 depending on the functional category. Only 8 ± 23% of Arabidopsis proteins involved in transcription have related genes in other eukaryotic genomes, reflecting the independent evolution of many plant transcription factors. In contrast, 48 ± 60% of genes involved in protein synthesis have counterparts in the other eukaryotic genomes, reflecting highly conserved gene functions. The relatively high proportion of matches between Arabidopsis and bacterial proteins in the categories `metabolism' and `energy' reflects both the acquisition of bacterial genes from the ancestor of the plastid and high conservation of sequences across all species. Finally, a comparison between unicellular and multicellular eukaryotes indicates that Arabidopsis genes involved in cellular communication and signal transduction have more counterparts in multicellular eukaryotes than in yeast, reflecting the need for sets of genes for communication in multicellular organisms.
6
SS 2008lecture 4 Biological Sequence Analysis 6 Many genes were duplicated Nature 408, 796 (2000)
7
SS 2008lecture 4 Biological Sequence Analysis 7 Segmental duplication Nature 408, 796 (2000) Segmentally duplicated regions in the Arabidopsis genome. Individual chromosomes are depicted as horizontal grey bars (with chromosome 1 at the top), centromeres are marked black. Coloured bands connect corresponding duplicated segments. Similarity between the rDNA repeats are excluded. Duplicated segments in reversed orientation are connected with twisted coloured bands.
8
SS 2008lecture 4 Biological Sequence Analysis 8 Membrane channels and transporters Nature 408, 796 (2000) Transporters in the plasma and intracellular membranes of Arabidopsis are responsible for the acquisition, redistribution and compartmentalization of organic nutrients and inorganic ions, as well as for the efflux of toxic compounds and metabolic end products, energy and signal transduction. Unlike animals, which use a sodium ion P-type ATPase pump to generate an electrochemical gradient across the plasma membrane, plants and fungi use a proton P- type ATPase pump to form a large membrane potential. plant secondary transporters are typically coupled to protons rather than to sodium. -almost half of the Arabidopsis channel proteins are aquaporins which emphasizes the importance of hydraulics in a wide range of plant processes. - Compared with other sequenced organisms, Arabidopsis has 10-fold more predicted peptide transporters, primarily of the proton-dependent oligopeptide transport (POT) family, emphasizing the importance of peptide transport or indicating that there is broader substrate specificity than previously realized. - nearly 1,000 Arabidopsis genes encoding Ser/Thr protein kinases, suggesting that peptides may have an important role in plant signalling.
9
SS 2008lecture 4 Biological Sequence Analysis 9 What is TAIR*? NSF-funded project begun in 1999 Web resource for Arabidopsis data and stocks Literature-based manual annotation of gene function Genome annotation (gene structure, computational gene function) * URL The following slides were borrowed from a talk at the TAIR7 workshop by Eva Huala & Donghui Li
10
SS 2008lecture 4 Biological Sequence Analysis 10 Portals
11
SS 2008lecture 4 Biological Sequence Analysis 11 Tools
12
SS 2008lecture 4 Biological Sequence Analysis 12 Search
13
SS 2008lecture 4 Biological Sequence Analysis 13
14
SS 2008lecture 4 Biological Sequence Analysis 14 Names Description
15
SS 2008lecture 4 Biological Sequence Analysis 15 GO annotations Expression
16
SS 2008lecture 4 Biological Sequence Analysis 16 Sequences Maps
17
SS 2008lecture 4 Biological Sequence Analysis 17 Mutations Seed lines
18
SS 2008lecture 4 Biological Sequence Analysis 18 Seed lines Links to other sites
19
SS 2008lecture 4 Biological Sequence Analysis 19 Seed lines Links to other sites
20
SS 2008lecture 4 Biological Sequence Analysis 20 Seed lines Links to other sites
21
SS 2008lecture 4 Biological Sequence Analysis 21 Seed lines Links to other sites
22
SS 2008lecture 4 Biological Sequence Analysis 22 Comments References
23
SS 2008lecture 4 Biological Sequence Analysis 23
24
SS 2008lecture 4 Biological Sequence Analysis 24
25
SS 2008lecture 4 Biological Sequence Analysis 25
26
SS 2008lecture 4 Biological Sequence Analysis 26
27
SS 2008lecture 4 Biological Sequence Analysis 27 GBrowse - coming soon
28
SS 2008lecture 4 Biological Sequence Analysis 28 Overview of releases to date 26,819 protein coding genes 3,866 alternatively spliced
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.