SS 2008lecture 4 Biological Sequence Analysis 1 V4 Genome of Arabidopsis thaliana Review of lecture V3... - What are Tandem repeats? - How does one find.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

The Arabidopsis Information Resource (TAIR)
Problem Results: Question: 1. You screen two libraries- cDNA; genomic
Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale DeRisi, Iyer, and Brown (1997) Science 278,
Duplication, rearrangement, and mutation of DNA contribute to genome evolution Chapter 21, Section 5.
1.Generate mutants by mutagenesis of seeds Use a genetic background with lots of known polymorphisms compared to other genotypes. Availability of polymorphic.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
V7 Arabidopsis thaliana
Genes. Outline  Genes: definitions  Molecular genetics - methodology  Genome Content  Molecular structure of mRNA-coding genes  Genetics  Gene regulation.
Gene Expression.
Prepared with lots of help from friends... Metsada Pasmanik-Chor, Zohar Yakhini and NUMEROUS WEB RESOURCES. BioInformatics / Computational Biology Introduction.
MYB61 Single or Multicopy gene in Arabidopsis Thaliana?
Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)
Biological (genomic) information Dan Janies
Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction.
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
Genome organization Eukaryotic genomes are complex and DNA amounts and organization vary widely between species.
Comparative Genomics of the Eukaryotes
Genome projects and model organisms Level 3 Molecular Evolution and Bioinformatics Jim Provan.
Genome of Drosophila species Olga Dolgova UAB Barcelona, 2008.
Automatic methods for functional annotation of sequences Petri Törönen.
Control of Growth and Development Chapter 15. Developmental Processes Present knowledge of plant hormone and light regulation (especially at the molecular.
Arabidopsis Genome Annotation TAIR7 Release. Arabidopsis Genome Annotation  Overview of releases  Current release (TAIR7)  Where to find TAIR7 release.
Introduction to Arabidopsis Research
Arabidopsis: The Model Organism Melissa Borkenhagen Heather Hernandez.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
EXPLORING DEAD GENES Adrienne Manuel I400. What are they? Dead Genes are also called Pseudogenes Pseudogenes are non functioning copies of genes in DNA.
Genome Organization and Evolution. Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
Genome Organization & Evolution. Chromosomes Genes are always in genomic structures (chromosomes) – never ‘free floating’ Bacterial genomes are circular.
More regulating gene expression. Combinations of 3 nucleotides code for each 1 amino acid in a protein. We looked at the mechanisms of gene expression,
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
1 TRANSCRIPTION AND TRANSLATION. 2 Central Dogma of Gene Expression.
Chapter 21 Eukaryotic Genome Sequences
Cell Signaling Ontology Takako Takai-Igarashi and Toshihisa Takagi Human Genome Center, Institute of Medical Science, University of Tokyo.
MITOCHONDRIA (Powerhouse of the Cell). Mitochondria (singular, mitochondrion) – are typically tubular or rod-shaped organelles found in the cytoplasm.
Genomics and Arabidopsis. What is ‘genomics’? Study of an organism’s entire genome –All the DNA encoded in the organism –Nucleus, mitochondria, chloroplasts.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.
Mark D. Adams Dept. of Genetics 9/10/04
Lecture 10 Genes, genomes and chromosomes
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
The “ABC’s” of Floral Madness Architecture of a Prototypical Problem Space John Greenler and Doug Green.
MPL The DNA Sequence of chimpanzee chromosome 22 and comparative analysis with its human ortholog, chromosome 21 Bioinformatics Dae-Soo Kim.
IB Arabidopsis thaliana – Feb 29
How many genes are there?
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
BIOL 433 Plant Genetics Term 2, Instructors: Dr. George Haughn Dr. Ljerka Kunst BioSciences 2239BioSciences Tel
V2: Feedback loops control the mammalian circadian core clock
DNA extraction 분자생물학실험 SUBJECT. Sequence blast Restriction enzyme Mini-prep E.coli transformation TA Ligation PCR DNA EXTRACTION.
Physical Map and Organization of Arabidopsis thaliana Chromosome 4
1 Chapter 2 Genome Organization and Gene Expression.
GROUP 2 DNA TO PROTEIN. 9.1 RICIN AND YOUR RIBOSOMES.
Figure 1 Myotubularin exhibits a tyrosine phosphatase activity
BIOL 433 Plant Genetics Term 2,
The Basics of Molecular Biology
SGN23 The Organization of the Human Genome
Genomes and Their Evolution
There are four levels of structure in proteins
Arabidopsis: The Model Organism
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Genome organization and Bioinformatics
Arabidopsis: The Model Organism
Chapter 6 Genome Sequences and Gene Numbers
Plant Cells.. Membrane.. Nutrients traffic.. Regulation..
BIOL 433 Plant Genetics Term 2,
BSC1010: Intro to Biology I K. Maltz Chapter 21.
Volume 29, Issue 6, Pages (March 2008)
Presentation transcript:

SS 2008lecture 4 Biological Sequence Analysis 1 V4 Genome of Arabidopsis thaliana Review of lecture V What are Tandem repeats? - How does one find CpG islands? - What are Gardiner-Frommer and Takai-Jones parameters? - Why do we need t-tests? – - What are the findings of (Hutter et al. 2006)?

SS 2008lecture 4 Biological Sequence Analysis 2 Arabidopsis thaliana Arabidopsis thaliana is a small flowering plant that is widely used as a model organism in plant biology. Arabidopsis is a member of the mustard (Brassicaceae) family, which includes cultivated species such as cabbage and radish. Arabidopsis is not of major agronomic significance, but it offers important advantages for basic research in genetics and molecular biology. TAIR

SS 2008lecture 4 Biological Sequence Analysis 3 Some useful statistics for Arabidopsis thaliana –Small genome (114.5 Mb/125 Mb total) has been sequenced in the year –Extensive genetic and physical maps of all 5 chromosomes. –A rapid life cycle (about 6 weeks from germination to mature seed). –Prolific seed production and easy cultivation in restricted space. –Efficient transformation methods utilizing Agrobacterium tumefaciens. –A large number of mutant lines and genomic resources many of which are available from Stock Centers. –Multinational research community of academic, government and industry laboratories. Such advantages have made Arabidopsis a model organism for studies of the cellular and molecular biology of flowering plants.TAIR collects and makes available the information arising from these efforts. TAIR

SS 2008lecture 4 Biological Sequence Analysis 4 Arabidopsis thaliana genome sequence Representation of the Arabidopsis chromosomes. Sequenced portions are red, telomeric and centromeric regions are light blue, heterochromatic knobs are shown black and the rDNA repeat regions are magenta. Left: DAPI-stained chromosomes. Gene density (`Genes') ranged from 38 per 100 kb to 1 gene per 100 kb; expressed sequence tag matches (`ESTs') ranged from more than 200 per 100 kb to 1 per 100 kb. Transposable element densities (`TEs') ranged from 33 per 100 kb to 1 per 100 kb. Mitochondrial and chloroplast insertions (`MT/CP') were assigned black and green tick marks, respectively. Transfer RNAs and small nucleolar RNAs (`RNAs') were assigned black and red ticks marks, respectively. Nature 408, 796 (2000)

SS 2008lecture 4 Biological Sequence Analysis 5 Arabidopsis thaliana genome sequence Nature 408, 796 (2000) The proportion of Arabidopsis proteins having related counterparts in eukaryotic genomes varies by a factor of 2 to 3 depending on the functional category. Only 8 ± 23% of Arabidopsis proteins involved in transcription have related genes in other eukaryotic genomes, reflecting the independent evolution of many plant transcription factors. In contrast, 48 ± 60% of genes involved in protein synthesis have counterparts in the other eukaryotic genomes, reflecting highly conserved gene functions. The relatively high proportion of matches between Arabidopsis and bacterial proteins in the categories `metabolism' and `energy' reflects both the acquisition of bacterial genes from the ancestor of the plastid and high conservation of sequences across all species. Finally, a comparison between unicellular and multicellular eukaryotes indicates that Arabidopsis genes involved in cellular communication and signal transduction have more counterparts in multicellular eukaryotes than in yeast, reflecting the need for sets of genes for communication in multicellular organisms.

SS 2008lecture 4 Biological Sequence Analysis 6 Many genes were duplicated Nature 408, 796 (2000)

SS 2008lecture 4 Biological Sequence Analysis 7 Segmental duplication Nature 408, 796 (2000) Segmentally duplicated regions in the Arabidopsis genome. Individual chromosomes are depicted as horizontal grey bars (with chromosome 1 at the top), centromeres are marked black. Coloured bands connect corresponding duplicated segments. Similarity between the rDNA repeats are excluded. Duplicated segments in reversed orientation are connected with twisted coloured bands.

SS 2008lecture 4 Biological Sequence Analysis 8 Membrane channels and transporters Nature 408, 796 (2000) Transporters in the plasma and intracellular membranes of Arabidopsis are responsible for the acquisition, redistribution and compartmentalization of organic nutrients and inorganic ions, as well as for the efflux of toxic compounds and metabolic end products, energy and signal transduction. Unlike animals, which use a sodium ion P-type ATPase pump to generate an electrochemical gradient across the plasma membrane, plants and fungi use a proton P- type ATPase pump to form a large membrane potential.  plant secondary transporters are typically coupled to protons rather than to sodium. -almost half of the Arabidopsis channel proteins are aquaporins which emphasizes the importance of hydraulics in a wide range of plant processes. - Compared with other sequenced organisms, Arabidopsis has 10-fold more predicted peptide transporters, primarily of the proton-dependent oligopeptide transport (POT) family, emphasizing the importance of peptide transport or indicating that there is broader substrate specificity than previously realized. - nearly 1,000 Arabidopsis genes encoding Ser/Thr protein kinases, suggesting that peptides may have an important role in plant signalling.

SS 2008lecture 4 Biological Sequence Analysis 9 What is TAIR*? NSF-funded project begun in 1999 Web resource for Arabidopsis data and stocks Literature-based manual annotation of gene function Genome annotation (gene structure, computational gene function) * URL The following slides were borrowed from a talk at the TAIR7 workshop by Eva Huala & Donghui Li

SS 2008lecture 4 Biological Sequence Analysis 10 Portals

SS 2008lecture 4 Biological Sequence Analysis 11 Tools

SS 2008lecture 4 Biological Sequence Analysis 12 Search

SS 2008lecture 4 Biological Sequence Analysis 13

SS 2008lecture 4 Biological Sequence Analysis 14 Names Description

SS 2008lecture 4 Biological Sequence Analysis 15 GO annotations Expression

SS 2008lecture 4 Biological Sequence Analysis 16 Sequences Maps

SS 2008lecture 4 Biological Sequence Analysis 17 Mutations Seed lines

SS 2008lecture 4 Biological Sequence Analysis 18 Seed lines Links to other sites

SS 2008lecture 4 Biological Sequence Analysis 19 Seed lines Links to other sites

SS 2008lecture 4 Biological Sequence Analysis 20 Seed lines Links to other sites

SS 2008lecture 4 Biological Sequence Analysis 21 Seed lines Links to other sites

SS 2008lecture 4 Biological Sequence Analysis 22 Comments References

SS 2008lecture 4 Biological Sequence Analysis 23

SS 2008lecture 4 Biological Sequence Analysis 24

SS 2008lecture 4 Biological Sequence Analysis 25

SS 2008lecture 4 Biological Sequence Analysis 26

SS 2008lecture 4 Biological Sequence Analysis 27 GBrowse - coming soon

SS 2008lecture 4 Biological Sequence Analysis 28 Overview of releases to date 26,819 protein coding genes 3,866 alternatively spliced