The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht.

Slides:



Advertisements
Similar presentations
BiGCaT Bioinformatics Hunting strategy of the bigcat.
Advertisements

Transcriptional regulation and promoter analysis
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Genome organization Lesk, Ch 2 (Lesk, 2008). Genomes and proteomes Genome of a typical bacterium comes as a single DNA molecule of about 5 million characters.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
The European Nutrigenomics Organisation Understanding what you find in the context of what is already known Chris Evelo BiGCaT Bioinformatics Maastricht.
Toxicology in the omics era. Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM.
Alignment of mRNAs to genomic DNA Sequence Martin Berglund Khanh Huy Bui Md. Asaduzzaman Jean-Luc Leblond.
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
© 2006 W.W. Norton & Company, Inc. DISCOVER BIOLOGY 3/e
Genome Related Biological Databases. Content DNA Sequence databases Protein databases Gene prediction Accession numbers NCBI website Ensembl website.
Genes. Outline  Genes: definitions  Molecular genetics - methodology  Genome Content  Molecular structure of mRNA-coding genes  Genetics  Gene regulation.
BI420 – Course information Web site: Instructor: Gabor Marth Teaching.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
All living things have a genetic molecule In prokaryotes and eukaryotes: DNA –Even in viruses, genetic material is DNA or RNA –Directs day to day operations.
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
RNA (Ribonucleic acid)
Gene Structure: DNA RNA Protein Dr. Jason Tasch. Nucleic Acids Sequence of Nucleotides Nucleotide composed of: –Nitrogenous Base Purine Pyrimidine –Sugar.
Central Dogma & PCR B Wang Yu-Hsin.
On line (DNA and amino acid) Sequence Information
Fine Structure and Analysis of Eukaryotic Genes
Essentials of the Living World Second Edition George B. Johnson Jonathan B. Losos Chapter 13 How Genes Work Copyright © The McGraw-Hill Companies, Inc.
From Haystacks to Needles AP Biology Fall Isolating Genes  Gene library: a collection of bacteria that house different cloned DNA fragments, one.
Gene Technology Chapters 11 & 13. Gene Expression 0 Genome 0 Our complete genetic information 0 Gene expression 0 Turning parts of a chromosome “on” and.
The European Nutrigenomics Organisation Deciding and acting on quality of microarray experiments in genomics Chris Evelo BiGCaT Bioinformatics Maastricht.
How do you identify and clone a gene of interest? Shotgun approach? Is there a better way?
Protein Synthesis Transcription and Translation. The Central Dogma The information encoded with the DNA nucleotide sequence of a double helix is transferred.
Section 2 Genetics and Biotechnology DNA Technology
Bioinformatics Overview, NCBI & GenBank JanPlan 2012.
Genome Sequencing & App. of DNA Technologies Genomics is a branch of science that focuses on the interactions of sets of genes with the environment. –
Expression of the Genome The transcriptome. Decoding the Genetic Information  Information encoded in nucleotide sequences contained in discrete units.
Protein synthesis mb.edu/cellbio/r ibosome.htm.
Part I: Identifying sequences with … Speaker : S. Gaj Date
Chapter 21 Eukaryotic Genome Sequences
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
DNA TO RNA Transcription is the process of creating a molecule that can carry the genetic blueprint for a particular protein coding gene from the DNA.
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.
Genomics.
EB3233 Bioinformatics Introduction to Bioinformatics.
Molecular Biology II Lecture 1 OrR. Restriction Endonuclease (sticky end)
DNA in the Cell Stored in Number of Chromosomes (24 in Human Genome) Tightly coiled threads of DNA and Associated Proteins: Chromatin 3 billion bp in Human.
JIGSAW: a better way to combine predictions J.E. Allen, W.H. Majoros, M. Pertea, and S.L. Salzberg. JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the.
August 20, 2007 BDGP modENCODE Data Production. BDGP Data Production Project Goals 21,000 RACE experiments 6,000 cDNA’s from directed screening and full.
ESTs Ian Keller Laboratory Techniques in Molecular Bio.
Around the triangle Chris Evelo BiGCaT Bioinformatics Maastricht May arrays QTLs paths.
CFE Higher Biology DNA and the Genome Transcription.
DAY 2. Warm Up What type of RNA copies DNA? – mRNA What is this process called? – Transcription.
1 From Bi 150 Lecture 0 October 4, 2012 An introduction to molecular biology... but you will learn the cell biology in this course.
Introduction to molecular biology Data Mining Techniques.
Vectors Bacteria, viruses or liposomes into which DNA can be inserted. These can be used to grow genes, harvest the proteins they code for or deliver them.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Biotechnology.
Part 3 Gene Technology & Medicine
Molecular Genetics Transcription & Translation
Experimental Verification Department of Genetic Medicine
Section 2 Genetics and Biotechnology DNA Technology
Chapter 4 “DNA Finger Printing”
Access to Sequence Data and Related Information
CHAPTER 12 DNA Technology and the Human Genome
Genome organization and Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Protein Synthesis Lecture 5
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
DNA Profiling Vocabulary
Comparison Of DNA And RNA Synthesis in Prokaryotes and Eukaryotes
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht

The Sense of Sequense The Sense of Sequense Databases. What have we got to compare our sequences with? Chris Evelo. Is that gene on my array? A simple question with a complicated answer. Gontran Zepeda. Annotating a full array. Evading the EST trap and the Affymetrix challenge. Stan Gaj.

First questions to answer. How to sequence an entire genome? Typical errors? Why not start with chromosome 1? Is it useful?

How to sequence an entire genome Start show

Example trace file. DNA sequence trace showing a portion of the nucleotide sequence of the gene encoding the envelope protein of the Human Immunodeficiency Virus, HIV-1.

Typical errors. Not all base/dye combo’s same mobility (typically corrected by software) Bad quality at start and end of sequences Bad separation in front runners Typical low broad peeks at the end As a result multiple equal bases overlap

Why not start with chromosome 1? …

Is it useful?

Are genome databases useful? Copied DNA to computer disks. Computers can read bits easier than bases. But why read them? Or better, how read them. We need more information.

Figure The transfer of information from DNA to protein. The transfer proceeds by means of an RNA intermediate called messenger RNA (mRNA). In procaryotic cells the process is simpler than in eucaryotic cells. In eucaryotes the coding regions of the DNA (in the exons,shown in color) are separated by noncoding regions (the introns). As indicated, these introns must be removed by an enzymatically catalyzed RNA-splicing reaction to form the mRNA. Alberts et al. Molecular Biology of the Cell, 3rd edn. Gene expression

Three levels And we need them all… DNA, mRNA and protein Protein information comes from biochemistry and physiology: Main database is Swissprot (high quality/ highly curated) US has PIR Hypothetical proteins: Main database trEMBL Databases now combined: UniProt

Swissprot

SwissProt

Three levels DNA Genome data mRNA ?? Protein: Swissprot trEMBL = UniProt

mRNA. Measuring mRNA is easy Use PolyA tail to isolate PCR and blot (use primer if known) Clone and sequence And what do you know then? “It’s an expressed sequence tag…”

Three levels DNA Genome data mRNA ESTs (EMBL) Protein: Swissprot trEMBL = UniProt

Annotate! DNA: Genome data mRNA: ESTs - EMBL Clustered - Unigene Protein: - Swissprot - trEMBL = UniProt