Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy August 31, 2009.

Slides:



Advertisements
Similar presentations
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
Advertisements

Databases (“knowledge bases”) used in genome analysis
Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.
Bunu databases’in icine koy lecture 5i de sonuna
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
Biology 4900 Biocomputing. Chapter 2 Molecular Databases and Data Analysis.
COT 6930 HPC and Bioinformatics Bioinformatics Resources and Databases Xingquan Zhu Dept. of Computer Science and Engineering.
1.
On line (DNA and amino acid) Sequence Information Lecture 7.
9 Genomics and Beyond Brief Chapter Outline
Archives and Information Retrieval
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Genome Related Biological Databases. Content DNA Sequence databases Protein databases Gene prediction Accession numbers NCBI website Ensembl website.
The Cell, Central Dogma and Human Genome Project.
The BIG Goal “The greatest challenge, however, is analytical. … Deeper biological insight is likely to emerge from examining datasets with scores of samples.”
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Workshop in Bioinformatics 2010 Class # Class 8 March 2010.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
10 Genomics, Proteomics and Genetic Engineering. 2 Genomics and Proteomics The field of genomics deals with the DNA sequence, organization, function,
Introduction to Genetics
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
A Study of Cystic Fibrosis Using Web-Based Tools Anuradha Datta Murphy Graduate Student, Dept. of Molecular and Integrative Physiology, University of Illinois.
On line (DNA and amino acid) Sequence Information
Bioinformatics.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Bioinformatics Jack Min Office 3012 Office hours: TR 12:15 – 4.
Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy January 29, 2008.
Sequence Databases What are they and why do we need them.
Introduction to Bioinformatics Part 1 of 2 Jonathan Pevsner, Ph.D. M.E: September 8, 2003.
Introduction to Bioinformatics CPSC 265. Interface of biology and computer science Analysis of proteins, genes and genomes using computer algorithms and.
Introduction to Bioinformatics Monday, November 15, 2010 Jonathan Pevsner Bioinformatics M.E:
1 Database Resources of the National Center for Biotechnology Information Baharak Rastegari MEDG 505 presentation February 3, 2005 David.
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Introduction to Bioinformatics Introduction to Databases
DAY 1c: Accessing Completed Genomes 1. UCSC Genome Bioinformatics 2. Ensembl 3. NCBI Genomic Biology.
Introduction to Bioinformatics Databases. DNARNAphenotypeprotein Central dogma of molecular biology A main focus of bioinformatics is to study molecular.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Chapter 21 Eukaryotic Genome Sequences
REMINDERS 2 nd Exam on Nov.17 Coverage: Central Dogma of DNA Replication Transcription Translation Cell structure and function Recombinant DNA technology.
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
EB3233 Bioinformatics Introduction to Bioinformatics.
Chapter 1 Introduction.
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Information retrieval and sliding window programs April 5, 2011 Hand in Homework #1. Homework #2 due Tuesday, April 12. Learning objectives- Understand.
Genome Bioinformatics DNA and protein Databases I.
Instructor Prof. Chandrama P. Upadhyaya 220, Life Sciences Building ,
CHAPTER 1 Genetics: An Introduction Authored by Peter J. Russell.
生物資料庫搜尋 ( 第八組 ) 連威森 王鼎 黃智楹 張鈞淵
Keeping Current: Genetics Resources. This workshop will provide an overview of NCBI resources for finding-- Background information & journal articles.
Archives and Information Retrieval
생물정보학 Bioinformatics.
Genomes and Their Evolution
Introduction to Genetic Analysis
Today… Review a few items from last class
Genomes and Their Evolution
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Evolution of Genomes Chapter 21.
Introduction to Bioinformatics
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy August 31, 2009

Interface of biology and computers Analysis of genomes, genes, mRNA and proteins using computer algorithms and computer databases What is bioinformatics?

What is Genomics? What is Proteomics? What is the Transcriptome?

On bioinformatics “Science is about building causal relations between natural phenomena (for instance, between a mutation in a gene and a disease). The development of instruments to increase our capacity to observe natural phenomena has, therefore, played a crucial role in the development of science - the microscope being the paradigmatic example in biology. With the human genome, the natural world takes an unprecedented turn: it is better described as a sequence of symbols. Besides high-throughput machines such as sequencers and DNA chip readers, the computer and the associated software becomes the instrument to observe it, and the discipline of bioinformatics flourishes.” Martin Reese and Roderic Guigó, Genome Biology (Suppl I):S1, introducing EGASP, the Encyclopedia of DNA Elements (ENCODE) Genome Annotation Assessment Project

What do you want out of this course?

Themes throughout the course: gene/protein families Retinol-binding protein 4 (RBP4)  member of the lipocalin family  small, abundant carrier protein We will study it in a variety of contexts including --homologs in various species --sequence alignment --gene expression --protein structure --phylogeny

Tool-users Tool-makers bioinformatics public health informatics medical informatics infrastructure databases algorithms

DNARNA cDNA ESTs UniGene Microarrays phenotype genomic DNA databases protein sequence databases protein

GenBankEMBLDDBJ Housed at EBI European Bioinformatics Institute There are three major public DNA databases Housed at NCBI National Center for Biotechnology Information Housed in Japan

Growth of GenBank Year Base pairs of DNA (billions) Sequences (millions) Updated : >40b base pairs

Growth of GenBank + Whole Genome Shotgun (1982-November 2008) Number of sequences in GenBank (millions) Base pairs of DNA in GenBank (billions) Base pairs in GenBank + WGS (billions)

Taxonomy at NCBI: ~200,000 species are represented in GenBank

The most sequenced organisms in GenBank Homo sapiens 13.1 billion bases Mus musculus 8.4b Rattus norvegicus 6.1b Bos taurus5.2b Zea mays 4.6b Sus scrofa3.6b Danio rerio 3.0b Oryza sativa (japonica) 1.5b Strongylocentrotus purpurata1.4b Nicotiana tabacum 1.1b Updated GenBank release Excluding WGS, organelles, metagenomics

Go to NCBI website

PubMed is… National Library of Medicine's search service 12 million citations in MEDLINE links to participating online journals PubMed tutorial (via “Education” on side bar)

Entrez integrates… the scientific literature; DNA and protein sequence databases; 3D protein structure data; population study data sets; assemblies of complete genomes

Entrez is a search and retrieval system that integrates NCBI databases

BLAST is… Basic Local Alignment Search Tool NCBI's sequence similarity search tool supports analysis of DNA and protein databases 80,000 searches per day

OMIM is… Online Mendelian Inheritance in Man catalog of human genes and genetic disorders edited by Dr. Victor McKusick, others at JHU

Books is… searchable resource of on-line books

TaxBrowser is… browser for the major divisions of living organisms (archaea, bacteria, eukaryota, viruses) taxonomy information such as genetic codes molecular data on extinct organisms

Structure site includes… Molecular Modelling Database (MMDB) biopolymer structures obtained from the Protein Data Bank (PDB) Cn3D (a 3D-structure viewer) vector alignment search tool (VAST)

Review of Genetics, Biochemistry & Evolution

Human Genome Project

What is a typical Genomic structure for a Eukaryotic gene?

Synonymous vs. nonsynonymous changes

Synonymous Substitution Non-synonymous Substitution

Central Dogma DNA  RNA  protein sequence  structure  function  evolution

What kind of modifications Are made to Eukaryotic mRNAs?

RNA Modifications

What are cDNAs?

Protein structures X-ray crystallography and Nuclear magnetic resonance (NMR) Primary structure – linear AA Secondary structure- –alpha helix and beta sheet Tertiary structures- –3-d that exposes binding domains etc

Linkage maps YAC Yeast artificial chromosome & BAC Bacterial artificial chromosome -used to clone large pieces of DNA -overlapping clones Are genes linked?

Organization of genomes Groups of genes within a species -Comparative Genomics plastid genomes and mt genomes

How do we determine functions of genes?

Expression patterns –Northerns –RT-PCR –SAGE –Microarrays Transgenics –insert genes what results? Mutants –classical genetics –molecular genetics And Functional Protein Assays

Charles Darwin Descent with modification –species change through time and are related to a common ancestor Natural Selection is the process by which this change occurs

Understanding Natural selection acts on individuals though consequences occur in populations –Individual’s phenotype reason survived and reproduced –after a time this will change the distribution in the population, –what ultimately changes? Gene pool

New alleles Point change is all that is needed –not always a "big deal" neutral change –can be in Sickle cell anemia

Gene duplication creates an additional copy of a gene –unequal cross-over –X-rays Are these duplicates maintained in populations? –Psuedogenes

Polyploidy additional set of chromosomes –Found in plants –Amphibians, invertebrates Through a type of parthenogenesis –Triploid Poor fertility Hybridization or meiosis malfunction

Homology study of likeness (literal) Similarity between species (or genes) that results from inheritance of traits from a common ancestor –Unless know of a common ancestor have to be careful when using this word.

Orthologous vs Paralogous Genes        Gene Duplication Speciation Species 1 Species 2

Species All organisms alive today can trace their ancestry back to the origin of life some 3.8 billion years ago –Since then millions if not billions of branching events have occurred Mechanisms have to be in place for change to occur –genetic drift and natural selection