Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy January 29, 2008.

Slides:



Advertisements
Similar presentations
UTACCEL 2010 Adventures in Biotechnology Graham Cromar.
Advertisements

© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
Databases (“knowledge bases”) used in genome analysis
Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.
Bunu databases’in icine koy lecture 5i de sonuna
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
COT 6930 HPC and Bioinformatics Bioinformatics Resources and Databases Xingquan Zhu Dept. of Computer Science and Engineering.
On line (DNA and amino acid) Sequence Information Lecture 7.
Bioinformatics David Brodin BEA core facility MOLEKYLÄRBIOLOGI MED GENETIK – BIOINFORMATIK HT -07 Course web page:
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Archives and Information Retrieval
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Lecture 2.21 Retrieving Information: Using Entrez.
Genome Related Biological Databases. Content DNA Sequence databases Protein databases Gene prediction Accession numbers NCBI website Ensembl website.
The Cell, Central Dogma and Human Genome Project.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
prepared with some help from friends...
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Workshop in Bioinformatics 2010 Class # Class 8 March 2010.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
On line (DNA and amino acid) Sequence Information
Bioinformatics.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Sequence Databases What are they and why do we need them.
Introduction to Bioinformatics Part 1 of 2 Jonathan Pevsner, Ph.D. M.E: September 8, 2003.
Information Resources for Bioinformatics 1 MARC: Developing Bioinformatics Programs July, 2008 Alex Ropelewski Hugh Nicholas
Introduction to Bioinformatics CPSC 265. Interface of biology and computer science Analysis of proteins, genes and genomes using computer algorithms and.
Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy August 31, 2009.
1 Database Resources of the National Center for Biotechnology Information Baharak Rastegari MEDG 505 presentation February 3, 2005 David.
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
Bioinformatics Overview, NCBI & GenBank JanPlan 2012.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
Copyright © 2010 Pearson Education Inc. Lecture 01 – Genetics & Genomics: An Introduction Based on Chapter 1 – Genetics: An introduction.
Introduction to Bioinformatics Introduction to Databases
Introduction to Bioinformatics Databases. DNARNAphenotypeprotein Central dogma of molecular biology A main focus of bioinformatics is to study molecular.
Chapter 21 Eukaryotic Genome Sequences
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
REMINDERS 2 nd Exam on Nov.17 Coverage: Central Dogma of DNA Replication Transcription Translation Cell structure and function Recombinant DNA technology.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Accessing information on molecular sequences Bio 224 Dr. Tom Peavy Sept 1, 2010.
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
EB3233 Bioinformatics Introduction to Bioinformatics.
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Information retrieval and sliding window programs April 5, 2011 Hand in Homework #1. Homework #2 due Tuesday, April 12. Learning objectives- Understand.
Instructor Prof. Chandrama P. Upadhyaya 220, Life Sciences Building ,
CHAPTER 1 Genetics: An Introduction Authored by Peter J. Russell.
生物資料庫搜尋 ( 第八組 ) 連威森 王鼎 黃智楹 張鈞淵
Archives and Information Retrieval
생물정보학 Bioinformatics.
Genomes and Their Evolution
Today… Review a few items from last class
Genomes and Their Evolution
9 Future Challenges for Bioinformatics
Introduction to Bioinformatics
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy January 29, 2008

Interface of biology and computers Analysis of genomes, genes, mRNA and proteins using computer algorithms and computer databases What is bioinformatics?

What is Genomics? What is Proteomics? What is the Transcriptome?

What do you want out of this course?

Top ten challenges for bioinformatics [1] Precise models of where and when transcription will occur in a genome (initiation and termination) [2] Precise, predictive models of alternative RNA splicing [3] Precise models of biological pathways; ability to predict cellular responses to external stimuli [4] Determining protein:DNA, protein:RNA, protein:protein recognition codes [5] Accurate ab initio protein structure prediction

Top ten challenges for bioinformatics [6] Rational design of small molecule inhibitors of proteins [7] Mechanistic understanding of protein evolution [8] Mechanistic understanding of speciation [9] Development of effective gene ontologies: systematic ways to describe gene and protein function [10] Education: development of bioinformatics curricula Source: Ewan Birney, Chris Burge, Jim Fickett

Themes throughout the course: gene/protein families Retinol-binding protein 4 (RBP4)  member of the lipocalin family  small, abundant carrier protein We will study it in a variety of contexts including --homologs in various species --sequence alignment --gene expression --protein structure --phylogeny

Tool-users Tool-makers bioinformatics public health informatics medical informatics infrastructure databases algorithms

DNARNA cDNA ESTs UniGene phenotype genomic DNA databases protein sequence databases protein

GenBankEMBLDDBJ Housed at EBI European Bioinformatics Institute There are three major public DNA databases Housed at NCBI National Center for Biotechnology Information Housed in Japan

Growth of GenBank Year Base pairs of DNA (billions) Sequences (millions) Updated : >40b base pairs

Press Release (August 22, 2005)  100 gigabases of sequence data (NCBI, EMBL, & DDBJ)  over 165,000 organisms

The growth of GenBank. The blue area shows the total number of bases including those from whole genome shotgun sequencing projects (WGS). The checkered area shows only the non-WGS portion. With release 149, the number of WGS bases exceeded the number of bases in the traditional GenBank divisions.

Go to NCBI website

PubMed is… National Library of Medicine's search service 12 million citations in MEDLINE links to participating online journals PubMed tutorial (via “Education” on side bar)

Entrez integrates… the scientific literature; DNA and protein sequence databases; 3D protein structure data; population study data sets; assemblies of complete genomes

Entrez is a search and retrieval system that integrates NCBI databases

BLAST is… Basic Local Alignment Search Tool NCBI's sequence similarity search tool supports analysis of DNA and protein databases 80,000 searches per day

OMIM is… Online Mendelian Inheritance in Man catalog of human genes and genetic disorders edited by Dr. Victor McKusick, others at JHU

Books is… searchable resource of on-line books

TaxBrowser is… browser for the major divisions of living organisms (archaea, bacteria, eukaryota, viruses) taxonomy information such as genetic codes molecular data on extinct organisms

Structure site includes… Molecular Modelling Database (MMDB) biopolymer structures obtained from the Protein Data Bank (PDB) Cn3D (a 3D-structure viewer) vector alignment search tool (VAST)

Review of Genetics, Biochemistry & Evolution

Human Genome Project

What is a typical Genomic structure for a Eukaryotic gene?

Synonymous vs. nonsynonymous changes

Synonymous Substitution Non-synonymous Substitution

Central Dogma DNA  RNA  protein sequence  structure  function  evolution

What kind of modifications Are made to Eukaryotic mRNAs?

RNA Modifications

What are cDNAs?

Protein structures X-ray crystallography and Nuclear magnetic resonance (NMR) Primary structure – linear AA Secondary structure- –alpha helix and beta sheet Tertiary structures- –3-d that exposes binding domains etc

Linkage maps YAC Yeast artificial chromosome & BAC Bacterial artificial chromosome -used to clone large pieces of DNA -overlapping clones Are genes linked?

Organization of genomes Groups of genes within a species -Comparative Genomics plastid genomes and mt genomes

How do we determine functions of genes?

Expression patterns –Northerns –RT-PCR –SAGE –Microarrays Transgenics –insert genes what results? Mutants –classical genetics –molecular genetics And Functional Protein Assays

Charles Darwin Descent with modification –species change through time and are related to a common ancestor Natural Selection is the process by which this change occurs

Understanding Natural selection acts on individuals though consequences occur in populations –Individual’s phenotype reason survived and reproduced –after a time this will change the distribution in the population, –what ultimately changes? Gene pool

New alleles Point change is all needed –not always a "big deal" neutral change –can be in Sickle cell anemia

Gene duplication creates an additional copy of a gene –unequal cross-over –X-rays Are these duplicates maintained in populations? –Psuedogenes

Polyploidy additional set of chromosomes –Found in plants –Amphibians, invertebrates Through a type of parthenogenesis –Triploid Poor fertility Hybridization or meiosis malfunction

Homology study of likeness (literal) Similarity between species (or genes) that results from inheritance of traits from a common ancestor –Unless know of a common ancestor have to be careful when using this word.

Orthologous vs Paralogous Genes        Gene Duplication Speciation Species 1 Species 2

Species All organisms alive today can trace their ancestry back to the origin of life some 3.8 billion years ago –Since then millions if not billions of branching events have occurred Mechanisms have to be in place for change to occur –genetic drift and natural selection