Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy August 31, 2009.

Similar presentations


Presentation on theme: "Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy August 31, 2009."— Presentation transcript:

1 Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy August 31, 2009

2 Interface of biology and computers Analysis of genomes, genes, mRNA and proteins using computer algorithms and computer databases What is bioinformatics?

3 What is Genomics? What is Proteomics? What is the Transcriptome?

4 On bioinformatics “Science is about building causal relations between natural phenomena (for instance, between a mutation in a gene and a disease). The development of instruments to increase our capacity to observe natural phenomena has, therefore, played a crucial role in the development of science - the microscope being the paradigmatic example in biology. With the human genome, the natural world takes an unprecedented turn: it is better described as a sequence of symbols. Besides high-throughput machines such as sequencers and DNA chip readers, the computer and the associated software becomes the instrument to observe it, and the discipline of bioinformatics flourishes.” Martin Reese and Roderic Guigó, Genome Biology 2006 7(Suppl I):S1, introducing EGASP, the Encyclopedia of DNA Elements (ENCODE) Genome Annotation Assessment Project

5 What do you want out of this course?

6 Themes throughout the course: gene/protein families Retinol-binding protein 4 (RBP4)  member of the lipocalin family  small, abundant carrier protein We will study it in a variety of contexts including --homologs in various species --sequence alignment --gene expression --protein structure --phylogeny

7

8 Tool-users Tool-makers bioinformatics public health informatics medical informatics infrastructure databases algorithms

9 DNARNA cDNA ESTs UniGene Microarrays phenotype genomic DNA databases protein sequence databases protein

10 GenBankEMBLDDBJ Housed at EBI European Bioinformatics Institute There are three major public DNA databases Housed at NCBI National Center for Biotechnology Information Housed in Japan

11 Growth of GenBank Year Base pairs of DNA (billions) Sequences (millions) Updated 8-12-04: >40b base pairs 198219861990199419982002

12 Growth of GenBank + Whole Genome Shotgun (1982-November 2008) Number of sequences in GenBank (millions) 0 50 100 150 200 250 198219871992199720022007 Base pairs of DNA in GenBank (billions) Base pairs in GenBank + WGS (billions)

13 Taxonomy at NCBI: ~200,000 species are represented in GenBank http://www.ncbi.nlm.nih.gov/Taxonomy/txstat.cgi11/08

14 The most sequenced organisms in GenBank Homo sapiens 13.1 billion bases Mus musculus 8.4b Rattus norvegicus 6.1b Bos taurus5.2b Zea mays 4.6b Sus scrofa3.6b Danio rerio 3.0b Oryza sativa (japonica) 1.5b Strongylocentrotus purpurata1.4b Nicotiana tabacum 1.1b Updated 11-6-08 GenBank release 168.0 Excluding WGS, organelles, metagenomics

15 Go to NCBI website http://www.ncbi.nlm.nih.gov/

16 PubMed is… National Library of Medicine's search service 12 million citations in MEDLINE links to participating online journals PubMed tutorial (via “Education” on side bar)

17 Entrez integrates… the scientific literature; DNA and protein sequence databases; 3D protein structure data; population study data sets; assemblies of complete genomes

18 Entrez is a search and retrieval system that integrates NCBI databases

19 BLAST is… Basic Local Alignment Search Tool NCBI's sequence similarity search tool supports analysis of DNA and protein databases 80,000 searches per day

20 OMIM is… Online Mendelian Inheritance in Man catalog of human genes and genetic disorders edited by Dr. Victor McKusick, others at JHU

21 Books is… searchable resource of on-line books

22 TaxBrowser is… browser for the major divisions of living organisms (archaea, bacteria, eukaryota, viruses) taxonomy information such as genetic codes molecular data on extinct organisms

23 Structure site includes… Molecular Modelling Database (MMDB) biopolymer structures obtained from the Protein Data Bank (PDB) Cn3D (a 3D-structure viewer) vector alignment search tool (VAST)

24 Review of Genetics, Biochemistry & Evolution

25 Human Genome Project

26 What is a typical Genomic structure for a Eukaryotic gene?

27

28

29

30

31 Synonymous vs. nonsynonymous changes

32 Synonymous Substitution Non-synonymous Substitution

33 Central Dogma DNA  RNA  protein sequence  structure  function  evolution

34 What kind of modifications Are made to Eukaryotic mRNAs?

35 RNA Modifications

36

37

38 What are cDNAs?

39 Protein structures X-ray crystallography and Nuclear magnetic resonance (NMR) Primary structure – linear AA Secondary structure- –alpha helix and beta sheet Tertiary structures- –3-d that exposes binding domains etc

40

41 Linkage maps YAC Yeast artificial chromosome & BAC Bacterial artificial chromosome -used to clone large pieces of DNA -overlapping clones Are genes linked?

42 Organization of genomes Groups of genes within a species -Comparative Genomics plastid genomes and mt genomes

43

44 How do we determine functions of genes?

45 Expression patterns –Northerns –RT-PCR –SAGE –Microarrays Transgenics –insert genes what results? Mutants –classical genetics –molecular genetics And Functional Protein Assays

46 Charles Darwin Descent with modification –species change through time and are related to a common ancestor Natural Selection is the process by which this change occurs

47 Understanding Natural selection acts on individuals though consequences occur in populations –Individual’s phenotype reason survived and reproduced –after a time this will change the distribution in the population, –what ultimately changes? Gene pool

48 New alleles Point change is all that is needed –not always a "big deal" neutral change –can be in Sickle cell anemia

49 Gene duplication creates an additional copy of a gene –unequal cross-over –X-rays Are these duplicates maintained in populations? –Psuedogenes

50

51 Polyploidy additional set of chromosomes –Found in plants –Amphibians, invertebrates Through a type of parthenogenesis –Triploid Poor fertility Hybridization or meiosis malfunction

52 Homology study of likeness (literal) Similarity between species (or genes) that results from inheritance of traits from a common ancestor –Unless know of a common ancestor have to be careful when using this word.

53 Orthologous vs Paralogous Genes        Gene Duplication Speciation Species 1 Species 2

54 Species All organisms alive today can trace their ancestry back to the origin of life some 3.8 billion years ago –Since then millions if not billions of branching events have occurred Mechanisms have to be in place for change to occur –genetic drift and natural selection


Download ppt "Genomics, Proteomics, and Bioinformatics Biology 224 Instructor: Tom Peavy August 31, 2009."

Similar presentations


Ads by Google